-
Notifications
You must be signed in to change notification settings - Fork 9
Read_Mapping
Read_Mapping starts a task array of QSub job submissions to the Portable Batch System job scheduler for lab-long to read map using the Burrows-Wheeler Aligner (BWA-MEM). It can also index a FastA file using BWA if the provided reference is not already indexed. To run Read_Mapping, all common and wrapper-specific variables must be defined within the configuration file. Once the variables have been defined, Read_Mapping can be submitted to a job scheduler with the following command (assuming that you are in the directory containing sequence_handling):
./sequence_handling Read_Mapping ConfigWhere Config is the full file path to the configuration file.
The following are a list of variables that need to be defined within Config. In addition to the handler-specific variables, all common variables must be defined. The default parameters can be found on the BWA manual page and are designed for human genomes. Parameters may need to be adjusted on a per-species basis.
| Variable | Function | Default Value |
|---|---|---|
RM_QSUB |
QSub settings for batch submission. Recommended settings are "mem=8gb,nodes=1:ppn=8,walltime=48:00:00". |
|
TRIMMED_LIST |
A list of adapter-trimmed or quality-trimmed samples to read map. This will be ${OUT_DIR}/Adapter_Trimming/${PROJECT}_trimmed_adapters.txt (Adapter_Trimming) or ${OUT_DIR}/Quality_Trimming/${PROJECT}_trimmed_quality.txt (Quality_Trimming). |
|
FORWARD_TRIMMED |
Suffix for forward reads. This will be _Forward_ScytheTrimmed.fastq.gz (Adapter_Trimming) or _R1_trimmed.fastq.gz (Quality_Trimming). |
|
REVERSE_TRIMMED |
Suffix for reverse reads. This will be _Reverse_ScytheTrimmed.fastq.gz (Adapter_Trimming) or _R2_trimmed.fastq.gz (Quality_Trimming). |
|
SINGLES_TRIMMED |
Suffix for single reads. This will be _Single_ScytheTrimmed.fastq.gz (Adapter_Trimming) or _single_trimmed.fastq.gz (Quality_Trimming). |
|
THREADS |
How many threads to use. | 1 |
SEED |
Minimum seed length. | 19 |
WIDTH |
Band width. | 100 |
DROPOFF |
Off-diagonal x-dropoff (Z-dropoff). | 100 |
RE_SEED |
Re-seed value. | 1.5 |
CUTOFF |
Cutoff value. | 10000 |
MATCH |
Matching score. | 1 |
MISMATCH |
Mismatch penalty. | 4 |
GAP |
Gap penalty. | 6 |
EXTENSION |
Gap extension penalty. | 1 |
CLIP |
Clipping penalty. | 6 |
UNPAIRED |
Unpaired read penalty. | 9 |
INTERLEAVED |
Is the first input query interleaved? | false |
RM_THRESHOLD |
Minimum threshold. | 30 |
SECONDARY |
Output all alignments and mark as secondary. | false |
APPEND |
Append FastA/Q comments to SAM files. | false |
HARD |
Use hard clipping. | false |
SPLIT |
Mark split hits as secondary. | true |
VERBOSITY |
Verbosity level. Choose from 'disabled', 'errors', 'warnings', 'all', or 'debug'. |
'all' |
Note: If running single-end samples, leave FORWARD_TRIMMED and REVERSE_TRIMMED filled with values that do NOT match your samples. If running paired-end samples, leave SINGLES_TRIMMED filled with values that do NOT match your samples.
If your reference genome is not indexed, Read_Mapping generates an index file for the reference genome in the same directory as the reference genome. Please make sure you have write permissions for said directory. After indexing Read_Mapping will exit, so you will need to run Read_Mapping again to map reads.
Read_Mapping generates aligned SAM files for each sample, located under
${OUT_DIR}/Read_Mapping/${SAMPLE}.samwhere ${OUT_DIR} is specified in the configuration file. These SAM files have the '@SQ', '@RG', and '@PG' headers included in them. The '@HD' header is not generated from this process.
A list of files is not generated from Read_Mapping. However, you can generate one using sample_list_generator.sh.
Read_Mapping depends on the Burrows-Wheeler Aligner and the Portable Batch System to run. If you want to use a different job scheduler or read mapper, you will need to modify this script extensively. Future implementations of Read_Mapping using Bowtie 2 are in progress.
Next: SAM_Processing
- Getting Started
- Recommended Workflow
- Configuration
- Dependencies
- sample_list_generator.sh
- Slurm specific options
- Common Problems and Errors