Read_Mapping

Basic Usage

Read_Mapping starts a task array of QSub job submissions to the Portable Batch System job scheduler for lab-long to read map using the Burrows-Wheeler Aligner (BWA-MEM). It can also index a FastA file using BWA if the provided reference is not already indexed. To run Read_Mapping, all common and wrapper-specific variables must be defined within the configuration file. Once the variables have been defined, Read_Mapping can be submitted to a job scheduler with the following command (assuming that you are in the directory containing sequence_handling):

./sequence_handling Read_Mapping Config

Where Config is the full file path to the configuration file.

Handler-Specific Variables

The following are a list of variables that need to be defined within Config. In addition to the handler-specific variables, all common variables must be defined. The default parameters can be found on the BWA manual page and are designed for human genomes. Parameters may need to be adjusted on a per-species basis.

Variable	Function	Default Value
`RM_QSUB`	QSub settings for batch submission. Recommended settings are `"mem=8gb,nodes=1:ppn=8,walltime=48:00:00"`.
`TRIMMED_LIST`	A list of adapter-trimmed or quality-trimmed samples to read map. This will be `${OUT_DIR}/Adapter_Trimming/${PROJECT}_trimmed_adapters.txt` (Adapter_Trimming) or `${OUT_DIR}/Quality_Trimming/${PROJECT}_trimmed_quality.txt` (Quality_Trimming).
`FORWARD_TRIMMED`	Shared suffix for forward reads. This will be `_Forward_ScytheTrimmed.fastq.gz` (Adapter_Trimming) or `_R1_trimmed.fastq.gz` (Quality_Trimming).
`REVERSE_TRIMMED`	Shared suffix for reverse reads. This will be `_Reverse_ScytheTrimmed.fastq.gz` (Adapter_Trimming) or `_R2_trimmed.fastq.gz` (Quality_Trimming).
`SINGLES_TRIMMED`	Shared suffix for single reads. This will be `_Single_ScytheTrimmed.fastq.gz` (Adapter_Trimming) or `_single_trimmed.fastq.gz` (Quality_Trimming).
`THREADS`	How many threads to use.	1
`SEED`	Minimum seed length.	19
`WIDTH`	Band width.	100
`DROPOFF`	Off-diagonal x-dropoff (Z-dropoff).	100
`RE_SEED`	Re-seed value.	1.5
`CUTOFF`	Cutoff value.	10000
`MATCH`	Matching score.	1
`MISMATCH`	Mismatch penalty.	4
`GAP`	Gap penalty.	6
`EXTENSION`	Gap extension penalty.	1
`CLIP`	Clipping penalty.	6
`UNPAIRED`	Unpaired read penalty.	9
`RESCUE`	Attempt to rescue missing hits in paired-end mode? Note: this means that reads may not be matched	false
`INTERLEAVED`	Is the first input query interleaved?	false
`RM_THRESHOLD`	Minimum threshold.	30
`SECONDARY`	Output all alignments and mark as secondary.	false
`APPEND`	Append FastA/Q comments to SAM files.	false
`HARD`	Use hard clipping.	false
`SPLIT`	Mark split hits as secondary.	true
`VERBOSITY`	Verbosity level. Choose from `'disabled'`, `'errors'`, `'warnings'`, `'all'`, or `'debug'`.	`'all'`

Note: If running single-end samples, leave FORWARD_TRIMMED and REVERSE_TRIMMED filled with values that do NOT match your samples. If running paired-end samples, leave SINGLES_TRIMMED filled with values that do NOT match your samples.

Output

If your reference genome is not indexed, Read_Mapping generates an index file for the reference genome in the same directory as the reference genome. Please make sure you have write permissions for said directory. After indexing Read_Mapping will exit, so you will need to run Read_Mapping again to map reads.

Read_Mapping generates aligned SAM files for each sample, located under

${OUT_DIR}/Read_Mapping/${SAMPLE}.sam

where ${OUT_DIR} is specified in the configuration file. These SAM files have the '@SQ', '@RG', and '@PG' headers included in them. The '@HD' header is not generated from this process.

A list of files is not generated from Read_Mapping. However, you can generate one using sample_list_generator.sh.

Dependencies

Read_Mapping depends on the Burrows-Wheeler Aligner and the Portable Batch System to run. If you want to use a different job scheduler or read mapper, you will need to modify this script extensively. Future implementations of Read_Mapping using Bowtie 2 are in progress.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read_Mapping

Basic Usage

Handler-Specific Variables

Output

Dependencies

Next: SAM_Processing

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Main Information

Recommended Workflow Handlers

Other Handlers

Clone this wiki locally