Skip to content

Read_Mapping

Skylar Wyant edited this page Jun 2, 2016 · 21 revisions

Basic Usage

Read_Mapping starts a task array of QSub job submissions to the Portable Batch System job scheduler for read mapping using the Burrows Wheeler Aligner (BWA). It can also index a Fasta file using BWA. To run Read_Mapping, all common and wrapper-specific variables must be defined within the configuration file. Once the variables have been defined, Read_Mapping can be submitted to a job scheduler with the following command:

sequence_handling Read_Mapping Config

Where Config is the full file path to the configuration file.

Handler-Specific Variables

The following are a list of variables that need to be defined within Config. In addition to the handler-specific variables, all common variables must be defined.

Mapping Argument Function
map Start the read mapping process using BWA
Scratch A directory to put the finished SAM files from BWA
Reference Genome The genome to base the read mapping off of. The genome must be indexed before read mapping can happen
Sample Info A list of samples for read_mapping_start.sh to work with
Project The name of the project or capture facility, used for the Read Group header
Platform The platform used for sequencing, used for the Read Group header
Email An email address for the QSub scheduler to notify you of starts, ends, and abortions for each read mapping
Indexing Argument Function
index Start the indexing process using BWA
Reference Genome The genome to be indexed, must be in Fasta format
Email An email address for the QSub scheduler to notify you of starts, ends, and abortions for indexing

Output

If your reference genome is not indexed, Read_Mapping generates an index file for the reference genome in the same directory as the reference genome. Please make sure you have write permissions for said directory.

Read_Mapping also generates aligned SAM files for each sample. These SAM files have the '@SQ', '@RG', and '@PG' headers included in them. The '@HD' header is not generated from this process.

A list of files is not generated from Read_Mapping. To do create one, please use sample_list_generator.sh. However, SAM_Processing does not require a sample list, only a directory containing all the samples to be processed.

Dependencies

Read_Mapping depends on BWA and the Portable Batch System to run. If you want to use a different job scheduler or read mapper, you will need to modify this script extensively.

Clone this wiki locally