This README shows a number of example datasets and their read structures for hts-specs#270.
See Read Structures.
Note: Additional operators may be used below beyond [TBMS], for example C for cell-partition identifiers, to denote read segments that map into platform specific tags (for example C can map into the CR tag for 10x cell-partitioned data).
Note: if a sample barcode or molecular identifier is extracted from multiple reads, the bases are typically concatenated with a dash (-) delimiter.
| FASTQ | Description | Read Structure | SEQ/Tags |
|---|---|---|---|
| i1.fq | index/i7 | +B |
BC |
| i2.fq | index/i5 | +B |
BC |
| r1.fq | read-one/R1 | +T |
SEQ |
Example technology:: Illumina Standard
| FASTQ | Description | Read Structure | SEQ/Tags |
|---|---|---|---|
| i1.fq | index/i7 | +B |
BC |
| i2.fq | index/i5 | +B |
BC |
| r1.fq | read-one/R1 | +T |
SEQ |
| r2.fq | read-two/R2 | +T |
SEQ |
Example technology:: Illumina Standard
| FASTQ | Description | Read Structure | SEQ/Tags |
|---|---|---|---|
| r1.fq | read-one/R1 | 10B+T* |
BC and SEQ |
| r2.fq | read-two/R2 | +T |
SEQ |
- example has 10 bases of sample barcode in read-one
Example technology: Missing
| FASTQ | Description | Read Structure | SEQ/Tags |
|---|---|---|---|
| i1.fq | index/i7 | +B |
BC |
| i2.fq | index/i5 | +M |
RX |
| r1.fq | read-one/R1 | +T |
SEQ |
| r2.fq | read-two/R2 | +T |
SEQ |
- example has 10 bases of sample barcode in read-one
Example technology: NEBNext Direct
| FASTQ | Description | Read Structure | SEQ/Tags |
|---|---|---|---|
| i1.fq | index/i7 | +B |
BC |
| r1.fq | read-one/R1 | 8B8M+T |
BC, RX, and SEQ |
| r2.fq | read-two/R2 | 8M+T |
RX and SEQ |
Example technology: Riptide™ High Throughput Rapid Library Prep (HT-RLP)
| FASTQ | Description | Read Structure | SEQ/Tags |
|---|---|---|---|
| i1.fq | index/i7 | +B |
BC |
| i2.fq | index/i5 | +B |
BC |
| r1.fq | read-one/R1 | 8M1S+T |
RX, Discarded, SEQ |
| r2.fq | read-two/R2 | 8M1S+T |
RX, Discarded, SEQ |
This is a good example of bases that are discarded and not present in the BAM
Example technology: TwinStrand Biosciences Duplex Sequencing
Single-Index Paired-End with In-line Cell Partition Identifiers and In-line Unique Molecular Identifiers and Skipped Bases
| FASTQ | Description | Read Structure | SEQ/Tags |
|---|---|---|---|
| i1.fq | index/i7 | +B |
BC |
| r1.fq | read-one/R1 | 16C10M+S |
CR, UR, TR(unused) |
| r2.fq | read-two/R2 | +T |
SEQ |
Example technology: 10X Genomics Chromium Single Cell 3’ v2 Libraries (and BAM tags).
Dual-Index Providing Cell Partition Identifiers and Sample Identifiers, Paired-End with In-line Unique Molecular Identifiers
| FASTQ | Description | Read Structure | SEQ/Tags |
|---|---|---|---|
| i1.fq | index/i7 | +C |
CR |
| i2.fq | index/i5 | +B |
BC |
| r1.fq | read-one/R1 | +T |
SEQ |
| r2.fq | read-two/R2 | +M |
UR |
Example technology: 10X Genomics Chromium Single Cell 3’ v1 Libraries (driving bcl2fastq)