Skip to content

Birendra-Kumar-S/bacterial_genomics_nextflow_pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bacterial_genomics_nextflow_pipeline

A modular Nextflow pipeline for bacterial genome QC and assembly, developed as part of Georgia Tech's BIOL7210 Computational Genomics course.

This repository contains a Nextflow pipeline for performing quality control, and assembling genomic sequences.

Course: BIOL7210 - Computational Genomics
Author: S Birendra Kumar
Institution: Georgia Tech
GitHub Repo: https://github.com/Birendra-Kumar-S/bacterial_genomics_nextflow
Nextflow Version: 24.10.4.5934
Package manager: conda

Workflow Overview

This workflow performs quality control, calculates trimmed read statistics and assembles genomic sequences.
The pipeline supports both sequential and parallel processing to optimize execution.

** Workflow Execution Order**

1️⃣ Sequential Execution:

  • FASTPSKESA (Genome Assembly)

2️⃣ Parallel Execution:

  • FASTPSEQKIT (Read Statistics)

Key Features

  • Read Processing

    • Quality control and adapter trimming with FASTP (v0.24.0)
  • Assembly

    • De novo genome assembly with SKESA (v2.5.1)
  • READ statistics

    • Calculation of quality filtered or trimmed reads' statistics using SeqKit (v2.10.0)

Requirements

Tested Environment

System Version: macOS 15.3.2 (24D81)
OS       : Sequoia 15.3.2
Model Name: MacBook Pro
Kernel   : Darwin 24.3.0
Chip     : Apple M4
Number of Cores: 10 (4 performance and 6 efficiency)
RAM      : 16 GB
Nextflow : v24.10.5
Java     : OpenJDK 22 (via Conda)

DAG Workflow Diagram

Diagram illustrating the pipeline's workflow, showing the sequence of processes and their dependencies. Obtained using Nextflow's built-in DAG visualization tool.

Dag flow

Test Data

The included test data in the test_data/ directory contains paired-end reads from Listeria monocytogenes (SRA accession:SRR1556296)

Tools Used

Quick Start

Perform the below steps sequentially

Setup

# Clone the repository
git clone https://github.com/Birendra-Kumar-S/bacterial_genomics_nextflow_pipeline
cd bacterial_genomics_nextflow_pipeline

Conda env installation

Would suggest to create a new conda env with nextflow installed as specified below:

CONDA_SUBDIR=osx-64 conda create -n nf_test -c bioconda nextflow -y

conda activate nf_test

Pipeline execution

export CONDA_SUBDIR=osx-64
nextflow run pipeline.nf -with-conda

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors