Skip to content

Commit cb4d451

Browse files
committed
Merge branch 'dev' into nf-core-template-merge-2.8
2 parents 7d1306d + e44c9ae commit cb4d451

File tree

83 files changed

+7334
-580
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

83 files changed

+7334
-580
lines changed

.editorconfig

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,3 +22,11 @@ indent_size = unset
2222

2323
[/assets/email*]
2424
indent_size = unset
25+
26+
# C++ compiles code
27+
[/bin/cutsite_trimming]
28+
end_of_line = unset
29+
insert_final_newline = unset
30+
trim_trailing_whitespace = unset
31+
indent_style = unset
32+
indent_size = unset

.github/workflows/awsfulltest.yml

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,7 @@ jobs:
1414
runs-on: ubuntu-latest
1515
steps:
1616
- name: Launch workflow via tower
17-
uses: seqeralabs/action-tower-launch@v1
18-
# TODO nf-core: You can customise AWS full pipeline tests as required
19-
# Add full size test data (but still relatively small datasets for few samples)
20-
# on the `test_full.config` test runs with only one set of parameters
17+
uses: nf-core/tower-action@v3
2118
with:
2219
workspace_id: ${{ secrets.TOWER_WORKSPACE_ID }}
2320
access_token: ${{ secrets.TOWER_ACCESS_TOKEN }}
@@ -27,7 +24,7 @@ jobs:
2724
{
2825
"outdir": "s3://${{ secrets.AWS_S3_BUCKET }}/hic/results-${{ github.sha }}"
2926
}
30-
profiles: test_full,aws_tower
27+
profiles: test_full,public_aws_ecr
3128
- uses: actions/upload-artifact@v3
3229
with:
3330
name: Tower debug log file

.github/workflows/awstest.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ jobs:
2222
{
2323
"outdir": "s3://${{ secrets.AWS_S3_BUCKET }}/hic/results-test-${{ github.sha }}"
2424
}
25-
profiles: test,aws_tower
25+
profiles: test,public_aws_ecr
2626
- uses: actions/upload-artifact@v3
2727
with:
2828
name: Tower debug log file

.github/workflows/ci.yml

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,13 +31,10 @@ jobs:
3131
uses: actions/checkout@v3
3232

3333
- name: Install Nextflow
34-
uses: nf-core/setup-nextflow@v1
34+
uses: nf-core/setup-nextflow@v1.3.0
3535
with:
3636
version: "${{ matrix.NXF_VER }}"
3737

3838
- name: Run pipeline with test data
39-
# TODO nf-core: You can customise CI pipeline run tests as required
40-
# For example: adding multiple test runs with different parameters
41-
# Remember that you can parallelise this by using strategy.matrix
4239
run: |
4340
nextflow run ${GITHUB_WORKSPACE} -profile test,docker --outdir ./results

CHANGELOG.md

Lines changed: 149 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,160 @@
33
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
44
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
55

6-
## v2.1.0dev - [date]
6+
## v2.1.0dev
77

8-
Initial release of nf-core/hic, created with the [nf-core](https://nf-co.re/) template.
8+
### `Added`
9+
10+
- Added public_aws_ecr profile for using containers stored on ECR.
11+
12+
### `Fixed`
13+
14+
## v2.0.0 - 2023-01-12
915

1016
### `Added`
1117

18+
- DSL2 version of nf-core-hic pipeline
19+
- Add full test dataset (#80)
20+
- Replace local modules by the cooler nf-core module
21+
22+
### `Fixed`
23+
24+
- Fix error in the Arima preset (#127)
25+
26+
## v1.3.1 - 2021-09-25
27+
28+
### `Fixed`
29+
30+
- Fix bug in conda environment for cooltools (#109)
31+
32+
## v1.3.0 - 2021-05-22
33+
34+
- Change the `/tmp/` folder to `./tmp/` folder so that all tmp files are now in the work directory (#24)
35+
- Add `--hicpro_maps` options to generate the raw and normalized HiC-Pro maps. The default is now to use cooler
36+
- Add chromosome compartments calling with cooltools (#53)
37+
- Add HiCExplorer distance decay quality control (#54)
38+
- Add HiCExplorer TADs calling (#55)
39+
- Add insulation score TADs calling (#55)
40+
- Generate cooler/txt contact maps
41+
- Normalize Hi-C data with cooler instead of iced
42+
- New `--digestion` parameter to automatically set the restriction_site and ligation_site motifs
43+
- New `--keep_multi` and `keep_dup` options. Default: false
44+
- Template update for nf-core/tools
45+
- Minor fix to summary log messages in pipeline header
46+
1247
### `Fixed`
1348

14-
### `Dependencies`
49+
- Fix bug in stats report which were not all correcly exported in the results folder
50+
- Fix recurrent bug in input file extension (#86)
51+
- Fix bug in `--bin_size` parameter (#85)
52+
- `--min_mapq` is ignored if `--keep_multi` is used
1553

1654
### `Deprecated`
55+
56+
- `--rm_dup` and `--rm_multi` are replaced by `--keep_dups` and `--keep_multi`
57+
58+
## v1.2.2 - 2020-09-02
59+
60+
### `Added`
61+
62+
- Template update for nf-core/tools v1.10.2
63+
- Add the `--fastq_chunks_size` to specify the number of reads per chunks if split_fastq is true
64+
65+
### `Fixed`
66+
67+
- Bug in `--split_fastq` option not recognized
68+
69+
## v1.2.1 - 2020-07-06
70+
71+
### `Fixed`
72+
73+
- Fix issue with `--fasta` option and `.fa` extension (#66)
74+
75+
## v1.2.0 - 2020-06-18
76+
77+
### `Added`
78+
79+
- Bump v1.2.0
80+
- Merge template nf-core 1.9
81+
- Move some options to camel_case
82+
- Update python scripts for python3
83+
- Update conda environment file
84+
- python base `2.7.15` > `3.7.6`
85+
- pip `19.1` > `20.0.1`
86+
- scipy `1.2.1` > `1.4.1`
87+
- numpy `1.16.3` > `1.18.1`
88+
- bx-python `0.8.2` > `0.8.8`
89+
- pysam `0.15.2` > `0.15.4`
90+
- cooler `0.8.5` > `0.8.6`
91+
- multiqc `1.7` > `1.8`
92+
- iced `0.5.1` > `0.5.6`
93+
- _*New*_ pymdown-extensions `7.1`
94+
- _*New*_ hicexplorer `3.4.3`
95+
- _*New*_ bioconductor-hitc `1.32.0`
96+
- _*New*_ r-optparse `1.6.6`
97+
- _*New*_ ucsc-bedgraphtobigwig `377`
98+
- _*New*_ cython `0.29.19`
99+
- _*New*_ cooltools `0.3.2`
100+
- _*New*_ fanc `0.8.30`
101+
- _*Removed*_ r-markdown
102+
103+
### `Fixed`
104+
105+
- Fix error in doc for Arima kit usage
106+
- Sort output of `get_valid_interaction` process as the input files of `remove_duplicates`
107+
are expected to be sorted (sort -m)
108+
109+
### `Deprecated`
110+
111+
- Command line options converted to `camel_case`:
112+
- `--skipMaps` > `--skip_maps`
113+
- `--skipIce` > `--skip_ice`
114+
- `--skipCool` > `--skip_cool`
115+
- `--skipMultiQC` > `--skip_multiqc`
116+
- `--saveReference` > `--save_reference`
117+
- `--saveAlignedIntermediates` > `--save_aligned_intermediates`
118+
- `--saveInteractionBAM` > `--save_interaction_bam`
119+
120+
## v1.1.1 - 2020-04-02
121+
122+
### `Fixed`
123+
124+
- Fix bug in tag. Remove '['
125+
126+
## v1.1.0 - 2019-10-15
127+
128+
### `Added`
129+
130+
- Update hicpro2higlass with `-p` parameter
131+
- Support 'N' base motif in restriction/ligation sites
132+
- Support multiple restriction enzymes/ligattion sites (comma separated) ([#31](https://github.com/nf-core/hic/issues/31))
133+
- Add --saveInteractionBAM option
134+
- Add DOI ([#29](https://github.com/nf-core/hic/issues/29))
135+
- Update manual ([#28](https://github.com/nf-core/hic/issues/28))
136+
137+
### `Fixed`
138+
139+
- Fix bug for reads extension `_1`/`_2` ([#30](https://github.com/nf-core/hic/issues/30))
140+
141+
## v1.0 - [2019-05-06]
142+
143+
Initial release of nf-core/hic, created with the [nf-core](http://nf-co.re/) template.
144+
145+
### `Added`
146+
147+
First version of nf-core Hi-C pipeline which is a Nextflow implementation of
148+
the [HiC-Pro pipeline](https://github.com/nservant/HiC-Pro/).
149+
Note that all HiC-Pro functionalities are not yet all implemented.
150+
The current version supports most protocols including Hi-C, in situ Hi-C,
151+
DNase Hi-C, Micro-C, capture-C or HiChip data.
152+
153+
In summary, this version allows :
154+
155+
- Automatic detection and generation of annotation files based on igenomes
156+
if not provided.
157+
- Two-steps alignment of raw sequencing reads
158+
- Reads filtering and detection of valid interaction products
159+
- Generation of raw contact matrices for a set of resolutions
160+
- Normalization of the contact maps using the ICE algorithm
161+
- Generation of cooler file for visualization on [higlass](https://higlass.io/)
162+
- Quality report based on HiC-Pro MultiQC module

CITATIONS.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
# nf-core/hic: Citations
22

3+
## [HiC-Pro](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0831-x)
4+
5+
> Servant N, Varoquaux N, Lajoie BR, Viara E, Chen C, Vert JP, Dekker J, Heard E, Barillot E. Genome Biology 2015, 16:259 doi: [10.1186/s13059-015-0831-x](https://dx.doi.org/10.1186/s13059-015-0831-x)
6+
37
## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/)
48

59
> Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031.

README.md

Lines changed: 24 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# ![nf-core/hic](docs/images/nf-core-hic_logo_light.png#gh-light-mode-only) ![nf-core/hic](docs/images/nf-core-hic_logo_dark.png#gh-dark-mode-only)
22

3-
[![AWS CI](https://img.shields.io/badge/CI%20tests-full%20size-FF9900?labelColor=000000&logo=Amazon%20AWS)](https://nf-co.re/hic/results)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.XXXXXXX-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.XXXXXXX)
3+
[![AWS CI](https://img.shields.io/badge/CI%20tests-full%20size-FF9900?labelColor=000000&logo=Amazon%20AWS)](https://nf-co.re/hic/results)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.2669512-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.2669512)
44

55
[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A522.10.1-23aa62.svg)](https://www.nextflow.io/)
66
[![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/)
@@ -12,20 +12,29 @@
1212

1313
## Introduction
1414

15-
**nf-core/hic** is a bioinformatics pipeline that ...
15+
**nf-core/hic** is a bioinformatics best-practice analysis pipeline for Analysis of Chromosome Conformation Capture data (Hi-C).
1616

17-
<!-- TODO nf-core:
18-
Complete this sentence with a 2-3 sentence summary of what types of data the pipeline ingests, a brief overview of the
19-
major pipeline sections and the types of output it produces. You're giving an overview to someone new
20-
to nf-core here, in 15-20 seconds. For an example, see https://github.com/nf-core/rnaseq/blob/master/README.md#introduction
21-
-->
17+
The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from [nf-core/modules](https://github.com/nf-core/modules) in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!
2218

23-
<!-- TODO nf-core: Include a figure that guides the user through the major workflow steps. Many nf-core
24-
workflows use the "tube map" design for that. See https://nf-co.re/docs/contributing/design_guidelines#examples for examples. -->
25-
<!-- TODO nf-core: Fill in short bullet-pointed list of the default steps in the pipeline -->
19+
On release, automated continuous integration tests run the pipeline on a full-sized dataset on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources.The results obtained from the full-sized test can be viewed on the [nf-core website](https://nf-co.re/hic/results).
20+
21+
## Pipeline summary
2622

2723
1. Read QC ([`FastQC`](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/))
28-
2. Present QC for raw reads ([`MultiQC`](http://multiqc.info/))
24+
2. Hi-C data processing
25+
1. [`HiC-Pro`](https://github.com/nservant/HiC-Pro)
26+
1. Mapping using a two steps strategy to rescue reads spanning the ligation
27+
sites ([`bowtie2`](http://bowtie-bio.sourceforge.net/bowtie2/index.shtml))
28+
2. Detection of valid interaction products
29+
3. Duplicates removal
30+
4. Generate raw and normalized contact maps ([`iced`](https://github.com/hiclib/iced))
31+
3. Create genome-wide contact maps at various resolutions ([`cooler`](https://github.com/open2c/cooler))
32+
4. Contact maps normalization using balancing algorithm ([`cooler`](https://github.com/open2c/cooler))
33+
5. Export to various contact maps formats ([`HiC-Pro`](https://github.com/nservant/HiC-Pro), [`cooler`](https://github.com/open2c/cooler))
34+
6. Quality controls ([`HiC-Pro`](https://github.com/nservant/HiC-Pro), [`HiCExplorer`](https://github.com/deeptools/HiCExplorer))
35+
7. Compartments calling ([`cooltools`](https://cooltools.readthedocs.io/en/latest/))
36+
8. TADs calling ([`HiCExplorer`](https://github.com/deeptools/HiCExplorer), [`cooltools`](https://cooltools.readthedocs.io/en/latest/))
37+
9. Quality control report ([`MultiQC`](https://multiqc.info/))
2938

3039
## Usage
3140

@@ -34,30 +43,24 @@
3443
> to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline)
3544
> with `-profile test` before running the workflow on actual data.
3645
37-
<!-- TODO nf-core: Describe the minimum required steps to execute the pipeline, e.g. how to prepare samplesheets.
38-
Explain what rows and columns represent. For instance (please edit as appropriate):
3946

4047
First, prepare a samplesheet with your input data that looks as follows:
4148

4249
`samplesheet.csv`:
4350

4451
```csv
4552
sample,fastq_1,fastq_2
46-
CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz
53+
HIC_ES_4,SRR5339783_1.fastq.gz,SRR5339783_2.fastq.gz
4754
```
4855

49-
Each row represents a fastq file (single-end) or a pair of fastq files (paired end).
50-
51-
-->
52-
56+
Each row represents a pair of fastq files (paired end).
5357
Now, you can run the pipeline using:
5458

55-
<!-- TODO nf-core: update the following command to include all required parameters for a minimal example -->
56-
5759
```bash
5860
nextflow run nf-core/hic \
5961
-profile <docker/singularity/.../institute> \
6062
--input samplesheet.csv \
63+
--genome GRCh37 \
6164
--outdir <OUTDIR>
6265
```
6366

@@ -78,10 +81,6 @@ For more details about the output files and reports, please refer to the
7881

7982
nf-core/hic was originally written by Nicolas Servant.
8083

81-
We thank the following people for their extensive assistance in the development of this pipeline:
82-
83-
<!-- TODO nf-core: If applicable, make list of people who have also contributed -->
84-
8584
## Contributions and Support
8685

8786
If you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md).
@@ -90,10 +89,7 @@ For further information or help, don't hesitate to get in touch on the [Slack `#
9089

9190
## Citations
9291

93-
<!-- TODO nf-core: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file. -->
94-
<!-- If you use nf-core/hic for your analysis, please cite it using the following doi: [10.5281/zenodo.XXXXXX](https://doi.org/10.5281/zenodo.XXXXXX) -->
95-
96-
<!-- TODO nf-core: Add bibliography of tools and data used in your pipeline -->
92+
If you use nf-core/hic for your analysis, please cite it using the following doi: doi: [10.5281/zenodo.2669512](https://doi.org/10.5281/zenodo.2669512)
9793

9894
An extensive list of references for the tools used by the pipeline can be found in the [`CITATIONS.md`](CITATIONS.md) file.
9995

assets/methods_description_template.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ description: "Suggested text and references to use when describing pipeline usag
33
section_name: "nf-core/hic Methods Description"
44
section_href: "https://github.com/nf-core/hic"
55
plot_type: "html"
6-
## TODO nf-core: Update the HTML below to your prefered methods description, e.g. add publication citation for this pipeline
6+
## nf-core: Update the HTML below to your prefered methods description, e.g. add publication citation for this pipeline
77
## You inject any metadata in the Nextflow '${workflow}' object
88
data: |
99
<h4>Methods</h4>
@@ -12,6 +12,7 @@ data: |
1212
<pre><code>${workflow.commandLine}</code></pre>
1313
<h4>References</h4>
1414
<ul>
15+
<li>Servant, N., Ewels, P. A., Peltzer, A., Garcia, M. U. (2021) nf-core/hic. Zenodo. https://doi.org/10.5281/zenodo.2669512</li>
1516
<li>Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., & Notredame, C. (2017). Nextflow enables reproducible computational workflows. Nature Biotechnology, 35(4), 316-319. <a href="https://doi.org/10.1038/nbt.3820">https://doi.org/10.1038/nbt.3820</a></li>
1617
<li>Ewels, P. A., Peltzer, A., Fillinger, S., Patel, H., Alneberg, J., Wilm, A., Garcia, M. U., Di Tommaso, P., & Nahnsen, S. (2020). The nf-core framework for community-curated bioinformatics pipelines. Nature Biotechnology, 38(3), 276-278. <a href="https://doi.org/10.1038/s41587-020-0439-x">https://doi.org/10.1038/s41587-020-0439-x</a></li>
1718
</ul>

assets/samplesheet.csv

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,2 @@
11
sample,fastq_1,fastq_2
2-
SAMPLE_PAIRED_END,/path/to/fastq/files/AEG588A1_S1_L002_R1_001.fastq.gz,/path/to/fastq/files/AEG588A1_S1_L002_R2_001.fastq.gz
3-
SAMPLE_SINGLE_END,/path/to/fastq/files/AEG588A4_S4_L003_R1_001.fastq.gz,
2+
SRR4292758,https://github.com/nf-core/test-datasets/raw/hic/data/SRR4292758_00_R1.fastq.gz,https://github.com/nf-core/test-datasets/raw/hic/data/SRR4292758_00_R2.fastq.gz

bin/build_matrix

63.2 KB
Binary file not shown.

0 commit comments

Comments
 (0)