Thank you for your interest in contributing to the WILDS WDL Library! This document provides guidelines for contributing modules, pipelines, and improvements to our centralized collection of bioinformatics WDL infrastructure.
- Getting Started
- Repository Structure
- Types of Contributions
- Module Development Guidelines
- Pipeline Development Guidelines
- Testing Requirements
- Documentation Standards
- Documentation Website
- Pull Request Process
- Code of Conduct
Before contributing code changes, please:
-
Fork the repository to your GitHub account
-
Set up your development environment with the required tools:
- For local testing:
- Docker Desktop for container execution
-
Make code changes and push them to your fork
-
Submit a pull request (PR) to merge your contributions into the
mainbranch of the original repo- The title of your PR should briefly describe the change.
- If your contribution resolves an issue, the body of your PR should contain
Fixes #issue-number
The WILDS WDL Library follows a two-tier architecture:
- Modules: Collection of tasks that use a given tool
- Pipelines: Analysis workflows that import and combine module tasks (ranging from basic examples to production-ready pipelines)
wilds-wdl-library/
├── modules/
│ └── ww-toolname/
│ ├── ww-toolname.wdl
│ └── README.md
├── pipelines/
│ └── ww-pipeline-name/
│ ├── ww-pipeline-name.wdl
│ ├── inputs.json
│ └── README.md
└── .github/
└── workflows/ # CI/CD automation
- Use the GitHub Issues page
- Provide detailed information about the problem
- Include error messages, info about input files, and steps to reproduce
- Tag issues appropriately (bug, enhancement, question, etc.)
- Fix typos, improve clarity, or add missing information
- Enhance README files with better examples
- Focus on one high-utility bioinformatics tool
- Follow standardized module structure
- Include comprehensive testing and validation
- Combine existing modules into analysis workflows
- Range from basic educational examples (2-3 modules) to advanced production pipelines (10+ modules)
- Document complexity level in the README
- Provide educational and/or production value for the community
See our ww-template module as an example
The module folder must contain:
ww-toolname.wdl- Main WDL file containing task definitions for the tooltestrun.wdl- Test workflow demonstrating module functionality (must be namedtestrun.wdl)README.md- Comprehensive documentation
The module folder may optionally contain:
- Custom scripts (e.g.,
.R,.py,.sh) - If your task requires a custom script that isn't part of the container image, place it directly in the module directory alongside the WDL files. The script can be fetched at runtime usingcurlorwgetin the task's command block.
Your main WDL file (ww-toolname.wdl) must include:
- Version declaration: Use WDL version 1.0
- Task definitions: Individual tasks with proper resource requirements
- Metadata documentation: Describe properties of tasks (e.g. inputs, outputs) using
metaandparameter_metablocks
Your test workflow file (testrun.wdl) must include:
- Version declaration: Use WDL version 1.0
- Module imports: Import the module being tested and the
ww-testdatamodule using GitHub URLs - Sample struct definition: Define a struct for organizing sample inputs if needed
- Test workflow: A
toolname_exampleworkflow that demonstrates all tasks (must follow the naming convention{module}_examplewhere{module}is the tool name, e.g.,star_exampleforww-star) - Auto-downloading of test data: Use the
ww-testdatamodule to automatically provision test data - Validation task (optional): Consider including a validation task to verify output correctness
Parameter preferences:
- Use descriptive parameter names
- Include optional parameters with sensible defaults
- Support both single samples and batch processing where applicable
Docker image preferences:
- Use images from the WILDS Docker Library when available
- If creating new images, follow WILDS container standards and consider contributing to the WILDS Docker Library.
- Specify exact image versions (avoid
latesttags) - Document image dependencies in the README
Pipelines should:
- Combine existing modules from the library
- Demonstrate realistic analysis workflows
- Serve as educational templates and/or production-ready analyses
- Use publicly available test data
- Document their complexity level (Basic, Intermediate, or Advanced)
Complexity Levels:
| Level | Modules | Typical Runtime | Description |
|---|---|---|---|
| Basic | 2-3 | < 30 minutes | Simple integrations ideal for learning |
| Intermediate | 4-6 | 1-4 hours | Multi-step analyses for common use cases |
| Advanced | 10+ | > 4 hours | Comprehensive production pipelines |
Prefer Existing Modules
- Pipelines should primarily combine existing modules - prefer using existing modules over creating new task definitions. If you need new functionality, consider contributing it as a module first.
Pipeline inputs.json
Each pipeline should include an inputs.json file that serves as an example for users. This file demonstrates the expected input structure and helps users understand what values they need to provide when running the pipeline. Your inputs.json should:
- Use dummy/placeholder paths for file inputs (e.g.,
"/path/to/your/sample.fastq.gz") - Include common or recommended values for non-file parameters
- Document all required inputs with realistic example values
- Use the pipeline's README to provide descriptions and guidance for each input parameter
Note: GitHub Action tests use the ww-testdata module to automatically download test data, so your inputs.json does not need to reference actual test files for CI purposes.
Platform-Specific Configurations (Optional)
Pipelines may include optional platform-specific configuration directories for execution on cloud platforms or workflow management systems:
- Location: Place platform configs in a subdirectory within the pipeline (e.g.,
pipelines/ww-example/.cirro/) - Naming convention: Use dotfile directory names (
.cirro/,.terra/, etc.) to indicate platform - Standalone principle: Keep all pipeline-related files (WDL, inputs, platform configs) in the pipeline directory
- Documentation: Document platform configurations in the pipeline's README with links to platform documentation
- Examples:
.cirro/for Cirro platform (config documentation).terra/for Terra workspace configurations- Other platform-specific directories as needed
Platform configurations are entirely optional and should not be required to run the pipeline with standard WDL executors (Cromwell, miniWDL, Sprocket).
Cirro Configuration Validation: Pipelines with .cirro/ directories are automatically validated in CI. The validation checks that all required files are present (preprocess.py, process-form.json, process-input.json, process-output.json, process-compute.config), JSON files are valid, and preprocess.py has no syntax errors. You can run this locally with make lint_cirro.
Make sure you have these installed:
- sprocket (recommended)
- miniWDL
- uv for automated testing with our Makefile
- Docker Desktop for container execution
Test your WDL manually by navigating to the module directory:
cd modules/ww-toolname
# Linting with miniwdl (check both main module and test workflow)
miniwdl check ww-toolname.wdl
miniwdl check testrun.wdl
# Linting with sprocket (ignoring things we don't care about)
sprocket lint \
-e TodoComment \
-e ContainerUri \
-e TrailingComma \
-e CommentWhitespace \
-e UnusedInput \
ww-toolname.wdl
sprocket lint \
-e TodoComment \
-e ContainerUri \
-e TrailingComma \
-e CommentWhitespace \
-e UnusedInput \
testrun.wdl
# Test running (use testrun.wdl for execution tests)
sprocket run testrun.wdl --entrypoint toolname_example
miniwdl run testrun.wdlUse our automated Makefile from the repository root for easier testing:
# Test a specific module (replace ww-toolname with your module name)
make lint MODULE=ww-toolname # Run all linting checks
make lint_sprocket MODULE=ww-toolname # Run only sprocket linting
make lint_miniwdl MODULE=ww-toolname # Run only miniwdl linting
make run_sprocket MODULE=ww-toolname # Run sprocket with proper entrypoint
make run_miniwdl MODULE=ww-toolname # Run miniwdl
# Test all modules
make lint # Lint all modules
make run # Run all modules with both sprocket and miniwdlThe Makefile automatically handles:
- Proper entrypoint naming for sprocket (
{module}_example) - Module discovery and validation
- Dependency checking (sprocket, uv, etc.)
- Consistent test execution across all modules
- Use the
ww-testdatamodule for standardized test datasets - If you need additional test datasets, modify the
ww-testdatamodule also - Include small, representative test files in your examples
All contributions must pass our automated testing pipeline which executes on a PR via GitHub Actions:
- Multi-executor validation: Tests with Cromwell, miniWDL, and Sprocket
- Container verification: All Docker images must be accessible and functional
- Syntax validation: WDL syntax and structure validation
- Integration testing: Cross-module compatibility testing
- Cirro validation: Validates
.cirro/configurations for pipelines that include them
The WILDS WDL Library includes an automatically-generated documentation website that provides comprehensive technical documentation for all modules and pipelines. Understanding how this documentation works is important for contributors.
The documentation website is built using Sprocket and automatically deployed to GitHub Pages. The documentation is generated from:
- README files: Each module and pipeline directory contains a README.md that becomes the documentation homepage for that component
- WDL files: Task descriptions, inputs, outputs, and metadata are automatically extracted from WDL files
- Main README: The repository's root README.md serves as the documentation site homepage
Documentation is automatically built and deployed when changes are merged to the main branch:
- The build-docs.yml GitHub Actions workflow triggers on push to
main - The workflow runs the make_preambles.py script to prepare WDL files
- Sprocket generates static HTML documentation
- The postprocess_docs.py script applies final formatting
- Documentation is deployed to GitHub Pages at the repository's documentation URL
Important: You don't need to build or commit documentation files - they are generated automatically in CI/CD.
Before submitting a PR, you can preview how your changes will appear on the documentation website using the provided Makefile targets:
# Build documentation locally (mirrors the CI/CD process)
make docs-preview
# Serve the documentation on http://localhost:8000
make docs-serve
# Or do both in one command
make docsThe docs-preview target will:
- Check for uncommitted changes and warn you (docs are built from your last commit)
- Safely stash any uncommitted work
- Run the same build process as the GitHub Actions workflow
- Generate documentation in the
docs/directory - Restore your uncommitted changes when finished
- Clean up all temporary build files
Note: The docs/ directory is gitignored and should never be committed to the repository.
When you run make docs-preview, the build process:
- Prepends each module's README to its WDL file for better documentation context
- Converts GitHub import URLs to relative paths for local navigation
- Generates comprehensive HTML documentation for all tasks, workflows, and components
- Applies custom styling and post-processing
When contributing, ensure your documentation is clear and complete:
- README files: Write clear, user-focused descriptions of what your module/pipeline does
- Task metadata: Use
metablocks to document task purpose, authors, and other high-level information - Parameter metadata: Use
parameter_metablocks to describe all inputs and outputs - Examples: Include usage examples in README files
- Preview locally: Always run
make docs-previewbefore submitting a PR to verify how your documentation will appear
If you encounter issues with local documentation builds:
- Ensure you have the required dependencies installed (
sprocket,uv,python 3.13) - Check that you're running the command from the repository root
- Review error messages - they often indicate issues with WDL syntax or README formatting
For questions about documentation, please contact wilds@fredhutch.org.
After meeting the requirements above, submit a PR to merge your forked repo into main.
-
Create descriptive PR title:
- Examples:
Add BWA alignment module,Add RNA-seq analysis pipeline
- Examples:
-
Fill out PR template: Provide detailed information about your contribution
-
Link related issues: Reference any GitHub issues your PR addresses
-
Request reviews: Tag Emma Bishop (@emjbishop) or Taylor Firman (@tefirman)
Your PR will be evaluated on:
- Functionality: Does it work as intended?
- Testing: Are tests comprehensive and passing?
- Documentation: Is documentation clear and complete?
- Standards compliance: Does it follow WILDS conventions?
- Code quality: Is the WDL code well-structured and readable?
- Uniqueness: Does it avoid duplicating existing functionality in the library?
New contributors are welcome! If you're new to WDL or bioinformatics workflows:
- Review our WDL 101 course materials
- Check out existing modules for examples
- Don't hesitate to ask questions in issues or via email. If you have a
uw.eduorfredhutch.orgemail you can also ask questions in ourfh-dataslack workspace - Consider starting with documentation contributions
For more questions you can contact the Fred Hutch Office of the Chief Data Officer (OCDO) at wilds@fredhutch.org
By participating in this project, you agree to abide by our code of conduct:
- Be respectful: Treat all community members with respect and kindness
- Be collaborative: Work together constructively and help others learn
- Be inclusive: Welcome contributors from all backgrounds and experience levels
- Be patient: Remember that everyone is learning and growing
If you experience or witness unacceptable behavior, please report it to wilds@fredhutch.org.
By contributing to this project, you agree that your contributions will be licensed under the MIT License. See the LICENSE file for details.
Thank you for contributing to WILDS! Your contributions help advance reproducible bioinformatics research for the entire community.