Thank you for your interest in contributing to the WILDS Docker Library! This document provides guidelines for contributing Docker images, improvements, and documentation to our centralized collection of bioinformatics container infrastructure.
- Getting Started
- Repository Structure
- Types of Contributions
- Docker Image Development Guidelines
- Testing Requirements
- Documentation Standards
- Pull Request Process
- Code of Conduct
Before contributing code changes, please:
-
Fork the repository to your GitHub account
-
Set up your development environment with the required tools:
- Docker Desktop for building and testing containers
- Hadolint for Dockerfile linting
- (Optional) Docker Scout for local security scanning
-
Make code changes and push them to your fork
-
Submit a pull request (PR) to merge your contributions into the
mainbranch of the original repo- The title of your PR should briefly describe the change
- If your contribution resolves an issue, the body of your PR should contain
Fixes #issue-number
The WILDS Docker Library organizes each bioinformatics tool in its own directory:
wilds-docker-library/
├── toolname/
│ ├── Dockerfile_X.Y.Z # Specific version
│ ├── Dockerfile_latest # Most recent version
│ ├── README.md # Tool documentation
│ └── CVEs_*.md # Security vulnerability reports
├── template/
│ └── Dockerfile_template # Template for new contributions
└── .github/
└── workflows/ # CI/CD automation
Key conventions:
- Each tool has its own directory (use lowercase names)
- Dockerfiles use the naming pattern
Dockerfile_VERSION(e.g.,Dockerfile_1.19,Dockerfile_latest) - Every tool directory must include a comprehensive
README.md - Vulnerability reports are auto-generated by CI/CD workflows
- Use the GitHub Issues page
- Provide detailed information about the problem
- Include error messages, Docker/system versions, and steps to reproduce
- Tag issues appropriately (bug, enhancement, question, etc.)
- Fix typos, improve clarity, or add missing information
- Enhance README files with better examples
- Update outdated version information or usage instructions
- Add new images for bioinformatics tools
- Follow our standardized Dockerfile structure and naming conventions
- Include comprehensive testing and documentation
- Add new versions of existing images for updated tools
- Update existing Dockerfiles to address security vulnerabilities
- Ensure backward compatibility or document breaking changes
- Reduce image sizes while maintaining functionality
- Improve build times and layer caching
- Add multi-platform support (linux/amd64 and linux/arm64)
See our template Dockerfile as a comprehensive reference
-
Create a directory named after the tool (use lowercase):
mkdir toolname cd toolname -
Copy the template:
cp ../template/Dockerfile_template Dockerfile_VERSION
-
Customize the Dockerfile following the template's guidance
Your Dockerfile must include:
- Appropriate base image: Choose the minimal base that supports your tool (Ubuntu, Miniforge, Bioconductor, etc.)
- Complete metadata labels: All required OCI labels with accurate information
- Shell configuration:
SHELL ["/bin/bash", "-o", "pipefail", "-c"]for better error handling - Pinned versions: All dependencies should use explicit versions for reproducibility
- Smoke test: A simple
RUNcommand to verify the tool installed correctly - Cleanup commands: Remove temporary files and caches to minimize image size
Required labels (update all fields for your tool):
LABEL org.opencontainers.image.title="toolname"
LABEL org.opencontainers.image.description="Docker image for TOOLNAME in FH DaSL's WILDS"
LABEL org.opencontainers.image.version="X.Y.Z"
LABEL org.opencontainers.image.authors="wilds@fredhutch.org"
LABEL org.opencontainers.image.url=https://ocdo.fredhutch.org/
LABEL org.opencontainers.image.documentation=https://getwilds.org/
LABEL org.opencontainers.image.source=https://github.com/getwilds/wilds-docker-library
LABEL org.opencontainers.image.licenses=MITImage Size:
- Target size: A few hundred MB (2GB maximum)
- Combine
RUNcommands to reduce layers - Remove build dependencies after compilation
- Clean package manager caches (
rm -rf /var/lib/apt/lists/*,mamba clean -afy, etc.)
Reproducibility:
- Pin ALL software versions explicitly
- Avoid
latesttags in downloads and dependencies - Use
apt-cache policyto get current security-patched versions of system packages - Document the exact source URLs and versions
Security:
- Never include secrets, credentials, or sensitive data
- Use minimal base images when possible
- Follow the principle of least privilege
- Let automated workflows scan for vulnerabilities
Platform Support:
- Build for multi-platform (linux/amd64 and linux/arm64) by default
- Our CI/CD automatically attempts multi-platform builds
- If your tool has platform restrictions, document them clearly in the README
- See existing AMD64-only images (BWA, DESeq2, HISAT2, etc.) for examples
Tool Focus:
- One primary tool per image (maximum 1-2 closely related tools)
- Include only necessary dependencies
- If building a complex workflow, consider multiple separate images
Dockerfile naming:
Dockerfile_X.Y.Zfor specific versions (e.g.,Dockerfile_1.19)Dockerfile_latestfor the most current version- The version in the filename determines the Docker tag automatically
Tool directory naming:
- Use lowercase
- Use hyphens for multi-word tools (e.g.,
sra-tools,combine-counts) - Match the common name of the tool
Image tagging (handled automatically by CI/CD):
getwilds/toolname:X.Y.Z(fromDockerfile_X.Y.Z)getwilds/toolname:latest(fromDockerfile_latest)- Images are pushed to both DockerHub and GitHub Container Registry
Before submitting a pull request, test your Docker images locally. You can test manually or use our automated Makefile (recommended).
The repository includes a Makefile that automates linting and building for standardized testing. You'll need hadolint installed for linting.
Quick start - see all available commands:
make helpTest a specific image:
# Lint Dockerfiles in a specific tool directory
make lint IMAGE=toolname
# Build for AMD64 only
make build_amd64 IMAGE=toolname
# Build for ARM64 only
make build_arm64 IMAGE=toolname
# Build for both architectures
make build IMAGE=toolname
# Full validation: lint + build for both architectures
make validate IMAGE=toolname
# Clean up built images
make clean IMAGE=toolnameTest all images in the repository:
# Lint all Dockerfiles
make lint
# Build all images for both architectures
make build
# Full validation of all images
make validate
# Clean up all built images
make cleanNotes about the Makefile:
- The Makefile automatically handles multi-platform builds (AMD64 and ARM64)
- ARM64 builds skip tools listed in
amd64_only_tools.txt - Images are tagged as
getwilds/toolname:version-amd64orgetwilds/toolname:version-arm64 - The template directory is automatically skipped
- When building all images (
IMAGE=*), the Makefile automatically prunes build cache and removes images after building to save disk space - Built images are labeled with
built-by=makefilefor easy cleanup
1. Build the image:
cd toolname
docker build -t test-toolname:VERSION -f Dockerfile_VERSION .2. Verify functionality:
# Test basic functionality (adjust command for your tool)
docker run --rm test-toolname:VERSION toolname --version
# Test with real data (recommended)
docker run --rm -v /path/to/test-data:/data test-toolname:VERSION \
toolname [appropriate-test-command]3. Check image size:
docker images test-toolname:VERSION
# Should be a few hundred MB, max 2GB4. Run security scan (if Docker Scout available):
docker scout cves test-toolname:VERSION5. Test multi-platform build (optional but recommended):
docker buildx build --platform linux/amd64,linux/arm64 -t test-toolname:VERSION .All contributions must pass our automated CI/CD pipeline which runs via GitHub Actions:
- Dockerfile linting: Validates Dockerfile syntax and best practices
- Multi-platform builds: Attempts to build for both linux/amd64 and linux/arm64
- Security scanning: Runs Docker Scout to identify vulnerabilities
- Image publishing: Pushes successfully built images to registries
- Documentation sync: Updates DockerHub descriptions from README files
The workflows automatically trigger when:
- Dockerfiles are modified in a pull request or push to
main - Monthly scheduled scans (first day of each month)
- Manual workflow dispatch by maintainers
Each tool directory must include a README.md with:
1. Header section:
- Tool name and description
- Links to official tool documentation
- Brief overview of what the tool does
2. Available versions:
- Table or list of all supported versions
- Indicate which is the
latestversion
3. Platform availability (if applicable):
- Note if the image is AMD64-only or has platform restrictions
- Explain why (e.g., "Contains x86-specific optimizations")
4. Image locations:
- DockerHub:
docker pull getwilds/toolname:VERSION - GHCR:
docker pull ghcr.io/getwilds/toolname:VERSION
5. Usage examples:
- Basic command examples with Docker
- Basic command examples with Apptainer/Singularity
- Real-world usage scenarios when possible
6. Installed components:
- List all major tools and versions in the image
- Note any additional utilities or dependencies
7. Security information:
- Link to or mention the vulnerability reports in the directory
- Note when last scanned
8. Contributing/Support:
- Link back to this CONTRIBUTING.md
- Contact information (wilds@fredhutch.org)
# Tool Name
Brief description of what this tool does.
[Link to official documentation](https://example.com)
## Available Versions
| Tag | Tool Version | Image Size |
|-----|--------------|------------|
| latest | X.Y.Z | XXX MB |
| X.Y.Z | X.Y.Z | XXX MB |
## Platform Availability
Available for: linux/amd64, linux/arm64
(Or note restrictions if AMD64-only)
## Usage
### Docker
\`\`\`bash
docker pull getwilds/toolname:latest
docker run --rm getwilds/toolname:latest toolname --version
\`\`\`
### Apptainer/Singularity
\`\`\`bash
apptainer pull docker://getwilds/toolname:latest
apptainer run toolname_latest.sif toolname --version
\`\`\`
## Installed Components
- Tool Name: vX.Y.Z
- Dependency1: vA.B.C
## Security
Vulnerability reports are available in this directory as `CVEs_*.md` files.
Images are scanned monthly and on each build.
## Contributing
See the [CONTRIBUTING.md](../.github/CONTRIBUTING.md) for guidelines.After meeting the requirements above, submit a PR to merge your forked repo into main.
- Test locally: Build and run your Docker image successfully
- Review checklist: Complete the Dockerfile best practices checklist in the template
- Update documentation: Ensure README.md is complete and accurate
- Clean up: Remove any test images, temporary files, or debugging code
-
Create descriptive PR title:
- New images:
Add [ToolName] Docker image (vX.Y.Z) - Updates:
Update [ToolName] to vX.Y.Z - Fixes:
Fix [brief description] in [ToolName]
- New images:
-
Fill out PR template:
- Describe what changed and why
- List any new dependencies or breaking changes
- Include testing performed
-
Link related issues: Reference any GitHub issues your PR addresses
-
Request reviews: Tag Emma Bishop (@emjbishop) or Taylor Firman (@tefirman)
Your PR will be evaluated on:
- Functionality: Does the image work as intended?
- Testing: Have you tested the build and basic functionality?
- Documentation: Is the README clear, complete, and accurate?
- Standards compliance: Does it follow WILDS conventions and best practices?
- Image quality: Is it appropriately sized and optimized?
- Security: Are there any obvious security concerns?
- Uniqueness: Does it avoid duplicating existing functionality?
Once your PR is submitted:
- Automated workflows will build and test your images
- Security scans will run and generate reports
- Reviewers will provide feedback
- Address any requested changes
- Once approved and merged, images are automatically published to DockerHub and GHCR
New contributors are welcome! If you're new to Docker or bioinformatics containers:
- Start with our template Dockerfile which has extensive comments and examples
- Review existing tool directories for real-world examples:
- Simple compiled tool:
samtools/ - Java application:
picard/ - R/Bioconductor:
deseq2/ - Python environment:
scanpy/
- Simple compiled tool:
- Check out Docker's best practices guide
- Don't hesitate to ask questions in issues or via email
- If you have a
uw.eduorfredhutch.orgemail you can also ask questions in ourfh-dataSlack workspace - Consider starting with documentation contributions or version updates before adding entirely new tools
For more questions you can contact the Fred Hutch Data Science Lab at wilds@fredhutch.org
By participating in this project, you agree to abide by our Code of Conduct:
- Be respectful: Treat all community members with respect and kindness
- Be collaborative: Work together constructively and help others learn
- Be inclusive: Welcome contributors from all backgrounds and experience levels
- Be patient: Remember that everyone is learning and growing
If you experience or witness unacceptable behavior, please report it to wilds@fredhutch.org.
By contributing to this project, you agree that your contributions will be licensed under the MIT License. See the LICENSE file for details.
Thank you for contributing to WILDS! Your contributions help advance reproducible bioinformatics research for the entire community.