This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
# Install dependencies with development extras
uv sync --all-extras
# OR using pip (but do not use pip unless you have to - prefer uv for dependency management)
pip install -e ".[dev]"Always write unit tests for new features and bug fixes. Ensure all tests pass before committing code. Also add integration tests that exercise crump at the cli level using cli_runner - see tests/test_cli.py for examples. Tests should be passing before you commit code.
# Start with fast unit tests and sqlite only as these are fast to run (and no Docker required)
uv run pytest tests -k "not [postgres]" -v
# Then run the postgres tests (these are slower) - you should use the version of postgres that is already installed locally
pytest tests -k "[postgres]" -v
# Or run all tests
uv run pytest -vOther useful testing commands:
# Run all tests with coverage
uv run pytest --cov=src --cov-report=term-missing
# database integration tests for sqlite only so VERY FAST (and no Docker required)
uv run pytest tests -k database -k "[sqlite]" -v
# Integration tests only (requires Docker)
uv run pytest tests/test_database_integration.py -v
# Run specific test
uv run pytest tests/test_config.py::TestCrumpConfig::test_load_from_yaml -vAlways check code quality before committing code. Use the commands below to format, lint, and type check your code.
# Format code
uv run ruff format .
# Check and fix linting
uv run ruff check --fix .
# Type checking
uv run mypy srcDocumentation is in markdown files in docs/ folder. All new features should be documented. Documentation should include code and CLI examples. The code and cli examples will be tested in the test suite automatically to ensure they are valid.
# Generate and serve documentation locally
./generate-docs.sh build
# OR manually
uv run mkdocs serveCLI Interface (cli.py, cli_*.py):
- Main entry point with Click-based commands
- Commands:
sync,prepare,inspect,extract - Each command has dedicated module (e.g.,
cli_sync.py) extractcommand supports both raw CDF dump and config-based extraction with column mapping
Configuration System (config.py):
- YAML-based job configuration with
CrumpConfigandCrumpJobclasses - Column mappings between CSV and database
- Filename extraction patterns for metadata (dates, versions, etc.)
- Compound primary key support via
id_mapping
Database Operations (database.py):
- PostgreSQL sync with
sync_csv_to_postgres() - Dry-run mode for previewing changes
- Automatic table creation and schema updates
- Stale record cleanup based on filename-extracted values
Type Detection (type_detection.py):
- Automatic CSV analysis for data types and nullable columns
- Primary key suggestion based on column characteristics
CDF Support (cdf_*.py):
- Reading and extracting CDF (Common Data Format) science files
- Conversion to CSV for database sync
- Config-based extraction with column mapping and transformations
- Two extraction modes: raw dump or config-based with same transformations as sync
See README.md and docs/index.md for detailed feature list.
jobs:
job_name:
target_table: "table_name"
id_mapping: # Compound primary key
csv_col1: db_col1
csv_col2: db_col2
filename_to_column: # Extract from filename
template: "data_[date]_[version].csv"
columns:
date:
db_column: sync_date
type: date
use_to_delete_old_rows: true
columns: # Column mappings
csv_col: db_col- Click: CLI framework
- Rich: Terminal output formatting
- PyYAML: Configuration parsing
- psycopg: PostgreSQL adapter
- cdflib: CDF file reading
- testcontainers: Integration testing with real databases
- pyarrow: Parquet file support
Project version is stored in pyproject.toml at the version field. Follow semantic versioning (semver):
- Major version (X.0.0): Breaking changes, incompatible API changes
- Minor version (0.X.0): New features, backwards-compatible functionality additions
- Patch version (0.0.X): Bug fixes, backwards-compatible bug fixes
When making changes:
- Bug fixes: Increment patch version (e.g., 0.6.0 → 0.6.1)
- New features: Increment minor version (e.g., 0.6.0 → 0.7.0)
- Breaking changes: Increment major version (e.g., 0.6.0 → 1.0.0)
Always update the version number in pyproject.toml when making changes that will be released.