Skip to content

Add Traject smoke tests to Tier 1 testing infrastructure#21

Merged
alexdryden merged 6 commits into
copilot/add-comprehensive-testing-infrastructurefrom
copilot/update-testing-infrastructure
Feb 26, 2026
Merged

Add Traject smoke tests to Tier 1 testing infrastructure#21
alexdryden merged 6 commits into
copilot/add-comprehensive-testing-infrastructurefrom
copilot/update-testing-infrastructure

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 23, 2026

Adds fast-feedback traject configuration validation to catch Ruby syntax errors and config issues before Solr indexing. Tests run in ~2s with cached gems, suitable for CI and agent iteration.

Changes

GitHub Actions workflow (.github/workflows/test.yml)

  • Ruby 3.1 + Python 3.12 setup with bundler and pip caching
  • Explicit minimal permissions (contents: read)

Traject smoke tests (tests/unit/test_traject_smoke.py)

  • Ruby syntax validation via ruby -c
  • Config load verification without input files
  • XML transformation using Traject::DebugWriter (bypasses Solr)
  • Graceful skip when traject config doesn't exist

Dependencies (Gemfile)

gem 'traject', '~> 3.0'
gem 'traject_plus'  # Required by example_traject_config_eac_cpf.rb

Test infrastructure

  • Shared sample_eac_cpf_xml fixture in conftest.py
  • Documentation in tests/README.md covering performance and skip behavior
  • .gitignore entries for Ruby artifacts

Performance

  • First run: ~60s (gem install)
  • Cached runs: ~2s (fast enough for Tier 1)
Original prompt

Update the testing infrastructure PR to include Traject smoke tests in Tier 1 (fast feedback loop).

Additional Requirements

1. Update GitHub Actions Workflow

Add Ruby setup and bundler caching to .github/workflows/test.yml:

- name: Setup Ruby
  uses: ruby/setup-ruby@v1
  with:
    ruby-version: '3.1'
    bundler-cache: true

- name: Install Ruby dependencies
  run: |
    bundle install
    # Or if Gemfile doesn't exist yet, create minimal one:
    # gem install traject

2. Add Traject Smoke Tests

Create tests/unit/test_traject_smoke.py:

"""
Traject smoke tests - verify traject config and XML processing work.

These tests run traject without Solr to catch config errors quickly.
Goal: < 60 seconds total including Ruby setup (with caching).
"""

import pytest
import subprocess
from pathlib import Path

@pytest.fixture
def sample_eac_cpf_xml():
    """Minimal valid EAC-CPF XML for testing"""
    return '''<?xml version="1.0" encoding="UTF-8"?>
<eac-cpf xmlns="urn:isbn:1-931666-33-4">
  <control>
    <recordId>creator_people_1</recordId>
    <maintenanceStatus>new</maintenanceStatus>
    <maintenanceAgency>
      <agencyName>Test</agencyName>
    </maintenanceAgency>
  </control>
  <cpfDescription>
    <identity>
      <nameEntry>
        <part>Test Person</part>
      </nameEntry>
    </identity>
  </cpfDescription>
</eac-cpf>'''

def test_traject_config_syntax_valid():
    """Verify traject config has valid Ruby syntax"""
    # Find traject config (might be in different locations)
    possible_paths = [
        "traject_config_eac_cpf.rb",
        "example_traject_config_eac_cpf.rb",
    ]
    
    config_path = None
    for path in possible_paths:
        if Path(path).exists():
            config_path = path
            break
    
    if not config_path:
        pytest.skip("No traject config found (expected if not yet created)")
    
    # Ruby syntax check (fast, doesn't execute)
    result = subprocess.run(
        ["ruby", "-c", config_path],
        capture_output=True,
        text=True
    )
    
    assert result.returncode == 0, f"Invalid Ruby syntax: {result.stderr}"

@pytest.mark.skipif(
    not Path("example_traject_config_eac_cpf.rb").exists(),
    reason="Traject config not yet available"
)
def test_traject_loads_config():
    """Verify traject can load config without errors"""
    result = subprocess.run(
        ["bundle", "exec", "traject", "-c", "example_traject_config_eac_cpf.rb"],
        capture_output=True,
        text=True
    )
    
    # Should show usage/help, not crash
    assert "error" not in result.stderr.lower() or result.returncode == 1

@pytest.mark.skipif(
    not Path("example_traject_config_eac_cpf.rb").exists(),
    reason="Traject config not yet available"
)
def test_traject_processes_sample_xml(tmp_path, sample_eac_cpf_xml):
    """Verify traject can transform XML without Solr (smoke test)"""
    xml_file = tmp_path / "sample.xml"
    xml_file.write_text(sample_eac_cpf_xml)
    
    # Use NullWriter to process without Solr
    result = subprocess.run([
        "bundle", "exec", "traject",
        "-c", "example_traject_config_eac_cpf.rb",
        "-w", "Traject::NullWriter",
        str(xml_file)
    ], capture_output=True, text=True)
    
    # Should complete without errors
    assert result.returncode == 0, f"Traject processing failed: {result.stderr}"

3. Create Minimal Gemfile

If one doesn't exist, create Gemfile in repo root:

source 'https://rubygems.org'

gem 'traject', '~> 3.0'

# Optional: for testing
group :test do
  gem 'rspec', '~> 3.0'
end

4. Update tests/README.md

Add section about traject smoke tests:

## Traject Smoke Tests

Tests in `tests/unit/test_traject_smoke.py` verify traject configuration without requiring Solr.

### What They Test
- Ruby syntax validity of traject configs
- Traject can load and parse configs
- XML transformation logic (without indexing)

### Setup Requirements
- Ruby 3.1+
- Bundler
- Run `bundle install` to install traject gem

### Performance
- First run: ~60 seconds (includes gem install)
- Cached runs: ~40 seconds (gems cached)
- Still fast enough for CI/agent iteration

### Skipping
These tests skip gracefully if traject config doesn't exist yet.

5. Update conftest.py

Add traject-related fixtures:

@pytest.fixture
def sample_eac_cpf_xml():
    """Minimal valid EAC-CPF XML for testing"""
    return '''<?xml version="1.0" encoding="UTF-8"?>
<eac-cpf xmlns="urn:isbn:1-931666-33-4">
  <control>
    <recordId>creator_people_1</recordId>
  </control>
  <cpfDescription>
    <identity>
      <nameEntry><part>Test Person</part></nameEntry>
    </identity>
  </cpfDescription>
</eac-cpf>'''

Key Points

  • Traject tests should be part of unit test suite (not integration)
  • Use pytest.skipif to gracefully skip if traject config doesn't exist yet
  • Use Traject::NullWriter to avoid needing Solr
  • Leverage bundler-cache in Git...

This pull request was created from Copilot chat.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 4 commits February 23, 2026 20:38
Co-authored-by: alexdryden <47127862+alexdryden@users.noreply.github.com>
Co-authored-by: alexdryden <47127862+alexdryden@users.noreply.github.com>
Co-authored-by: alexdryden <47127862+alexdryden@users.noreply.github.com>
Co-authored-by: alexdryden <47127862+alexdryden@users.noreply.github.com>
Copilot AI changed the title [WIP] Update testing infrastructure to include Traject smoke tests Add Traject smoke tests to Tier 1 testing infrastructure Feb 23, 2026
Copilot AI requested a review from alexdryden February 23, 2026 20:46
@alexdryden alexdryden changed the base branch from index_creators to copilot/add-comprehensive-testing-infrastructure February 26, 2026 16:57
@alexdryden alexdryden marked this pull request as ready for review February 26, 2026 17:04
@alexdryden alexdryden merged commit ba49963 into copilot/add-comprehensive-testing-infrastructure Feb 26, 2026
1 check passed
@alexdryden alexdryden deleted the copilot/update-testing-infrastructure branch February 26, 2026 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants