This document tracks the progress of porting the Python xmlschema package to Rust.
- Python Source: sissaschool/xmlschema
- License: MIT
- Python Package Stats: 79 Python source files, ~50k+ lines of code
Overall: ~85% complete Current Stage: Phase 5 - Polish & Remaining Features
- Error types and error handling
- Namespace handling with QName support
- XML name validation (NCName, QName)
- Resource loading (file-based)
- Security limits and constraints
- Module structure
- Schema parsing from file and string
- Simple type parsing (atomic, list, union)
- Complex type parsing
- Element declarations
- Attribute declarations
- Attribute groups
- Model groups (sequence, choice, all)
- Group references
- Type restrictions and extensions
- Forward reference resolution
- Built-in XSD types (string, integer, decimal, date, etc.)
- Simple type restrictions
- Complex types with simple/complex content
- Type derivation (extension/restriction)
- Qualified name resolution
- enumeration
- pattern (regex)
- length, minLength, maxLength
- minInclusive, maxInclusive, minExclusive, maxExclusive
- totalDigits, fractionDigits
- whiteSpace
- Sequence compositor
- Choice compositor
- All compositor
- Mixed content
- Occurrence constraints (minOccurs/maxOccurs)
- ModelVisitor state machine
- Any element wildcard
- Any attribute wildcard
- Namespace constraints
- Process contents (strict/lax/skip)
- Element validation
- Attribute validation
- Content model validation
- Simple type value validation
- ValidationContext with error collection
- Validation modes (strict/lax)
- Unique constraints
- Key constraints
- Keyref constraints
- Selector/field XPath evaluation
- Assertions (assert/report elements)
- Basic XSD 1.1 parsing
- Parker convention
- BadgerFish convention
- Unordered converter
- XPath expression evaluation
- Identity constraint selectors
- JSON export of schema structure
- Python parity for schema dumps
- HTTP/HTTPS schema loading
- URL resource resolution
- Schema caching from remote sources
- xs:include resolution across files
- xs:import with namespace mapping
- xs:redefine support
- Circular import detection
- Substitution groups
- Default/fixed value application during validation
- Full conditional type assignment (XSD 1.1)
- xsi:type handling
- xsi:nil handling for nillable elements
- Validate command
- Convert command (XML to JSON)
- Inspect command (schema introspection)
- Download schemas command
- Performance optimization
- Memory optimization
- Documentation improvements
- More extensive error messages
- Clone Python reference code
- Initialize Rust cargo project
- Set up Cargo.toml with dependencies
- Create module structure
- README.md
- Documentation infrastructure
- Base validator infrastructure
- Simple type validators
- Facet validators
- Complex type validators
- Element validators
- Attribute validators
- Model groups
- Content models
- Wildcards
- Schema component parsing
- Forward reference resolution
- Identity constraints
- XSD 1.1 assertions
- Document validation
- Converter framework
- Parker converter
- BadgerFish converter
- Unordered converter
- XPath expression evaluation
- Schema context evaluation
- HTTP/HTTPS loading
- CLI tool commands (inspect, xml2json, validate)
- Schema composition (include/import)
- Substitution groups
- Performance optimization
- Schema dump comparison framework
- Python parity validation
- Book.xsd comparison test (passing)
- DITA schema bundle tests
- NISO schema bundle tests
- Per-module functionality tests
- Integration tests
- W3C XSD 1.0 conformance suite
- W3C XSD 1.1 conformance suite
- Property-based testing
- Performance benchmarks
- XML Parser: Using roxmltree for DOM-like access
- Error Handling: thiserror for error types
- Type Safety: Arc<dyn SimpleType + Send + Sync> for thread-safe type references
- Memory: Arc/Rc for shared references, cloning where needed
- API Style: Similar to Python where idiomatic in Rust
- Type safety: Compile-time error catching
- Performance: Faster validation than Python
- Memory safety: No memory leaks
- Concurrency: Thread-safe type system with Send + Sync
- Cloned Python reference repository
- Initialized Rust project
- Created TODO tracking document
- Implemented core XSD parsing
- Added type system and facets
- Implemented content model validation
- Added document validation
- Implemented data converters
- Achieved Python parity for schema dumps
- Updated README and TODO documentation
- Implemented CLI tool with clap
- Added
inspectcommand for schema introspection (JSON output, element/type lookup) - Added
xml2jsoncommand with multiple formats (default, parker, badgerfish, unordered) - Added
validatecommand with strict/lax modes - Created DITA/NISO bundle comparison test infrastructure with static facts
Last Updated: 2025-12-29