Skip to content

[Draft] Cedarling Protobuf Based Schema Support

Safin Wasi edited this page Jan 26, 2026 · 1 revision

Protobuf Schema Modeling for Cedarling

Overview

This specification defines the requirements for modeling Cedarling schemas using Protocol Buffers (Protobuf) with LDAP-like layered schema management and inheritance capabilities. The goal is to create a structured, type-safe, and extensible schema definition system that supports core, organization, and application schema layers with comprehensive startup validation.

Background

Currently, Cedarling uses Cedar schema format (.cedarschema) and JSON representations for defining entity types, actions, and context structures. While functional, this approach has limitations:

  • No native inheritance support for entity types
  • Limited type safety and validation
  • Difficult to extend and maintain complex hierarchies
  • No built-in versioning or evolution support
  • Limited tooling ecosystem compared to Protobuf
  • No layered schema management for different organizational needs
  • No comprehensive schema validation during system startup

LDAP's schema management model provides a proven pattern for hierarchical schema organization:

  • Core schemas define fundamental object classes and attributes (like inetOrgPerson, organizationalUnit)
  • Organization schemas extend core schemas with organization-specific attributes and classes
  • Application schemas add application-specific extensions without modifying lower layers
  • Schema validation ensures consistency across all layers during directory startup
  • Multiple inheritance is supported through auxiliary object classes
  • Clear inheritance hierarchies enable schema reuse and extension

User Stories

US-1: Core Schema Designer

As a core schema designer
I want to define fundamental entity types and attributes in core schemas
So that all organizations and applications have consistent base types to build upon

US-2: Organization Schema Administrator

As an organization schema administrator
I want to extend core schemas with organization-specific attributes and entity types
So that I can customize the schema for my organization's needs without modifying core definitions

US-3: Application Developer

As an application developer
I want to add application-specific schema extensions
So that I can support my application's unique requirements while inheriting from organization and core schemas

US-4: System Administrator

As a system administrator
I want to validate all schema layers during Cedarling startup
So that I can detect schema conflicts, missing dependencies, and inheritance errors before the system becomes operational

US-5: Policy Developer

As a policy developer
I want to write policies against the combined schema from all layers
So that I can reference any entity type or attribute defined across core, organization, and application schemas

US-6: Schema Maintainer

As a schema maintainer
I want to version and evolve schemas at each layer independently
So that I can maintain backward compatibility while adding new features at the appropriate layer

US-7: System Integrator

As a system integrator
I want to generate type-safe code from the combined schema
So that I can have compile-time validation across all schema layers

Acceptance Criteria

AC-1: Layered Schema Architecture

  • Given core, organization, and application schema files
  • When I load the schema system
  • Then schemas should be loaded in dependency order (core → organization → application)
  • And each layer should be able to extend and reference entities from lower layers
  • And higher layers should not be able to modify lower layer definitions
  • And schema conflicts should be detected and reported

AC-2: Core Schema Definition

  • Given fundamental Cedarling entity types (User, Workload, Application, etc.)
  • When I define them in core schema using Protobuf
  • Then core schema should define base entity types and common attributes
  • And provide structural object classes for inheritance
  • And include fundamental actions and context types
  • And maintain semantic equivalence to current Cedar schema

AC-3: Organization Schema Extensions

  • Given a core schema and organization-specific requirements
  • When I define organization schema extensions
  • Then I should be able to add organization-specific attributes to core entities
  • And define new entity types that inherit from core types
  • And add organization-specific actions and context attributes
  • And override optional attributes with organization-specific constraints

AC-4: Application Schema Extensions

  • Given core and organization schemas
  • When I define application-specific schema extensions
  • Then I should be able to add application-specific attributes to any entity type
  • And define application-specific entity types and actions
  • And create auxiliary object classes for optional features
  • And maintain compatibility with organization and core schemas

AC-5: Startup Schema Validation

  • Given multiple schema layers (core, organization, application)
  • When Cedarling starts up
  • Then all schemas should be validated for consistency
  • And inheritance relationships should be verified
  • And attribute conflicts should be detected and reported
  • And missing dependencies should be identified
  • And circular dependencies should be prevented
  • And startup should fail with clear error messages if validation fails

AC-6: Schema Composition and Merging

  • Given validated schema layers
  • When I compose the final schema
  • Then all entity types should include attributes from all applicable layers
  • And inheritance hierarchies should be properly resolved
  • And attribute precedence should follow layer hierarchy (application > organization > core)
  • And the composed schema should be valid Cedar schema

AC-7: LDAP-Style Inheritance Support

  • Given entity types defined across schema layers
  • When I define inheritance relationships
  • Then structural inheritance should be supported (single base class)
  • And auxiliary inheritance should be supported (multiple auxiliary classes)
  • And attribute inheritance should follow LDAP semantics
  • And inheritance conflicts should be detected during validation

AC-8: Cedar Schema Generation

  • Given a composed Protobuf schema from all layers
  • When I generate Cedar schema format
  • Then the output should be valid Cedar schema
  • And preserve all entity relationships and attributes from all layers
  • And maintain compatibility with existing Cedar policies
  • And support all Cedar primitive types (String, Long, Set, etc.)

AC-9: Type Safety and Validation

  • Given layered Protobuf schema definitions
  • When I validate entity instances
  • Then type checking should enforce constraints from all applicable layers
  • And inheritance rules should be validated across layers
  • And required fields should be enforced according to layer precedence
  • And field types should be strictly validated

AC-10: Schema Evolution and Versioning

  • Given existing layered schemas
  • When I need to evolve schemas at any layer
  • Then changes should maintain backward compatibility within the layer
  • And version information should be embedded in each schema layer
  • And migration paths should be clearly defined for each layer
  • And deprecated fields should be properly marked and handled

Technical Requirements

TR-1: Schema Layer Structure

  • Core schemas define base entity types, actions, and context
  • Organization schemas extend core with organization-specific elements
  • Application schemas add application-specific extensions
  • Each layer should be independently versionable
  • Schema files should follow naming conventions (core.proto, org-{name}.proto, app-{name}.proto)

TR-2: Protobuf Schema Organization

  • Use Protobuf 3 syntax for all schema definitions
  • Implement proper package namespacing for each layer
  • Support nested message types for complex attributes
  • Use oneof for union types where appropriate
  • Implement proper field numbering for evolution

TR-3: Inheritance Implementation

  • Use message composition to simulate structural inheritance
  • Implement auxiliary message types for optional features
  • Support multiple inheritance through embedded messages
  • Provide clear inheritance resolution rules
  • Maintain type hierarchy metadata across layers

TR-4: Schema Validation Engine

  • Implement comprehensive schema validation during startup
  • Detect and report inheritance conflicts
  • Validate attribute type consistency across layers
  • Check for circular dependencies
  • Provide detailed error messages with layer context

TR-5: Cedar Type Mapping

  • Map Cedar String to protobuf string
  • Map Cedar Long to protobuf int64
  • Map Cedar Set to repeated T
  • Map Cedar Record to nested message
  • Support Cedar entity references through string identifiers
  • Maintain type mapping consistency across layers

TR-6: Schema Composition Engine

  • Merge schemas from all layers in correct order
  • Resolve inheritance hierarchies across layers
  • Handle attribute precedence and conflicts
  • Generate final composed schema for Cedar engine
  • Support incremental composition for performance

TR-7: Code Generation

  • Generate Rust structs with serde support for all layers
  • Generate Python classes with proper inheritance
  • Generate TypeScript interfaces and classes
  • Include validation logic in generated code
  • Support custom serialization formats

TR-8: Tooling and CLI

  • Provide schema validation tools for each layer
  • Include Cedar schema conversion utilities
  • Support schema diff and migration tools across layers
  • Implement code generation CLI
  • Include documentation generation for composed schemas

Non-Functional Requirements

NFR-1: Performance

  • Schema validation should complete within 50ms for all layers
  • Schema composition should complete within 100ms
  • Code generation should complete within 10 seconds for full schema
  • Runtime type checking should add <5% overhead
  • Memory usage should not exceed 3x current implementation

NFR-2: Maintainability

  • Schema definitions should be self-documenting
  • Inheritance relationships should be clearly visible across layers
  • Generated code should be readable and debuggable
  • Error messages should include layer context and be actionable
  • Schema evolution should be trackable across layers

NFR-3: Compatibility

  • Must maintain 100% compatibility with existing Cedar policies
  • Should support gradual migration from current schema format
  • Must work with all current Cedarling language bindings
  • Should integrate with existing build and deployment processes
  • Support both monolithic and layered schema deployment

NFR-4: Scalability

  • Support up to 1000 entity types across all layers
  • Handle up to 10,000 attributes across all entity types
  • Support up to 10 schema layers (core + 9 extension layers)
  • Maintain performance with large schema hierarchies

Schema Layer Examples

Core Schema (core.proto)

// Core entity types that all organizations inherit
message Principal {
  string id = 1;
  int64 created_at = 2;
  int64 updated_at = 3;
}

message User {
  Principal principal = 1;
  string sub = 2;
  repeated string roles = 3;
}

message Application {
  Principal principal = 1;
  string app_id = 2;
  string name = 3;
  Url url = 4;
}

Organization Schema (org-acme.proto)

// ACME Corp specific extensions
message AcmeUser {
  User base_user = 1;
  string employee_id = 2;
  string department = 3;
  string manager_id = 4;
}

message AcmeApplication {
  Application base_app = 1;
  string cost_center = 2;
  repeated string compliance_tags = 3;
}

Application Schema (app-hr-system.proto)

// HR System specific extensions
message HRUser {
  AcmeUser acme_user = 1;
  string payroll_id = 2;
  int32 vacation_days = 3;
  repeated string certifications = 4;
}

Out of Scope

  • Runtime schema modification (schemas are compile-time artifacts)
  • Dynamic inheritance (inheritance structure is fixed at compile time)
  • Schema registry or centralized schema management
  • GUI-based schema editors
  • Integration with external schema repositories
  • Real-time schema synchronization across distributed systems

Dependencies

  • Protocol Buffers compiler (protoc)
  • Rust protobuf libraries (prost)
  • Python protobuf libraries
  • JavaScript/TypeScript protobuf libraries
  • Cedar policy engine compatibility
  • Existing Cedarling build system integration

Success Metrics

  • 100% of existing Cedar entity types can be represented in layered Protobuf schemas
  • Schema validation catches 100% of inheritance and type conflicts
  • Generated schemas pass all existing Cedar policy tests
  • Code generation produces valid, compilable output for all target languages
  • Performance benchmarks show no regression in policy evaluation
  • Schema evolution scenarios work without breaking existing policies
  • Developer adoption rate for layered schema format exceeds 80% within 6 months
  • Startup validation prevents 100% of schema-related runtime errors

Risks and Mitigations

Risk: Complex layered schema semantics

Mitigation: Start with simple 3-layer model and comprehensive documentation

Risk: Schema validation performance impact

Mitigation: Implement caching and incremental validation strategies

Risk: Protobuf limitations for Cedar type system

Mitigation: Implement custom extensions and validation layers

Risk: Developer adoption resistance to layered approach

Mitigation: Provide migration tools and clear benefits demonstration

Risk: Schema conflict resolution complexity

Mitigation: Implement clear precedence rules and detailed error reporting

Risk: Cedar engine compatibility with composed schemas

Mitigation: Maintain existing JSON/Cedar format support during transition

Clone this wiki locally