Skip to content

Add taxonomy system and refactor agents & ledger models#554

Merged
jfrench9 merged 12 commits into
mainfrom
feature/taxonomy-system
Mar 29, 2026
Merged

Add taxonomy system and refactor agents & ledger models#554
jfrench9 merged 12 commits into
mainfrom
feature/taxonomy-system

Conversation

@jfrench9
Copy link
Copy Markdown
Member

Summary

This PR introduces a comprehensive taxonomy system for financial data classification and simultaneously refactors the agent architecture into a unified, modular design. The ledger model layer is renamed from "accounts" to "elements" to support a more flexible, taxonomy-driven data model.

Key Accomplishments

Taxonomy System

  • New database tables and migration: Adds taxonomy, structure, element, and element association models with a full Alembic migration (rename_accounts_to_elements_add_taxonomy)
  • Seed data: Implements a taxonomy seed configuration (robosystems/config/taxonomy/seed.py) providing a canonical set of financial taxonomy definitions
  • API layer: New REST endpoints for taxonomy CRUD operations (robosystems/routers/ledger/taxonomies.py) with corresponding Pydantic API models
  • MCP tooling: Adds taxonomy_tools.py to expose taxonomy mapping capabilities to the graph-based MCP agent framework, integrated via the tools manager

Ledger Model Refactor: Accounts → Elements

  • Renames the core account model to element across the entire codebase (ORM models, API schemas, dbt models, QuickBooks adapter, routers, and tests)
  • Introduces element_association model for many-to-many relationships between elements and taxonomies
  • Adds structure model to represent hierarchical taxonomy structures
  • Updates all downstream references in line items, classification rules, trial balance, and summary endpoints

Agent System Overhaul

  • Unified agent base: Refactored base.py with a cleaner base class; removed legacy context.py, registry.py, cypher_agent.py, and financial.py
  • New modular architecture:
    • agent_context.py — lightweight, well-scoped execution context
    • agent_registry.py — decorator-based agent registration (@register_task pattern)
    • adapters/ — separated API, worker, and worker task adapters for clean integration boundaries
    • implementations/ — agent implementations (cypher, mapping) organized by domain
    • credit_consumer.py, progress.py, tracked_ai.py, tool_access.py — composable concerns extracted into single-responsibility modules
  • Orchestrator simplification: Significantly reduced orchestrator complexity (~510 lines removed, streamlined flow)
  • Worker initialization: Task modules are now imported at worker init to trigger @register_task decorators

Client & Dependencies

  • Bumps robosystems-client to 0.2.37
  • Bumps cryptography from 46.0.5 to 46.0.6

Breaking Changes

  • Database migration required: The migration renames the accounts table to elements and adds new taxonomy-related tables. Must be applied before deployment.
  • Account model removed: All references to Account ORM model are replaced by Element. Any external code depending on the old model or import paths will break.
  • Agent imports restructured: Legacy agent modules (context.py, registry.py, cypher_agent.py, financial.py) are deleted. Any external references to these modules must be updated to the new implementations/ and adapters/ paths.
  • API contract changes: Ledger endpoints that previously referenced "accounts" now reference "elements". API consumers must update accordingly.

Testing Notes

  • New test suites added for: taxonomy seed validation, taxonomy tools, taxonomy router endpoints, and the unified agent system
  • Existing tests updated across: QuickBooks pipeline/load/transform, MCP tools manager, orchestrator, loader, materializer, agent router, accounts router, and transactions router
  • Legacy test files removed: test_context.py, test_registry.py, test_financial.py
  • All test changes align with the accounts → elements rename and new agent architecture

Infrastructure Considerations

  • A database migration must be run against the extensions database before deploying this change
  • Worker processes need to be restarted to pick up the new task registration mechanism
  • The taxonomy seed data should be reviewed for completeness before running in production; seeding may be required as a post-migration step
  • MCP tool registration now includes taxonomy tools — ensure MCP service configurations are updated if tool whitelisting is in place

🤖 Generated with Claude Code

Branch Info:

  • Source: feature/taxonomy-system
  • Target: main
  • Type: feature

Co-Authored-By: Claude noreply@anthropic.com

jfrench9 added 11 commits March 27, 2026 21:46
…ructure changes

- Updated the `robosystems-client` dependency in `pyproject.toml` and `uv.lock` to version 0.2.37.
- Introduced a new migration script to rename the `accounts` table to `elements` and added a taxonomy system with new tables for `taxonomies` and `structures`.
- Refactored related code to replace references from `accounts` to `elements`, ensuring consistency across the codebase.
- Added new models for `Element`, `ElementAssociation`, `Structure`, and `Taxonomy` to support the updated taxonomy structure.
- Created a migration to add `taxonomies` and `structures` tables, enhancing the database schema for taxonomy management.
- Updated the `_insert_element` function to include `created_at` and `updated_at` timestamps.
- Introduced new API endpoints for CRUD operations on taxonomies, structures, and element associations, improving the taxonomy management capabilities.
- Added response models for taxonomies and structures to support the new API functionality.
- Introduced four new taxonomy mapping tools: `get-unmapped-elements`, `suggest-mapping`, `create-mapping-association`, and `get-mapping-summary`, gated by the `roboledger` extension and `EXTENSIONS_ENABLED` flag.
- Implemented methods to retrieve tool definitions and execute the new tools, enhancing the taxonomy management capabilities.
- Added unit tests to ensure proper registration and dispatching of the taxonomy tools, including checks for availability based on extension settings and read-only mode.
- Introduced a new SQL model `elements.sql` to replace the previous `accounts.sql`, aligning with the updated taxonomy structure.
- Updated references in existing SQL files and tests to use `elements` instead of `accounts`, ensuring consistency across the codebase.
- Modified the loading and processing logic to accommodate the new terminology, enhancing clarity in the data pipeline.
- Added a new function to seed tenant reporting taxonomy, improving the management of shared reporting structures.
- Added `WORKER_API_BASE_URL` environment variable to `compose.yaml` for API access.
- Updated `worker.yaml` to include `WORKER_API_BASE_URL` in the CloudFormation template.
- Introduced `WorkerAPIClient` for handling HTTP requests from the worker to the API.
- Implemented `AgentMappingTask` for autonomous mapping of Chart of Accounts to US GAAP using Bedrock AI.
- Added new endpoint for auto-mapping elements in the ledger router.
- Removed the `FinancialAgent` class and its associated tests to streamline the agent system.
- Added tests for the new `WorkerAPIClient` and `AgentMappingTask` functionalities.
- Removed `WORKER_API_BASE_URL` from environment variables in `compose.yaml` and `worker.yaml`.
- Deleted `WorkerAPIClient` as it is no longer needed for API interactions.
- Introduced `AgentMappingTask` for autonomous mapping of Chart of Accounts to US GAAP using Bedrock AI.
- Added new files for agent mapping tasks and prompts to support the mapping functionality.
- Implemented tests for the new `AgentMappingTask` to ensure proper functionality and coverage.
- Introduced a new unified agent protocol with classes for Agent, AgentSpec, AgentResult, and AgentContext to streamline agent definitions across execution contexts.
- Added legacy support for existing orchestrator and routers, allowing gradual migration to the new protocol.
- Implemented various agent context services, including tool access and credit consumption, to facilitate agent operations.
- Created new agent registry for managing agent registrations and lookups.
- Developed multiple agent implementations, including CypherAgent for natural language to Cypher query conversion and MappingAgent for autonomous Chart of Accounts to US GAAP mapping.
- Enhanced progress reporting and credit consumption mechanisms for better tracking and management during agent execution.
- Updated the agent operations module to implement a unified agent protocol, enhancing dynamic routing and execution capabilities.
- Removed legacy components including the context enricher, Cypher agent, and agent registry to streamline the agent framework.
- Introduced execution adapters for running agents in different contexts (API and worker).
- Enhanced agent registration and metadata handling for improved compatibility and performance.
- Updated documentation to reflect changes in the agent protocol and system architecture.
@jfrench9
Copy link
Copy Markdown
Member Author

@claude please review this PR

@claude
Copy link
Copy Markdown

claude Bot commented Mar 29, 2026

Claude finished @jfrench9's task in 2m 9s —— View job


PR Review: Add taxonomy system and refactor agents & ledger models

Todo List:

  • Gather context and understand the PR
  • Review migration and database models
  • Review taxonomy system (seed, API models, router)
  • Review agent system overhaul
  • Review MCP taxonomy tools
  • Review tests
  • Summarize findings

This is a large, well-structured PR that introduces a solid taxonomy system and cleans up significant agent complexity. Overall the direction is excellent. Below are issues ranging from bugs to design observations.


Bugs

1. credits_remaining metadata is always wrong (adapters/api.py:91-93)

result.metadata["credits_remaining"] = (
    tracked_ai._credit_consumer is not None
    and getattr(tracked_ai, "_last_remaining", None)
)

TrackedAIClient never sets _last_remaining, so this evaluates to either False (no consumer) or None (consumer present, attribute missing). It never returns actual remaining credits. This is dead/broken metadata that misleads API consumers. Either implement it properly or remove the field. Fix this →

2. idx_associations_unapproved index is missing from migration

ElementAssociation.__table_args__ in element_association.py:42-45 defines:

Index(
    "idx_associations_unapproved",
    "approved_by",
    postgresql_where="approved_by IS NULL AND suggested_by = 'ai'",
),

This index exists in the ORM model but is not created in the migration (a3f8b2c1d4e5). The tenant schema creation in _upgrade_tenant_schema also doesn't include it. Fix this →

3. TrackedAIClient.credit_consumer type annotation accepts None but parameter declares it non-optional (tracked_ai.py:31, adapters/api.py:64)

# tracked_ai.py:31
credit_consumer: CreditConsumer,   # non-optional Protocol

# api.py:64
credit_consumer = SessionCreditConsumer(db_session) if db_session else None  # can be None

When db_session is None (e.g. in tests), credit_consumer=None is passed to TrackedAIClient. Then in create_message, await self._credit_consumer.consume(...) will raise AttributeError. Make credit_consumer optional: credit_consumer: CreditConsumer | None = None and guard the call.


Code Quality Issues

4. CreateTaxonomyRequest, CreateStructureRequest, CreateAssociationRequest have no input validation (models/api/extensions/taxonomies.py:32-60, 83-90)

taxonomy_type, structure_type, and association_type accept arbitrary strings at the Pydantic layer. Invalid values only fail at the DB CHECK constraint, which returns a 500 rather than a 400. Use Literal types to validate at the request level:

from typing import Literal

class CreateTaxonomyRequest(BaseModel):
    taxonomy_type: Literal["chart_of_accounts", "reporting", "mapping"]

5. except (ValueError, ProgrammingError) is too broad (routers/ledger/taxonomies.py:91, 137, 179…)

The pattern raise _ledger_404() is used in every handler to signal "ledger not initialized". But any ValueError raised by business logic (e.g. from Pydantic, a bad ID format, or a misconfigured service) would be silently converted to a misleading 404. Consider catching a more specific exception or checking the ProgrammingError pgcode for the relation-does-not-exist case.

6. get_mapped_trial_balance has no response model (routers/ledger/taxonomies.py:687-748)

The endpoint returns an untyped dict. All other endpoints in this router have explicit response_model=.... Defining a Pydantic model here would catch schema drift and enable OpenAPI documentation.

7. AgentResponse.timestamp uses deprecated datetime.utcnow() (base.py:223, 456)

timestamp: datetime = field(default_factory=datetime.utcnow)  # deprecated in Py3.12
"timestamp": datetime.utcnow().isoformat(),                    # same issue

The rest of the codebase uses datetime.now(UTC). These two lines in the legacy AgentResponse and BaseAgent.prepare_context should follow the same pattern.

8. Module docstring in agent_registry.py is stale (agent_registry.py:1-8)

The docstring mentions "Separate from the legacy AgentRegistry (registry.py)" but registry.py was deleted in this PR. The note about coexistence during migration is outdated.


Design Observations (non-blocking)

9. Migration seeds data inline (migration upgrade():357)

seed_reporting_taxonomy(conn)

Calling seed_reporting_taxonomy directly inside upgrade() couples schema changes with data initialization. This means if the seed function changes, a new migration is required to re-seed. A common alternative is a separate seeding step invoked by deployment scripts (so seed data can be re-run without a new migration). This is a deliberate trade-off worth documenting.

10. Missing UPDATE/DELETE endpoints for Taxonomy and Structure (routers/ledger/taxonomies.py)

The router supports POST /taxonomies and POST /structures but no PATCH/DELETE. Shared/locked seed data is read-only by design, but tenant-owned CoA taxonomies and custom structures should be updatable. If this is intentional for the initial MVP, consider documenting it with a TODO.

11. suggested_by hardcoded to "cowork" in MCP tool (taxonomy_tools.py:299)

suggested_by="cowork",

This should probably reflect the actual tool/agent performing the suggestion (e.g. "mcp" or "ai"). The suggested_by field is also used in the idx_associations_unapproved index filter suggested_by = 'ai', so there's an inconsistency — MCP-created associations would not appear in the pending-approval index.

12. _get_all_agents() instantiates agents on every routing call (orchestrator.py:108-116)

def _get_all_agents(self) -> dict[str, Agent]:
    agents = {}
    for agent_type in list_agents():
        agents[agent_type] = get_agent(agent_type)

Since agents are stateless, this is functionally safe, but instantiating on every route_query call (including _best_match_routing, _capability_based_routing, etc., each of which calls this) means multiple instantiations per request. A property cache or lazy initialization would be cleaner.

13. FactoryCreditConsumer doesn't commit (credit_consumer.py:92-107)

session = SessionFactory()
try:
    service = CreditService(session)
    result = service.consume_ai_tokens(...)
    return float(result.get("credits_consumed", 0))

There's no session.commit() before session.close(). If CreditService.consume_ai_tokens doesn't auto-commit internally, credit consumption in worker context would silently be rolled back. Verify that CreditService handles its own commit, or add an explicit commit here.


Summary

The taxonomy system is well-modeled with a clean separation between shared (public schema) GAAP elements and per-tenant CoA/mapping data. The agent refactor achieves its stated goals: stateless agents, protocol-based service injection, and clean separation between domain logic (implementations/) and execution wiring (adapters/). The migration is comprehensive and correctly handles tenant schema iteration.

The most critical issues to address before merge are #1 (broken credits_remaining metadata), #2 (missing idx_associations_unapproved in migration), and #3 (nullable credit_consumer passed to TrackedAIClient).

Comment thread robosystems/operations/agents/agent_context.py Fixed
Comment thread robosystems/operations/agents/agent_context.py Fixed
Comment thread robosystems/operations/agents/agent_context.py Fixed
Comment thread robosystems/operations/agents/agent_context.py Fixed
Comment thread robosystems/operations/agents/base.py Fixed
Comment thread robosystems/operations/agents/credit_consumer.py Fixed
- Added a new index for unapproved element associations in the database migration to improve query performance.
- Updated the suggested_by field in the CreateMappingAssociationTool to use 'ai' instead of 'cowork' for better tracking of AI-generated associations.
- Refactored agent context and registry components to streamline the agent protocol, including optional credit consumption handling in TrackedAIClient.
- Improved metadata handling in the agent API to indicate whether credit tracking is enabled.
@jfrench9
Copy link
Copy Markdown
Member Author

@claude can you continue to review including the new commit

@claude
Copy link
Copy Markdown

claude Bot commented Mar 29, 2026

Claude finished @jfrench9's task in 2m 13s —— View job


Todo List:

  • Gather context and understand the request
  • Check new commits since last review
  • Read new/changed files from recent commits
  • Review and update findings
  • Post updated comprehensive review

Updated PR Review: Add taxonomy system and refactor agents & ledger models

The new commit (9c0e36b) directly addresses all three critical bugs and several quality issues flagged in the previous review. Here's a status update on each finding, followed by remaining items.


Fixed in 9c0e36b

Bug #1credits_remaining always wrong (adapters/api.py:91)
Fixed. Replaced the broken credits_remaining field with has_credit_tracking: bool, which correctly reflects whether a CreditConsumer is wired in. Clean solution.

Bug #2idx_associations_unapproved missing from migration
Fixed. The index is now added to both the public schema table creation (line 301) and _upgrade_tenant_schema (line 631), with the correct WHERE approved_by IS NULL AND suggested_by = 'ai' predicate. The public schema uses the ORM op.create_index form; the tenant loop uses raw SQL — both are correct.

Bug #3credit_consumer nullable crash in TrackedAIClient
Fixed well. credit_consumer: CreditConsumer | None = None and the consumption is now guarded by if self._credit_consumer is not None. A NoOpCreditConsumer was also added as an explicit test double, which is a nice bonus.

Issue #7 — datetime.utcnow() deprecated (base.py:219, 452)
Fixed. Both occurrences replaced with datetime.now(UTC).

Issue #8 — Stale docstring in agent_registry.py
Fixed. The migration-era note about coexistence with registry.py is removed.

Issue #11 — suggested_by hardcoded to "cowork" in MCP tool (taxonomy_tools.py:299)
Fixed. Changed to "ai", which now matches the idx_associations_unapproved index predicate.

CodeQL: ... in Protocol/abstract methods
The ... expressions were removed from ProgressReporter, ToolAccess, CreditConsumer, and Agent.run. Protocol methods with only a docstring body are valid Python and won't trigger the "statement has no effect" CodeQL rule. ✅


Design Observation Resolved (new info)

#13FactoryCreditConsumer doesn't commit
This is not actually a bug. CreditService.consume_credits() calls self.session.commit() internally (confirmed at credit_service.py:388, 443, 722). The FactoryCreditConsumer session is committed inside the service. No change needed — the original concern was unfounded.


Remaining Issues

1. FactoryCreditConsumer doesn't check success flag (credit_consumer.py:102)

SessionCreditConsumer checks result.get("success") before returning credits consumed. FactoryCreditConsumer does not:

# SessionCreditConsumer — checks success:
if result.get("success"):
    return float(result.get("credits_consumed", 0))
logger.warning(...)
return 0.0

# FactoryCreditConsumer — doesn't check success:
return float(result.get("credits_consumed", 0))

A failed consumption (e.g. insufficient credits) in worker context would return the consumed amount as if it succeeded. Minor but worth fixing for consistency. Fix this →

2. No input validation on taxonomy_type, structure_type, association_type (models/api/extensions/taxonomies.py:32-60, 83-90)

These fields accept arbitrary strings at the Pydantic layer; invalid values silently reach the DB CHECK constraint and surface as 500s. Use Literal types:

from typing import Literal

class CreateTaxonomyRequest(BaseModel):
    taxonomy_type: Literal["chart_of_accounts", "reporting", "mapping"]

Fix this →

3. except (ValueError, ProgrammingError) too broad (routers/ledger/taxonomies.py:91, 137, 179…)

Any ValueError from business logic (bad UUID, Pydantic coercion, misconfigured service) is silently swallowed and returned as a 404. Consider narrowing to ProgrammingError only, or checking e.orig.pgcode == "42P01" (undefined_table) for the "ledger not initialized" case.

4. get_mapped_trial_balance has no response model (routers/ledger/taxonomies.py:687)

This is the only endpoint in the taxonomy router without an explicit response_model=. It returns a plain dict. Defining a Pydantic model would catch schema drift and populate the OpenAPI docs.

5. Migration seeds data inline (migration upgrade():357)

seed_reporting_taxonomy(conn) is called directly inside upgrade(). This couples seed data to schema migration: if seed content needs to change post-deploy, a new (no-op schema) migration must be written just to re-seed. If this is intentional for MVP, worth adding a comment to that effect.

6. No UPDATE/DELETE for Taxonomy or Structure (routers/ledger/taxonomies.py)

Shared/seeded data being read-only makes sense. But tenant-owned CoA taxonomies and custom structures should presumably be mutable. If PATCH/DELETE are out of scope for this PR, a # TODO: noting it would prevent this becoming a silent gap.

7. _get_all_agents() instantiates on every call (orchestrator.py:108)

def _get_all_agents(self) -> dict[str, Agent]:
    for agent_type in list_agents():
        agents[agent_type] = get_agent(agent_type)

This is called multiple times per route_query (once per routing strategy). Since agents are stateless, correctness is fine, but it's unnecessary work on every request. A @cached_property or module-level cache would be cleaner.


Summary

All three critical pre-merge bugs are fixed. The commit also resolves the datetime.utcnow() deprecation, the stale docstring, the MCP suggested_by inconsistency, and the CodeQL ... warnings — thorough cleanup. The remaining items are all non-critical: one consistency gap in FactoryCreditConsumer, some missing input validation and response models in the taxonomy router, and a few design-level observations. None should block merge, but #1 (success-check inconsistency) and #2 (Literal type validation) are worth fixing before the next release.

@jfrench9 jfrench9 merged commit 3a3dee0 into main Mar 29, 2026
7 checks passed
@jfrench9 jfrench9 deleted the feature/taxonomy-system branch March 29, 2026 03:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant