Add taxonomy system and refactor agents & ledger models by jfrench9 · Pull Request #554 · RoboFinSystems/robosystems

jfrench9 · 2026-03-29T03:23:27Z

Summary

This PR introduces a comprehensive taxonomy system for financial data classification and simultaneously refactors the agent architecture into a unified, modular design. The ledger model layer is renamed from "accounts" to "elements" to support a more flexible, taxonomy-driven data model.

Key Accomplishments

Taxonomy System

New database tables and migration: Adds taxonomy, structure, element, and element association models with a full Alembic migration (rename_accounts_to_elements_add_taxonomy)
Seed data: Implements a taxonomy seed configuration (robosystems/config/taxonomy/seed.py) providing a canonical set of financial taxonomy definitions
API layer: New REST endpoints for taxonomy CRUD operations (robosystems/routers/ledger/taxonomies.py) with corresponding Pydantic API models
MCP tooling: Adds taxonomy_tools.py to expose taxonomy mapping capabilities to the graph-based MCP agent framework, integrated via the tools manager

Ledger Model Refactor: Accounts → Elements

Renames the core account model to element across the entire codebase (ORM models, API schemas, dbt models, QuickBooks adapter, routers, and tests)
Introduces element_association model for many-to-many relationships between elements and taxonomies
Adds structure model to represent hierarchical taxonomy structures
Updates all downstream references in line items, classification rules, trial balance, and summary endpoints

Agent System Overhaul

Unified agent base: Refactored base.py with a cleaner base class; removed legacy context.py, registry.py, cypher_agent.py, and financial.py
New modular architecture:
- agent_context.py — lightweight, well-scoped execution context
- agent_registry.py — decorator-based agent registration (@register_task pattern)
- adapters/ — separated API, worker, and worker task adapters for clean integration boundaries
- implementations/ — agent implementations (cypher, mapping) organized by domain
- credit_consumer.py, progress.py, tracked_ai.py, tool_access.py — composable concerns extracted into single-responsibility modules
Orchestrator simplification: Significantly reduced orchestrator complexity (~510 lines removed, streamlined flow)
Worker initialization: Task modules are now imported at worker init to trigger @register_task decorators

Client & Dependencies

Bumps robosystems-client to 0.2.37
Bumps cryptography from 46.0.5 to 46.0.6

Breaking Changes

Database migration required: The migration renames the accounts table to elements and adds new taxonomy-related tables. Must be applied before deployment.
Account model removed: All references to Account ORM model are replaced by Element. Any external code depending on the old model or import paths will break.
Agent imports restructured: Legacy agent modules (context.py, registry.py, cypher_agent.py, financial.py) are deleted. Any external references to these modules must be updated to the new implementations/ and adapters/ paths.
API contract changes: Ledger endpoints that previously referenced "accounts" now reference "elements". API consumers must update accordingly.

Testing Notes

New test suites added for: taxonomy seed validation, taxonomy tools, taxonomy router endpoints, and the unified agent system
Existing tests updated across: QuickBooks pipeline/load/transform, MCP tools manager, orchestrator, loader, materializer, agent router, accounts router, and transactions router
Legacy test files removed: test_context.py, test_registry.py, test_financial.py
All test changes align with the accounts → elements rename and new agent architecture

Infrastructure Considerations

A database migration must be run against the extensions database before deploying this change
Worker processes need to be restarted to pick up the new task registration mechanism
The taxonomy seed data should be reviewed for completeness before running in production; seeding may be required as a post-migration step
MCP tool registration now includes taxonomy tools — ensure MCP service configurations are updated if tool whitelisting is in place

🤖 Generated with Claude Code

Branch Info:

Source: feature/taxonomy-system
Target: main
Type: feature

Co-Authored-By: Claude noreply@anthropic.com

…ructure changes - Updated the `robosystems-client` dependency in `pyproject.toml` and `uv.lock` to version 0.2.37. - Introduced a new migration script to rename the `accounts` table to `elements` and added a taxonomy system with new tables for `taxonomies` and `structures`. - Refactored related code to replace references from `accounts` to `elements`, ensuring consistency across the codebase. - Added new models for `Element`, `ElementAssociation`, `Structure`, and `Taxonomy` to support the updated taxonomy structure.

- Created a migration to add `taxonomies` and `structures` tables, enhancing the database schema for taxonomy management. - Updated the `_insert_element` function to include `created_at` and `updated_at` timestamps. - Introduced new API endpoints for CRUD operations on taxonomies, structures, and element associations, improving the taxonomy management capabilities. - Added response models for taxonomies and structures to support the new API functionality.

- Introduced four new taxonomy mapping tools: `get-unmapped-elements`, `suggest-mapping`, `create-mapping-association`, and `get-mapping-summary`, gated by the `roboledger` extension and `EXTENSIONS_ENABLED` flag. - Implemented methods to retrieve tool definitions and execute the new tools, enhancing the taxonomy management capabilities. - Added unit tests to ensure proper registration and dispatching of the taxonomy tools, including checks for availability based on extension settings and read-only mode.

- Introduced a new SQL model `elements.sql` to replace the previous `accounts.sql`, aligning with the updated taxonomy structure. - Updated references in existing SQL files and tests to use `elements` instead of `accounts`, ensuring consistency across the codebase. - Modified the loading and processing logic to accommodate the new terminology, enhancing clarity in the data pipeline. - Added a new function to seed tenant reporting taxonomy, improving the management of shared reporting structures.

…taxonomy-system

- Added `WORKER_API_BASE_URL` environment variable to `compose.yaml` for API access. - Updated `worker.yaml` to include `WORKER_API_BASE_URL` in the CloudFormation template. - Introduced `WorkerAPIClient` for handling HTTP requests from the worker to the API. - Implemented `AgentMappingTask` for autonomous mapping of Chart of Accounts to US GAAP using Bedrock AI. - Added new endpoint for auto-mapping elements in the ledger router. - Removed the `FinancialAgent` class and its associated tests to streamline the agent system. - Added tests for the new `WorkerAPIClient` and `AgentMappingTask` functionalities.

- Removed `WORKER_API_BASE_URL` from environment variables in `compose.yaml` and `worker.yaml`. - Deleted `WorkerAPIClient` as it is no longer needed for API interactions. - Introduced `AgentMappingTask` for autonomous mapping of Chart of Accounts to US GAAP using Bedrock AI. - Added new files for agent mapping tasks and prompts to support the mapping functionality. - Implemented tests for the new `AgentMappingTask` to ensure proper functionality and coverage.

- Introduced a new unified agent protocol with classes for Agent, AgentSpec, AgentResult, and AgentContext to streamline agent definitions across execution contexts. - Added legacy support for existing orchestrator and routers, allowing gradual migration to the new protocol. - Implemented various agent context services, including tool access and credit consumption, to facilitate agent operations. - Created new agent registry for managing agent registrations and lookups. - Developed multiple agent implementations, including CypherAgent for natural language to Cypher query conversion and MappingAgent for autonomous Chart of Accounts to US GAAP mapping. - Enhanced progress reporting and credit consumption mechanisms for better tracking and management during agent execution.

- Updated the agent operations module to implement a unified agent protocol, enhancing dynamic routing and execution capabilities. - Removed legacy components including the context enricher, Cypher agent, and agent registry to streamline the agent framework. - Introduced execution adapters for running agents in different contexts (API and worker). - Enhanced agent registration and metadata handling for improved compatibility and performance. - Updated documentation to reflect changes in the agent protocol and system architecture.

…itialization

jfrench9 · 2026-03-29T03:23:28Z

@claude please review this PR

claude · 2026-03-29T03:23:56Z

Claude finished @jfrench9's task in 2m 9s —— View job

PR Review: Add taxonomy system and refactor agents & ledger models

Todo List:

Gather context and understand the PR
Review migration and database models
Review taxonomy system (seed, API models, router)
Review agent system overhaul
Review MCP taxonomy tools
Review tests
Summarize findings

This is a large, well-structured PR that introduces a solid taxonomy system and cleans up significant agent complexity. Overall the direction is excellent. Below are issues ranging from bugs to design observations.

Bugs

1. credits_remaining metadata is always wrong (adapters/api.py:91-93)

result.metadata["credits_remaining"] = (
    tracked_ai._credit_consumer is not None
    and getattr(tracked_ai, "_last_remaining", None)
)

TrackedAIClient never sets _last_remaining, so this evaluates to either False (no consumer) or None (consumer present, attribute missing). It never returns actual remaining credits. This is dead/broken metadata that misleads API consumers. Either implement it properly or remove the field. Fix this →

2. idx_associations_unapproved index is missing from migration

ElementAssociation.__table_args__ in element_association.py:42-45 defines:

Index(
    "idx_associations_unapproved",
    "approved_by",
    postgresql_where="approved_by IS NULL AND suggested_by = 'ai'",
),

This index exists in the ORM model but is not created in the migration (a3f8b2c1d4e5). The tenant schema creation in _upgrade_tenant_schema also doesn't include it. Fix this →

3. TrackedAIClient.credit_consumer type annotation accepts None but parameter declares it non-optional (tracked_ai.py:31, adapters/api.py:64)

# tracked_ai.py:31
credit_consumer: CreditConsumer,   # non-optional Protocol

# api.py:64
credit_consumer = SessionCreditConsumer(db_session) if db_session else None  # can be None

When db_session is None (e.g. in tests), credit_consumer=None is passed to TrackedAIClient. Then in create_message, await self._credit_consumer.consume(...) will raise AttributeError. Make credit_consumer optional: credit_consumer: CreditConsumer | None = None and guard the call.

Code Quality Issues

4. CreateTaxonomyRequest, CreateStructureRequest, CreateAssociationRequest have no input validation (models/api/extensions/taxonomies.py:32-60, 83-90)

taxonomy_type, structure_type, and association_type accept arbitrary strings at the Pydantic layer. Invalid values only fail at the DB CHECK constraint, which returns a 500 rather than a 400. Use Literal types to validate at the request level:

from typing import Literal

class CreateTaxonomyRequest(BaseModel):
    taxonomy_type: Literal["chart_of_accounts", "reporting", "mapping"]

5. except (ValueError, ProgrammingError) is too broad (routers/ledger/taxonomies.py:91, 137, 179…)

The pattern raise _ledger_404() is used in every handler to signal "ledger not initialized". But any ValueError raised by business logic (e.g. from Pydantic, a bad ID format, or a misconfigured service) would be silently converted to a misleading 404. Consider catching a more specific exception or checking the ProgrammingError pgcode for the relation-does-not-exist case.

6. get_mapped_trial_balance has no response model (routers/ledger/taxonomies.py:687-748)

The endpoint returns an untyped dict. All other endpoints in this router have explicit response_model=.... Defining a Pydantic model here would catch schema drift and enable OpenAPI documentation.

7. AgentResponse.timestamp uses deprecated datetime.utcnow() (base.py:223, 456)

timestamp: datetime = field(default_factory=datetime.utcnow)  # deprecated in Py3.12
"timestamp": datetime.utcnow().isoformat(),                    # same issue

The rest of the codebase uses datetime.now(UTC). These two lines in the legacy AgentResponse and BaseAgent.prepare_context should follow the same pattern.

8. Module docstring in agent_registry.py is stale (agent_registry.py:1-8)

The docstring mentions "Separate from the legacy AgentRegistry (registry.py)" but registry.py was deleted in this PR. The note about coexistence during migration is outdated.

Design Observations (non-blocking)

9. Migration seeds data inline (migration upgrade():357)

seed_reporting_taxonomy(conn)

Calling seed_reporting_taxonomy directly inside upgrade() couples schema changes with data initialization. This means if the seed function changes, a new migration is required to re-seed. A common alternative is a separate seeding step invoked by deployment scripts (so seed data can be re-run without a new migration). This is a deliberate trade-off worth documenting.

10. Missing UPDATE/DELETE endpoints for Taxonomy and Structure (routers/ledger/taxonomies.py)

The router supports POST /taxonomies and POST /structures but no PATCH/DELETE. Shared/locked seed data is read-only by design, but tenant-owned CoA taxonomies and custom structures should be updatable. If this is intentional for the initial MVP, consider documenting it with a TODO.

11. suggested_by hardcoded to "cowork" in MCP tool (taxonomy_tools.py:299)

suggested_by="cowork",

This should probably reflect the actual tool/agent performing the suggestion (e.g. "mcp" or "ai"). The suggested_by field is also used in the idx_associations_unapproved index filter suggested_by = 'ai', so there's an inconsistency — MCP-created associations would not appear in the pending-approval index.

12. _get_all_agents() instantiates agents on every routing call (orchestrator.py:108-116)

def _get_all_agents(self) -> dict[str, Agent]:
    agents = {}
    for agent_type in list_agents():
        agents[agent_type] = get_agent(agent_type)

Since agents are stateless, this is functionally safe, but instantiating on every route_query call (including _best_match_routing, _capability_based_routing, etc., each of which calls this) means multiple instantiations per request. A property cache or lazy initialization would be cleaner.

13. FactoryCreditConsumer doesn't commit (credit_consumer.py:92-107)

session = SessionFactory()
try:
    service = CreditService(session)
    result = service.consume_ai_tokens(...)
    return float(result.get("credits_consumed", 0))

There's no session.commit() before session.close(). If CreditService.consume_ai_tokens doesn't auto-commit internally, credit consumption in worker context would silently be rolled back. Verify that CreditService handles its own commit, or add an explicit commit here.

Summary

The taxonomy system is well-modeled with a clean separation between shared (public schema) GAAP elements and per-tenant CoA/mapping data. The agent refactor achieves its stated goals: stateless agents, protocol-based service injection, and clean separation between domain logic (implementations/) and execution wiring (adapters/). The migration is comprehensive and correctly handles tenant schema iteration.

The most critical issues to address before merge are #1 (broken credits_remaining metadata), #2 (missing idx_associations_unapproved in migration), and #3 (nullable credit_consumer passed to TrackedAIClient).

- Added a new index for unapproved element associations in the database migration to improve query performance. - Updated the suggested_by field in the CreateMappingAssociationTool to use 'ai' instead of 'cowork' for better tracking of AI-generated associations. - Refactored agent context and registry components to streamline the agent protocol, including optional credit consumption handling in TrackedAIClient. - Improved metadata handling in the agent API to indicate whether credit tracking is enabled.

jfrench9 · 2026-03-29T03:32:04Z

@claude can you continue to review including the new commit

claude · 2026-03-29T03:32:31Z

Claude finished @jfrench9's task in 2m 13s —— View job

Todo List:

Gather context and understand the request
Check new commits since last review
Read new/changed files from recent commits
Review and update findings
Post updated comprehensive review

Updated PR Review: Add taxonomy system and refactor agents & ledger models

The new commit (9c0e36b) directly addresses all three critical bugs and several quality issues flagged in the previous review. Here's a status update on each finding, followed by remaining items.

Fixed in `9c0e36b` ✅

Bug #1 — credits_remaining always wrong (adapters/api.py:91)
Fixed. Replaced the broken credits_remaining field with has_credit_tracking: bool, which correctly reflects whether a CreditConsumer is wired in. Clean solution.

Bug #2 — idx_associations_unapproved missing from migration
Fixed. The index is now added to both the public schema table creation (line 301) and _upgrade_tenant_schema (line 631), with the correct WHERE approved_by IS NULL AND suggested_by = 'ai' predicate. The public schema uses the ORM op.create_index form; the tenant loop uses raw SQL — both are correct.

Bug #3 — credit_consumer nullable crash in TrackedAIClient
Fixed well. credit_consumer: CreditConsumer | None = None and the consumption is now guarded by if self._credit_consumer is not None. A NoOpCreditConsumer was also added as an explicit test double, which is a nice bonus.

Issue #7 — datetime.utcnow() deprecated (base.py:219, 452)
Fixed. Both occurrences replaced with datetime.now(UTC).

Issue #8 — Stale docstring in agent_registry.py
Fixed. The migration-era note about coexistence with registry.py is removed.

Issue #11 — suggested_by hardcoded to "cowork" in MCP tool (taxonomy_tools.py:299)
Fixed. Changed to "ai", which now matches the idx_associations_unapproved index predicate.

CodeQL: ... in Protocol/abstract methods
The ... expressions were removed from ProgressReporter, ToolAccess, CreditConsumer, and Agent.run. Protocol methods with only a docstring body are valid Python and won't trigger the "statement has no effect" CodeQL rule. ✅

Design Observation Resolved (new info)

#13 — FactoryCreditConsumer doesn't commit
This is not actually a bug. CreditService.consume_credits() calls self.session.commit() internally (confirmed at credit_service.py:388, 443, 722). The FactoryCreditConsumer session is committed inside the service. No change needed — the original concern was unfounded.

Remaining Issues

1. FactoryCreditConsumer doesn't check success flag (credit_consumer.py:102)

SessionCreditConsumer checks result.get("success") before returning credits consumed. FactoryCreditConsumer does not:

# SessionCreditConsumer — checks success:
if result.get("success"):
    return float(result.get("credits_consumed", 0))
logger.warning(...)
return 0.0

# FactoryCreditConsumer — doesn't check success:
return float(result.get("credits_consumed", 0))

A failed consumption (e.g. insufficient credits) in worker context would return the consumed amount as if it succeeded. Minor but worth fixing for consistency. Fix this →

2. No input validation on taxonomy_type, structure_type, association_type (models/api/extensions/taxonomies.py:32-60, 83-90)

These fields accept arbitrary strings at the Pydantic layer; invalid values silently reach the DB CHECK constraint and surface as 500s. Use Literal types:

from typing import Literal

class CreateTaxonomyRequest(BaseModel):
    taxonomy_type: Literal["chart_of_accounts", "reporting", "mapping"]

Fix this →

3. except (ValueError, ProgrammingError) too broad (routers/ledger/taxonomies.py:91, 137, 179…)

Any ValueError from business logic (bad UUID, Pydantic coercion, misconfigured service) is silently swallowed and returned as a 404. Consider narrowing to ProgrammingError only, or checking e.orig.pgcode == "42P01" (undefined_table) for the "ledger not initialized" case.

4. get_mapped_trial_balance has no response model (routers/ledger/taxonomies.py:687)

This is the only endpoint in the taxonomy router without an explicit response_model=. It returns a plain dict. Defining a Pydantic model would catch schema drift and populate the OpenAPI docs.

5. Migration seeds data inline (migration upgrade():357)

seed_reporting_taxonomy(conn) is called directly inside upgrade(). This couples seed data to schema migration: if seed content needs to change post-deploy, a new (no-op schema) migration must be written just to re-seed. If this is intentional for MVP, worth adding a comment to that effect.

6. No UPDATE/DELETE for Taxonomy or Structure (routers/ledger/taxonomies.py)

Shared/seeded data being read-only makes sense. But tenant-owned CoA taxonomies and custom structures should presumably be mutable. If PATCH/DELETE are out of scope for this PR, a # TODO: noting it would prevent this becoming a silent gap.

7. _get_all_agents() instantiates on every call (orchestrator.py:108)

def _get_all_agents(self) -> dict[str, Agent]:
    for agent_type in list_agents():
        agents[agent_type] = get_agent(agent_type)

This is called multiple times per route_query (once per routing strategy). Since agents are stateless, correctness is fine, but it's unnecessary work on every request. A @cached_property or module-level cache would be cleaner.

Summary

All three critical pre-merge bugs are fixed. The commit also resolves the datetime.utcnow() deprecation, the stale docstring, the MCP suggested_by inconsistency, and the CodeQL ... warnings — thorough cleanup. The remaining items are all non-critical: one consistency gap in FactoryCreditConsumer, some missing input validation and response models in the taxonomy router, and a few design-level observations. None should block merge, but #1 (success-check inconsistency) and #2 (Literal type validation) are worth fixing before the next release.

jfrench9 added 11 commits March 27, 2026 21:46

Merge commit '7a481b0d26ecbc0e0e524536e9e98b49af87f6ee' into feature/…

88e4145

…taxonomy-system

Merge commit 'd1423cb43db0bae8e4c80169ff819f2bf4dca819' into feature/…

b995b01

…taxonomy-system

Import task modules to trigger @register_task decorators in worker in…

ac720ec

…itialization

github-code-quality Bot found potential problems Mar 29, 2026

View reviewed changes

jfrench9 merged commit 3a3dee0 into main Mar 29, 2026
7 checks passed

jfrench9 deleted the feature/taxonomy-system branch March 29, 2026 03:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add taxonomy system and refactor agents & ledger models#554

Add taxonomy system and refactor agents & ledger models#554
jfrench9 merged 12 commits into
mainfrom
feature/taxonomy-system

jfrench9 commented Mar 29, 2026

Uh oh!

jfrench9 commented Mar 29, 2026

Uh oh!

claude Bot commented Mar 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jfrench9 commented Mar 29, 2026

Uh oh!

claude Bot commented Mar 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jfrench9 commented Mar 29, 2026

Summary

Key Accomplishments

Taxonomy System

Ledger Model Refactor: Accounts → Elements

Agent System Overhaul

Client & Dependencies

Breaking Changes

Testing Notes

Infrastructure Considerations

Uh oh!

jfrench9 commented Mar 29, 2026

Uh oh!

claude Bot commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Add taxonomy system and refactor agents & ledger models

Bugs

Code Quality Issues

Design Observations (non-blocking)

Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jfrench9 commented Mar 29, 2026

Uh oh!

claude Bot commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Updated PR Review: Add taxonomy system and refactor agents & ledger models

Fixed in 9c0e36b ✅

Design Observation Resolved (new info)

Remaining Issues

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented Mar 29, 2026 •

edited

Loading

claude Bot commented Mar 29, 2026 •

edited

Loading

Fixed in `9c0e36b` ✅