Enhance API docs, refactor models, and remove deprecated scripts by jfrench9 · Pull Request #660 · RoboFinSystems/robosystems

jfrench9 · 2026-05-07T20:16:23Z

Summary

This PR delivers a broad set of improvements focused on API model documentation, codebase cleanup, and structural refinements across the platform. It enhances the developer experience by adding comprehensive docstrings, examples, and descriptions to API models, removes deprecated seed generation scripts in favor of manual curation, and refactors several core components for consistency.

Key Accomplishments

📖 API Model Documentation & Enhancement

Added detailed docstrings, field descriptions, and usage examples across the full spectrum of API models: event blocks, event handlers, publish lists, reports, information blocks, taxonomy blocks, view configs, and all extension models (investor, entity, agent, fiscal calendar, taxonomies, journal entries, etc.)
Introduced a shared DeleteResult model in models/api/common.py for consistent delete response handling across endpoints
Updated request/response structures for clarity and consistency

🧹 Deprecated Script Removal

Removed 7 deprecated scripts related to automated seed generation and upstream data import (build_fac_calculations_seed, build_fac_presentation_seed, build_rs_gaap_seed, curate_fac_axes, import_upstream_seeds, load_fac_taxonomy, sec_extractor_benchmark)
Updated taxonomy __init__.py and seeds README.md to document the transition from automated seed generation to manual curation

🔧 Refactoring & Dependency Updates

Added _pydantic.py utility module for GraphQL type helpers
Refactored GraphQL types (ledger, investor, information block, library, common) to leverage the new Pydantic integration
Updated demo examples (main.py, validate.py) to use dot notation for object property access instead of dictionary-style access
Refactored information block API models and commands for improved structure
Expanded roboledger and roboinvestor router operations
Removed robosystems-client from main dependencies (moved or no longer needed)

Breaking Changes

Deprecated scripts removed: Any workflows depending on the removed seed generation scripts must transition to the manual curation process documented in the seeds README
API model structure changes: Request/response models for information blocks and other endpoints have been refactored — consumers of these models may need to update field references
Dependency change: robosystems-client has been removed from main project dependencies; downstream consumers relying on it as a transitive dependency should add it explicitly
Dot notation access: Examples and potentially client-facing code now expect object property access rather than dictionary-style access on response objects

Testing Notes

Test fixtures updated in conftest.py to align with model changes
Tests updated across information block commands, registry, statement handlers, router operations, GraphQL queries, and security race condition tests
Verify that all GraphQL type changes are compatible with existing schema consumers
Validate that the expanded router operations in roboledger/roboinvestor function correctly end-to-end

Infrastructure Considerations

The removal of automated seed generation scripts means taxonomy seeds are now maintained as curated artifacts — ensure documentation and onboarding materials reflect this process change
The lock file (uv.lock) has been updated to reflect dependency changes; environments should be refreshed accordingly
The new shared DeleteResult model should be adopted by any future delete endpoints for consistency

🤖 Generated with Claude Code

Branch Info:

Source: feature/ops-docs
Target: main
Type: feature

Co-Authored-By: Claude noreply@anthropic.com

- Introduced typed arms for Create, Update, and Delete requests in Information Block API models to enhance validation and clarity. - Implemented a `_coerce_payload` function to streamline payload validation against request models, ensuring typed payloads are returned as-is while untyped payloads are validated at dispatch time. - Updated tests to validate the new discriminated union structure, ensuring unknown block types are rejected at construction, providing clearer error messages. - Enhanced OpenAPI schema examples for Create and Update requests to improve developer experience. - Adjusted operations in the roboledger and roboinvestor routers to reflect the new response types for information block operations.

…ptions

…d descriptions and examples

…ponse Structures - Added detailed docstrings to various API models including FiscalPeriodSummary, PeriodCloseItemResponse, LedgerSummaryResponse, and others to improve clarity on their purpose and usage. - Updated request and response models in journal entries, taxonomies, transactions, and trial balance to include descriptions for fields, enhancing the API documentation. - Introduced examples in model configurations for better understanding of expected input and output formats. - Refactored agent-related operations to use LedgerAgentResponse for consistency across the API. - Implemented a new helper function to auto-fill GraphQL type descriptions from Pydantic model docstrings, improving the integration between Pydantic and Strawberry GraphQL. - Adjusted error messages in operation handlers to provide clearer guidance on available actions and workarounds.

…se handling

…nsition from automated seed generation to manual curation. Deleted `import_upstream_seeds.py`, `load_fac_taxonomy.py`, and `sec_extractor_benchmark.py` scripts. Updated `__init__.py` and `README.md` to clarify that seeds are now curated by hand and no longer regenerated from scripts.

…ct properties; update robosystems-client version to 0.3.22

jfrench9 · 2026-05-07T20:16:26Z

@claude please review this PR

claude · 2026-05-07T20:16:55Z

Claude finished @jfrench9's task in 2m 40s —— View job

PR Review: Enhance API docs, refactor models, and remove deprecated scripts

Tasks:

Gather PR context and understand changes
Review key changed files
Analyze API model documentation changes
Review GraphQL type refactoring
Check middleware/operations changes
Review router changes
Check test coverage changes
Provide comprehensive feedback

This is a large, well-structured PR. The core themes — typed OpenAPI envelopes, discriminated-union request models, and Pydantic-docstring auto-propagation to GraphQL — are all directionally correct. A few items below warrant attention before merging.

Strengths

OperationEnvelope genericization (middleware/operations.py): The TypeVar(default=Any) (PEP 696) approach is elegant. Unparameterized callers keep the result: Any | None shape they have today; callers that opt into result_type on their OperationSpec get a typed result field in OpenAPI. The backward-compatibility guarantee is clear in the comment.

_pydantic.py utility: _own_docstring correctly avoids inheriting BaseModel's own docstring by checking model.__dict__ rather than inspect.getdoc (which walks the MRO and would pull BaseModel's internals). The sentinel cases (description="" to suppress) are well-documented.

Discriminated-union information block requests: Refactoring CreateInformationBlockRequest / UpdateInformationBlockRequest / DeleteInformationBlockRequest to RootModel[Annotated[... | ..., Discriminator("block_type")]] is the right call — unknown block types now fail at the API boundary (422 from FastAPI's validation) rather than inside the dispatcher. The @property delegation pattern preserving the .block_type / .payload surface for downstream command functions is clean.

Drift tests (tests/operations/information_block/test_registry.py): TestRequestUnionRegistryDrift and TestRequestUnionPayloadAlignment are excellent. They'll catch any future registry addition that isn't mirrored in the union arms, surfacing the gap as a loud test failure rather than an opaque 422 in production. The helper functions (_arm_block_types, _arm_payload_for_block_type) are readable and purposeful.

DeleteResult consolidation (models/api/common.py): Removing duplicate DeleteResult classes from roboledger and roboinvestor prevents OpenAPI component key collisions and standardizes the delete response shape.

Docstring coverage: The field-level description= additions across the extension models (investor.py, agent.py, fiscal_calendar.py, etc.) are comprehensive without being verbose. The multi-example model_config = ConfigDict(json_schema_extra={"examples": [...]}) blocks on request models are a significant SDK-developer-experience improvement.

Issues

1. __all__ re-export in investor.py masks the consolidation — Medium

# robosystems/models/api/extensions/investor.py:7
__all__ = ["DeleteResult"]  # re-exported for back-compat with existing imports

__all__ with a single entry means from robosystems.models.api.extensions.investor import * only exports DeleteResult, silently dropping every other class in the module. Any code that relied on wildcard imports from this module breaks silently. A targeted alias is safer and doesn't interfere with the module's public surface:

# At the bottom of investor.py, after DeleteResult is imported:
DeleteResult = DeleteResult  # re-exported for back-compat

Or simply remove __all__ and leave a comment on the import line — wildcard imports from model modules are uncommon in production code.

Fix this →

2. validate.py — incomplete dot-notation migration — Low

After the refactor, validate.py still uses .get() on several response fields (lines 103–379). Most are on raw GraphQL response dicts (data.get("informationBlocks", ...), b.get("factSet"), etc.), which is correct since GraphQL returns plain dicts. However, line 144:

fails = summary.get("fail", 0) + summary.get("error", 0)

summary is payload.summary where payload is an EvaluateRulesResponse. If .summary is typed as dict[str, int], .get() is correct. If it's a Pydantic model, it should use dot notation. Worth confirming the type of EvaluateRulesResponse.summary.

3. AgentResponse rename — no backward-compat alias — Low

AgentResponse was renamed to LedgerAgentResponse in models/api/extensions/agent.py. All internal usages have been updated correctly. However, agent.py exports LedgerAgentResponse with no alias for the old name. If any consumer (SDK, external integration, downstream service) was importing AgentResponse from this module, they'll get an ImportError without warning. Since this is noted as a breaking change in the PR description, that's fine — just worth confirming no external consumers need a deprecation shim.

4. DeleteResult docstring — deleted field is always True — Nit

The docstring correctly states "always true today — 404 covers the not-found case at the HTTP layer." If that contract holds permanently, the boolean adds no information. Consider whether a future requirement (soft-delete returning deleted=False for already-soft-deleted rows) is plausible. If not, a no-content 204 or a simpler {"deleted": true} without a named field would be cleaner. Not a blocker — consistency with DeletePortfolioBlockResponse (which does carry meaningful fields) justifies keeping the shape uniform.

5. _LEGACY_BLOCK_TYPES comment — promotion path is clear, enforcement is manual — Nit

The promotion checklist in information_block.py is good:

# To promote a value to its own typed arm:
# (1) add a `_Create<Block>Arm` ...
# (2) drop the value from `_LEGACY_BLOCK_TYPES`,
# (3) add the new arm to `_CreateInformationBlockArms`.
# The `test_create_arm_union_covers_registry` drift test enforces that
# every registry entry remains covered.

The drift test catches step (3) being missed. It does not catch step (2) being missed — i.e., a new arm added but the literal not removed from _LEGACY_BLOCK_TYPES. Pydantic's discriminated-union evaluation would then have two arms matching the same literal, which raises at schema build time (good), but this won't surface until app startup rather than in the test suite. A test asserting _LEGACY_BLOCK_TYPES values are disjoint from typed arm values would close this gap.

Minor Observations

generate_fy2025_report simplification (examples/main.py:689): Removing the sync_detailed bypass and going through client.create_report() is a clean fix. The removal of the workaround comment ("Drop this once the SDK helper is fixed") is appropriate.
conftest.py cleanup order: The added UserRepository* → Graph → billing table deletion order with an explanatory comment is correct and will prevent FK-violation failures in integration tests that previously required careful setup.
ClosePeriodOperation pattern validation (routers/roboledger/operations.py): Adding pattern=r"^\d{4}-(0[1-9]|1[0-2])$" to the period field is a good tightening — the operation docstring already says "YYYY-MM", and the regex makes the 422 error self-evident when the caller passes a malformed period.

Summary

Approve with the __all__ issue addressed. The discriminated-union refactor, genericized OperationEnvelope, _pydantic.py utility, and drift tests are all solid additions. The __all__ = ["DeleteResult"] side-effect on wildcard imports is the only change that could cause silent regressions; the rest are low/nit priority.

…verlap with typed arms

+from robosystems.models.api.common import (  # noqa: F401  DeleteResult re-exported
+  DeleteResult,
+  PaginationInfo,
+)


…or clarity; adjust validation and service source fields to align with demo pedagogy

…ry extraction in _Validator class

…n errors across API endpoints

…larity

jfrench9 · 2026-05-07T23:50:30Z

@claude can you review the changes since the last review

claude · 2026-05-07T23:51:03Z

Claude finished @jfrench9's task in 3m 12s —— View job

Review: Changes Since Last Review

6 commits added after the initial review (2026-05-07 ~15:16 CDT):

Commit	Summary
`34a7c60`	test: add `TestLegacyTypedArmDisjoint` to catch `_LEGACY_BLOCK_TYPES` / typed-arm overlap
`19be358`	feat(demo): switch journal entry creation and disposal to `create_event_block`
`2134644`	fix: dynamic period computation in `validate.py`; fix `summary.to_dict()`
`d0045d7`	feat: normalize 422 / HTTP exception shapes in `main.py`; add 422 to `OPERATION_ERROR_RESPONSES`
`63954d3`	fix: bump `robosystems-client` to 0.3.24
`23110e8`	fix: update `source` options in `ListEventBlocksTool` docs

Fixes from the Previous Review

Issue #1 (__all__ in investor.py) — resolved correctly. The __all__ = ["DeleteResult"] line is gone. The re-export is now done via from ... import (# noqa: F401 DeleteResult re-exported ...), which is the right approach.

Nit #5 (no test for _LEGACY_BLOCK_TYPES disjointness) — partially addressed, but the test is logically broken. See issue below.

Issue #2 (validate.py summary.get() on attrs object) — resolved. .summary.to_dict() confirms EvaluateRulesResponseSummary is an attrs class; .to_dict() is correct. Same fix applied at line ~416.

Issues

1. TestLegacyTypedArmDisjoint always passes vacuously — Medium

tests/operations/information_block/test_registry.py:

def _typed_arm_block_types(self, union_alias) -> set[str]:
    legacy_values = set(get_args(_LEGACY_BLOCK_TYPES))
    return _arm_block_types(union_alias) - legacy_values   # ← subtracts legacy first

def test_create_arms_disjoint_from_legacy(self) -> None:
    legacy = set(get_args(_LEGACY_BLOCK_TYPES))
    typed = self._typed_arm_block_types(...)  # already has legacy subtracted out
    overlap = legacy & typed                  # ← can never be non-empty
    assert not overlap                        # always True

_typed_arm_block_types subtracts legacy_values before returning, so typed can never contain a value from legacy. The legacy & typed check is vacuously empty and the test always passes — exactly the failure mode it was meant to catch would go undetected.

The test needs to identify typed-arm literals independently of _LEGACY_BLOCK_TYPES:

def _typed_arm_block_types(self, union_alias) -> set[str]:
    """Block types from arms whose discriminator is NOT _LEGACY_BLOCK_TYPES."""
    from typing import Annotated, Literal, Union, get_args
    from robosystems.models.api.information_block import _LEGACY_BLOCK_TYPES

    typed = set()
    for arm in get_args(get_args(union_alias)[0]):  # unwrap RootModel[Annotated[Union[...]]]
        for field in arm.model_fields.values():
            annotation = field.annotation
            if annotation is _LEGACY_BLOCK_TYPES:
                continue  # skip the catch-all legacy arm
            if hasattr(annotation, "__args__"):
                typed.update(a for a in get_args(annotation) if isinstance(a, str))
    return typed

(Exact introspection depends on how _arm_block_types walks the union; the key fix is to skip the arm whose discriminator is _LEGACY_BLOCK_TYPES rather than post-subtracting its values.)

Fix this →

2. http_exception_handler comment says "4xx" but handles all HTTP status codes — Nit

main.py:

# 4xx HTTPException pass-through — ...
@app.exception_handler(StarletteHTTPException)
async def http_exception_handler(...):

StarletteHTTPException captures every HTTPException regardless of status code (4xx and 5xx). The comment is misleading if a raise HTTPException(status_code=500) is ever used. Rename the comment to "HTTPException pass-through" or "all HTTP exceptions" to avoid confusion.

Strengths

main.py exception handler normalization (d0045d7): Clean fix. The RequestValidationError handler correctly joins Pydantic error loc + msg into a human-readable string and emits ErrorResponse. The StarletteHTTPException handler's dict-passthrough for structured details (e.g., close-period blockers) is the right call — avoids double-wrapping already-structured error bodies. Adding request_id to all error responses is a good DX improvement.

OPERATION_ERROR_RESPONSES gets 422 (d0045d7): Now that the runtime handler normalizes 422 into ErrorResponse, declaring 422: {"model": ErrorResponse} in the spec aligns what SDK generators see with what they'll actually receive. The explanatory comment in common.py (lines ~245–252) is exactly right.

source vocabulary in service.py (23110e8 / 19be358): Changing "internal" → "schedule" in the three ScheduleService event-creation paths makes the vocabulary consistent with what the MCP tool docs now advertise ('manual' | 'schedule' | 'system' | 'quickbooks' | 'xero' | 'plaid'). The validate.py disposal body also picks up "source": "manual" correctly.

Dynamic period computation in validate.py (2134644): Replacing hardcoded "2026-03-01" / "2026-04-30" dates with current_month_period() / next_period() / period_date_range() makes the validation script time-stable. The explanation in the comment ("using fixed dates breaks whenever current_month_period() advances past them") is clear.

asset_disposed.py preview parity (19be358): Adding assert_period_not_closed to dispatch_preview closes a real correctness gap — without it, a preview claiming would_succeed=True could still fail at dispatch time if the disposal period has been closed since the preview ran. The added docstring explains the invariant plainly.

Summary

Approve pending the TestLegacyTypedArmDisjoint fix. The exception-handler normalization, source vocabulary cleanup, dynamic period computation, and preview/dispatch parity are all solid. The disjoint test is the one item that needs fixing — it was added in direct response to the previous review's nit, so it's worth getting right before merge.

…tity across tests

jfrench9 added 8 commits May 7, 2026 01:37

feat(reports): enhance report request models with examples and descri…

2602141

…ptions

feat(api): enhance event handler and publish list models with detaile…

020ce66

…d descriptions and examples

feat(api): add shared DeleteResult model for consistent delete respon…

7ad4b0f

…se handling

chore(dependencies): remove robosystems-client from main dependencies

ee1b33f

refactor: update main.py and validate.py to use dot notation for obje…

4c5f9fc

…ct properties; update robosystems-client version to 0.3.22

test: add tests to ensure block types in _LEGACY_BLOCK_TYPES do not o…

34a7c60

…verlap with typed arms

github-code-quality Bot found potential problems May 7, 2026

View reviewed changes

Comment thread robosystems/models/api/extensions/investor.py

Comment on lines +9 to +12

from robosystems.models.api.common import ( # noqa: F401 DeleteResult re-exported

DeleteResult,

PaginationInfo,

)

jfrench9 added 5 commits May 7, 2026 16:48

feat(demo): update journal entry creation to use create_event_block f…

19be358

…or clarity; adjust validation and service source fields to align with demo pedagogy

fix: update date handling in generate_fy2025_report and improve summa…

2134644

…ry extraction in _Validator class

feat: enhance error handling and response normalization for validatio…

d0045d7

…n errors across API endpoints

fix: update robosystems-client version to 0.3.24 in dependencies

63954d3

fix: update source options in ListEventBlocksTool documentation for c…

23110e8

…larity

refactor: normalize error response structure for 422 Unprocessable En…

053b815

…tity across tests

jfrench9 merged commit 058faef into main May 8, 2026
7 checks passed

jfrench9 deleted the feature/ops-docs branch May 8, 2026 00:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance API docs, refactor models, and remove deprecated scripts#660

Enhance API docs, refactor models, and remove deprecated scripts#660
jfrench9 merged 15 commits into
mainfrom
feature/ops-docs

jfrench9 commented May 7, 2026

Uh oh!

jfrench9 commented May 7, 2026

Uh oh!

claude Bot commented May 7, 2026 •

edited

Loading

Uh oh!

jfrench9 commented May 7, 2026

Uh oh!

claude Bot commented May 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jfrench9 commented May 7, 2026

Summary

Key Accomplishments

📖 API Model Documentation & Enhancement

🧹 Deprecated Script Removal

🔧 Refactoring & Dependency Updates

Breaking Changes

Testing Notes

Infrastructure Considerations

Uh oh!

jfrench9 commented May 7, 2026

Uh oh!

claude Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Enhance API docs, refactor models, and remove deprecated scripts

Strengths

Issues

Minor Observations

Summary

Uh oh!

jfrench9 commented May 7, 2026

Uh oh!

claude Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: Changes Since Last Review

Fixes from the Previous Review

Issues

Strengths

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented May 7, 2026 •

edited

Loading

claude Bot commented May 7, 2026 •

edited

Loading