Skip to content

Fix backend-frontend alignment with enhanced subgraph management#72

Merged
jfrench9 merged 3 commits into
mainfrom
bugfix/align-backend-frontend
Nov 12, 2025
Merged

Fix backend-frontend alignment with enhanced subgraph management#72
jfrench9 merged 3 commits into
mainfrom
bugfix/align-backend-frontend

Conversation

@jfrench9
Copy link
Copy Markdown
Member

Summary

This bugfix addresses alignment issues between backend and frontend systems by improving subgraph management capabilities and enhancing the development workflow. The changes focus on strengthening API consistency and providing better tooling for developers.

Key Accomplishments

  • Enhanced Subgraph Service: Significantly expanded subgraph management operations with 100+ lines of new functionality for better graph processing and manipulation
  • Improved API Router Logic: Updated multiple router modules (billing, graphs, organizations) to ensure consistent data handling and response formats
  • Better Schema Management: Enhanced schema export capabilities and utilities for more reliable graph schema operations
  • Development Workflow Improvements: Updated development documentation and tooling configuration to streamline the development process
  • Test Coverage Updates: Synchronized test cases across billing, graphs, and organization modules to maintain compatibility

Technical Changes

  • Extended subgraph service operations for more robust graph management
  • Refined API response handling in billing invoices, backup statistics, and organization member management
  • Enhanced schema export functionality with improved error handling
  • Updated subgraph utilities for better data processing
  • Improved organization usage tracking and member management logic

Testing Notes

  • All existing tests have been updated to reflect the new API behaviors
  • Focus testing on subgraph operations, particularly creation, modification, and export workflows
  • Verify billing invoice generation and organization member management functionality
  • Test schema export operations across different graph configurations

Infrastructure Considerations

  • No breaking changes to existing API contracts
  • Enhanced error handling may result in more informative error responses
  • Improved subgraph processing may affect performance characteristics for large graphs
  • Development environment setup has been streamlined with updated tooling requirements

🤖 Generated with Claude Code

Branch Info:

  • Source: bugfix/align-backend-frontend
  • Target: main
  • Type: bugfix

Co-Authored-By: Claude noreply@anthropic.com

…ement

- Updated CLAUDE.md to clarify the usage of `just restart` and `just rebuild` commands for code changes.
- Modified `justfile` to ensure environment variables are set correctly before rebuilding containers.
- Added a new field `subgraphs_enabled` in the ListSubgraphsResponse to indicate if subgraphs are supported for the parent graph tier.
- Implemented a new method `create_subgraph` in SubgraphService to handle subgraph creation with proper metadata management.
- Refactored various endpoints to improve subgraph handling and access control, ensuring better validation and error messaging.
- Updated tests to reflect changes in the billing and organization management logic, enhancing coverage and accuracy.
@jfrench9
Copy link
Copy Markdown
Member Author

@claude please review this PR

@claude
Copy link
Copy Markdown

claude Bot commented Nov 12, 2025

Claude finished @jfrench9's task —— View job


Code Review Complete

I've reviewed all 18 changed files in this PR. Here's my comprehensive feedback:


Overview

This PR makes significant enhancements to subgraph management and API consistency. The changes are well-structured overall, but I've identified several issues that should be addressed before merging.


Critical Issues

1. Uninitialized Variable in Backup Creation (subgraph_service.py:559)

Location: robosystems/operations/graph/subgraph_service.py:559

Issue: The variable backup_location is initialized to an empty string to avoid an unbound variable error, but this is a code smell indicating incomplete error handling.

async def _create_backup(self, client, database_name: str, instance_id: str) -> str:
    backup_location = ""  # Initialize to avoid unbound variable error  # ⚠️
    try:
        timestamp = datetime.now(timezone.utc).strftime("%Y%m%d_%H%M%S")
        backup_location = f"s3://robosystems-backups/{instance_id}/{database_name}_{timestamp}.backup"
        # ...
    except AttributeError:
        logger.warning(f"Backup not yet implemented for {database_name}")
        return backup_location  # Returns empty string if backup fails

Problem: If the backup method is not implemented (line 575), the function returns an empty string rather than raising an exception. This makes it impossible for callers to distinguish between a successful backup and a failed one.

Recommendation: Either:

  • Raise an exception if backup is not implemented
  • Return None to indicate no backup was created
  • Use a proper result type like Optional[str]

2. Missing Error Handling for Non-Existent User Graph (subgraphs/utils.py:76)

Location: robosystems/routers/graphs/subgraphs/utils.py:76

Issue: The code assumes user_graph exists without checking:

def verify_parent_graph_access(graph_id: str, current_user: User, session: Session, required_role: str = "read") -> Graph:
    # ... code to get user_graph ...
    user_graph = (
        session.query(GraphUser)
        .filter(GraphUser.user_id == current_user.id, GraphUser.graph_id == graph_id)
        .first()
    )
    
    # Check role requirements
    if required_role == "admin" and user_graph.role != "admin":  # ⚠️ AttributeError if user_graph is None
        raise HTTPException(...)

Problem: If the user doesn't have access to the graph, user_graph will be None, causing an AttributeError when trying to access .role.

Recommendation:

if not user_graph:
    raise HTTPException(
        status_code=status.HTTP_403_FORBIDDEN,
        detail=f"Access denied to graph {graph_id}"
    )

if required_role == "admin" and user_graph.role != "admin":
    # ...

High Priority Issues

3. Inconsistent Response Models (billing/invoices.py)

Location: robosystems/routers/billing/invoices.py:99-103

Issue: When there's no Stripe customer, the code returns InvoicesResponse with different structure than the Stripe path:

return InvoicesResponse(
    invoices=invoices,
    total_count=len(invoices),
    has_more=len(local_invoices) > limit,  # ⚠️ Could be True even if all invoices returned
)

Problem: The has_more logic is incorrect. If len(local_invoices) > limit, it means there ARE more invoices, but we've only returned limit invoices (sliced at line 70: for inv in local_invoices[:limit]).

Recommendation:

return InvoicesResponse(
    invoices=invoices,
    total_count=len(local_invoices),  # Total count, not just returned count
    has_more=len(local_invoices) > limit,
)

4. Potential Integer Division Issue (backups/stats.py:95)

Location: robosystems/routers/graphs/backups/stats.py:95

Issue:

compression_ratios.append(
    backup.compressed_size_bytes / backup.original_size_bytes  # ⚠️ No check for zero
)

Problem: While there's a check for original_size_bytes > 0 on line 93, the division could still produce unexpected results if compressed_size_bytes is None.

Recommendation:

if backup.original_size_bytes and backup.original_size_bytes > 0 and backup.compressed_size_bytes is not None:
    compression_ratios.append(
        backup.compressed_size_bytes / backup.original_size_bytes
    )

5. Schema Export Has Complex Fallback Logic (schema/export.py)

Location: robosystems/routers/graphs/schema/export.py:139-257

Issue: The schema export endpoint has very complex nested fallback logic with multiple code paths:

  1. Try to get from GraphSchema table
  2. Fall back to Graph metadata
  3. Reconstruct from Python definitions for extension-based schemas
  4. Fall back to DDL if JSON not available

Problem: This makes the code difficult to test and maintain. There are 4+ different ways to construct the schema definition, making it hard to ensure consistency.

Recommendation:

  • Consider refactoring into separate helper functions for each schema source
  • Add explicit logging at each fallback point
  • Consider whether all these fallbacks are necessary or if some represent incomplete migrations

Medium Priority Issues

6. Hardcoded S3 Bucket Name (subgraph_service.py:564)

Location: robosystems/operations/graph/subgraph_service.py:564

backup_location = (
    f"s3://robosystems-backups/{instance_id}/{database_name}_{timestamp}.backup"  # ⚠️
)

Problem: The S3 bucket name is hardcoded, making it difficult to use different buckets for different environments (staging/prod).

Recommendation: Move to environment configuration:

from robosystems.config import env
backup_location = (
    f"s3://{env.BACKUP_BUCKET_NAME}/{instance_id}/{database_name}_{timestamp}.backup"
)

7. Missing Type Hints (Multiple files)

Several helper methods are missing return type hints:

  • subgraph_service.py:487-526: Helper methods _install_schema_with_extensions, _install_base_schema, etc.
  • subgraphs/utils.py:230-258: Utility functions

Recommendation: Add return type hints for better type safety and IDE support.


8. Test Coverage Gaps

Issue: While tests were updated, I don't see tests for:

  • The new subgraph service methods (create_subgraph_database, delete_subgraph_database, list_subgraph_databases, get_subgraph_info)
  • Schema export with all its fallback paths
  • Backup statistics edge cases (zero compression ratio, missing sizes)

Recommendation: Add integration tests for the new subgraph service operations.


Low Priority / Code Quality

9. Magic Numbers (backups/stats.py)

Location: Multiple places in backups/stats.py

success_rate = (
    (successful_backups / total_backups * 100) if total_backups > 0 else 0.0  # Magic 100
)

Recommendation: Extract to named constants:

PERCENTAGE_MULTIPLIER = 100.0
success_rate = (successful_backups / total_backups * PERCENTAGE_MULTIPLIER) if total_backups > 0 else 0.0

10. Inconsistent Naming

Location: subgraphs/main.py:102-123

The response uses both subgraph_name and display_name:

SubgraphSummary(
    graph_id=subgraph.graph_id,
    subgraph_name=subgraph.subgraph_name or subgraph.graph_name,  # Fallback logic
    display_name=subgraph.graph_name,
    # ...
)

Problem: It's unclear when to use subgraph_name vs display_name. The fallback on line 115 suggests subgraph_name might be None in some cases.

Recommendation: Ensure the database schema guarantees subgraph_name is always set for subgraphs, eliminating the need for this fallback.


11. Schema Extension Field Name Inconsistency (subgraphs.py)

Location: robosystems/models/api/graphs/subgraphs.py:48-52

schema_extensions: list[str] | None = Field(
    default_factory=list,
    description="Schema extensions to include (inherits from parent by default)",
    examples=[["roboledger", "roboinvestor"]],
)

Problem: The field has default_factory=list but also allows None in the type hint. This is contradictory.

Recommendation:

schema_extensions: list[str] = Field(
    default_factory=list,  # Remove | None from type
    description="Schema extensions to include (inherits from parent by default)",
)

Positive Observations

What's Done Well:

  1. Comprehensive Documentation: Schema export endpoint has excellent documentation explaining the difference between runtime and declared schemas
  2. Security Checks: Good validation in member management (preventing owner role changes, ensuring at least one owner)
  3. Error Handling: Most endpoints have proper try/except blocks with appropriate HTTP status codes
  4. Metrics & Logging: Good use of metrics and logging throughout (circuit breakers, operation tracking)
  5. Validation: Strong input validation (subgraph names, alphanumeric checks)
  6. Feature Flags: Proper use of feature flags (SUBGRAPH_CREATION_ENABLED, ORG_MEMBER_INVITATIONS_ENABLED)

Recommendations Summary

Must Fix Before Merge:

  1. Fix verify_parent_graph_access null check (Critical - will cause crashes)
  2. Fix backup location error handling in _create_backup
  3. Fix has_more logic in invoice listing

Should Fix Before Merge:

  1. Add null checks for compression ratio calculation
  2. Move S3 bucket name to environment config
  3. Simplify schema export fallback logic or add better logging

Nice to Have:

  1. Add type hints to helper methods
  2. Add tests for new subgraph service methods
  3. Extract magic numbers to constants
  4. Fix schema_extensions type inconsistency

Overall Assessment

This is a solid PR that adds important functionality. The code quality is generally good with proper error handling, logging, and validation. However, there are 2-3 critical bugs that need to be fixed before merging to prevent runtime errors.

The subgraph service expansion is well-designed and follows good patterns. The API consistency improvements are valuable. Once the critical issues are addressed, this will be ready to merge.

Status: ⚠️ Changes Requested - Address critical issues #1 and #2, and strongly consider fixing #3.


| Branch: bugfix/align-backend-frontend

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 65 to 69
graph_id: str = Path(
...,
description="Parent graph ID (e.g., 'kg1a2b3c4d5')",
pattern="^(kg[a-f0-9]{10}|sec)$",
pattern="^(kg[a-f0-9]{16}|sec)$",
),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Path regex rejects valid graph IDs

The list/create subgraph endpoints now validate graph_id with pattern="^(kg[a-f0-9]{16}|sec)$", which only allows 16 hex characters after the kg prefix and the special sec repository. Production graph IDs are 18 hex characters (kg + 14 ULID chars + 4 hash chars) and other repositories such as industry or economic are also valid, as documented and implemented in _generate_graph_id where the expected pattern is ^(kg[a-z0-9]{10,20}|sec|industry|economic)$. With the new regex, most legitimate graph IDs will fail FastAPI’s path validation, making subgraph listing/creation endpoints unusable for real graphs. This effectively blocks subgraph functionality for existing deployments.

Useful? React with 👍 / 👎.

- Updated type hints for client parameters in SubgraphService methods to specify "GraphClient".
- Enhanced the _create_backup method to return None if the backup is not implemented, and updated the backup location format to use the configured S3 bucket.
- Adjusted the delete subgraph documentation to clarify the backup location format.
- Improved error handling in the verify_parent_graph_access function to raise an HTTPException for unauthorized access.
- Refactored utility functions to include return types for better clarity and maintainability.
- Modified `gha-runner.yml` to improve runner registration logic, including dynamic desired runner count and enhanced error handling for API responses.
- Adjusted `test.yml` to remove timeout from type checking command for better execution flow.
- Updated `gha-runner.yaml` to change instance types for better resource allocation, replacing `t3a.medium` and `t3.medium` with `t3a.large` and `t3.large`.
@jfrench9 jfrench9 merged commit 723d402 into main Nov 12, 2025
5 checks passed
@jfrench9 jfrench9 deleted the bugfix/align-backend-frontend branch November 20, 2025 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant