Skip to content
This repository was archived by the owner on May 7, 2026. It is now read-only.

feat(ibis): introduce database and statement timeout for connector#1260

Merged
douenergy merged 6 commits intoCanner:mainfrom
goldmedal:feat/query-timeout
Jul 21, 2025
Merged

feat(ibis): introduce database and statement timeout for connector#1260
douenergy merged 6 commits intoCanner:mainfrom
goldmedal:feat/query-timeout

Conversation

@goldmedal
Copy link
Copy Markdown
Contributor

@goldmedal goldmedal commented Jul 17, 2025

Description

This PR introduces a timeout mechanism to prevent long queries from blocking the database resource. Wren engine has a 3-level timeout to protect the request:

  • System Level: The maximum time for a gunicorn worker to process a request. The worker will be aborted by SIGABRT. We should avoid triggering this timeout as soon as possible.
  • Connection Level: The maximum time to query with database. It's Wren's engine configuration. It protects a request from ending beyond the expected time to avoid triggering the system-level kill.
  • Statement level: The maximum time to execute a statement on the database side. It's a server-side timeout. The query will be canceled by the database itself. Currently, it's only implemented for Postgres. We should also implement it for other data sources.

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Added support for configurable query and statement timeouts via headers and environment variables.
    • Introduced asynchronous execution with robust timeout and cancellation handling for database queries and metadata operations.
    • Added comprehensive connection closing and cleanup across all supported connectors.
    • Enhanced connection info handling with additional parameters and consistent usage across API endpoints.
  • Bug Fixes

    • Improved error handling for database timeouts, returning clear 504 Gateway Timeout responses when queries exceed limits.
  • Tests

    • Added new tests to verify query timeout behavior and error responses for PostgreSQL connectors.
    • Updated existing tests to reflect new configuration parameters and function counts.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jul 17, 2025

Walkthrough

This set of changes introduces asynchronous execution with timeout controls for database operations, adds robust connection closing for connectors (especially PostgreSQL), and centralizes connection info handling. New configuration options, error classes, and test cases for connection timeouts are added. Endpoint implementations and utilities are refactored to support these enhancements.

Changes

File(s) Change Summary
app/config.py Config class updated to load app_timeout_seconds from env variable with default 240 seconds.
app/model/init.py ConnectionUrl gains optional kwargs dict; new DatabaseTimeoutError exception class added.
app/model/connector.py Adds robust close methods to all connectors, introduces PostgresConnector with safe close logic, adds QueryDryRunError and helper function for PG type names.
app/model/data_source.py get_connection_info accepts dict or ConnectionInfo, adds PostgreSQL-specific connect and statement timeouts via headers, passes kwargs to backend; fixes Redshift IAM connection info type.
app/routers/v2/connector.py Refactors endpoints to use async execution with timeout utilities and standardized connection info extraction; updates cache usage to new connection info.
app/routers/v3/connector.py Refactors endpoints for async execution with timeout utilities and standardized connection info extraction; updates cache usage and adds java_engine_connector param in SQL rewriting.
app/util.py Adds async utilities for database operations with timeout and safe connector cleanup; introduces error translation and cancellation handling.
tests/routers/v2/connector/test_postgres.py Adds async test for connection timeout using custom statement timeout header.
tests/routers/v3/connector/postgres/test_query.py Adds async test for connection timeout for both connection info and URL cases.
tests/routers/v3/connector/postgres/test_functions.py Updates expected function count in assertion by +1.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API_Endpoint
    participant DataSource
    participant Connector
    participant Util
    participant DB

    Client->>API_Endpoint: Send query request (with headers)
    API_Endpoint->>DataSource: get_connection_info(dto.connection_info, headers)
    DataSource-->>API_Endpoint: ConnectionInfo (with timeouts)
    API_Endpoint->>Connector: Instantiate with ConnectionInfo
    API_Endpoint->>Util: execute_query_with_timeout(connector, sql, ...)
    Util->>Connector: Run query in thread
    Connector->>DB: Execute SQL
    DB-->>Connector: Result or Timeout
    Connector-->>Util: Result or Exception
    Util-->>API_Endpoint: Result or DatabaseTimeoutError
    API_Endpoint-->>Client: Response (data or 504 error)
Loading

Suggested reviewers

  • douenergy

Poem

🐇
Hopping through async fields I go,
With timeouts set so queries don’t grow,
Connections close with gentle care,
Errors handled, no need to despair.
Now tests ensure our code’s robust—
In rabbit we trust, for queries just!


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 273b7fa and 679e9c1.

📒 Files selected for processing (1)
  • ibis-server/app/model/__init__.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • ibis-server/app/model/init.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: ci
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions Bot added ibis python Pull requests that update Python code labels Jul 17, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
ibis-server/app/config.py (1)

23-23: Consider adding error handling for invalid timeout values.

The timeout configuration looks good with a reasonable default of 240 seconds. However, consider adding error handling for cases where APP_TIMEOUT_SECONDS contains non-numeric values.

-        self.app_timeout_seconds = int(os.getenv("APP_TIMEOUT_SECONDS", "240"))
+        try:
+            self.app_timeout_seconds = int(os.getenv("APP_TIMEOUT_SECONDS", "240"))
+        except ValueError:
+            logger.warning("Invalid APP_TIMEOUT_SECONDS value, using default of 240 seconds")
+            self.app_timeout_seconds = 240
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4b7a32b and 0216d4e.

⛔ Files ignored due to path filters (1)
  • ibis-server/resources/function_list/postgres.csv is excluded by !**/*.csv
📒 Files selected for processing (10)
  • ibis-server/app/config.py (1 hunks)
  • ibis-server/app/model/__init__.py (2 hunks)
  • ibis-server/app/model/connector.py (9 hunks)
  • ibis-server/app/model/data_source.py (4 hunks)
  • ibis-server/app/routers/v2/connector.py (14 hunks)
  • ibis-server/app/routers/v3/connector.py (9 hunks)
  • ibis-server/app/util.py (3 hunks)
  • ibis-server/tests/routers/v2/connector/test_postgres.py (2 hunks)
  • ibis-server/tests/routers/v3/connector/postgres/test_functions.py (1 hunks)
  • ibis-server/tests/routers/v3/connector/postgres/test_query.py (2 hunks)
🧰 Additional context used
🧠 Learnings (2)
ibis-server/app/model/__init__.py (1)
Learnt from: goldmedal
PR: Canner/wren-engine#1053
File: ibis-server/app/model/__init__.py:187-196
Timestamp: 2025-02-04T12:08:44.141Z
Learning: In GcsFileConnectionInfo, both secret-based (key_id/secret_key) and credentials-based (service account) authentication methods are required fields.
ibis-server/app/model/connector.py (1)
Learnt from: goldmedal
PR: Canner/wren-engine#1029
File: ibis-server/app/model/metadata/object_storage.py:44-44
Timestamp: 2025-01-07T03:56:21.741Z
Learning: When working with DuckDB in Python, use `conn.execute("DESCRIBE SELECT * FROM table").fetchall()` to get column types instead of accessing DataFrame-style attributes like `dtype` or `dtypes`.
🧬 Code Graph Analysis (4)
ibis-server/tests/routers/v2/connector/test_postgres.py (3)
ibis-server/tests/routers/v3/connector/postgres/test_query.py (2)
  • test_connection_timeout (620-653)
  • manifest_str (124-125)
ibis-server/tests/conftest.py (1)
  • client (18-23)
ibis-server/tests/routers/v2/test_relationship_valid.py (3)
  • manifest_str (61-62)
  • postgres (66-69)
  • _to_connection_info (156-163)
ibis-server/tests/routers/v3/connector/postgres/test_query.py (4)
ibis-server/tests/routers/v2/connector/test_postgres.py (2)
  • test_connection_timeout (1030-1046)
  • manifest_str (126-127)
ibis-server/tests/conftest.py (1)
  • client (18-23)
ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py (1)
  • manifest_str (30-31)
ibis-server/tests/routers/v3/connector/postgres/conftest.py (2)
  • connection_info (37-44)
  • connection_url (48-50)
ibis-server/app/util.py (7)
ibis-server/app/config.py (1)
  • get_config (77-78)
ibis-server/app/model/__init__.py (1)
  • DatabaseTimeoutError (535-540)
ibis-server/app/model/metadata/metadata.py (1)
  • Metadata (7-21)
ibis-server/app/model/connector.py (18)
  • close (88-95)
  • close (116-140)
  • close (147-184)
  • close (233-243)
  • close (385-390)
  • close (434-439)
  • query (79-80)
  • query (105-110)
  • query (219-226)
  • query (266-304)
  • query (324-339)
  • query (419-427)
  • dry_run (82-86)
  • dry_run (113-114)
  • dry_run (191-199)
  • dry_run (229-231)
  • dry_run (342-348)
  • dry_run (430-432)
ibis-server/app/routers/v2/connector.py (4)
  • query (65-198)
  • validate (206-241)
  • get_table_list (250-264)
  • get_constraints (273-287)
ibis-server/app/routers/v3/connector.py (2)
  • query (58-205)
  • validate (301-359)
ibis-server/app/model/validator.py (1)
  • validate (23-31)
ibis-server/app/model/data_source.py (1)
ibis-server/app/model/__init__.py (4)
  • RedshiftIAMConnectionInfo (292-312)
  • SnowflakeConnectionInfo (321-352)
  • SSLMode (547-550)
  • TrinoConnectionInfo (355-377)
🔇 Additional comments (23)
ibis-server/tests/routers/v3/connector/postgres/test_functions.py (1)

58-58: LGTM! Test assertion updated to reflect function count change.

The expected function count has been correctly updated from 34 to 35, indicating that one additional function is now available when a remote function list path is set.

ibis-server/tests/routers/v2/connector/test_postgres.py (2)

13-13: LGTM! Import added for timeout header constant.

The import of X_WREN_DB_STATEMENT_TIMEOUT is correctly added to support the new timeout test functionality.


1030-1047: Excellent timeout test implementation.

The test effectively validates the timeout functionality with:

  • A deliberately slow query using pg_sleep(5)
  • A short timeout header of 1 second
  • Proper assertion of HTTP 504 status
  • Verification of the expected timeout cancellation message

This provides good coverage for the timeout feature.

ibis-server/app/model/__init__.py (2)

98-101: LGTM! kwargs field enhances connection flexibility.

The addition of the optional kwargs field to ConnectionUrl provides valuable flexibility for passing additional connection parameters, which aligns well with the timeout functionality being introduced.


535-541: Well-implemented timeout error class.

The DatabaseTimeoutError class properly:

  • Extends CustomHttpError following the established pattern
  • Uses HTTP 504 status code, which is appropriate for timeout scenarios
  • Provides consistent error message formatting
  • Accepts a descriptive message parameter
ibis-server/tests/routers/v3/connector/postgres/test_query.py (2)

7-7: LGTM! Import added for timeout header constant.

The import of X_WREN_DB_STATEMENT_TIMEOUT is correctly added to support the timeout test functionality.


620-654: Comprehensive timeout test covering both connection methods.

The test excellently validates timeout functionality by:

  • Testing both connectionInfo and connectionUrl connection methods
  • Using a slow query (pg_sleep(5)) with a 1-second timeout
  • Properly asserting HTTP 504 status and timeout message
  • Ensuring consistent behavior across different connection approaches

This provides thorough coverage for the timeout feature in the v3 API.

ibis-server/app/routers/v3/connector.py (3)

86-140: LGTM! Consistent async execution with timeout controls.

The refactoring properly extracts connection info and uses it consistently throughout the endpoint. All async operations are correctly awaited, and the timeout-enabled execution functions are properly integrated.


313-331: Consistent implementation with timeout controls.

The validation endpoint properly extracts connection info and uses the async timeout execution function correctly.


395-411: Proper async implementation with java_engine_connector support.

The endpoint correctly extracts connection info and passes the java_engine_connector parameter to the Rewriter, maintaining consistency with the refactoring pattern.

ibis-server/app/model/data_source.py (3)

81-103: Well-implemented connection info handling with timeout configuration.

The method properly handles both dict and ConnectionInfo inputs, and correctly configures PostgreSQL connection timeouts. The statement_timeout configuration from headers is properly appended to the options string.


125-126: Correct return type for Redshift IAM connections.

Good fix - returning the proper RedshiftIAMConnectionInfo type for redshift_iam connections.


166-168: Proper kwargs propagation to ibis.connect.

The change correctly extracts and passes kwargs to enable additional connection parameters like timeouts.

ibis-server/app/routers/v2/connector.py (2)

89-152: Consistent async refactoring with v3 implementation.

The v2 connector endpoints are properly refactored to use async timeout execution functions, maintaining consistency with the v3 implementation.


250-304: Metadata endpoints properly converted to async.

The metadata endpoints (get_table_list, get_constraints, get_db_version) are correctly converted to async functions with proper timeout execution.

ibis-server/app/model/connector.py (4)

143-185: Robust PostgreSQL connection closing implementation.

Excellent implementation with proper safety checks:

  • Query cancellation before closing prevents potential issues
  • The delay after cancellation allows for graceful cleanup
  • Multiple layers of error handling ensure connection is closed even on failure

116-141: Well-structured generic close implementation.

The SimpleConnector close method properly handles various connection types and prevents double-closing with the _closed flag.


74-96: Proper integration of PostgresConnector and unified close interface.

The Connector class correctly instantiates PostgresConnector for PostgreSQL connections and provides a clean delegation pattern for the close method.


46-54: Efficient type name caching and appropriate error class.

Good use of @cache decorator for performance and proper resource cleanup with closing context.

ibis-server/app/util.py (4)

251-258: Safe connector cleanup with error handling.

The helper function properly handles errors during connection cleanup and includes a delay that aligns with the PostgreSQL close implementation.


263-273: Clean timeout execution with proper error translation.

The function correctly handles both timeout and query cancellation scenarios, translating them to appropriate DatabaseTimeoutError exceptions.


275-308: Excellent implementation of task cancellation and cleanup.

This function demonstrates robust error handling:

  • Graceful task cancellation with timeout
  • Connection cleanup in separate thread to prevent blocking
  • Multiple timeout layers ensure eventual cleanup
  • Comprehensive error logging for debugging

310-400: Well-structured timeout functions for different operation types.

The implementation correctly differentiates between operations that need full cleanup (query, dry_run) and those that can use simpler timeout handling (validation, metadata).

Comment thread ibis-server/app/config.py Outdated
@goldmedal goldmedal requested a review from douenergy July 17, 2025 10:06
@douenergy douenergy merged commit f0f9523 into Canner:main Jul 21, 2025
5 checks passed
@douenergy
Copy link
Copy Markdown
Contributor

Thanks @goldmedal

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

ibis python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants