feat(ibis): introduce database and statement timeout for connector#1260
feat(ibis): introduce database and statement timeout for connector#1260douenergy merged 6 commits intoCanner:mainfrom
Conversation
WalkthroughThis set of changes introduces asynchronous execution with timeout controls for database operations, adds robust connection closing for connectors (especially PostgreSQL), and centralizes connection info handling. New configuration options, error classes, and test cases for connection timeouts are added. Endpoint implementations and utilities are refactored to support these enhancements. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant API_Endpoint
participant DataSource
participant Connector
participant Util
participant DB
Client->>API_Endpoint: Send query request (with headers)
API_Endpoint->>DataSource: get_connection_info(dto.connection_info, headers)
DataSource-->>API_Endpoint: ConnectionInfo (with timeouts)
API_Endpoint->>Connector: Instantiate with ConnectionInfo
API_Endpoint->>Util: execute_query_with_timeout(connector, sql, ...)
Util->>Connector: Run query in thread
Connector->>DB: Execute SQL
DB-->>Connector: Result or Timeout
Connector-->>Util: Result or Exception
Util-->>API_Endpoint: Result or DatabaseTimeoutError
API_Endpoint-->>Client: Response (data or 504 error)
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
ibis-server/app/config.py (1)
23-23: Consider adding error handling for invalid timeout values.The timeout configuration looks good with a reasonable default of 240 seconds. However, consider adding error handling for cases where
APP_TIMEOUT_SECONDScontains non-numeric values.- self.app_timeout_seconds = int(os.getenv("APP_TIMEOUT_SECONDS", "240")) + try: + self.app_timeout_seconds = int(os.getenv("APP_TIMEOUT_SECONDS", "240")) + except ValueError: + logger.warning("Invalid APP_TIMEOUT_SECONDS value, using default of 240 seconds") + self.app_timeout_seconds = 240
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
ibis-server/resources/function_list/postgres.csvis excluded by!**/*.csv
📒 Files selected for processing (10)
ibis-server/app/config.py(1 hunks)ibis-server/app/model/__init__.py(2 hunks)ibis-server/app/model/connector.py(9 hunks)ibis-server/app/model/data_source.py(4 hunks)ibis-server/app/routers/v2/connector.py(14 hunks)ibis-server/app/routers/v3/connector.py(9 hunks)ibis-server/app/util.py(3 hunks)ibis-server/tests/routers/v2/connector/test_postgres.py(2 hunks)ibis-server/tests/routers/v3/connector/postgres/test_functions.py(1 hunks)ibis-server/tests/routers/v3/connector/postgres/test_query.py(2 hunks)
🧰 Additional context used
🧠 Learnings (2)
ibis-server/app/model/__init__.py (1)
Learnt from: goldmedal
PR: Canner/wren-engine#1053
File: ibis-server/app/model/__init__.py:187-196
Timestamp: 2025-02-04T12:08:44.141Z
Learning: In GcsFileConnectionInfo, both secret-based (key_id/secret_key) and credentials-based (service account) authentication methods are required fields.
ibis-server/app/model/connector.py (1)
Learnt from: goldmedal
PR: Canner/wren-engine#1029
File: ibis-server/app/model/metadata/object_storage.py:44-44
Timestamp: 2025-01-07T03:56:21.741Z
Learning: When working with DuckDB in Python, use `conn.execute("DESCRIBE SELECT * FROM table").fetchall()` to get column types instead of accessing DataFrame-style attributes like `dtype` or `dtypes`.
🧬 Code Graph Analysis (4)
ibis-server/tests/routers/v2/connector/test_postgres.py (3)
ibis-server/tests/routers/v3/connector/postgres/test_query.py (2)
test_connection_timeout(620-653)manifest_str(124-125)ibis-server/tests/conftest.py (1)
client(18-23)ibis-server/tests/routers/v2/test_relationship_valid.py (3)
manifest_str(61-62)postgres(66-69)_to_connection_info(156-163)
ibis-server/tests/routers/v3/connector/postgres/test_query.py (4)
ibis-server/tests/routers/v2/connector/test_postgres.py (2)
test_connection_timeout(1030-1046)manifest_str(126-127)ibis-server/tests/conftest.py (1)
client(18-23)ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py (1)
manifest_str(30-31)ibis-server/tests/routers/v3/connector/postgres/conftest.py (2)
connection_info(37-44)connection_url(48-50)
ibis-server/app/util.py (7)
ibis-server/app/config.py (1)
get_config(77-78)ibis-server/app/model/__init__.py (1)
DatabaseTimeoutError(535-540)ibis-server/app/model/metadata/metadata.py (1)
Metadata(7-21)ibis-server/app/model/connector.py (18)
close(88-95)close(116-140)close(147-184)close(233-243)close(385-390)close(434-439)query(79-80)query(105-110)query(219-226)query(266-304)query(324-339)query(419-427)dry_run(82-86)dry_run(113-114)dry_run(191-199)dry_run(229-231)dry_run(342-348)dry_run(430-432)ibis-server/app/routers/v2/connector.py (4)
query(65-198)validate(206-241)get_table_list(250-264)get_constraints(273-287)ibis-server/app/routers/v3/connector.py (2)
query(58-205)validate(301-359)ibis-server/app/model/validator.py (1)
validate(23-31)
ibis-server/app/model/data_source.py (1)
ibis-server/app/model/__init__.py (4)
RedshiftIAMConnectionInfo(292-312)SnowflakeConnectionInfo(321-352)SSLMode(547-550)TrinoConnectionInfo(355-377)
🔇 Additional comments (23)
ibis-server/tests/routers/v3/connector/postgres/test_functions.py (1)
58-58: LGTM! Test assertion updated to reflect function count change.The expected function count has been correctly updated from 34 to 35, indicating that one additional function is now available when a remote function list path is set.
ibis-server/tests/routers/v2/connector/test_postgres.py (2)
13-13: LGTM! Import added for timeout header constant.The import of
X_WREN_DB_STATEMENT_TIMEOUTis correctly added to support the new timeout test functionality.
1030-1047: Excellent timeout test implementation.The test effectively validates the timeout functionality with:
- A deliberately slow query using
pg_sleep(5)- A short timeout header of 1 second
- Proper assertion of HTTP 504 status
- Verification of the expected timeout cancellation message
This provides good coverage for the timeout feature.
ibis-server/app/model/__init__.py (2)
98-101: LGTM!kwargsfield enhances connection flexibility.The addition of the optional
kwargsfield toConnectionUrlprovides valuable flexibility for passing additional connection parameters, which aligns well with the timeout functionality being introduced.
535-541: Well-implemented timeout error class.The
DatabaseTimeoutErrorclass properly:
- Extends
CustomHttpErrorfollowing the established pattern- Uses HTTP 504 status code, which is appropriate for timeout scenarios
- Provides consistent error message formatting
- Accepts a descriptive message parameter
ibis-server/tests/routers/v3/connector/postgres/test_query.py (2)
7-7: LGTM! Import added for timeout header constant.The import of
X_WREN_DB_STATEMENT_TIMEOUTis correctly added to support the timeout test functionality.
620-654: Comprehensive timeout test covering both connection methods.The test excellently validates timeout functionality by:
- Testing both
connectionInfoandconnectionUrlconnection methods- Using a slow query (
pg_sleep(5)) with a 1-second timeout- Properly asserting HTTP 504 status and timeout message
- Ensuring consistent behavior across different connection approaches
This provides thorough coverage for the timeout feature in the v3 API.
ibis-server/app/routers/v3/connector.py (3)
86-140: LGTM! Consistent async execution with timeout controls.The refactoring properly extracts connection info and uses it consistently throughout the endpoint. All async operations are correctly awaited, and the timeout-enabled execution functions are properly integrated.
313-331: Consistent implementation with timeout controls.The validation endpoint properly extracts connection info and uses the async timeout execution function correctly.
395-411: Proper async implementation with java_engine_connector support.The endpoint correctly extracts connection info and passes the java_engine_connector parameter to the Rewriter, maintaining consistency with the refactoring pattern.
ibis-server/app/model/data_source.py (3)
81-103: Well-implemented connection info handling with timeout configuration.The method properly handles both dict and ConnectionInfo inputs, and correctly configures PostgreSQL connection timeouts. The statement_timeout configuration from headers is properly appended to the options string.
125-126: Correct return type for Redshift IAM connections.Good fix - returning the proper
RedshiftIAMConnectionInfotype for redshift_iam connections.
166-168: Proper kwargs propagation to ibis.connect.The change correctly extracts and passes kwargs to enable additional connection parameters like timeouts.
ibis-server/app/routers/v2/connector.py (2)
89-152: Consistent async refactoring with v3 implementation.The v2 connector endpoints are properly refactored to use async timeout execution functions, maintaining consistency with the v3 implementation.
250-304: Metadata endpoints properly converted to async.The metadata endpoints (get_table_list, get_constraints, get_db_version) are correctly converted to async functions with proper timeout execution.
ibis-server/app/model/connector.py (4)
143-185: Robust PostgreSQL connection closing implementation.Excellent implementation with proper safety checks:
- Query cancellation before closing prevents potential issues
- The delay after cancellation allows for graceful cleanup
- Multiple layers of error handling ensure connection is closed even on failure
116-141: Well-structured generic close implementation.The SimpleConnector close method properly handles various connection types and prevents double-closing with the _closed flag.
74-96: Proper integration of PostgresConnector and unified close interface.The Connector class correctly instantiates PostgresConnector for PostgreSQL connections and provides a clean delegation pattern for the close method.
46-54: Efficient type name caching and appropriate error class.Good use of @cache decorator for performance and proper resource cleanup with closing context.
ibis-server/app/util.py (4)
251-258: Safe connector cleanup with error handling.The helper function properly handles errors during connection cleanup and includes a delay that aligns with the PostgreSQL close implementation.
263-273: Clean timeout execution with proper error translation.The function correctly handles both timeout and query cancellation scenarios, translating them to appropriate DatabaseTimeoutError exceptions.
275-308: Excellent implementation of task cancellation and cleanup.This function demonstrates robust error handling:
- Graceful task cancellation with timeout
- Connection cleanup in separate thread to prevent blocking
- Multiple timeout layers ensure eventual cleanup
- Comprehensive error logging for debugging
310-400: Well-structured timeout functions for different operation types.The implementation correctly differentiates between operations that need full cleanup (query, dry_run) and those that can use simpler timeout handling (validation, metadata).
|
Thanks @goldmedal |
Description
This PR introduces a timeout mechanism to prevent long queries from blocking the database resource. Wren engine has a 3-level timeout to protect the request:
SIGABRT. We should avoid triggering this timeout as soon as possible.Summary by CodeRabbit
Summary by CodeRabbit
New Features
Bug Fixes
Tests