Add general-purpose SQL Statement Execution engine#5416
Merged
Conversation
Add libs/sqlexec, a reusable client over the Databricks SQL Statement Execution API: submit (sync or async), poll to terminal with additive backoff, paginate result chunks, cancel, and turn terminal non-success states into a typed StatementError. INLINE + JSON_ARRAY only (the 25 MiB cap covers every caller); EXTERNAL_LINKS is intentionally out of scope. Migrate the experimental aitools query/batch/discover-schema/statement commands onto the engine, deleting their duplicated poll/fetch/terminal /error helpers. CLI-specific concerns (signal handling, spinner, batch JSON shapes, the UNRESOLVED_MAP_KEY hint) stay in aitools as a thin presenter over the engine's structured results and errors. Co-authored-by: Isaac
Add two durable test layers beyond the mock-interface unit tests: - libs/sqlexec/sqlexec_http_test.go: drives the engine through a real SDK client over HTTP against testserver with the statement endpoints programmed per test. Covers the request/response JSON round-trip for sync success, polling across GetStatement, result-chunk pagination, the FAILED-as-HTTP-200 quirk, and submit+cancel. Hermetic; runs every PR. - integration/libs/sqlexec: exercises the same paths against a live warehouse in the nightly suite (skips without CLOUD_ENV / TEST_DEFAULT_WAREHOUSE_ID). Also address review: Statement.Columns() exposes column metadata from the manifest so `statement get` still surfaces columns when a later result chunk fetch fails (restores the previously-dropped behavior + assertion). Co-authored-by: Isaac
Collaborator
|
Commit: eef24de |
simonfaltum
approved these changes
Jun 2, 2026
Member
simonfaltum
left a comment
There was a problem hiding this comment.
LGTM. Reviewed the new sqlexec engine and aitools migration; focused tests passed locally. The external Integration Tests check currently reports "Report generation failed" without test annotations, so I don't see a code-review blocker here.
The shared classic warehouse on the non-UC azure-prod workspace fails to launch clusters, so the statement sat pending ~75 min and timed out the job. Gate newClient on TEST_METASTORE_ID, which is set only on the *-ucws environments, so the tests run only where a working UC warehouse is available. CLOUD_ENV can't make this distinction because azure-prod-ucws and the non-UC azure-prod both report CLOUD_ENV=azure. Co-authored-by: Isaac
Collaborator
|
Commit: 4008bbe |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
libs/sqlexec, a general-purpose, non-interactive engine for running SQL through the Databricks SQL Statement Execution API, and refactors the experimental aitools query commands to use it instead of each re-implementing the submit/poll/fetch loop.A
Clientbinds to a single SQL warehouse and exposes the full lifecycle:Submit(async, returns immediately with a statement ID so callers can wire up cancellation),Poll(additive backoff between status checks),Get,Cancel,Results, and the convenience wrappersExecuteandExecuteScalar. Failures surface as a typed*StatementErrorcarrying the terminalState,Code, andMessage, so callers compare witherrors.Asrather than string-matching. TheClientholds no mutable state and is safe for concurrent use — aitools fans many statements out through one instance.The engine speaks only the INLINE disposition with JSON_ARRAY format (the API caps this at 25 MiB per result set), which covers every caller today; EXTERNAL_LINKS is intentionally left out as a separate concern. It exists to be shared by programmatic callers such as bundle deploy resources (e.g. metric views, which have no REST API and are managed via SQL DDL) and the aitools commands.
The aitools consumers (
query.go,batch.go,discover_schema.go,statement*.go) are reworked to delegate to the engine, removing the duplicated polling and result-assembly code (net ~549 lines deleted from those files).Test plan
libs/sqlexeccovering path/response decoding, polling, cancellation, parameters, and error mapping.integration/libs/sqlexec(skips withoutCLOUD_ENV/TEST_DEFAULT_WAREHOUSE_ID); all 6 tests verified green against a real workspace.This pull request and its description were written by Isaac.