[SPARK-43752][SQL] Support column DEFAULT values in V2 write commands#55127
[SPARK-43752][SQL] Support column DEFAULT values in V2 write commands#55127LuciferYang wants to merge 1 commit into
Conversation
dongjoon-hyun
left a comment
There was a problem hiding this comment.
+1, LGTM. Thank you, @LuciferYang .
|
Thank you @dongjoon-hyun |
|
The code changes in this PR appear to be dead code. The test is valuable, but it passes without the Why it's dead code: For
Since The Verification: I reverted the code changes (kept the test) on a branch without this commit and ran: Result: |
|
If no objections I'll revert it tomorrow. |
Thanks for pointing this out. My bad. This issue came up while I was working on the recent SPARK-56170 patches. As @cloud-fan suggested, let's keep the test case and revert the changes. We can fix them together later when DSV2 File Table is actually ready to advance to this part. |
|
+1 for reverting. Thank you for fixing this. |
### What changes were proposed in this pull request? This PR reverts the code changes from #55127 (commit eca57ea) and moves the test to `InsertIntoTests` where it properly runs across V2 INSERT SQL test suites. Specifically: - **Removes** the `V2WriteCommand` case added to `Analyzer.ResolveReferences` and `ResolveColumnDefaultInCommandInputQuery` - **Removes** the test from `ResolveDefaultColumnsSuite` (wrong location for a V2 end-to-end test) - **Adds** the test to `InsertIntoSQLOnlyTests` in `InsertIntoTests.scala`, which runs in suites with `includeSQLOnlyTests = true`: `DataSourceV2SQLSuite`, `DataSourceV2SQLSessionCatalogSuite`, and `V1WriteFallbackSessionCatalogSuite`. This is correct since `DEFAULT` is SQL-only syntax. ### Why are the changes needed? The code added by #55127 is dead code. For `INSERT INTO v2_table VALUES (1, DEFAULT)`, the resolution flow is: 1. The parser produces `InsertIntoStatement` for **both** V1 and V2 tables. 2. In the Resolution batch, `ResolveReferences` dispatches `InsertIntoStatement` to `resolveColumnDefaultInCommandInputQuery`, which resolves DEFAULT using the table schema. For V2 tables, the `DataSourceV2Relation` schema includes column default metadata (via `CatalogV2Util.encodeDefaultValue`), so DEFAULT is resolved correctly at this stage. 3. `ResolveInsertInto` only converts `InsertIntoStatement` to a `V2WriteCommand` (`AppendData`/`OverwriteByExpression`/`OverwritePartitionsDynamic`) **after** the query is fully resolved — by which point DEFAULT is already gone. No code path (SQL or DataFrame API) creates a `V2WriteCommand` with unresolved DEFAULT attributes. The `PlanResolutionSuite` test "INSERT INTO table with default column value" already verifies this resolution at the plan level. The test is still valuable and passes without the code changes — this PR moves it to the proper location. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Moved the existing test to `InsertIntoSQLOnlyTests` in `InsertIntoTests.scala`. The test verifies: - `INSERT INTO` with partial DEFAULT values - `INSERT INTO` with all DEFAULT values - `INSERT OVERWRITE` with DEFAULT values ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code 4.6 Closes #55348 from cloud-fan/SPARK-43752-revert. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
Resolve column
DEFAULTreferences in V2 write commands (AppendData,OverwriteByExpression,OverwritePartitionsDynamic).Previously,
ResolveColumnDefaultInCommandInputQueryonly handledInsertIntoStatement(V1 path) andSetVariable. V2 write commands withDEFAULTwould fail with unresolved attribute errors.Changes:
Analyzer.ResolveReferences: dispatchV2WriteCommandtoresolveColumnDefaultInCommandInputQuerywhen the query contains unresolved attributesResolveColumnDefaultInCommandInputQuery: addV2WriteCommandcase that resolvesDEFAULTby matching query columns to the table schema by positionTODO (SPARK-43752)commentWhy are the changes needed?
INSERT INTO v2_table VALUES (1, DEFAULT)fails for V2 data sources even when the table has column default values defined. This blocks V2 adoption for catalogs that supportSUPPORT_COLUMN_DEFAULT_VALUE.Does this PR introduce any user-facing change?
Yes.
DEFAULTkeyword now works inINSERT INTO/INSERT OVERWRITEfor V2 tables.How was this patch tested?
ResolveDefaultColumnsSuiteWas this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code 4.6