Skip to content

Core: Fix reference to nested column added to map/list in same transaction#17034

Open
huan233usc wants to merge 1 commit into
apache:mainfrom
huan233usc:fix/schemaupdate-nested-added-shortname
Open

Core: Fix reference to nested column added to map/list in same transaction#17034
huan233usc wants to merge 1 commit into
apache:mainfrom
huan233usc:fix/schemaupdate-nested-added-shortname

Conversation

@huan233usc

Copy link
Copy Markdown
Contributor

Problem

SchemaUpdate records a column added in the current transaction in addedNameToId, keyed by the column's name, so that a later operation in the same transaction (updateColumn, updateColumnDoc, requireColumn, makeColumnOptional, or a move) can resolve the not-yet-committed column.

For a column added into a map value struct or a list element struct, internalAddColumn builds the key from the parent's canonical name via schema.findColumnName(parentId), which includes the synthetic value / element segment:

fullName = schema.findColumnName(parentId) + "." + name;   // e.g. "locations.value.alt"
addedNameToId.put(caseSensitivityAwareName(fullName), newId);

But callers address such a column by its user-facing short name (locations.alt, points.z) — the form the schema's own name index exposes for map value / list element structs. findForUpdate first tries findField(name) (which can't see the pending column) and then falls back to addedNameToId.get(name), so the lookup misses and the operation fails with Cannot update missing column: locations.alt.

Only the full canonical path (locations.value.alt) works today, which is not the name users write.

Example

new SchemaUpdate(schema, lastColumnId)
    .addColumn("locations", "alt", Types.FloatType.get()) // into a map<..., struct> value
    .updateColumnDoc("locations.alt", "altitude")          // -> IllegalArgumentException today
    .apply();

Fix

Also index the user-facing short name in addedNameToId for nested adds, so the shorthand resolves. This mirrors how the schema's own IndexByName tracks both the full and short paths (dropping value/element for structs nested in a map value / list element). Top-level and plain-struct adds are unaffected because their short name equals the full name.

Tests

Added TestSchemaUpdate.testUpdateColumnAddedToMapValueOrListElementInSameTransaction: adds a field into a map value struct and a list element struct, then references each by its short name (locations.alt, points.z) in the same transaction. It fails before the change with Cannot update missing column and passes after. The full TestSchemaUpdate suite remains green.

Comment on lines +171 to +177
// fullName uses the parent's canonical name, which for a map value or list element
// struct includes the synthetic "value"/"element" segment (e.g. locations.value.alt).
// Also index the user-facing short name (locations.alt) so later operations in this
// transaction that reference the column by that name resolve it.
// Add the short name only if it is not already mapped, so a name containing dots
// cannot shadow another added column that already owns that name (as its canonical
// full name or its own short name).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is too verbose. I suggest simplifying it. The same goes for the new tests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, simplified a bit

…ction

SchemaUpdate records an added column in addedNameToId using the parent's
canonical name, which for a map value or list element struct includes the
synthetic "value"/"element" segment (e.g. locations.value.alt). But callers
address such a column by its user-facing short name (locations.alt), so a
later operation in the same transaction -- updateColumn, updateColumnDoc,
requireColumn, makeColumnOptional, or a move -- fails to resolve it with
"Cannot update missing column".

Also index the short name in addedNameToId so the shorthand resolves, matching
how the schema's own name index (IndexByName) tracks both the full and short
paths for map value / list element structs.
@huan233usc huan233usc force-pushed the fix/schemaupdate-nested-added-shortname branch from 2c14b2e to 2561fea Compare July 2, 2026 02:17
@huan233usc huan233usc requested a review from ebyhr July 2, 2026 02:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants