Skip to content

Flink: Clean up UpdateSchema instantiator method#13640

Merged
pvary merged 2 commits into
apache:mainfrom
mxm:cleanup-test
Jul 24, 2025
Merged

Flink: Clean up UpdateSchema instantiator method#13640
pvary merged 2 commits into
apache:mainfrom
mxm:cleanup-test

Conversation

@mxm
Copy link
Copy Markdown
Contributor

@mxm mxm commented Jul 23, 2025

Currently, we need to supply the highest field id when instantiating the SchemaUpdate constructor. This can be automated.

@github-actions github-actions Bot added the flink label Jul 23, 2025
() ->
EvolveSchemaVisitor.visit(
loadUpdateApi(currentSchema, 3), currentSchema, targetSchema))
loadUpdateApi(currentSchema), currentSchema, targetSchema))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is not quite right, in testReplaceListWithPrimitive lastColumnId is 3, but schema.highestFieldId() is 2.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That observation is correct, but it makes no difference for the correctness of the test. The columns are matched by name, not by id. Before we even add the column, where the lastColumnId would come into place for generating a new field id, the test will already fail with the expected error message.

Generally, any value for lastColumnId >= the highest id (2 here) will work (the ids must not overlap). Note that the SchemaUpdate code will increase the provided value by 1 before using it as an id for a new field.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, wondering though that was is accidental, that a bigger id than the highest was used here, or it was intentional. I suppose it is the former.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good attention to detail. The refactoring is precisely to prevent setting the wrong id. 3 wasn't wrong, but it could have been 2 as well (the max current field id). At runtime, SchemaUpdate will set this id automatically, but we are using the test constructor here.

@pvary pvary merged commit a8c3a59 into apache:main Jul 24, 2025
18 checks passed
@pvary
Copy link
Copy Markdown
Contributor

pvary commented Jul 24, 2025

Merged to main.
Thanks for the simplification in tests for @mxm and @nandorKollar for the review!

mxm added a commit to mxm/iceberg that referenced this pull request Jul 24, 2025
pvary pushed a commit that referenced this pull request Jul 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants