Fix precision loss in large integral string conversions by fallintoplace · Pull Request #3405 · apache/iceberg-python

fallintoplace · 2026-05-22T23:51:56Z

Summary

Fixes precision loss when converting large integral strings in two runtime paths:

StringLiteral.to(IntegerType/LongType)
partition_to_py(...) for integral and time-based partition values backed by integers

Root cause

Both paths were converting through float before converting to int, which loses precision for values outside the IEEE-754 exact integer range.

That caused valid 64-bit integers like LongType.max and 9007199254740993 to be corrupted.

What changed

Replaced int(float(...)) with exact integer parsing in partition_to_py
For StringLiteral.to(IntegerType/LongType), exact integral strings now use exact integer parsing while fractional numeric strings retain the existing truncation behavior
Added regression tests for LongType.max and 9007199254740993

Validation

uv run pytest tests/expressions/test_literals.py tests/test_conversions.py

Closes #3404.

ndrluis · 2026-05-24T02:11:06Z

@fallintoplace Thank you for your contribution.

Can we also handle and add tests for large integral-looking strings that are not plain integer literals, e.g. "9007199254740993.0", "9007199254740993e0", and f"{LongType.max}.0"? These still go through the float path today and either lose precision or overflow incorrectly. A test for "1e3" would also catch the scientific-notation regression from the previous behavior.

fallintoplace · 2026-05-24T11:24:20Z

Thanks, pushed a follow-up in d443f37 to handle the remaining integral-looking string cases without going through float, and added regressions for "9007199254740993.0", "9007199254740993e0", f"{LongType.max}.0", and "1e3".

ndrluis · 2026-05-24T14:40:33Z



+def _truncate_numeric_string_to_int(value: str) -> int:
+    return int(Decimal(value))


int(Decimal(value)) fixes the precision issue, but it also means we materialize the full integer before doing the bounds check. For example, literal("1e1000000").to(IntegerType()) eventually returns IntAboveMax, but only after constructing a very large Python int first. In my local check this took ~17s.

Can we compare the parsed Decimal against IntegerType.max/min or LongType.max/min before converting to int?

number = Decimal(self.value) if number > IntegerType.max: return IntAboveMax() elif number < IntegerType.min: return IntBelowMin() return LongLiteral(int(number))

That should preserve the precision fix while avoiding excessive CPU/memory use for obviously out-of-range values.

Thanks, pushed a follow-up in 29f7d36.

fix: preserve precision for large integral string conversions

f9f2de0

fallintoplace marked this pull request as ready for review May 22, 2026 23:54

refine integral string parsing behavior

6bdb20c

fix: preserve exact numeric string literal conversions

d443f37

ndrluis reviewed May 24, 2026

View reviewed changes

fix: avoid materializing oversized numeric strings

29f7d36

ndrluis approved these changes May 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix precision loss in large integral string conversions#3405

Fix precision loss in large integral string conversions#3405
fallintoplace wants to merge 4 commits into
apache:mainfrom
fallintoplace:fix-integral-string-precision

fallintoplace commented May 22, 2026 •

edited

Loading

Uh oh!

ndrluis commented May 24, 2026

Uh oh!

fallintoplace commented May 24, 2026

Uh oh!

ndrluis May 24, 2026

Uh oh!

fallintoplace May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		def _truncate_numeric_string_to_int(value: str) -> int:
		return int(Decimal(value))

Conversation

fallintoplace commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

What changed

Validation

Uh oh!

ndrluis commented May 24, 2026

Uh oh!

fallintoplace commented May 24, 2026

Uh oh!

ndrluis May 24, 2026

Choose a reason for hiding this comment

Uh oh!

fallintoplace May 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fallintoplace commented May 22, 2026 •

edited

Loading