Skip to content

User-guide page: safe GeoTIFF IO usage (#2345 PR 1)#2386

Merged
brendancol merged 3 commits into
mainfrom
issue-2379
May 25, 2026
Merged

User-guide page: safe GeoTIFF IO usage (#2345 PR 1)#2386
brendancol merged 3 commits into
mainfrom
issue-2379

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #2379. Part of epic #2345 PR 1.

Summary

Adds docs/source/user_guide/geotiff_safe_io.rst, a user-facing page that
answers "is this safe to rely on?" for xrspatial.geotiff without making
the caller read source code or SUPPORTED_FEATURES directly. Wires the
file into docs/source/user_guide/index.rst.

The page covers:

  • Entry points to prefer (open_geotiff, to_geotiff, read_vrt,
    write_vrt, write_geotiff_gpu) and what each one expects.
  • The SUPPORTED_FEATURES tier vocabulary: stable / advanced /
    experimental / internal-only, with a runtime check example.
  • Recommended codecs for the stable contract (none, deflate, lzw,
    packbits, zstd) and what disqualifies a file (experimental codecs,
    internal-only jpeg, non-stable read/write paths).
  • COG output: when to pass cog=True, the layout it produces, and the
    combinations that fall outside the stable contract (GPU, BigTIFF COG,
    HTTP COG, file-like destinations, rotated transforms, sidecar
    overviews, experimental codecs).
  • Fail-closed errors a caller will hit, what each one means, and the
    opt-in (where one exists).
  • Remote-read safety limits: max_cloud_bytes kwarg,
    XRSPATIAL_GEOTIFF_MAX_CLOUD_BYTES, XRSPATIAL_GEOTIFF_ALLOW_PRIVATE_HOSTS,
    HTTP connect / read timeouts, strict mode, mmap cache size.
  • Cross-links to the reference page, the release contract, the release
    gate audit checklist, and the attrs contract.

The page does not claim full GDAL / VRT / GPU parity anywhere.

Test plan

  • rst2html reports no structural errors on the new file (Sphinx
    roles flagged as "unknown" are expected outside Sphinx).
  • All cross-reference targets (reference.geotiff,
    reference.geotiff_release_contract, reference.geotiff_release_gate,
    user_guide.attrs_contract) exist in the docs tree.
  • SUPPORTED_FEATURES tier strings on the page match
    xrspatial/geotiff/_attrs.py at the current commit.
  • Sphinx CI builds the page without warnings.

New docs page walks a caller through the GeoTIFF entry points, the
SUPPORTED_FEATURES tier vocabulary, the codec subset inside the stable
contract, the COG layout produced by cog=True, the fail-closed errors
the reader and writer raise, and the env vars / kwargs that bound
remote reads. Cross-links to the existing reference, release contract,
release gate, and attrs contract pages.

Part of #2345 PR 1.

Closes #2379.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 25, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: User-guide page: safe GeoTIFF IO usage (#2345 PR 1)

Docs-only PR. The new page is accurate against _attrs.py, _errors.py, and open_geotiff at the current commit. The review focused on cross-reference correctness, factual claims, and Sphinx role usage.

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

  • docs/source/user_guide/geotiff_safe_io.rst:195 and :227: VRTUnsupportedError is written as inline code instead of a :class: cross-ref because it is not re-exported from xrspatial.geotiff. The class is already in _errors.__all__, so adding it to the from ._errors import (...) line in xrspatial/geotiff/__init__.py and the package __all__ would let both mentions switch to :class:\~xrspatial.geotiff.VRTUnsupportedError`` and match the rest of the table. Keeps the user-facing error names on one import path. If you want this PR to stay strictly docs-only, file a follow-up.

  • docs/source/user_guide/geotiff_safe_io.rst:269 (Byte budget): "The module default" leaves the reader without a concrete number. The default is set in xrspatial/geotiff/_sources.py near _resolve_max_cloud_bytes. Naming it (or at least pointing at the helper the way the page does for env vars) helps a caller decide whether they need to touch the env var at all.

  • docs/source/user_guide/geotiff_safe_io.rst:38-43 (open_geotiff row): says "A path or a binary file-like is the only required argument" but the file-like path is restricted to the eager numpy reader. Dask, GPU, VRT, and remote-URL paths require a string. The open_geotiff docstring spells this out under source. One sentence on this page would save a reader from passing a BytesIO with chunks=N and getting a confusing failure.

Nits (optional improvements)

  • docs/source/user_guide/geotiff_safe_io.rst:34: "Single dispatch for reading" reads a bit clipped. "The read entry point" matches the docstring tone better.

  • docs/source/user_guide/geotiff_safe_io.rst:182: the sentence says every error below subclasses ValueError, which is true. The carve-out for UnsupportedGeoTIFFFeatureError (not a GeoTIFFAmbiguousMetadataError) lives down at the bottom of the table at line 233. Mentioning the GeoTIFFAmbiguousMetadataError vs direct-ValueError split once at the top would save the second read.

  • docs/source/user_guide/geotiff_safe_io.rst:127 (Recommended codecs): the none bullet says "no compression" without the TIFF spec name. Worth noting none is the codec keyword the writer accepts; the TIFF spec calls this COMPRESSION_NONE. Optional.

  • The page uses both "Cloud-Optimised" and "Cloud-Optimized" spellings. The existing reference page uses "Cloud-Optimised"; keep this one consistent with that.

What looks good

  • Tier vocabulary section maps cleanly to the four strings in SUPPORTED_FEATURES at _attrs.py:264. The runtime check example is correct.
  • Stable COG contract bullets match _attrs.py:209-237.
  • The "outside the stable contract" combinations match the experimental / advanced tier strings in SUPPORTED_FEATURES: GPU read/write, BigTIFF COG, HTTP COG, file-like with cog=True, sidecar overviews, and rotated transforms all line up.
  • Error table covers every public class in xrspatial.geotiff._errors and the opt-in flags match the constructors of open_geotiff and to_geotiff.
  • Env var names match the strings in _sources.py and _runtime.py.
  • Cross-reference targets (reference.geotiff, reference.geotiff_release_contract, reference.geotiff_release_gate, user_guide.attrs_contract) all exist in the docs tree.
  • The page does not claim full GDAL / VRT / GPU parity, per the acceptance criterion on #2379.

Checklist

  • Cross-reference targets exist
  • Tier strings match SUPPORTED_FEATURES at HEAD
  • Error class names match xrspatial.geotiff._errors
  • Env var names match _sources.py / _runtime.py
  • No code or API changes (docs-only)
  • Wired into the user guide TOC
  • Sphinx CI confirms a clean build (waiting on CI)
  • [N/A] Backend parity / NaN handling / dask correctness (docs-only PR)
  • [N/A] README feature matrix (no new functions)

Apply the suggestions and nits from the PR #2386 review.

- Note that the binary file-like read form is restricted to the eager
  numpy reader; dask, GPU, VRT, and remote-URL paths require a string.
- Name the default max_cloud_bytes value (256 MiB) and point at
  MAX_CLOUD_BYTES_DEFAULT so callers can find it.
- Hoist the GeoTIFFAmbiguousMetadataError vs direct-ValueError split
  to the top of the error section so the carve-out for
  UnsupportedGeoTIFFFeatureError reads as introduction, not footnote.
- Add the TIFF spec name COMPRESSION_NONE next to the ``none`` codec
  bullet.
- Switch "Cloud-Optimised" to "Cloud-optimized" to match
  docs/source/reference/release_gate_geotiff.rst.
- Reword the open_geotiff blurb away from "single dispatch" to "the
  read entry point".

The VRTUnsupportedError re-export suggestion is deferred -- that's a
code change and #2379 is docs-only.
The intro said the first eight entries subclass GeoTIFFAmbiguousMetadataError.
The table actually lists nine such entries; UnsupportedGeoTIFFFeatureError
is the only direct ValueError subclass. Reword to "every entry except
UnsupportedGeoTIFFFeatureError" so the count cannot drift.
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review (follow-up): User-guide page: safe GeoTIFF IO usage

Re-review after the fix-up commits at 3576b588 and e9089e68.

Blockers

None.

Suggestions

None outstanding. The original suggestions were applied:

  • File-like restriction now stated on the open_geotiff row.
  • Default max_cloud_bytes named as 256 MiB with a pointer to
    MAX_CLOUD_BYTES_DEFAULT.
  • GeoTIFFAmbiguousMetadataError vs direct-ValueError split hoisted
    to the top of the error section.
  • Off-by-one in "first eight entries" corrected to "every entry except
    UnsupportedGeoTIFFFeatureError" so the count cannot drift.
  • COMPRESSION_NONE noted next to the none codec.
  • "Cloud-Optimised" switched to "Cloud-optimized" to match
    release_gate_geotiff.rst.
  • "Single dispatch for reading" reworded to "the read entry point".

Deferred:

  • Re-exporting VRTUnsupportedError from xrspatial.geotiff so the
    two inline references can become :class: cross-refs. The fix is a
    one-line change to xrspatial/geotiff/__init__.py, but #2379 is
    scoped docs-only. Worth a follow-up issue rather than expanding
    this PR.

Nits

None.

What looks good

  • All cross-references still resolve.
  • rst2html reports no structural errors (Sphinx roles flagged as
    "unknown" are expected outside Sphinx).
  • The page does not claim full GDAL / VRT / GPU parity.
  • Tier strings still match SUPPORTED_FEATURES at HEAD.

Checklist

  • All actionable items from the first review either applied or
    deferred with a reason.
  • No structural RST errors.
  • Cross-references still valid.
  • Sphinx CI confirms a clean build (waiting on CI).

@brendancol
Copy link
Copy Markdown
Contributor Author

CI status

Three required pytest jobs failed on this PR:

  • run (ubuntu-latest, 3.14)
  • run (macos-latest, 3.14)
  • run (windows-latest, 3.14)

All five failures are in xrspatial/geotiff/tests/test_vrt_unsupported_2370.py::test_unsupported_resample_alg_raises (Bilinear, Cubic, Lanczos, Average, Mode parametrizations). The error message says the VRT validator is raising on differing SrcRect (4x4) and DstRect (2x2) sizes before the test asserts on the resample-algorithm rejection.

This is pre-existing on main: the pytest workflow run for commit 9cd54f6e (the current main HEAD that this branch branched from) reports the same failure. The failures predate the docs-only changes in this PR and are unrelated to anything in docs/source/user_guide/geotiff_safe_io.rst.

mergeable reports MERGEABLE with mergeStateStatus=BLOCKED only because the same pytest workflow runs on the PR.

The remaining checks (3.12 fast lane, ubuntu 3.14 alternate matrix entry, label, RTD docs) are passing.

Suggest landing the VRT test fix in a separate PR; this docs page does not affect runtime behaviour.

@brendancol brendancol merged commit bca64b5 into main May 25, 2026
4 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

User-guide page: safe GeoTIFF IO usage (#2345 PR 1)

1 participant