Skip to content

Implement scale/offset codec in the EOPF data model #134

@emmanuelmathot

Description

@emmanuelmathot

Context

The scale/offset codec extension has been merged in the zarr-extensions repository, providing a standardized Zarr V3 array-to-array codec for scale/offset transformation. This closes the specification gap documented in zarr-developers/zarr-extensions#42.

The next step is implementation in zarr-python, which would make the codec transparently available through xarray and our downstream stack (data pipeline, TiTiler).

What this enables

With a standardized and implemented scale/offset codec, our GeoZarr data arrays can:

  • Store reflectance data efficiently as uint16 (0–10000) while transparently presenting float32 values (0.0–1.0) to users
  • Follow CF conventions for physical quantities without requiring users to manually apply scale_factor / add_offset attributes
  • Decouple storage encoding from data presentation — the Zarr V3 design intent

This directly addresses the reflectance data convention discussion with EOPF CPM, where both sides agree on integer storage with codec-based float presentation.

Tasks

  • Confirm zarr-python implementation status@d-v-b: is the scale/offset codec from zarr-extensions#43 being implemented in zarr-python? If so, this is an activity we can do in this project.
  • Update the data pipeline to use the codec in the converter and the data pipeline @lhoupert
  • Validate downstream compatibility and confirm that TiTiler correctly decode data through the codec without additional handling @vincentsarago
  • Document the convention in the data model spec @emmanuelmathot

Related issues

Metadata

Metadata

Labels

enhancementNew feature or request
No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions