Skip to content

Research: NaN handling for numeric observation fields — rejection vs. nilValue support #4

@Sam-Bolling

Description

@Sam-Bolling

Finding

The server rejects observation payloads containing "NaN" as a string value for numeric (float/double) fields. This is a research spike to determine whether NaN handling for numeric observation fields is specified by OGC, what other servers do, and what the recommended behavior should be.

Review Source: Live integration testing from OSHConnect-Python OpenSky ADS-B publisher against both connected-systems-go and OSH SensorHub
Severity: P3-Minor (research spike)
Category: API Design / Data Quality
Ownership: connected-systems-go


Problem Statement

Real-world sensor data frequently contains missing or undefined values. For example, ADS-B transponders may not report barometric altitude for all aircraft, resulting in null or NaN values in the source data. When our OpenSky publisher encounters these, the Python json.dumps() emits "NaN" (a string) since JSON has no native NaN representation.

Observed behavior:

POST /api/datastreams/<uuid>/observations
{
  "result": {
    "baro_altitude": "NaN",
    "geo_altitude": 10500.0,
    ...
  }
}

→ connected-systems-go rejects this with a validation error: the field is declared as numeric in the schema but "NaN" is a string.

Comparative behavior:

Server "NaN" string for numeric field null for numeric field 0.0 substitute
OSH SensorHub ✅ Accepted (leniently) Unknown ✅ Accepted
connected-systems-go ❌ Rejected Unknown ✅ Accepted

Workaround applied: Our OpenSky publisher detects the Go server and replaces NaN values with 0.0 before publishing. This loses the semantic meaning of "no data" vs "zero."

Research Questions

This spike should produce analysis, findings, and recommendations that may result in follow-on issues:

  1. How does JSON handle NaN? JSON spec (RFC 8259) does not include NaN. What are the common approaches for representing it?
  2. What does OGC SWE Common say about missing/no-data values? Is there a noDataValue or nilValue mechanism in the datastream schema?
  3. What does OGC 23-002 say about handling missing observation values? Are there provisions for partial results?
  4. How should the server handle null for a required numeric field? Is this a valid way to express "no data"?
  5. Should the server support "NaN" as a special sentinel? If so, how should it be stored in PostGIS?
  6. Is 0.0 a reasonable substitute, or does it corrupt data integrity? (For altitude, 0.0 means sea level, which is semantically different from "no reading.")
  7. Should the schema support declaring a nilValue per field? Then clients could use { "baro_altitude": -999.0 } as an explicit no-data sentinel.

Expected Deliverables

  • Analysis of JSON NaN handling approaches across the ecosystem
  • Review of OGC SWE Common nilValue / noData mechanisms
  • Comparison of how CSAPI server implementations handle missing numeric data
  • Recommendation: reject, accept, convert, or implement nilValue support
  • If changes are recommended, file follow-on implementation issue(s)

Scope

  • ❌ Do NOT implement changes in this spike — research and recommend only
  • ❌ Do NOT change client-side workarounds — they are functional

References

# Document What It Provides
1 RFC 8259 JSON specification — no NaN support
2 OGC SWE Common 3.0 §7.3 Quantity component, nilValues
3 OGC 23-002 §9.7 Observation result encoding
4 IEEE 754 Floating-point NaN definition
5 OSHConnect-Python opensky_publisher.py Publisher with NaN→0.0 workaround

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions