Skip to content

Datastream creation without explicit uid stores empty string, violates unique constraint on second create #1

@Sam-Bolling

Description

@Sam-Bolling

Finding

Datastream creation without an explicit uid field stores an empty string in the database, which violates a unique constraint on subsequent creates — making it impossible to create more than one datastream without manually providing UIDs.

Review Source: Live integration testing from ogc-csapi-explorer and OSHConnect-Python publishers
Severity: P1-Critical
Category: API Design / Data Integrity
Ownership: connected-systems-go


Problem Statement

When a client POSTs a new datastream resource without including a uid field in the request body, the server stores an empty string "" as the UID value in PostGIS. On the second datastream creation (also without uid), the insert fails with a PostgreSQL unique constraint violation because both rows have uid = "".

Affected operation:

POST /api/datastreams
Content-Type: application/json

{
  "name": "My Datastream",
  "system@link": { "href": "systems/<uuid>" },
  "observedProperties": [...],
  "schema": { ... }
}

Expected behavior:

One of:

  1. Server auto-generates a UUID for uid when omitted (preferred — matches how id is auto-generated)
  2. Server treats omitted uid as NULL (not empty string), allowing multiple datastreams without UIDs
  3. Server returns a 400 Bad Request with a clear error message indicating uid is required

Actual behavior:

  • First POST succeeds — datastream created with uid = ""
  • Second POST fails with a PostgreSQL unique constraint violation error on the uid column

Workaround: Clients must always provide an explicit uid value. Our publishers now include UIDs like "uid": "urn:os4csapi:datastream:usgs-eq-feed:earthquakeEvent:v1" on all datastream creation requests.

Impact: Any client that creates multiple datastreams without explicit UIDs will fail on the second create. The error message references a PostgreSQL constraint name rather than a user-friendly API error, making it difficult to diagnose.

Proposed Solutions

Option A: Auto-generate UID when omitted (Recommended)

If the request body does not include uid, generate one server-side (e.g., a UUID or URN). This matches the behavior for id which is already auto-generated.

Pros: Most user-friendly; clients don't need to know about UIDs to get started
Cons: Auto-generated UIDs are less meaningful than client-provided ones
Effort: Small | Risk: None

Option B: Treat omitted UID as NULL

Allow NULL in the uid column (remove or modify the unique constraint to exclude NULLs). PostgreSQL UNIQUE constraints already allow multiple NULLs by default.

Pros: Simple database-level fix
Cons: Loses the ability to look up resources by UID when one wasn't provided
Effort: Small | Risk: Low

Option C: Require UID explicitly

Return 400 Bad Request with a clear message like "uid is required" when omitted.

Pros: Forces clients to be explicit
Cons: Increases friction; not all OGC spec examples include uid
Effort: Small | Risk: Low (but may break existing clients)

Scope — What NOT to Touch

  • ❌ Do NOT change the unique constraint behavior for other resource types unless they have the same issue
  • ❌ Do NOT modify the id auto-generation mechanism

Acceptance Criteria

  • Creating a datastream without uid does not produce a database constraint error
  • Creating multiple datastreams without uid succeeds
  • Creating a datastream with an explicit uid still works as before
  • Duplicate uid values are still rejected (when explicitly provided)

Dependencies

Blocked by: Nothing
Blocks: Nothing


References

# Document What It Provides
1 OGC 23-002 §9.2 Datastream resource definition
2 ogc-csapi-explorer Client that discovered the issue
3 OSHConnect-Python publishers Publishers that work around the issue by providing explicit UIDs

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions