Skip to content

ETL: sync planets and moons with orbit name resolution #192

@GitAddRemote

Description

@GitAddRemote

User Story

As a backend engineer, I need station_planets and station_moons populated from UEX API data with their parent orbit FKs resolved by orbit name when the direct orbit ID is unavailable, so that the location hierarchy is complete and human-readable orbit names are preserved for display.

Definition of Done

  • PlanetsSyncStep ETL step created and registered at tier-4 position (after orbits)
  • MoonsSyncStep ETL step created and registered at tier-4 position (after orbits)
  • station_planets populated with FK orbit_id resolved from station_orbits
  • station_moons populated with FK orbit_id resolved from station_orbits
  • When id_orbit is null but name_orbit is present, resolve orbit UUID by exact name match in station_orbits
  • Failed name resolutions emit severity='warn' to station_etl_warnings with raw_payload containing the planet/moon record
  • orbit_name VARCHAR(255) column preserved on both tables for display fallback
  • Unit tests cover: direct FK resolution, name-based resolution, unresolvable orbit warning
  • pnpm test passes

Acceptance Criteria

  • After ETL: all station_planets rows have non-null orbit_id OR have a warning row in station_etl_warnings explaining the resolution failure
  • After ETL: orbit_name column is populated for every row regardless of resolution outcome
  • Re-running ETL is idempotent — no duplicate rows
  • A planet with neither id_orbit nor name_orbit emits a severity='error' warning and is skipped

Technical Elaboration

UEX Endpoints

  • GET /planets{ id, name, code, id_orbit, name_orbit, id_star_system, is_available, is_available_live, ... }
  • GET /moons{ id, name, code, id_orbit, name_orbit, id_planet, is_available, is_available_live, ... }

Target Schema

station_planets: uex_id INT UNIQUE, name VARCHAR(255), code VARCHAR(50), orbit_id UUID FK → station_orbits, orbit_name VARCHAR(255), star_system_id UUID FK → station_star_systems, is_available BOOLEAN, is_available_live BOOLEAN

station_moons: uex_id INT UNIQUE, name VARCHAR(255), code VARCHAR(50), orbit_id UUID FK → station_orbits, orbit_name VARCHAR(255), parent_planet_id UUID FK → station_planets NULLABLE, is_available BOOLEAN, is_available_live BOOLEAN

Two-Pass Resolution

Pass 1: Build lookup maps from in-memory:

const orbitByUexId = new Map<number, string>();  // uex_id → uuid
const orbitByName  = new Map<string, string>();  // name   → uuid

Pass 2: For each planet/moon:

  1. If id_orbit present → look up orbitByUexId
  2. Else if name_orbit present → look up orbitByName (case-insensitive)
  3. Else → emit severity='error' warning, skip row

Store the raw name_orbit string in orbit_name regardless of resolution outcome.

Moon Parent Resolution

id_planet on moons may be null (some moons orbit directly without a parent planet record). Set parent_planet_id to null in those cases — do not warn, this is expected data.

Warning Row Format

{
  runId: currentRunId,
  stepName: 'planets-sync',
  severity: 'warn',
  message: `Could not resolve orbit for planet uex_id=${planet.id}: name_orbit='${planet.name_orbit}'`,
  rawPayload: planet,
}

Design Elaboration

UEX API orbit data is inconsistent — some planet records carry id_orbit (integer), others carry only name_orbit (string), and rarely both are absent. The two-pass name resolution approach handles all observed cases without requiring a separate API call per planet. The orbit_name column is preserved even when resolution succeeds because it is the human-readable name used in UI display — querying station_orbits.name by UUID join adds latency unnecessarily for a column that changes infrequently.

Warnings for unresolvable orbits are surfaced via station_etl_warnings so operators can identify data gaps without log tailing. A planet with a warning is excluded from the catalog until the gap is resolved (either by UEX fixing their data or by a manual mapping entry).


Depends on: #191

Metadata

Metadata

Assignees

No one assigned

    Labels

    backendBackend services and logicdatabaseSchema, migrations, indexingenhancementNew feature or requesttech-storyTechnical implementation storyuex-syncUEXcorp API sync and integration

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions