Skip to content

ETL: sync categories and category attributes into station_* tables #196

@GitAddRemote

Description

@GitAddRemote

User Story

As a backend engineer, I need station_categories and station_category_attributes populated from UEX API data with synthetic section-level parent rows created for each top-level grouping, so that the frontend can render a two-level category hierarchy (section → category) for the catalog browser.

Definition of Done

  • CategoriesSyncStep and CategoryAttributesSyncStep ETL step classes created
  • Steps registered in CatalogEtlService at tier-8 (before items/vehicles/commodities)
  • UEX categories upserted into station_categories by uex_id
  • Synthetic is_section = TRUE parent rows created for each unique section grouping present in UEX category data
  • Synthetic rows use deterministic UUIDs (uuidv5 keyed on section name)
  • parent_id FK on leaf category rows references the correct synthetic section row
  • station_category_attributes upserted per (category_id, attribute_key) composite
  • Unit tests: section row creation, leaf row parent FK, attribute upsert, idempotency
  • pnpm test passes

Acceptance Criteria

  • After ETL: SELECT count(*) FROM station_categories WHERE is_section = TRUE equals number of distinct section groupings in UEX data
  • After ETL: all non-section categories have a non-null parent_id referencing a section row
  • Re-running ETL does not duplicate section rows or leaf rows
  • station_category_attributes rows are keyed by (category_id, attribute_key) with no duplicates

Technical Elaboration

UEX Endpoint

GET /categories{ id, name, code, section, attributes: [{ key, label, type, options }] }

The section field is a string grouping (e.g., "Weapons", "Ships", "Consumables"). UEX does not expose section IDs — they are free-text labels.

Two-Level Hierarchy Construction

Step 1 — Collect all unique section strings:

const sections = [...new Set(categories.map(c => c.section).filter(Boolean))];

Step 2 — Upsert synthetic section rows:

for (const section of sections) {
  const sectionId = uuidv5(`section-${section}`, CATEGORY_NAMESPACE);
  await this.categoriesRepo.upsert({
    id: sectionId,
    name: section,
    code: slugify(section),
    isSection: true,
    parentId: null,
    uexId: null,
  }, ['id']);
}

Step 3 — Upsert real category rows with parent_id set to the section UUID:

for (const cat of categories) {
  const parentId = sectionId(cat.section);
  await this.categoriesRepo.upsert({
    uexId: cat.id,
    name: cat.name,
    code: cat.code,
    isSection: false,
    parentId,
  }, ['uex_id']);
}

Category Attributes

Each category may carry an attributes array. Upsert each as:

INSERT INTO station_category_attributes (category_id, attribute_key, label, data_type, options_json)
VALUES (...)
ON CONFLICT (category_id, attribute_key) DO UPDATE SET label = EXCLUDED.label, ...

options_json JSONB stores the options array for enum-type attributes (e.g., ["rifle", "pistol", "smg"] for weapon sub-type).

Self-Referencing FK

Because station_categories.parent_id references station_categories.id, section rows must be upserted before leaf rows. Run section upsert as a batch first, then run leaf upsert in a second pass within the same step.

Design Elaboration

UEX category data is flat — it has no explicit hierarchy, just a section string on each category. Station needs a two-level hierarchy for the catalog browser UI. Synthetic section rows bridge this gap without modifying the UEX data model.

Deterministic UUIDs for section rows (uuidv5 keyed on section name) ensure idempotent re-runs: the same section name always produces the same UUID, so the upsert updates the existing row rather than inserting a duplicate.

The is_section boolean distinguishes synthetic parent rows from real UEX category data, enabling queries that filter to only leaf categories when building item/commodity selectors.


Depends on: #188, #189

Metadata

Metadata

Assignees

No one assigned

    Labels

    backendBackend services and logicdatabaseSchema, migrations, indexingenhancementNew feature or requesttech-storyTechnical implementation storyuex-syncUEXcorp API sync and integration

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions