Skip to content

test(parity): batch port — tile / math / distance / similarity (4 manifest files)#23

Merged
nhungoc1508 merged 2 commits into
mainfrom
feat/parity-batch-tile-math-distance-similarity
May 4, 2026
Merged

test(parity): batch port — tile / math / distance / similarity (4 manifest files)#23
nhungoc1508 merged 2 commits into
mainfrom
feat/parity-batch-tile-math-distance-similarity

Conversation

@estebanzimanyi

Copy link
Copy Markdown
Member

Summary

Four parity files added in one commit, each as a wholesale-skip manifest because the underlying MobilityDB surface is entirely unbound in MobilityDuck. The files stand as tracked gap inventories rather than live test assertions.

File Surface MEOS symbols
025_temporal_tile.test bins(<span>, <size>[, <origin>]), getBin, valueSplit, timeSplit, valueTimeSplit span_bins, value_bin, tnumber_value_split, temporal_time_split, tnumber_value_time_split
026_tnumber_mathfuncs.test Arithmetic + - * / for tnumber × {value, tnumber}; unary abs, deg, rad, atan, derivative tnumber_add_*, tnumber_sub_*, tnumber_mult_*, tnumber_div_* (96 ScalarFunction registrations needed)
036_tnumber_distance.test <-> for tnumber types, nearestApproachDistance, nad, shortestLine distance_tnumber_value, distance_tnumber_tnumber, nad_*
038_temporal_similarity.test frechetDistance, discreteFrechet, dynTimeWarp, hausdorffDistance, similarityPath temporal_frechet_distance, temporal_dyntimewarp_distance, temporal_hausdorff_distance, temporal_similarity_path

Architectural callouts

  • TableFunction infrastructure: valueSplit / timeSplit / valueTimeSplit (in 025) and similarityPath (in 038) all return tables, not scalars. MobilityDuck has no TableFunction registrations anywhere in the temporal/geo code; that's a separate infrastructure landing.
  • Aggregate infrastructure: not needed for these four — separate gap (PR test(parity): 015_span_aggfuncs.test — aggregate surface manifest #21).

Test plan

  • make release then TZ=UTC ./build/release/test/unittest "<proj>/test/*" — full suite passes (747 assertions across 17 test cases).

Drafted because every file is wholesale-skipped; the value is the manifest, not the assertions.

estebanzimanyi added a commit that referenced this pull request Apr 25, 2026
Adds 8 ScalarFunction registrations for temporal similarity:

  frechetDistance(t1, t2)      -> DOUBLE
  discreteFrechet(t1, t2)      -> DOUBLE  (alias of frechetDistance)
  dynTimeWarp(t1, t2)          -> DOUBLE
  hausdorffDistance(t1, t2)    -> DOUBLE

For each: (tint, tint) and (tfloat, tfloat) registrations.

MEOS bindings:
  temporal_frechet_distance, temporal_dyntimewarp_distance,
  temporal_hausdorff_distance.

Implementation uses 3 thin wrappers + a single TempTempDoublePred
templated helper.

similarityPath (table-returning, alignment) is intentionally NOT
included — needs DuckDB TableFunction infrastructure (separate
follow-up tracked in PR #23).

Smoke test:
  SELECT frechetDistance(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');     -- 1.0
  SELECT discreteFrechet(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');     -- 1.0
  SELECT dynTimeWarp(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');         -- 2.0
  SELECT hausdorffDistance(tfloat '[1@01-01, 5@01-05]', tfloat '[3@01-01, 3@01-05]');   -- 2.0

Stacked on PR #35 (ever/always ordering).

Full suite passes (747 assertions, 13 test cases).
estebanzimanyi added a commit that referenced this pull request Apr 25, 2026
Adds 8 ScalarFunction registrations for temporal similarity:

  frechetDistance(t1, t2)      -> DOUBLE
  discreteFrechet(t1, t2)      -> DOUBLE  (alias of frechetDistance)
  dynTimeWarp(t1, t2)          -> DOUBLE
  hausdorffDistance(t1, t2)    -> DOUBLE

For each: (tint, tint) and (tfloat, tfloat) registrations.

MEOS bindings:
  temporal_frechet_distance, temporal_dyntimewarp_distance,
  temporal_hausdorff_distance.

Implementation uses 3 thin wrappers + a single TempTempDoublePred
templated helper.

similarityPath (table-returning, alignment) is intentionally NOT
included — needs DuckDB TableFunction infrastructure (separate
follow-up tracked in PR #23).

Smoke test:
  SELECT frechetDistance(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');     -- 1.0
  SELECT discreteFrechet(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');     -- 1.0
  SELECT dynTimeWarp(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');         -- 2.0
  SELECT hausdorffDistance(tfloat '[1@01-01, 5@01-05]', tfloat '[3@01-01, 3@01-05]');   -- 2.0

Stacked on PR #35 (ever/always ordering).

Full suite passes (747 assertions, 13 test cases).
estebanzimanyi added a commit that referenced this pull request Apr 25, 2026
Adds 8 ScalarFunction registrations for temporal similarity:

  frechetDistance(t1, t2)      -> DOUBLE
  discreteFrechet(t1, t2)      -> DOUBLE  (alias of frechetDistance)
  dynTimeWarp(t1, t2)          -> DOUBLE
  hausdorffDistance(t1, t2)    -> DOUBLE

For each: (tint, tint) and (tfloat, tfloat) registrations.

MEOS bindings:
  temporal_frechet_distance, temporal_dyntimewarp_distance,
  temporal_hausdorff_distance.

Implementation uses 3 thin wrappers + a single TempTempDoublePred
templated helper.

similarityPath (table-returning, alignment) is intentionally NOT
included — needs DuckDB TableFunction infrastructure (separate
follow-up tracked in PR #23).

Smoke test:
  SELECT frechetDistance(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');     -- 1.0
  SELECT discreteFrechet(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');     -- 1.0
  SELECT dynTimeWarp(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');         -- 2.0
  SELECT hausdorffDistance(tfloat '[1@01-01, 5@01-05]', tfloat '[3@01-01, 3@01-05]');   -- 2.0

Stacked on PR #35 (ever/always ordering).

Full suite passes (747 assertions, 13 test cases).
estebanzimanyi added a commit that referenced this pull request Apr 27, 2026
Adds 8 ScalarFunction registrations for temporal similarity:

  frechetDistance(t1, t2)      -> DOUBLE
  discreteFrechet(t1, t2)      -> DOUBLE  (alias of frechetDistance)
  dynTimeWarp(t1, t2)          -> DOUBLE
  hausdorffDistance(t1, t2)    -> DOUBLE

For each: (tint, tint) and (tfloat, tfloat) registrations.

MEOS bindings:
  temporal_frechet_distance, temporal_dyntimewarp_distance,
  temporal_hausdorff_distance.

Implementation uses 3 thin wrappers + a single TempTempDoublePred
templated helper.

similarityPath (table-returning, alignment) is intentionally NOT
included — needs DuckDB TableFunction infrastructure (separate
follow-up tracked in PR #23).

Smoke test:
  SELECT frechetDistance(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');     -- 1.0
  SELECT discreteFrechet(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');     -- 1.0
  SELECT dynTimeWarp(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');         -- 2.0
  SELECT hausdorffDistance(tfloat '[1@01-01, 5@01-05]', tfloat '[3@01-01, 3@01-05]');   -- 2.0

Stacked on PR #35 (ever/always ordering).

Full suite passes (747 assertions, 13 test cases).
nhungoc1508 pushed a commit that referenced this pull request Apr 27, 2026
Adds 8 ScalarFunction registrations for temporal similarity:

  frechetDistance(t1, t2)      -> DOUBLE
  discreteFrechet(t1, t2)      -> DOUBLE  (alias of frechetDistance)
  dynTimeWarp(t1, t2)          -> DOUBLE
  hausdorffDistance(t1, t2)    -> DOUBLE

For each: (tint, tint) and (tfloat, tfloat) registrations.

MEOS bindings:
  temporal_frechet_distance, temporal_dyntimewarp_distance,
  temporal_hausdorff_distance.

Implementation uses 3 thin wrappers + a single TempTempDoublePred
templated helper.

similarityPath (table-returning, alignment) is intentionally NOT
included — needs DuckDB TableFunction infrastructure (separate
follow-up tracked in PR #23).

Smoke test:
  SELECT frechetDistance(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');     -- 1.0
  SELECT discreteFrechet(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');     -- 1.0
  SELECT dynTimeWarp(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');         -- 2.0
  SELECT hausdorffDistance(tfloat '[1@01-01, 5@01-05]', tfloat '[3@01-01, 3@01-05]');   -- 2.0

Stacked on PR #35 (ever/always ordering).

Full suite passes (747 assertions, 13 test cases).
nhungoc1508 pushed a commit that referenced this pull request Apr 27, 2026
Adds 8 ScalarFunction registrations for temporal similarity:

  frechetDistance(t1, t2)      -> DOUBLE
  discreteFrechet(t1, t2)      -> DOUBLE  (alias of frechetDistance)
  dynTimeWarp(t1, t2)          -> DOUBLE
  hausdorffDistance(t1, t2)    -> DOUBLE

For each: (tint, tint) and (tfloat, tfloat) registrations.

MEOS bindings:
  temporal_frechet_distance, temporal_dyntimewarp_distance,
  temporal_hausdorff_distance.

Implementation uses 3 thin wrappers + a single TempTempDoublePred
templated helper.

similarityPath (table-returning, alignment) is intentionally NOT
included — needs DuckDB TableFunction infrastructure (separate
follow-up tracked in PR #23).

Smoke test:
  SELECT frechetDistance(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');     -- 1.0
  SELECT discreteFrechet(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');     -- 1.0
  SELECT dynTimeWarp(tfloat '[1@01-01, 2@01-02]', tfloat '[2@01-01, 3@01-02]');         -- 2.0
  SELECT hausdorffDistance(tfloat '[1@01-01, 5@01-05]', tfloat '[3@01-01, 3@01-05]');   -- 2.0

Stacked on PR #35 (ever/always ordering).

Full suite passes (747 assertions, 13 test cases).
Four parity files added in one commit, each as a wholesale-skip
manifest because the underlying surface is unbound in MobilityDuck.

- 025_temporal_tile.test
    Skip reason: bins / getBin / valueSplit / timeSplit /
    valueTimeSplit not registered. MEOS symbols: span_bins, value_bin,
    tnumber_value_split, temporal_time_split,
    tnumber_value_time_split. Note: the table-returning split
    variants need DuckDB TableFunction infrastructure (not currently
    used anywhere in MobilityDuck).

- 026_tnumber_mathfuncs.test
    Skip reason: arithmetic operators (+, -, *, /) not registered for
    tnumber types — value op tnumber, tnumber op value, tnumber op
    tnumber across both base types (tint, tfloat) = 96 missing
    ScalarFunction registrations. Plus unary functions abs, deg, rad,
    atan, derivative.

- 036_tnumber_distance.test
    Skip reason: <-> for tnumber types not registered.
    nearestApproachDistance / nad not registered. MEOS symbols:
    distance_tnumber_value, distance_tnumber_tnumber, nad_*.

- 038_temporal_similarity.test
    Skip reason: frechetDistance, discreteFrechet, dynTimeWarp,
    hausdorffDistance, similarityPath all unregistered. Similarity
    path returns table — needs TableFunction infrastructure.

Suite: 747 assertions across 17 test cases (4 new files, 0 new
active assertions; each file is a tracked gap manifest).
@estebanzimanyi

Copy link
Copy Markdown
Member Author

Refreshed against current main:

  • 026_tnumber_mathfuncs: removed top-level skip; unary + arithmetic now work. Surfaced two MEOS-vs-MobilityDB divergences:
    • abs(tfloat) introduces a synthetic instant at the zero-crossing (mathematically correct but the MobilityDB upstream test omits it). Updated expected.
    • derivative(tfloat) returns value/second; MobilityDB returns value/day. Same MEOS symbol, different display unit. Skipped with annotation.
  • 036_tnumber_distance: split skip — kept value-tnumber/tnumber-value <-> skipped (MEOS upstream bug: tdistance_tfloat_float returns t.value not |t.value-v|), unskipped tnumber-tnumber and nad/nearestApproachDistance.
  • 038_temporal_similarity: removed skip from frechetDistance / discreteFrechet / dynTimeWarp / hausdorffDistance; kept similarityPath skipped (needs TableFunction infra).
  • 025_temporal_tile: kept skipped (bins() over span needs TableFunction infra).

@estebanzimanyi estebanzimanyi force-pushed the feat/parity-batch-tile-math-distance-similarity branch from ce5ef06 to df5744a Compare April 28, 2026 21:32
estebanzimanyi added a commit that referenced this pull request Apr 28, 2026
Single-file inventory of MobilityDB's mobilitydb/test/geo/queries/
surface. Rather than one parity file per upstream regression file,
this lists what each file covers, what's bound in MobilityDuck
today, and what's missing.

Bound surface (per direct audit of src/geo/tgeompoint.cpp):
- tgeompoint / tgeometry I/O for instant / discrete / continuous /
  sequence-set
- asText / asEWKT / memSize / interp / round / transform
- Constructors (TGEOMPOINT, tgeompointInst, tgeompointSeq,
  tgeompointSeqSet)
- Spatial accessors: getX/Y/Z, length, cumulativeLength, speed,
  direction, azimuth, angularDifference, trajectory
- Topological predicates: e/a/t variants of Contains, Disjoint,
  Dwithin, Intersects, Touches
- Set ops: makeSimple, isSimple, stops
- Restrictions: atGeometry, atStbox, atValues, atTime and the
  matching minus*
- Modification: appendInstant, appendSequence, insert, update,
  deleteTime, merge
- Comparison: temporal_eq / temporal_ne / etc.
- twCentroid, shortestLine, distance_gs, collect_gs

Unbound surface (per upstream regression file, with cross-references
to the temporal-side parity-batch PRs that cover the same gap):
- 051_stbox: stbox tests (whole file currently skipped in
  test/sql/stbox.test for DuckDB 1.4 signature issues).
- 052_tgeo / 052_tpoint: mostly bound — per-type ports would mirror
  PRs #13 / #17 / #18 patterns.
- 053_*_inout: asMFJSON / asWKB / asHexWKB / asGeoJSON — verify
  which are bound.
- 054_*_compops: same parser blocker as PR #24's 030 (?= / #=).
- 056_*_spatialfuncs: bulk bound; setSRID-on-temporal and
  reference-system accessors missing.
- 058_*_tile: same gap as PR #23's 025 but for tspatial.
- 060_*_topops: same pattern as PR #25's 032_temporal_topops.
- 062_*_posops: same pattern as PR #24's 034_temporal_posops plus
  spatial-direction operators.
- 064_*_distance: <-> for tgeo / tpoint; partially bound
  (shortestLine, distance_gs).
- 066_tpoint_similarity: specialisation of PR #23's 038.
- 068_*_aggfuncs: same architectural blocker as PR #21 (no
  AggregateFunction infra).

Suite: 747 assertions, 23 test cases.
The skip block claimed MobilityDB reports `derivative` in value/day while
MEOS uses value/second. That was wrong: MobilityDB's SQL wrapper
`Temporal_derivative` is a thin pass-through to MEOS with no scaling, and
its own regression `mobilitydb/test/temporal/expected/026_tnumber_mathfuncs.test.out`
shows `0.000012@...` (value/second). MobilityDuck already matched.

Replace the skip block with the correct expectation, mirroring
MobilityDB's regression style (`round(derivative(...), 6)`).
@estebanzimanyi estebanzimanyi marked this pull request as ready for review May 1, 2026 19:13
@nhungoc1508 nhungoc1508 merged commit 515bb8c into main May 4, 2026
17 checks passed
nhungoc1508 pushed a commit that referenced this pull request May 4, 2026
Single-file inventory of MobilityDB's mobilitydb/test/geo/queries/
surface. Rather than one parity file per upstream regression file,
this lists what each file covers, what's bound in MobilityDuck
today, and what's missing.

Bound surface (per direct audit of src/geo/tgeompoint.cpp):
- tgeompoint / tgeometry I/O for instant / discrete / continuous /
  sequence-set
- asText / asEWKT / memSize / interp / round / transform
- Constructors (TGEOMPOINT, tgeompointInst, tgeompointSeq,
  tgeompointSeqSet)
- Spatial accessors: getX/Y/Z, length, cumulativeLength, speed,
  direction, azimuth, angularDifference, trajectory
- Topological predicates: e/a/t variants of Contains, Disjoint,
  Dwithin, Intersects, Touches
- Set ops: makeSimple, isSimple, stops
- Restrictions: atGeometry, atStbox, atValues, atTime and the
  matching minus*
- Modification: appendInstant, appendSequence, insert, update,
  deleteTime, merge
- Comparison: temporal_eq / temporal_ne / etc.
- twCentroid, shortestLine, distance_gs, collect_gs

Unbound surface (per upstream regression file, with cross-references
to the temporal-side parity-batch PRs that cover the same gap):
- 051_stbox: stbox tests (whole file currently skipped in
  test/sql/stbox.test for DuckDB 1.4 signature issues).
- 052_tgeo / 052_tpoint: mostly bound — per-type ports would mirror
  PRs #13 / #17 / #18 patterns.
- 053_*_inout: asMFJSON / asWKB / asHexWKB / asGeoJSON — verify
  which are bound.
- 054_*_compops: same parser blocker as PR #24's 030 (?= / #=).
- 056_*_spatialfuncs: bulk bound; setSRID-on-temporal and
  reference-system accessors missing.
- 058_*_tile: same gap as PR #23's 025 but for tspatial.
- 060_*_topops: same pattern as PR #25's 032_temporal_topops.
- 062_*_posops: same pattern as PR #24's 034_temporal_posops plus
  spatial-direction operators.
- 064_*_distance: <-> for tgeo / tpoint; partially bound
  (shortestLine, distance_gs).
- 066_tpoint_similarity: specialisation of PR #23's 038.
- 068_*_aggfuncs: same architectural blocker as PR #21 (no
  AggregateFunction infra).

Suite: 747 assertions, 23 test cases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants