Commit d71d3d3
pluggable registry for input/export arrow kernels (#7824)
## Summary
Adds a pluggable `ArrowSession` registry on `VortexSession` for
round-tripping Vortex extension types in and out of Arrow extension
types. Unblocks Arrow round-trip for `arrow.uuid` today, with
`arrow.parquet.variant`, GeoArrow, and tensor types as the next
consumers.
Part of #7686.
## API changes
The session exposes two trait-driven plugin slots:
- `ArrowExportVTable` — dispatched by **target Arrow extension name**
(`ARROW:extension:name`). Implementations turn a Vortex `ArrayRef` into
an Arrow `ArrayRef` shaped to the requested `Field`. Also provides
`to_arrow_field` for schema inference when only a Vortex `DType` is in
hand.
- `ArrowImportVTable` — dispatched by **source Arrow extension name**
carried on the incoming `Field`. Implementations turn an Arrow
`ArrayRef` back into a Vortex `ArrayRef`, including any storage
re-encoding (e.g. `FixedSizeBinary[16]` → `FixedSizeList<u8; 16>` for
UUID).
Both traits return `Unsupported(input)` to defer to the next plugin or
to the canonical fallback, so multiple plugins can register against the
same key and probe in order.
New session entry points (`vortex-array/src/arrow/session.rs`):
- `ArrowSession::to_arrow_field` / `to_arrow_schema` — Vortex `DType` →
Arrow `Field`/`Schema`, recursing into containers so nested extension
fields go through the registered plugin.
- `ArrowSession::from_arrow_field` / `from_arrow_schema` — inverse
direction, plugin-aware.
- `ArrowSession::from_arrow_record_batch` / `execute_record_batch` —
`RecordBatch` round-trip.
- `ArrowSessionExt` extension trait so any `SessionExt` can call
`session.arrow().…`.
The default session pre-registers the builtin UUID plugin
(`vortex-array/src/extension/uuid/arrow.rs`).
## What's *not* in the plugin layer
`Date`, `Time`, and `Timestamp` are Vortex builtin extensions that map
directly to native Arrow temporal types, so they continue to go through
the canonical executor (`vortex-array/src/arrow/executor/temporal.rs`)
rather than the plugin registry. The plugin layer is reserved for
**Arrow extension types** that the canonical path can't express.
## DataFusion wiring
`vortex-datafusion` now goes through the session for schema/array
conversion:
- `convert/schema.rs::calculate_physical_schema` uses
`ArrowSession::to_arrow_field` so extension metadata survives
projection.
- `persistent/format.rs` and `persistent/opener.rs` route schema
inference through the session.
- `persistent/sink.rs` uses `from_arrow_record_batch`, passing the
original schema separately from `RecordBatch::schema()` to preserve
`ARROW:extension:name` metadata that DataFusion strips at runtime.
## Tests
Two new end-to-end tests in `vortex-datafusion/src/persistent/tests.rs`:
- `arrow_uuid_extension_roundtrip` — write Arrow UUID column to a Vortex
file via the session, `SELECT *` it back, assert the field still carries
the `Uuid` extension type and the values match.
- `arrow_uuid_extension_roundtrip_nested_struct` — same flow with the
UUID nested in a top-level `Struct`, exercising recursive session-aware
schema inference.
---------
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Baris Palaska <barispalaska@gmail.com>
Co-authored-by: Baris Palaska <barispalaska@gmail.com>1 parent 96dda71 commit d71d3d3
50 files changed
Lines changed: 2753 additions & 710 deletions
File tree
- encodings
- parquet-variant/src
- pco/src
- runend/src
- sparse/src
- vortex-array
- benches
- src
- arrays
- constant/compute
- filter/execute
- arrow
- executor
- dtype
- extension/uuid
- scalar_fn/fns
- binary
- zip
- vortex-bench/src/datasets
- vortex-cxx/src
- vortex-datafusion/src
- convert
- persistent
- v2
- vortex-ffi/src
- vortex-jni/src
- vortex-layout/src/scan
- vortex-python/src
- arrays
- dtype
- iter
- vortex-tensor
- src
- types/vector
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
382 | 382 | | |
383 | 383 | | |
384 | 384 | | |
385 | | - | |
| 385 | + | |
386 | 386 | | |
387 | 387 | | |
388 | 388 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
25 | 29 | | |
26 | 30 | | |
27 | 31 | | |
| |||
331 | 335 | | |
332 | 336 | | |
333 | 337 | | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
334 | 342 | | |
335 | 343 | | |
336 | 344 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
104 | 104 | | |
105 | 105 | | |
106 | 106 | | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
107 | 111 | | |
108 | 112 | | |
109 | 113 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | | - | |
| 13 | + | |
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| |||
213 | 213 | | |
214 | 214 | | |
215 | 215 | | |
216 | | - | |
217 | | - | |
218 | | - | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
219 | 224 | | |
220 | 225 | | |
221 | 226 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
88 | | - | |
| 88 | + | |
89 | 89 | | |
90 | 90 | | |
91 | 91 | | |
| |||
301 | 301 | | |
302 | 302 | | |
303 | 303 | | |
304 | | - | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
305 | 308 | | |
306 | 309 | | |
307 | 310 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
599 | 599 | | |
600 | 600 | | |
601 | 601 | | |
602 | | - | |
| 602 | + | |
603 | 603 | | |
604 | 604 | | |
605 | 605 | | |
| |||
845 | 845 | | |
846 | 846 | | |
847 | 847 | | |
848 | | - | |
849 | | - | |
850 | | - | |
851 | | - | |
852 | | - | |
853 | | - | |
854 | | - | |
855 | | - | |
856 | | - | |
| 848 | + | |
| 849 | + | |
| 850 | + | |
| 851 | + | |
| 852 | + | |
| 853 | + | |
| 854 | + | |
| 855 | + | |
| 856 | + | |
| 857 | + | |
| 858 | + | |
| 859 | + | |
| 860 | + | |
| 861 | + | |
857 | 862 | | |
858 | | - | |
859 | | - | |
860 | | - | |
861 | | - | |
862 | | - | |
863 | | - | |
864 | | - | |
| 863 | + | |
| 864 | + | |
| 865 | + | |
| 866 | + | |
| 867 | + | |
| 868 | + | |
| 869 | + | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
| 873 | + | |
| 874 | + | |
865 | 875 | | |
866 | 876 | | |
867 | 877 | | |
| |||
1544 | 1554 | | |
1545 | 1555 | | |
1546 | 1556 | | |
1547 | | - | |
1548 | | - | |
1549 | | - | |
| 1557 | + | |
| 1558 | + | |
| 1559 | + | |
| 1560 | + | |
| 1561 | + | |
| 1562 | + | |
| 1563 | + | |
| 1564 | + | |
| 1565 | + | |
| 1566 | + | |
1550 | 1567 | | |
1551 | 1568 | | |
1552 | 1569 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
32 | | - | |
| 32 | + | |
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
| |||
206 | 206 | | |
207 | 207 | | |
208 | 208 | | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
0 commit comments