feat: scaffold the OpenXLA backend crate and seam (#449 Phase 3 M1) by inureyes · Pull Request #458 · lablup/mlxcel

inureyes · 2026-06-27T00:53:47Z

Phase 3 M1 (#449): the default-off mlxcel-xla crate and the compute-backend seam wiring, the first step of integrating the OpenXLA / StableHLO backend behind the #448 inference-session contract.

New crate mlxcel-xla provides XlaInferenceSession: it implements the engine-neutral InferenceSession contract (capabilities / prefill / decode_step) plus the self-contained greedy drive loop (generate_greedy / generate_streaming_greedy) a backend that owns its KV and samples on-device uses instead of threading an MLX model.
The seam gains a cfg-gated Backend::Xla / Session::Xla behind a new xla-backend feature; select_backend recognizes MLXCEL_BACKEND=xla, mirroring the experimental scaffold. XlaBackend::load_model errors (the OpenXLA path drives generation through the session, not the MLX load boundary).
Execution is stubbed: prefill / decode_step return a clear not-wired error so the drive loop surfaces it rather than panicking. Binding the IREE runtime C API (load the compiled StableHLO vmfb, run prefill / decode_step) is the next milestone, together with threading the model directory through session creation.

Verified: the default build is unaffected (mlxcel-xla is an optional dep gated off, not compiled), cargo check --features xla-backend builds the lib and tests, fmt and clippy are clean, and the crate plus seam unit tests pass.

Refs #449.

Lands the default-off mlxcel-xla crate and wires it into the compute-backend seam, the first step of integrating the OpenXLA / StableHLO compiler-family backend behind the #448 inference-session contract. mlxcel-xla provides XlaInferenceSession, which implements the engine-neutral InferenceSession contract (capabilities, prefill, decode_step) and the self-contained greedy drive loop (generate_greedy / generate_streaming_greedy) that a backend owning its KV and sampling on-device uses instead of threading an MLX model. The seam gains a cfg-gated Backend::Xla / Session::Xla behind a new xla-backend feature, with select_backend recognizing MLXCEL_BACKEND=xla, mirroring the experimental scaffold. XlaBackend::load_model errors (the OpenXLA path drives generation through the session, not the MLX load boundary). Graph execution is stubbed: prefill / decode_step return a clear not-wired error, so the drive loop surfaces it rather than panicking. Binding the IREE runtime C API to load the compiled StableHLO vmfb and run prefill / decode_step is the next milestone, along with threading the model directory through session creation. Verified: the default build is unaffected (mlxcel-xla is an optional dep gated off, not compiled), cargo check --features xla-backend builds the lib and tests, fmt and clippy are clean, and the crate plus seam unit tests pass. Refs #449.

inureyes merged commit add7f1b into main Jun 27, 2026

inureyes deleted the feat/openxla-backend-scaffold-449 branch June 27, 2026 00:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: scaffold the OpenXLA backend crate and seam (#449 Phase 3 M1)#458

feat: scaffold the OpenXLA backend crate and seam (#449 Phase 3 M1)#458
inureyes merged 1 commit into
mainfrom
feat/openxla-backend-scaffold-449

inureyes commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

inureyes commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant