Add ModelCache v1alpha1 API by dennis-upbound · Pull Request #80 · modelplaneai/modelplane

dennis-upbound · 2026-05-20T18:55:28Z

Follow-up to #82. Builds on the cluster-level spec.storage.rwxCache shape.

Adds the ML-team-facing opt-in for cross-deployment sharing, independent cache lifecycle, and proactive pre-staging on top of the invisible per-replica caching Modelplane already provides for multi-node deployments.

API surface + docs + examples only. Composition function follows in a separate MR.

What this adds

apis/modelcaches/definition.yaml — ModelCache XRD. Artifact source + mount path + optional size override + cluster selector. Storage class is always inherited from the target cluster's rwxCache; ML teams don't pick it.
apis/modeldeployments/definition.yaml — spec.modelCacheRef (singular, matching the existing inferenceClusterRef pattern).
apis/modelreplicas/definition.yaml — same spec.modelCacheRef, inherited from the parent ModelDeployment.
docs/concepts.md — ## ModelCache section positioned as the optional shared/lifecycled overlay on top of the cluster-level invisible caching; mermaid diagram updated.
examples/cache/model-cache-qwen.yaml — Qwen 0.5B cache example.
examples/deployment/model-deployment-cached.yaml — deployment referencing the cache via spec.modelCacheRef.

How this relates to #82

Scenario	#82 (this PR's base)	This PR adds
Multi-node, no cache reference	Auto-provision per-replica PVC from cluster `rwxCache`	(no change)
Multi-node + `modelCacheRef` set	n/a	Mount shared cache PVC across replicas; lifecycle decoupled from deployment
Single-node + `modelCacheRef` set	n/a	Mount shared cache PVC; cold-start optimization
Single-node, no cache reference	Ephemeral fetch in engine container	(no change)

ML teams who don't care about sharing or pre-staging never see this surface — the cluster default still handles multi-node invisibly.

What this does NOT change

examples/deployment/model-deployment.yaml (the canonical single-node example) — untouched.
docs/getting-started.md — untouched. ModelCache stays an opt-in feature; the basic flow doesn't use it.
No composition function logic.

… caching) Builds on the cluster-level rwxCache shape: ModelCache is the ML-team-facing opt-in for cross-deployment sharing, independent lifecycle, and proactive pre-staging on top of the invisible per-replica caching that Modelplane already provides for multi-node deployments. API surface only — composition function follows in a separate MR. - apis/modelcaches/definition.yaml — XRD. Artifact source + mount path + optional size override + clusterSelector. Storage class is always inherited from the target cluster's rwxCache; ML teams don't pick it. - apis/modeldeployments/definition.yaml — spec.modelCacheRef (singular, matching the existing inferenceClusterRef pattern) - apis/modelreplicas/definition.yaml — spec.modelCacheRef, inherited verbatim from the parent ModelDeployment - docs/concepts.md — ModelCache section positioned as the optional shared/lifecycled overlay on the cluster-level invisible caching - examples/cache/model-cache-qwen.yaml — Qwen 0.5B cache example - examples/deployment/model-deployment-cached.yaml — deployment referencing the cache via spec.modelCacheRef

dennis-upbound · 2026-05-21T15:37:28Z

Closing in favor of consolidated cache design — superseded by the new branch combining the final API + design doc.

dennis-upbound force-pushed the dennis/modelcache-api branch 5 times, most recently from e8c73b0 to c76fb0c Compare May 20, 2026 23:49

dennis-upbound changed the base branch from main to dennis/cluster-storage May 20, 2026 23:49

dennis-upbound force-pushed the dennis/cluster-storage branch from 58b3a20 to 46f0f17 Compare May 20, 2026 23:51

dennis-upbound force-pushed the dennis/modelcache-api branch from c76fb0c to 5a177a4 Compare May 20, 2026 23:52

dennis-upbound force-pushed the dennis/cluster-storage branch from 46f0f17 to ab8dad1 Compare May 21, 2026 05:15

dennis-upbound force-pushed the dennis/modelcache-api branch from 5a177a4 to 573b410 Compare May 21, 2026 05:16

dennis-upbound closed this May 21, 2026

dennis-upbound deleted the dennis/modelcache-api branch June 19, 2026 17:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ModelCache v1alpha1 API#80

Add ModelCache v1alpha1 API#80
dennis-upbound wants to merge 1 commit into
dennis/cluster-storagefrom
dennis/modelcache-api

dennis-upbound commented May 20, 2026 •

edited

Loading

Uh oh!

dennis-upbound commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

dennis-upbound commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this adds

How this relates to #82

What this does NOT change

Uh oh!

dennis-upbound commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dennis-upbound commented May 20, 2026 •

edited

Loading