Skip to content

feat(maru): add Prometheus observability metrics to MaruBackend#8

Draft
hyunyul-XCENA wants to merge 2 commits into
xcena-dev:feat/maru-backendfrom
hyunyul-XCENA:feat/maru-observability
Draft

feat(maru): add Prometheus observability metrics to MaruBackend#8
hyunyul-XCENA wants to merge 2 commits into
xcena-dev:feat/maru-backendfrom
hyunyul-XCENA:feat/maru-observability

Conversation

@hyunyul-XCENA

Copy link
Copy Markdown

Summary

  • Add 6 Prometheus metrics to MaruBackend (gauge, counters, histograms)
  • Zero changes to LMCache core observability modules — self-contained in maru_backend.py
  • Metrics exposed via vLLM /metrics endpoint with PROMETHEUS_MULTIPROC_DIR

Metrics

Metric Type Description
lmcache:maru_put_task_num Gauge In-flight put tasks
lmcache:maru_put_failed_count Counter Store RPC failures
lmcache:maru_get_blocking_failed_count Counter Retrieve failures
lmcache:maru_alloc_failed_count Counter CXL memory allocation failures
lmcache:maru_store_latency_seconds Histogram Store RPC latency
lmcache:maru_retrieve_latency_seconds Histogram CXL read latency

Test plan

  • Verified metrics appear at curl localhost:$PORT/metrics | grep lmcache:maru
  • Confirmed store/retrieve latency values update after queries
  • Confirmed failure counters stay at 0 during normal operation

Add 6 self-contained Prometheus metrics using prometheus_client directly,
with zero changes to LMCache core observability modules:

- maru_put_task_num (Gauge): in-flight put tasks
- maru_put_failed_count (Counter): store RPC failures
- maru_get_blocking_failed_count (Counter): retrieve failures
- maru_alloc_failed_count (Counter): CXL memory allocation failures
- maru_store_latency_seconds (Histogram): store RPC latency
- maru_retrieve_latency_seconds (Histogram): CXL read latency

Metrics are exposed via vLLM /metrics endpoint when
PROMETHEUS_MULTIPROC_DIR is set. Gauge uses multiprocess_mode
and set_function for zero runtime overhead.
@hyunyul-XCENA hyunyul-XCENA marked this pull request as draft March 18, 2026 06:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant