Skip to content

refactor eloq metric repo and register eloqstore metrics#345

Merged
lokax merged 19 commits into
mainfrom
yf-metrics
Jan 16, 2026
Merged

refactor eloq metric repo and register eloqstore metrics#345
lokax merged 19 commits into
mainfrom
yf-metrics

Conversation

@lokax

@lokax lokax commented Jan 8, 2026

Copy link
Copy Markdown
Collaborator

Here are some reminders before you submit the pull request

  • Add tests for the change
  • Document changes
  • Reference the link of issue using fixes eloqdb/tx_service#issue_id
  • Reference the link of RFC if exists
  • Pass ./mtr --suite=mono_main,mono_multi,mono_basic

Summary by CodeRabbit

  • New Features

    • Optional TX-service metrics integration for the data store; store can initialize and register metrics at startup.
    • Metrics API now uses opaque metric handles and exposes an explicit collector accessor.
  • Refactor

    • Startup sequence reordered: metrics → storage → tx-service.
    • Metrics subsystem reworked with shared registry lifetime and handle-based collection.
  • Tests

    • Benchmarks/tests updated to use metric handles.
  • Documentation

    • Clarified that metric handles are thread-safe and usable after registration.
  • Bug Fixes

    • Improved logging for KV store metrics registration failures.

✏️ Tip: You can customize this high-level summary in your review settings.

Copilot AI review requested due to automatic review settings January 8, 2026 06:06
@coderabbitai

coderabbitai Bot commented Jan 8, 2026

Copy link
Copy Markdown

Caution

Review failed

The pull request is closed.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Reorders DataSubstrate startup to obtain a shared metrics registry earlier, adds DataSubstrate::RegisterKvStoreMetrics(), moves KV-store metrics registration after storage init, replaces CollectorWrapper/MetricKey APIs with MetricHandle/GetCollector, adds ELOQSTORE_WITH_TXSERVICE build flag, and wires EloqStore metrics initialization guarded by that flag. (50 words)

Changes

Cohort / File(s) Summary
Build configuration
CMakeLists.txt, store_handler/eloq_data_store_service/build_eloq_store.cmake
Add ELOQSTORE_WITH_TXSERVICE compile define; when WITH_TXSERVICE ON link eloq-metrics, add include dir, and print status.
Core DataSubstrate
core/include/data_substrate.h, core/src/data_substrate.cpp, core/src/metrics_init.cpp
metrics_registry_ -> std::shared_ptr; add public/private RegisterKvStoreMetrics(); reorder init phases to obtain registry earlier, initialize storage, then register KV metrics, then init TxService; adjust logs.
EloqStore integration
store_handler/eloq_data_store_service/eloq_store_data_store.h, .../eloq_store_data_store.cpp
Make Initialize() out-of-line; under ELOQSTORE_WITH_TXSERVICE obtain registry and call EloqStore::InitializeMetrics(registry, common_labels); add guarded includes.
Metrics public API
eloq_metrics/include/metrics.h, eloq_metrics/include/meter.h
Introduce metrics::MetricHandle (key, type, collector_data); MetricsRegistry::Register() returns MetricHandle; Collect() accepts MetricHandle; meter maps store handles.
Metrics manager & impl
eloq_metrics/include/metrics_manager.h, eloq_metrics/src/metrics_manager.cc, eloq_metrics/include/metrics_registry_impl.h, eloq_metrics/src/metrics_registry_impl.cpp
Remove CollectorWrapper; MetricsRegistry() returns MetricHandle; add GetCollector() returning std::shared_ptr<MetricsCollector>; registry singleton uses shared_ptr; Collect() forwards to central collector via MetricHandle.
Collectors & Prometheus
eloq_metrics/include/metrics_collector.h, eloq_metrics/include/prometheus_collector.h, eloq_metrics/src/prometheus_collector.cc
Collector API becomes handle-centric: SetMetric() returns MetricHandle; Collect() and CollectClientMetrics() accept MetricHandle; add PrometheusMetricData to contain per-metric Prometheus objects; remove per-type maps.
Benchmarks & tests
eloq_metrics/benchmark/*_bench.cpp, eloq_metrics/tests/metrics_collector_test.cc
Replace CollectorWrapper usage with MetricHandle + MetricsCollector via GetCollector(); update Collect calls to use (MetricHandle, Value); tests updated accordingly.
Docs
eloq_metrics/README.md
Note that MetricHandle contains all info needed for thread-safe, lock-free collection after registration.
Tx service flag
tx_service/include/tx_service_metrics.h
Add inline bool enable_eloqstore_metrics{false}; toggle.
Submodule update
store_handler/eloq_data_store_service/eloqstore
Submodule commit bumped (no code-level API changes).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant DS as DataSubstrate
  participant MR as MetricsRegistryImpl
  participant SH as StorageHandler
  participant ES as EloqStore
  participant TS as TxService

  DS->>MR: GetRegistry()
  MR-->>DS: shared_ptr<MetricsRegistry>
  DS->>SH: Initialize storage handler
  SH-->>DS: storage ready
  DS->>DS: RegisterKvStoreMetrics() (uses registry + store_hd_)
  DS->>ES: EloqStore::InitializeMetrics(registry, common_labels)
  ES-->>DS: shard metrics initialized
  DS->>TS: Initialize TxService
  TS-->>DS: TxService ready
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Possibly related PRs

Suggested reviewers

  • liunyl
  • MrGuin
  • xiexiaoy

Poem

🐇 I hopped through headers, swapped wrappers for a handle,
Shared pointers snug so no pointers will dandle,
Metrics wake after the store's first hum,
Shards count and report while build flags softly drum,
I nibble a carrot — CI green, I prance and dandle.

🚥 Pre-merge checks | ✅ 1 | ❌ 4
❌ Failed checks (3 warnings, 1 inconclusive)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description only contains an unchecked reminder checklist with no implementation details, context, or explanation of the changes. Provide a detailed description explaining the refactoring objectives, key architectural changes (handle-based metrics API, registry lifecycle management), testing status, and relevant issue/RFC links.
Linked Issues check ⚠️ Warning The PR description does not reference any issue or RFC links despite the template explicitly requesting them. Add issue reference using 'fixes eloqdb/tx_service#issue_id' or RFC link in the PR description.
Docstring Coverage ⚠️ Warning Docstring coverage is 12.24% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Out of Scope Changes check ❓ Inconclusive The submodule update appears unrelated to the metrics refactoring and requires clarification on its necessity. Clarify whether the submodule commit update (eloqstore 144d411 → e160555) is required for this refactoring and verify no unintended out-of-scope changes are included.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main changes: refactoring the metrics repository and registering eloqstore metrics, which aligns with the extensive changes across metrics and data store files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch yf-metrics


📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9a2f329 and 24dacdf.

📒 Files selected for processing (1)
  • store_handler/eloq_data_store_service/eloqstore

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
eloq_metrics/include/metrics_registry_impl.h (1)

33-49: API changes require full propagation through callers and tests.

The shift from unique_ptr to shared_ptr and the introduction of MetricHandle are sound architectural changes, but verification reveals incomplete migration:

  • Register callers (dummy_executor.cc, tx_service.h) invoke with 2 parameters (Name, Type) but the new signature requires 3 (Name, Type, Labels)
  • Collect in tests (metrics_collector_test.cc) use the old signature with hash_value and Type parameters instead of MetricHandle
  • Documentation (meter.h) shows outdated Collect examples

These breaking API changes must be fully propagated across all callers, tests, and documentation before this refactoring is complete.

🤖 Fix all issues with AI agents
In @CMakeLists.txt:
- Around line 129-130: The compile definition ELOQSTORE_WITH_TXSERVICE is added
unconditionally; make it conditional on the WITH_TXSERVICE option by checking
the CMake option variable (WITH_TXSERVICE) and only call
add_compile_definitions(ELOQSTORE_WITH_TXSERVICE) when that variable is true
(use the if(WITH_TXSERVICE) ... endif() pattern around the
add_compile_definitions call).

In @eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp:
- Around line 17-22: The static initialization dereferences
GetMetricMgrInstance().mgr_ (risking null) for benchmark_gauge_metric_handle and
benchmark_gauge_collector; instead, defer initialization and null-check the
manager: replace the file-level statics with function-local statics (or an init
function) that call MetricsMgr::GetMetricMgrInstance(), verify mgr_ is non-null,
then call mgr_->MetricsRegistry(std::move(benchmark_gauge_metric_ptr)) and
mgr_->GetCollector() to create and return the MetricHandle/collector (refer to
benchmark_gauge_metric_handle, benchmark_gauge_collector,
MetricsMgr::GetMetricMgrInstance, mgr_, MetricsRegistry, and GetCollector).

In @eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp:
- Around line 17-22: The static globals benchmark_histogram_metric_handle and
benchmark_histogram_collector dereference
MetricsMgr::GetMetricMgrInstance().mgr_ during static initialization which can
be nullptr; change to lazy initialization or add null guards: replace the static
initialization with function-local statics or getter functions (e.g.,
GetBenchmarkHistogramMetricHandle() and GetBenchmarkHistogramCollector()) that
call MetricsMgr::GetMetricMgrInstance(), check that .mgr_ is non-null before
calling MetricsRegistry/GetCollector, and return a safe fallback (nullptr/shared
empty handle) or throw a controlled error if mgr_ is null to avoid dereferencing
during startup.

In @eloq_metrics/README.md:
- Around line 113-114: The README's "thread-safe and lock-free" claim is
incorrect because PrometheusCollector::Collect() reads shared containers
histograms_, gauges_, counters_ without synchronization while
SetMetric()/PutMetricsIfAbsent() can modify them, causing data races and TOCTOU
issues; fix by adding a std::shared_mutex (e.g., metrics_mutex_) to
PrometheusCollector, take a shared_lock in Collect() when reading the maps and a
unique_lock in SetMetric()/PutMetricsIfAbsent() when inserting or mutating
entries, and remove or replace any assert() checks that assume pre-registration
with guarded checks under the same locks; alternatively, if you prefer
documentation-only, remove the "thread-safe and lock-free" sentence and state
that external synchronization is required.

In @store_handler/eloq_data_store_service/eloq_store_data_store.cpp:
- Line 112: The LOG(INFO) debug artifact in EloqStoreDataStore::Initialize
contains the developer prefix "yf: " and should be removed or reworded for
production; update the LOG(INFO) invocation in EloqStoreDataStore::Initialize to
use a concise, production-ready message (e.g., "EloqStoreDataStore::Initialize:
get metric registry") without the "yf: " prefix.
🧹 Nitpick comments (3)
core/src/metrics_init.cpp (1)

182-185: Minor typo in variable name.

The variable kv_common_common_labels contains a redundant "common". Consider renaming to kv_common_labels for consistency with similar patterns elsewhere (e.g., mysql_common_labels, redis_common_labels).

Suggested fix
-        metrics::CommonLabels kv_common_common_labels{};
-        kv_common_common_labels["node_ip"] = network_config_.local_ip;
-        kv_common_common_labels["node_port"] =
+        metrics::CommonLabels kv_common_labels{};
+        kv_common_labels["node_ip"] = network_config_.local_ip;
+        kv_common_labels["node_port"] =
             std::to_string(network_config_.local_port);
-        store_hd_->RegisterKvMetrics(metrics_registry_.get(),
-                                     kv_common_common_labels);
+        store_hd_->RegisterKvMetrics(metrics_registry_.get(),
+                                     kv_common_labels);
eloq_metrics/src/metrics_registry_impl.cpp (2)

59-67: Consider avoiding the intermediate copy.

Line 63 creates a local Metric, then line 65-66 creates a new Metric from it. You could construct directly:

-    auto metric = metrics::Metric(name.GetName(), type, labels);
-
-    return metrics_mgr_result_.mgr_->MetricsRegistry(
-        std::make_unique<metrics::Metric>(metric));
+    return metrics_mgr_result_.mgr_->MetricsRegistry(
+        std::make_unique<metrics::Metric>(name.GetName(), type, labels));

69-77: Good null check, but consider adding observability for failures.

The null check on collector prevents crashes, but silently dropping metrics when the collector is null could hide initialization problems. Consider adding a debug log or incrementing an error counter when this path is taken.

     auto collector = metrics_mgr_result_.mgr_->GetCollector();
     if (collector != nullptr)
     {
         collector->Collect(handle.key, val, handle.type);
     }
+    // Optional: Add debug logging for dropped metrics
+    // else { LOG(DEBUG) << "Metrics collector not available, dropping metric"; }
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9836172 and 719cdf3.

📒 Files selected for processing (17)
  • CMakeLists.txt
  • core/include/data_substrate.h
  • core/src/data_substrate.cpp
  • core/src/metrics_init.cpp
  • eloq_metrics/README.md
  • eloq_metrics/benchmark/mono_metrics_counter_bench.cpp
  • eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp
  • eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp
  • eloq_metrics/include/meter.h
  • eloq_metrics/include/metrics.h
  • eloq_metrics/include/metrics_manager.h
  • eloq_metrics/include/metrics_registry_impl.h
  • eloq_metrics/src/metrics_manager.cc
  • eloq_metrics/src/metrics_registry_impl.cpp
  • store_handler/eloq_data_store_service/build_eloq_store.cmake
  • store_handler/eloq_data_store_service/eloq_store_data_store.cpp
  • store_handler/eloq_data_store_service/eloq_store_data_store.h
🧰 Additional context used
🧬 Code graph analysis (6)
eloq_metrics/benchmark/mono_metrics_counter_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (49-79)
  • GetMetricMgrInstance (49-49)
core/src/metrics_init.cpp (1)
eloq_metrics/include/meter.h (1)
  • metrics_registry_ (347-350)
eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (49-79)
  • GetMetricMgrInstance (49-49)
store_handler/eloq_data_store_service/eloq_store_data_store.h (1)
store_handler/eloq_data_store_service/eloq_store_data_store.cpp (2)
  • Initialize (95-128)
  • Initialize (95-95)
eloq_metrics/include/metrics_manager.h (2)
eloq_metrics/include/metrics.h (1)
  • MetricsRegistry (153-160)
eloq_metrics/src/metrics_manager.cc (2)
  • MetricsRegistry (41-47)
  • MetricsRegistry (41-41)
eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (49-79)
  • GetMetricMgrInstance (49-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Agent
🔇 Additional comments (19)
eloq_metrics/include/meter.h (3)

175-186: LGTM: Consistent refactoring to MetricHandle.

The refactoring from MetricKey to MetricHandle is applied consistently throughout the registration logic. The variable naming (metric_handle) and usage align with the broader API changes in the metrics system.


206-218: LGTM: Collection logic correctly uses MetricHandle.

The collection path has been properly updated to use it->second (the MetricHandle) when calling metrics_registry_->Collect(), which aligns with the new API signature expecting a MetricHandle parameter.


349-349: LGTM: Map type updated to store MetricHandle.

The internal map correctly stores MetricHandle as the value type, completing the refactoring from MetricKey to MetricHandle throughout this class.

eloq_metrics/benchmark/mono_metrics_counter_bench.cpp (2)

17-22: LGTM: Correctly adapted to the new MetricHandle-based API.

The benchmark has been properly updated to:

  1. Store the MetricHandle returned from MetricsRegistry()
  2. Obtain the shared collector via GetCollector()

This aligns with the new API pattern introduced in metrics_manager.h.


66-69: LGTM: Collection call correctly uses MetricHandle fields.

The Collect() call properly uses the three components from the MetricHandle: handle.key, the value, and handle.type. This matches the expected signature after the API refactoring.

eloq_metrics/include/metrics_manager.h (1)

50-55: Verify all call sites have been updated for the breaking API change.

The public API has changed significantly:

  • MetricsRegistry() now returns MetricHandle instead of std::unique_ptr<CollectorWrapper>
  • New GetCollector() method provides access to the shared collector

This is a breaking change that requires all consumers to be updated to the new pattern: obtain a MetricHandle from registration, get the collector via GetCollector(), and use both for collection.

core/include/data_substrate.h (1)

263-263: LGTM!

The addition of RegisterKvStoreMetrics() as a private method and the change from std::unique_ptr to std::shared_ptr for metrics_registry_ are consistent with the refactoring goals. The shared ownership model aligns with MetricsRegistryImpl::GetRegistry() returning a shared_ptr, enabling the registry to be shared across components.

Also applies to: 297-297

core/src/metrics_init.cpp (2)

139-139: LGTM!

The direct assignment from metrics_registry_result.metrics_registry_ to the class member correctly leverages the shared_ptr ownership model.


173-191: LGTM on the overall implementation.

The early-return pattern for disabled metrics or uninitialized registry is appropriate. The function correctly delegates to store_hd_->RegisterKvMetrics() when conditions are met.

store_handler/eloq_data_store_service/eloq_store_data_store.h (1)

169-169: LGTM!

Moving the Initialize() implementation from the header to the .cpp file is a good practice. This properly separates interface from implementation and avoids exposing conditional compilation details (ELOQSTORE_WITH_TXSERVICE) in the header.

store_handler/eloq_data_store_service/eloq_store_data_store.cpp (1)

95-128: Implementation is sound with one caveat about empty labels.

The initialization logic correctly handles the conditional metrics setup. The comment on lines 113-115 notes that node_ip and node_port may not be available at Initialize() time, resulting in empty common labels. This differs from DataSubstrate::RegisterKvStoreMetrics() in metrics_init.cpp which does populate these labels.

Ensure this discrepancy is intentional — if both paths should produce consistent labels, consider aligning them.

core/src/data_substrate.cpp (1)

181-208: LGTM!

The reordered initialization sequence is logically sound:

  1. Phase 2: Initialize metrics infrastructure first
  2. Phase 3: Initialize storage handler (requires metrics to be available)
  3. Phase 4: Register KV store metrics (requires storage handler to be initialized)
  4. Phase 5: Initialize TxService

This ordering correctly addresses the dependency chain where KV store metrics registration depends on both the metrics registry and the storage handler being available.

eloq_metrics/include/metrics.h (2)

74-81: LGTM!

The MetricHandle struct is a well-designed abstraction that bundles the metric key with its type. Deleting the default constructor enforces proper initialization, preventing invalid handles from being created.


157-158: Breaking API change at interface level - but properly encapsulated via Meter adapter.

The MetricsRegistry interface has been changed to return MetricHandle from Register() and accept const MetricHandle& in Collect(). However, this breaking change is internal and properly handled: MetricsRegistryImpl correctly implements the new signatures, and the Meter adapter seamlessly bridges the gap by capturing MetricHandle values from Register() calls and looking them up during Collect() operations. All callers using the public Meter API (tx_service, tx_execution, cc_shard) remain unaffected and require no updates.

eloq_metrics/src/metrics_manager.cc (1)

41-47: LGTM!

The refactoring from CollectorWrapper to MetricHandle aligns with the updated MetricsRegistry interface. The new return value correctly bundles the metric key with its type, enabling type-safe metric collection.

eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp (1)

66-75: Collect call updated correctly to new API.

The benchmark correctly uses the new three-argument Collect signature with handle.key, Value, and handle.type. This aligns with the refactored metrics API pattern.

Note: Add a null check for benchmark_histogram_collector if the static initialization issue is addressed with lazy init.

eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp (1)

62-71: LGTM!

The benchmark correctly uses the new three-argument Collect signature, consistent with the refactored metrics API.

store_handler/eloq_data_store_service/build_eloq_store.cmake (1)

129-146: Metrics integration looks correct.

The conditional linking and include path setup for WITH_TXSERVICE is well-structured. The comment explaining why TARGET eloq-metrics check isn't used is helpful for maintainability.

Minor observation: Lines 142-143 mention ELOQSTORE_WITH_TXSERVICE is already defined in the parent, but it's added again explicitly. This is safe (duplicates are harmless) but you could consider removing the redundant definition if the parent already handles it.

eloq_metrics/src/metrics_registry_impl.cpp (1)

28-50: Singleton pattern implementation is correct.

The make_registry_shared helper struct is a well-known pattern to use std::make_shared with a private constructor. Static local initialization provides thread-safe lazy initialization (C++11 guarantee).

The shared_ptr return ensures the singleton remains alive while callers hold references.

Comment on lines +17 to +22
static metrics::MetricHandle benchmark_histogram_metric_handle =
(*(metrics::MetricsMgr::GetMetricMgrInstance().mgr_))
.MetricsRegistry(std::move(benchmark_histogram_metric_ptr));

static std::shared_ptr<metrics::MetricsCollector> benchmark_histogram_collector =
(*(metrics::MetricsMgr::GetMetricMgrInstance().mgr_)).GetCollector();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential null pointer dereference during static initialization.

The code dereferences GetMetricMgrInstance().mgr_ without checking if mgr_ is null. Based on the relevant snippet from metrics_manager.cc, GetMetricMgrInstance() can return a result with mgr_ as nullptr when initialization fails (e.g., unsupported collector type or port conflict). This would cause a crash during static initialization.

Since this is benchmark code, the impact is limited, but consider adding a guard or using lazy initialization.

Suggested approach using lazy initialization
-static metrics::MetricHandle benchmark_histogram_metric_handle =
-        (*(metrics::MetricsMgr::GetMetricMgrInstance().mgr_))
-            .MetricsRegistry(std::move(benchmark_histogram_metric_ptr));
-
-static std::shared_ptr<metrics::MetricsCollector> benchmark_histogram_collector =
-    (*(metrics::MetricsMgr::GetMetricMgrInstance().mgr_)).GetCollector();
+static metrics::MetricHandle benchmark_histogram_metric_handle;
+static std::shared_ptr<metrics::MetricsCollector> benchmark_histogram_collector;
+
+static void InitMetrics() {
+    static bool initialized = false;
+    if (!initialized) {
+        auto result = metrics::MetricsMgr::GetMetricMgrInstance();
+        if (result.mgr_ != nullptr) {
+            benchmark_histogram_metric_handle = 
+                result.mgr_->MetricsRegistry(std::move(benchmark_histogram_metric_ptr));
+            benchmark_histogram_collector = result.mgr_->GetCollector();
+        }
+        initialized = true;
+    }
+}

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In @eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp around lines 17 -
22, The static globals benchmark_histogram_metric_handle and
benchmark_histogram_collector dereference
MetricsMgr::GetMetricMgrInstance().mgr_ during static initialization which can
be nullptr; change to lazy initialization or add null guards: replace the static
initialization with function-local statics or getter functions (e.g.,
GetBenchmarkHistogramMetricHandle() and GetBenchmarkHistogramCollector()) that
call MetricsMgr::GetMetricMgrInstance(), check that .mgr_ is non-null before
calling MetricsRegistry/GetCollector, and return a safe fallback (nullptr/shared
empty handle) or throw a controlled error if mgr_ is null to avoid dereferencing
during startup.

Comment thread eloq_metrics/README.md
Comment thread store_handler/eloq_data_store_service/eloq_store_data_store.cpp Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request refactors the metrics repository infrastructure and adds support for registering eloqstore metrics. The refactoring simplifies the metrics API by introducing a MetricHandle struct that encapsulates both the metric key and type, eliminating the need for the CollectorWrapper class. Additionally, it changes the initialization sequence to enable metrics registration for eloqstore components.

Key changes:

  • Introduced MetricHandle struct to replace the wrapper-based approach, simplifying the metrics collection API
  • Changed MetricsRegistry ownership from unique_ptr to shared_ptr to support shared access across components
  • Reordered DataSubstrate initialization to allow metrics registry setup before storage handler initialization, enabling eloqstore to register its metrics during initialization

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
eloq_metrics/include/metrics.h Added MetricHandle struct and updated MetricsRegistry interface to use MetricHandle instead of MetricKey
eloq_metrics/include/metrics_manager.h Removed CollectorWrapper class and updated MetricsRegistry to return MetricHandle; added GetCollector method
eloq_metrics/include/metrics_registry_impl.h Changed registry result to return shared_ptr and updated method signatures to use MetricHandle
eloq_metrics/src/metrics_manager.cc Removed CollectorWrapper implementation and simplified MetricsRegistry to return MetricHandle directly
eloq_metrics/src/metrics_registry_impl.cpp Updated singleton to use shared_ptr, removed collectors_ map, updated Register and Collect methods
eloq_metrics/include/meter.h Updated internal storage from MetricKey to MetricHandle and fixed comment references
eloq_metrics/benchmark/*.cpp Updated benchmarks to use new MetricHandle-based API
store_handler/eloq_data_store_service/eloq_store_data_store.h Changed Initialize from inline implementation to declaration
store_handler/eloq_data_store_service/eloq_store_data_store.cpp Implemented Initialize method to register eloqstore metrics when compiled with ELOQSTORE_WITH_TXSERVICE
store_handler/eloq_data_store_service/build_eloq_store.cmake Added conditional linking and compile definitions for eloq-metrics when WITH_TXSERVICE is enabled
core/src/data_substrate.cpp Reordered initialization sequence: metrics before storage handler, added RegisterKvStoreMetrics phase
core/include/data_substrate.h Changed metrics_registry_ from unique_ptr to shared_ptr and added RegisterKvStoreMetrics method declaration
core/src/metrics_init.cpp Extracted KV store metrics registration into separate RegisterKvStoreMetrics method
CMakeLists.txt Added ELOQSTORE_WITH_TXSERVICE compile definition for eloqstore builds
eloq_metrics/README.md Added note about metrics collection being possible at any time after registration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread CMakeLists.txt
set(KV_STORAGE_VAL 3 CACHE STRING "eloq_ds" FORCE)
add_compile_definitions(DATA_STORE_TYPE_ELOQDSS_ELOQSTORE)
option(WITH_TXSERVICE "Whether compile eloqstore with txservice" ON)
add_compile_definitions(ELOQSTORE_WITH_TXSERVICE)

Copilot AI Jan 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ELOQSTORE_WITH_TXSERVICE compile definition is being added unconditionally when WITH_DATA_STORE is set to "ELOQDSS_ELOQSTORE", but line 129 defines WITH_TXSERVICE as an option that can be ON or OFF. This should be conditional on the WITH_TXSERVICE option to properly handle the case when it's set to OFF.

Suggested change
add_compile_definitions(ELOQSTORE_WITH_TXSERVICE)
if(WITH_TXSERVICE)
add_compile_definitions(ELOQSTORE_WITH_TXSERVICE)
endif()

Copilot uses AI. Check for mistakes.
Comment on lines +72 to +76
auto collector = metrics_mgr_result_.mgr_->GetCollector();
if (collector != nullptr)
{
collector->Collect(handle.key, val, handle.type);
}

Copilot AI Jan 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Collect method accesses metrics_mgr_result_.mgr_->GetCollector() without checking if metrics_mgr_result_.not_ok_ is null first. If metrics initialization failed (not_ok_ is non-null), the mgr_ pointer could be null, leading to a potential null pointer dereference. Should add a check similar to the one in Register method.

Copilot uses AI. Check for mistakes.
Comment thread core/src/metrics_init.cpp
Comment on lines +182 to +187
metrics::CommonLabels kv_common_common_labels{};
kv_common_common_labels["node_ip"] = network_config_.local_ip;
kv_common_common_labels["node_port"] =
std::to_string(network_config_.local_port);
store_hd_->RegisterKvMetrics(metrics_registry_.get(),
kv_common_common_labels);

Copilot AI Jan 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable name has a duplicated word: "kv_common_common_labels" should likely be "kv_common_labels" for consistency with similar variables like "mysql_common_labels" and "redis_common_labels" used elsewhere in the codebase.

Suggested change
metrics::CommonLabels kv_common_common_labels{};
kv_common_common_labels["node_ip"] = network_config_.local_ip;
kv_common_common_labels["node_port"] =
std::to_string(network_config_.local_port);
store_hd_->RegisterKvMetrics(metrics_registry_.get(),
kv_common_common_labels);
metrics::CommonLabels kv_common_labels{};
kv_common_labels["node_ip"] = network_config_.local_ip;
kv_common_labels["node_port"] =
std::to_string(network_config_.local_port);
store_hd_->RegisterKvMetrics(metrics_registry_.get(),
kv_common_labels);

Copilot uses AI. Check for mistakes.
@lokax lokax linked an issue Jan 8, 2026 that may be closed by this pull request

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In @eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp:
- Around line 17-22: The static initializers dereference
MetricsMgr::GetMetricMgrInstance().mgr_ without null checks and call
GetMetricMgrInstance() twice; cache the result into a local variable (e.g., auto
mgrResult = metrics::MetricsMgr::GetMetricMgrInstance()), check mgrResult.mgr_
for nullptr before dereferencing, and only initialize
benchmark_gauge_metric_handle and benchmark_gauge_collector when mgrResult.mgr_
is valid (otherwise set them to safe defaults or leave uninitialized in a
guarded way); reference the symbols benchmark_gauge_metric_handle,
benchmark_gauge_collector, benchmark_gauge_metric_ptr, and
MetricsMgr::GetMetricMgrInstance() when making the changes.

In @eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp:
- Around line 17-23: The static initialization dereferences
MetricsMgr::GetMetricMgrInstance().mgr_ without checking for nullptr and calls
GetMetricMgrInstance() twice; change to call MetricsMgr::GetMetricMgrInstance()
once, cache the result (e.g., auto mgrRes = MetricsMgr::GetMetricMgrInstance()),
check that mgrRes.mgr_ is non-null before using (*(mgrRes.mgr_)), and only then
call MetricsRegistry(...) to initialize benchmark_histogram_metric_handle and
GetCollector() to initialize benchmark_histogram_collector; if mgr_ is null,
initialize these static variables to safe no-op defaults (or leave
null/shared_ptr empty) to avoid startup crash.

In @store_handler/eloq_data_store_service/eloq_store_data_store.cpp:
- Around line 113-114: The log message in EloqStoreDataStore::Initialize
contains a developer debug prefix "yf:"; update the LOG(INFO) message (the
stream after LOG(INFO) in EloqStoreDataStore::Initialize) to remove the "yf:"
prefix so it reads a clean, production-ready message such as
"EloqStoreDataStore::Initialize: get metric registry" without any developer
trace text.
🧹 Nitpick comments (2)
eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp (1)

71-74: Consider if MetricHandle could encapsulate Collect parameters.

The new Collect API requires explicitly passing handle.key, Value, and handle.type separately. Since all three parameters are already part of the MetricHandle, consider whether a simpler API like Collect(handle, Value(1)) would be more ergonomic and reduce verbosity.

eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp (1)

1-85: Consider extracting common benchmark initialization pattern.

The initialization pattern in lines 17-22 is duplicated across multiple benchmark files (histogram, gauge, and likely counter based on the AI summary). Consider extracting this pattern into a helper function or template to reduce duplication and ensure consistent error handling across all benchmarks.

💡 Example helper approach

Create a helper in a common header:

// In a common benchmark utilities header
namespace metrics::benchmark {

template<typename MetricType>
struct BenchmarkMetricSetup {
    metrics::MetricHandle handle;
    std::shared_ptr<metrics::MetricsCollector> collector;
    
    static BenchmarkMetricSetup Create(std::unique_ptr<metrics::Metric> metric_ptr) {
        auto mgr_instance = metrics::MetricsMgr::GetMetricMgrInstance();
        if (!mgr_instance.mgr_) {
            std::cerr << "Failed to initialize MetricsMgr" << std::endl;
            std::abort();
        }
        
        return BenchmarkMetricSetup{
            .handle = (*mgr_instance.mgr_).MetricsRegistry(std::move(metric_ptr)),
            .collector = (*mgr_instance.mgr_).GetCollector()
        };
    }
};

} // namespace

Then use it in benchmarks:

static auto benchmark_gauge_setup = 
    metrics::benchmark::BenchmarkMetricSetup<metrics::Type::Gauge>::Create(
        std::move(benchmark_gauge_metric_ptr));
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 719cdf3 and 17f9087.

📒 Files selected for processing (8)
  • core/src/data_substrate.cpp
  • eloq_metrics/benchmark/mono_metrics_counter_bench.cpp
  • eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp
  • eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp
  • eloq_metrics/include/meter.h
  • eloq_metrics/include/metrics.h
  • eloq_metrics/src/metrics_registry_impl.cpp
  • store_handler/eloq_data_store_service/eloq_store_data_store.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
  • core/src/data_substrate.cpp
🧰 Additional context used
🧬 Code graph analysis (3)
eloq_metrics/benchmark/mono_metrics_counter_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (49-79)
  • GetMetricMgrInstance (49-49)
eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (49-79)
  • GetMetricMgrInstance (49-49)
store_handler/eloq_data_store_service/eloq_store_data_store.cpp (2)
store_handler/eloq_data_store_service/eloq_store_data_store.h (1)
  • Initialize (169-178)
eloq_metrics/src/metrics_registry_impl.cpp (2)
  • GetRegistry (28-50)
  • GetRegistry (28-28)
🔇 Additional comments (8)
eloq_metrics/include/metrics.h (2)

74-83: LGTM! Well-designed MetricHandle struct.

The deleted default constructor prevents uninitialized handles, and the converting constructor provides a clean way to create handles from key-type pairs.


159-162: LGTM! API changes are consistent.

The shift from MetricKey to MetricHandle in the MetricsRegistry interface improves type safety by bundling the key with its associated type.

eloq_metrics/include/meter.h (1)

175-184: LGTM! Meter implementation correctly updated for MetricHandle.

The changes consistently replace MetricKey with MetricHandle throughout the meter's internal mapping and collection logic.

Also applies to: 198-217, 349-349

store_handler/eloq_data_store_service/eloq_store_data_store.cpp (1)

95-130: LGTM! Well-structured metrics initialization.

The conditional compilation, error handling, and empty common_labels (with explanation) are all appropriate. The method correctly initializes eloqstore metrics when the registry is available.

eloq_metrics/src/metrics_registry_impl.cpp (3)

33-41: LGTM! Singleton lifetime management improved.

The shift from unique_ptr to shared_ptr enables safe shared ownership of the metrics registry singleton. C++11 guarantees thread-safe initialization of static local variables.


59-67: LGTM! Register implementation is clean.

The method directly returns the MetricHandle from the underlying metrics manager, maintaining a simple and consistent API.


70-78: Silent metrics dropping is intentional graceful degradation.

The null check is part of the design: if the metrics collector fails to initialize (e.g., port conflict), mgr_init_err_ is set to CollectorOpenErr and logged, but the service continues operating without metrics collection. Callers can check the error status via MetricsMgrResult::not_ok_. This is correct behavior.

eloq_metrics/benchmark/mono_metrics_counter_bench.cpp (1)

17-22: LGTM! Benchmark correctly updated for new API.

The benchmark now uses MetricHandle and a separate MetricsCollector, aligning with the broader refactor away from CollectorWrapper.

Also applies to: 66-69

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @core/src/metrics_init.cpp:
- Around line 185-193: Rename the accidentally duplicated variable name
kv_common_common_labels to kv_common_labels where it is declared and used;
update the declaration line and the RegisterKvMetrics call that passes it
(store_hd_->RegisterKvMetrics(metrics_registry_.get(), kv_common_labels)) so the
variable name is consistent and the redundant "common" is removed.

In @store_handler/eloq_data_store_service/eloq_store_data_store.cpp:
- Around line 113-114: The log message in EloqStoreDataStore::Initialize
contains a debug prefix "yf: " that should be removed; update the LOG(INFO)
invocation in the Initialize method to emit a clean message like
"EloqStoreDataStore::Initialize: get metric registry" (remove the "yf: " prefix)
so the production log contains no developer debug artifact.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1c478b0 and 51fd43e.

📒 Files selected for processing (3)
  • core/src/metrics_init.cpp
  • store_handler/eloq_data_store_service/eloq_store_data_store.cpp
  • tx_service/include/tx_service_metrics.h
🧰 Additional context used
🧬 Code graph analysis (1)
store_handler/eloq_data_store_service/eloq_store_data_store.cpp (2)
store_handler/eloq_data_store_service/eloq_store_data_store.h (1)
  • Initialize (169-178)
eloq_metrics/src/metrics_registry_impl.cpp (2)
  • GetRegistry (28-50)
  • GetRegistry (28-28)
🔇 Additional comments (5)
tx_service/include/tx_service_metrics.h (1)

77-77: LGTM! The new metric toggle flag follows the existing pattern and naming convention perfectly.

The flag is properly integrated into the codebase: it's loaded from configuration in core/src/metrics_init.cpp and used to conditionally initialize metrics in store_handler/eloq_data_store_service/eloq_store_data_store.cpp within the ELOQSTORE_WITH_TXSERVICE compilation guard.

store_handler/eloq_data_store_service/eloq_store_data_store.cpp (2)

29-32: LGTM!

The conditional includes are properly guarded with #ifdef ELOQSTORE_WITH_TXSERVICE, ensuring the metrics headers are only included when the txservice integration is enabled.


95-130: Pointer lifetime from .get() is safe; verify concurrent call scenarios for InitializeMetrics() if needed.

The raw pointer from .get() is valid during the InitializeMetrics() call since metrics_registry_result.metrics_registry_ (a shared_ptr) persists as a local variable throughout the call.

The method is called during the sequential startup phase in DataSubstrate::Start() (Phase 3), so concurrent invocation is not a concern in the current initialization flow. However, if Initialize() may be called from multiple threads or concurrently in the future, verify that eloq_store_service_->InitializeMetrics() is thread-safe, as there is no synchronization protecting this call.

core/src/metrics_init.cpp (2)

96-100: LGTM!

The new enable_eloqstore_metrics configuration follows the existing pattern and defaults to false, which is a safe default for a new feature.


178-196: LGTM on overall structure.

The method correctly:

  • Early-returns when metrics are disabled or registry is uninitialized
  • Guards registration behind enable_data_store and null check for store_hd_
  • Always returns true since this is optional functionality

Comment thread core/src/metrics_init.cpp
Comment on lines +185 to +193
if (core_config_.enable_data_store && store_hd_ != nullptr)
{
metrics::CommonLabels kv_common_common_labels{};
kv_common_common_labels["node_ip"] = network_config_.local_ip;
kv_common_common_labels["node_port"] =
std::to_string(network_config_.local_port);
store_hd_->RegisterKvMetrics(metrics_registry_.get(),
kv_common_common_labels);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix variable name typo.

The variable kv_common_common_labels has a redundant "common" in the name.

Suggested fix
     if (core_config_.enable_data_store && store_hd_ != nullptr)
     {
-        metrics::CommonLabels kv_common_common_labels{};
-        kv_common_common_labels["node_ip"] = network_config_.local_ip;
-        kv_common_common_labels["node_port"] =
+        metrics::CommonLabels kv_common_labels{};
+        kv_common_labels["node_ip"] = network_config_.local_ip;
+        kv_common_labels["node_port"] =
             std::to_string(network_config_.local_port);
-        store_hd_->RegisterKvMetrics(metrics_registry_.get(),
-                                     kv_common_common_labels);
+        store_hd_->RegisterKvMetrics(metrics_registry_.get(),
+                                     kv_common_labels);
     }
🤖 Prompt for AI Agents
In @core/src/metrics_init.cpp around lines 185 - 193, Rename the accidentally
duplicated variable name kv_common_common_labels to kv_common_labels where it is
declared and used; update the declaration line and the RegisterKvMetrics call
that passes it (store_hd_->RegisterKvMetrics(metrics_registry_.get(),
kv_common_labels)) so the variable name is consistent and the redundant "common"
is removed.

Comment on lines +113 to +114
LOG(INFO)
<< "yf: EloqStoreDataStore::Initialize: get metric registry";

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove debug log prefix.

The "yf: " prefix appears to be a developer debug artifact. Clean this up before merging.

Suggested fix
-            LOG(INFO)
-                << "yf: EloqStoreDataStore::Initialize: get metric registry";
+            LOG(INFO) << "EloqStoreDataStore::Initialize: metrics registry "
+                         "acquired, initializing eloqstore metrics";
🤖 Prompt for AI Agents
In @store_handler/eloq_data_store_service/eloq_store_data_store.cpp around lines
113 - 114, The log message in EloqStoreDataStore::Initialize contains a debug
prefix "yf: " that should be removed; update the LOG(INFO) invocation in the
Initialize method to emit a clean message like "EloqStoreDataStore::Initialize:
get metric registry" (remove the "yf: " prefix) so the production log contains
no developer debug artifact.

@lokax lokax linked an issue Jan 12, 2026 that may be closed by this pull request

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In @CMakeLists.txt:
- Around line 129-130: The add_compile_definitions(ELOQSTORE_WITH_TXSERVICE) is
applied unconditionally even though the option(WITH_TXSERVICE ...) can be OFF;
change the CMake logic so the compile definition is only added when
WITH_TXSERVICE is true (e.g., wrap
add_compile_definitions(ELOQSTORE_WITH_TXSERVICE) in an if(WITH_TXSERVICE) ...
endif() or use target_compile_definitions conditional on WITH_TXSERVICE) so the
ELOQSTORE_WITH_TXSERVICE macro matches the option state.

In @eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp:
- Around line 17-22: The static initialization dereferences
metrics::MetricsMgr::GetMetricMgrInstance().mgr_ (used to build
benchmark_gauge_metric_handle and benchmark_gauge_collector) which can be null
and cause UB; update the code to avoid direct dereference during static init by
performing a null-check or using a safe accessor that returns a valid manager
before creating benchmark_gauge_metric_handle and benchmark_gauge_collector
(e.g., delay initialization until after MetricsMgr is guaranteed initialized or
call a GetOrCreateManager-like function), and refactor the repeated pattern into
a shared utility function (used by the gauge and histogram benchmarks) that
performs the null-safe retrieval/creation of the manager and then returns the
MetricHandle and MetricsCollector.
🧹 Nitpick comments (3)
store_handler/eloq_data_store_service/eloq_store_data_store.cpp (1)

96-129: Consider propagating registry failure as an error when metrics are explicitly enabled.

When enable_eloqstore_metrics is true, a failure to obtain the metrics registry (Lines 106-111) logs a warning but still returns true. If metrics are explicitly enabled, this silent degradation may mask configuration issues.

Consider returning false when registry acquisition fails while metrics are enabled, or document that metrics initialization is best-effort:

💡 Optional: Return false on registry failure when metrics are enabled
         if (metrics_registry_result.not_ok_ != nullptr)
         {
             LOG(WARNING) << "Failed to get metrics registry, skipping "
                             "eloqstore metrics initialization: "
                          << metrics_registry_result.not_ok_;
+            // Optionally: return false if metrics are mandatory
         }
eloq_metrics/benchmark/mono_metrics_counter_bench.cpp (1)

17-22: Consider potential Static Initialization Order Fiasco (SIOF) between these globals.

Both benchmark_counter_metric_handle and benchmark_counter_collector depend on MetricsMgr::GetMetricMgrInstance(), which internally uses a function-local static. This should be safe since both initializations call the same singleton accessor.

However, the initialization of benchmark_counter_metric_handle (Line 17-19) moves ownership of benchmark_counter_metric_ptr (Line 14-15), so ensure the order between Lines 14-15 and 17-19 is guaranteed (it is, since benchmark_counter_metric_ptr is defined first in the same translation unit).

eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp (1)

67-75: Consider adding a null check for the collector.

If benchmark_histogram_collector is null due to initialization failure, calling Collect will crash. For benchmark code this may be acceptable, but a guard would make failures more graceful.

♻️ Optional null guard
 static void BM_Histogram_MonoWrapper_Collect(benchmark::State &state)
 {
+    if (!benchmark_histogram_collector)
+    {
+        state.SkipWithError("Metrics collector not initialized");
+        return;
+    }
     for (auto _ : state)
     {
         benchmark_histogram_collector->Collect(
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ad2c251 and 9476fbf.

📒 Files selected for processing (18)
  • CMakeLists.txt
  • core/include/data_substrate.h
  • core/src/data_substrate.cpp
  • core/src/metrics_init.cpp
  • eloq_metrics/README.md
  • eloq_metrics/benchmark/mono_metrics_counter_bench.cpp
  • eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp
  • eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp
  • eloq_metrics/include/meter.h
  • eloq_metrics/include/metrics.h
  • eloq_metrics/include/metrics_manager.h
  • eloq_metrics/include/metrics_registry_impl.h
  • eloq_metrics/src/metrics_manager.cc
  • eloq_metrics/src/metrics_registry_impl.cpp
  • store_handler/eloq_data_store_service/build_eloq_store.cmake
  • store_handler/eloq_data_store_service/eloq_store_data_store.cpp
  • store_handler/eloq_data_store_service/eloq_store_data_store.h
  • tx_service/include/tx_service_metrics.h
🚧 Files skipped from review as they are similar to previous changes (3)
  • tx_service/include/tx_service_metrics.h
  • store_handler/eloq_data_store_service/eloq_store_data_store.h
  • core/src/metrics_init.cpp
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.

Applied to files:

  • eloq_metrics/README.md
🧬 Code graph analysis (8)
eloq_metrics/src/metrics_manager.cc (1)
eloq_metrics/include/metrics.h (1)
  • MetricsRegistry (155-164)
store_handler/eloq_data_store_service/eloq_store_data_store.cpp (2)
store_handler/eloq_data_store_service/eloq_store_data_store.h (1)
  • Initialize (169-178)
eloq_metrics/src/metrics_registry_impl.cpp (2)
  • GetRegistry (28-50)
  • GetRegistry (28-28)
eloq_metrics/include/metrics_manager.h (2)
eloq_metrics/include/metrics.h (1)
  • MetricsRegistry (155-164)
eloq_metrics/src/metrics_manager.cc (2)
  • MetricsRegistry (41-47)
  • MetricsRegistry (41-41)
eloq_metrics/benchmark/mono_metrics_counter_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (49-79)
  • GetMetricMgrInstance (49-49)
eloq_metrics/include/metrics_registry_impl.h (4)
eloq_metrics/include/metrics.h (2)
  • MetricsRegistry (155-164)
  • Name (38-50)
eloq_metrics/include/meter.h (2)
  • metrics_registry_ (347-350)
  • metrics (32-351)
core/include/data_substrate.h (1)
  • metrics (103-106)
eloq_metrics/src/metrics_registry_impl.cpp (4)
  • Register (59-68)
  • Register (59-62)
  • Collect (70-78)
  • Collect (70-71)
core/include/data_substrate.h (2)
core/src/metrics_init.cpp (2)
  • RegisterKvStoreMetrics (178-196)
  • RegisterKvStoreMetrics (178-178)
eloq_metrics/include/meter.h (2)
  • metrics (32-351)
  • metrics_registry_ (347-350)
eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (49-79)
  • GetMetricMgrInstance (49-49)
eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (49-79)
  • GetMetricMgrInstance (49-49)
🔇 Additional comments (21)
eloq_metrics/README.md (1)

113-114: Good documentation addition.

The note clearly explains the thread-safety guarantees and the rationale behind the MetricHandle design. This helps users understand they can safely collect metrics concurrently after registration.

store_handler/eloq_data_store_service/build_eloq_store.cmake (1)

129-146: LGTM - conditional metrics integration.

The conditional block correctly gates metrics integration on WITH_TXSERVICE. The redundant ELOQSTORE_WITH_TXSERVICE definition (already set globally in CMakeLists.txt) is acceptable as defensive coding to ensure the target gets the definition regardless of inclusion context.

core/include/data_substrate.h (2)

264-264: LGTM - new private method for KV store metrics registration.

The RegisterKvStoreMetrics() method decouples KV store metrics registration from the general metrics initialization, allowing it to be called after the storage handler is fully initialized.


296-296: Appropriate ownership change to shared_ptr.

Changing from unique_ptr to shared_ptr enables shared ownership of the metrics registry, which aligns with the broader refactoring where the registry may need to be shared across components (e.g., with eloqstore when metrics are enabled). The GetMetricsRegistry() accessor returning a raw pointer remains correct since callers don't take ownership.

eloq_metrics/include/metrics.h (2)

74-83: Well-designed MetricHandle struct.

The MetricHandle encapsulates the metric key and type together, enabling the registry to perform type-safe collection without additional lookups. Deleting the default constructor ensures handles are always properly initialized. The struct is implicitly copyable, which is appropriate for storing handles in containers.


159-162: Breaking API change - properly implemented.

The interface changes from MetricKey to MetricHandle in Register() and Collect() methods are complete. The only implementation (MetricsRegistryImpl) has been updated with the new signatures, and there are no lingering references to the old MetricKey-based API. Callers use the Meter wrapper class which properly handles the MetricHandle abstraction internally.

eloq_metrics/include/meter.h (3)

175-184: LGTM! Clean migration from MetricKey to MetricHandle.

The registration logic correctly stores the MetricHandle returned by the registry, and the map is updated to use the handle directly for efficient collection.


210-217: LGTM! Proper use of stored MetricHandle for collection.

The iterator-based access (it->second) correctly retrieves the stored MetricHandle and passes it to the registry's Collect method. The defensive check before the assertion prevents crashes from missing meter keys.


349-349: LGTM! Map type correctly updated to store MetricHandle.

The type change from Map<MeterKey, MetricKey> to Map<MeterKey, MetricHandle> aligns with the handle-based metrics API.

store_handler/eloq_data_store_service/eloq_store_data_store.cpp (2)

29-33: LGTM! Conditional includes properly guarded.

The metrics-related headers are correctly guarded by ELOQSTORE_WITH_TXSERVICE, ensuring clean compilation when txservice integration is disabled.


114-123: This design is intentional and properly documented.

The code correctly passes empty CommonLabels{} because node_ip and node_port are unavailable at initialization time. The comment clearly states that EloqStore::InitializeMetrics() will add the shard_id label, and this pattern is consistent with how CommonLabels is used elsewhere in the codebase (e.g., register_mysql_metrics accepts empty CommonLabels as a default parameter). No changes needed.

core/src/data_substrate.cpp (3)

173-179: LGTM! Metrics initialization correctly moved before storage handler.

Moving metrics initialization to Phase 2 ensures the registry is available when the storage handler (and eloqstore) attempts to register its metrics in Phase 4.


196-201: LGTM! TxService initialization remains as final phase.

Phase 5 correctly initializes TxService after all dependencies (metrics, storage, KV metrics) are ready.


181-194: RegisterKvStoreMetrics() already handles disabled metrics and missing storage gracefully.

The implementation always returns true: it returns true early if metrics are disabled or the registry is uninitialized, and returns true at the end regardless of whether enable_data_store is false or the storage handler is missing. Phase 4 cannot fail due to this method.

Likely an incorrect or invalid review comment.

eloq_metrics/src/metrics_manager.cc (1)

41-47: LGTM! Clean API simplification returning MetricHandle directly.

Replacing std::unique_ptr<CollectorWrapper> with the lightweight MetricHandle struct avoids unnecessary heap allocations and provides a cleaner API. The handle encapsulates both the key and type needed for subsequent Collect calls.

eloq_metrics/benchmark/mono_metrics_counter_bench.cpp (1)

66-69: LGTM! Benchmark correctly uses the new handle-based Collect API.

The benchmark now exercises the intended collection path using MetricHandle and the shared collector, aligning with the refactored metrics API.

eloq_metrics/include/metrics_manager.h (1)

50-55: LGTM! Clean API migration to handle-based metrics.

The transition from CollectorWrapper to MetricHandle and the addition of GetCollector() accessor aligns well with the broader refactoring pattern. Returning std::shared_ptr by value is appropriate for thread-safe sharing.

eloq_metrics/src/metrics_registry_impl.cpp (2)

28-50: LGTM! Proper singleton pattern with shared_ptr.

The make_registry_shared helper struct idiom correctly enables std::make_shared with a private constructor. Static local initialization provides thread-safe lazy initialization in C++11 and later.


70-78: Good defensive null check on collector.

The null check before calling Collect at line 74 handles the case where the metrics manager failed to initialize properly.

eloq_metrics/include/metrics_registry_impl.h (1)

33-50: LGTM! Header correctly reflects the API migration.

The changes properly align with the base MetricsRegistry interface (from metrics.h):

  • Register returns MetricHandle
  • Collect takes const MetricHandle&
  • MetricsRegistryResult uses shared_ptr for shared ownership
eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp (1)

62-69: LGTM on the API usage, but add null guard for robustness.

The Collect call correctly uses the new handle-based API. Same optional improvement as histogram: add a null check on the collector for graceful failure handling.

Comment thread CMakeLists.txt
Comment on lines 129 to +130
option(WITH_TXSERVICE "Whether compile eloqstore with txservice" ON)
add_compile_definitions(ELOQSTORE_WITH_TXSERVICE)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential issue: compile definition added unconditionally.

ELOQSTORE_WITH_TXSERVICE is added unconditionally, but WITH_TXSERVICE is an option that can be disabled. This could lead to inconsistent behavior where the compile definition is present even when WITH_TXSERVICE=OFF.

Suggested fix
     option(WITH_TXSERVICE "Whether compile eloqstore with txservice" ON)
-    add_compile_definitions(ELOQSTORE_WITH_TXSERVICE)
+    if(WITH_TXSERVICE)
+        add_compile_definitions(ELOQSTORE_WITH_TXSERVICE)
+    endif()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
option(WITH_TXSERVICE "Whether compile eloqstore with txservice" ON)
add_compile_definitions(ELOQSTORE_WITH_TXSERVICE)
option(WITH_TXSERVICE "Whether compile eloqstore with txservice" ON)
if(WITH_TXSERVICE)
add_compile_definitions(ELOQSTORE_WITH_TXSERVICE)
endif()
🤖 Prompt for AI Agents
In @CMakeLists.txt around lines 129 - 130, The
add_compile_definitions(ELOQSTORE_WITH_TXSERVICE) is applied unconditionally
even though the option(WITH_TXSERVICE ...) can be OFF; change the CMake logic so
the compile definition is only added when WITH_TXSERVICE is true (e.g., wrap
add_compile_definitions(ELOQSTORE_WITH_TXSERVICE) in an if(WITH_TXSERVICE) ...
endif() or use target_compile_definitions conditional on WITH_TXSERVICE) so the
ELOQSTORE_WITH_TXSERVICE macro matches the option state.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
eloq_metrics/src/metrics_manager.cc (1)

75-77: Missing null terminator causes undefined behavior.

The buffer is allocated with err_msg.size() which doesn't include space for the null terminator, and std::copy doesn't append one. When msg_char_ptr is used as a C-string elsewhere, this causes buffer over-read (undefined behavior).

🐛 Proposed fix to allocate buffer with null terminator
-        char *msg_char_ptr = new char[err_msg.size()];
-        std::copy(err_msg.begin(), err_msg.end(), msg_char_ptr);
+        char *msg_char_ptr = new char[err_msg.size() + 1];
+        std::copy(err_msg.begin(), err_msg.end(), msg_char_ptr);
+        msg_char_ptr[err_msg.size()] = '\0';

Alternatively, consider using strdup or returning std::string to avoid manual memory management:

-        char *msg_char_ptr = new char[err_msg.size()];
-        std::copy(err_msg.begin(), err_msg.end(), msg_char_ptr);
+        char *msg_char_ptr = strdup(err_msg.c_str());
eloq_metrics/include/metrics_manager.h (1)

84-90: Undefined behavior: std::transform writes to empty string.

collector_provider_normalize is default-constructed (empty), so writing to .begin() is undefined behavior. The destination must have sufficient capacity.

🐛 Proposed fix
         else
         {
             auto collector_provider_str = std::string(collector_provider);
+            collector_provider_normalize.resize(collector_provider_str.size());
             std::transform(collector_provider_str.begin(),
                            collector_provider_str.end(),
                            collector_provider_normalize.begin(),
                            ::tolower);
         }

Alternatively, use a simpler approach:

else
{
    collector_provider_normalize = std::string(collector_provider);
    std::transform(collector_provider_normalize.begin(),
                   collector_provider_normalize.end(),
                   collector_provider_normalize.begin(),
                   ::tolower);
}
🤖 Fix all issues with AI agents
In @eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp:
- Around line 17-23: The static initializers for
benchmark_histogram_metric_handle and benchmark_histogram_collector dereference
MetricsMgr::GetMetricMgrInstance().mgr_ without checking for null, risking
undefined behavior during static init; fix by replacing the inline static
initialization with lazy-initializer functions (e.g., GetHistogramHandle() and
GetHistogramCollector()) that call MetricsMgr::GetMetricMgrInstance(), verify
result.mgr_ != nullptr, and either throw a descriptive exception or return a
safe dummy handle/collector before assigning to the static variables; ensure you
call mgr_->MetricsRegistry(...) and mgr_->GetCollector() only after the null
check and add a brief comment documenting the chosen failure behavior.

In @store_handler/eloq_data_store_service/eloq_store_data_store.cpp:
- Around line 95-128: EloqStoreDataStore::Initialize() currently only sets up
metrics and always returns true, violating the base-class contract that
Initialize() should "return True if connect successfully, otherwise false";
either implement a real connection attempt here and return its success or update
the contract to reflect connection happens in StartDB(). Fix options: (A) move
or call the DB connection logic from StartDB() into
EloqStoreDataStore::Initialize(), perform the connect attempt, and return
true/false based on that result (while still running the existing metrics
initialization via eloq_store_service_->InitializeMetrics), or (B) update the
base-class Initialize() documentation to state that Initialize() only performs
non-connection setup (metrics via metrics::CommonLabels and
eloq_store_service_->InitializeMetrics) and that StartDB() is responsible for
connecting; ensure behavior of StartDB() remains unchanged. Ensure you reference
and adjust any callers/tests that expect Initialize() to indicate connection
success if you choose option A.
🧹 Nitpick comments (2)
eloq_metrics/include/metrics_manager.h (1)

110-110: Unused variable: registry is created but never used.

The prometheus::Registry instance is constructed but appears to serve no purpose. If PrometheusCollector creates its own internal registry, this line should be removed.

♻️ Proposed fix
-        auto registry = std::make_shared<prometheus::Registry>();
-
         metrics_collector_ =
             std::make_shared<PrometheusCollector>(bind_addr, bind_port);
eloq_metrics/src/metrics_registry_impl.cpp (1)

59-68: Minor inefficiency in metric creation.

A Metric is constructed locally, then wrapped in a unique_ptr copy. This can be simplified.

♻️ Optional simplification
 metrics::MetricHandle MetricsRegistryImpl::Register(
     const metrics::Name &name,
     metrics::Type type,
     const metrics::Labels &labels)
 {
-    auto metric = metrics::Metric(name.GetName(), type, labels);
-
     return metrics_mgr_result_.mgr_->MetricsRegistry(
-        std::make_unique<metrics::Metric>(metric));
+        std::make_unique<metrics::Metric>(name.GetName(), type, labels));
 }
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9476fbf and 1e631d9.

📒 Files selected for processing (18)
  • CMakeLists.txt
  • core/include/data_substrate.h
  • core/src/data_substrate.cpp
  • core/src/metrics_init.cpp
  • eloq_metrics/README.md
  • eloq_metrics/benchmark/mono_metrics_counter_bench.cpp
  • eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp
  • eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp
  • eloq_metrics/include/meter.h
  • eloq_metrics/include/metrics.h
  • eloq_metrics/include/metrics_manager.h
  • eloq_metrics/include/metrics_registry_impl.h
  • eloq_metrics/src/metrics_manager.cc
  • eloq_metrics/src/metrics_registry_impl.cpp
  • store_handler/eloq_data_store_service/build_eloq_store.cmake
  • store_handler/eloq_data_store_service/eloq_store_data_store.cpp
  • store_handler/eloq_data_store_service/eloq_store_data_store.h
  • tx_service/include/tx_service_metrics.h
🚧 Files skipped from review as they are similar to previous changes (6)
  • CMakeLists.txt
  • store_handler/eloq_data_store_service/eloq_store_data_store.h
  • core/src/metrics_init.cpp
  • eloq_metrics/benchmark/mono_metrics_counter_bench.cpp
  • eloq_metrics/include/meter.h
  • eloq_metrics/README.md
🧰 Additional context used
🧬 Code graph analysis (7)
store_handler/eloq_data_store_service/eloq_store_data_store.cpp (2)
store_handler/eloq_data_store_service/eloq_store_data_store.h (1)
  • Initialize (169-178)
eloq_metrics/src/metrics_registry_impl.cpp (2)
  • GetRegistry (28-50)
  • GetRegistry (28-28)
eloq_metrics/include/metrics_registry_impl.h (2)
eloq_metrics/include/metrics.h (2)
  • MetricsRegistry (155-164)
  • Name (38-50)
eloq_metrics/src/metrics_registry_impl.cpp (4)
  • Register (59-68)
  • Register (59-62)
  • Collect (70-78)
  • Collect (70-71)
eloq_metrics/include/metrics_manager.h (2)
eloq_metrics/include/metrics.h (1)
  • MetricsRegistry (155-164)
eloq_metrics/src/metrics_manager.cc (2)
  • MetricsRegistry (41-47)
  • MetricsRegistry (41-41)
eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (49-79)
  • GetMetricMgrInstance (49-49)
eloq_metrics/src/metrics_manager.cc (1)
eloq_metrics/include/metrics.h (1)
  • MetricsRegistry (155-164)
core/include/data_substrate.h (2)
core/src/metrics_init.cpp (2)
  • RegisterKvStoreMetrics (178-196)
  • RegisterKvStoreMetrics (178-178)
eloq_metrics/include/meter.h (2)
  • metrics (32-351)
  • metrics_registry_ (347-350)
eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (49-79)
  • GetMetricMgrInstance (49-49)
🔇 Additional comments (16)
tx_service/include/tx_service_metrics.h (1)

77-77: LGTM!

The new enable_eloqstore_metrics flag follows the established pattern for inline metric enable flags in this namespace. Default value of false ensures metrics collection is opt-in.

core/include/data_substrate.h (2)

264-264: LGTM!

New private method RegisterKvStoreMetrics() appropriately separates KV store metrics registration from the general InitializeMetrics() call, allowing it to be invoked after storage handler initialization.


296-296: Appropriate change from unique_ptr to shared_ptr for metrics registry sharing.

The change enables external components like store handlers to safely use the metrics registry. The metrics registry is passed as a raw pointer to RegisterKvMetrics(), which creates a global metrics::kv_meter that persists beyond individual DataSubstrate lifecycle events. Using shared_ptr ensures the registry remains valid while it's actively used by these external components across the codebase.

Note: The GetMetricsRegistry() accessor is currently unused in the codebase, but the underlying shared_ptr change is still sound for the component sharing use case.

eloq_metrics/src/metrics_manager.cc (1)

41-47: LGTM - Handle-based API refactor.

The change from returning CollectorWrapper to returning MetricHandle aligns with the new metrics API design shown in the interface definition.

core/src/data_substrate.cpp (1)

173-194: LGTM - Proper phase reordering for metrics initialization.

The new initialization order correctly ensures:

  1. Metrics registry is created before storage handler (Phase 2)
  2. Storage handler is initialized (Phase 3)
  3. KV store metrics are registered after both dependencies are ready (Phase 4)

This dependency ordering is necessary because RegisterKvStoreMetrics() requires both metrics_registry_ and store_hd_ to be initialized.

store_handler/eloq_data_store_service/build_eloq_store.cmake (1)

129-146: LGTM - CMake integration for metrics when built with txservice.

The conditional linking and include path setup follows CMake best practices. The comment explaining why TARGET eloq-metrics cannot be checked is helpful for future maintainers. The relative include path ${CMAKE_CURRENT_LIST_DIR}/../../eloq_metrics/include correctly resolves to the metrics headers directory.

store_handler/eloq_data_store_service/eloq_store_data_store.cpp (1)

29-33: LGTM: Conditional includes properly guarded.

The metrics-related headers are correctly guarded behind the ELOQSTORE_WITH_TXSERVICE preprocessor flag, ensuring clean compilation when the TX service integration is not enabled.

eloq_metrics/include/metrics_manager.h (1)

50-55: LGTM: Clean API surface for handle-based metrics.

The refactored API correctly:

  • Returns MetricHandle from MetricsRegistry() for type-safe metric references
  • Exposes GetCollector() for direct collector access when needed

This aligns with the broader handle-based refactoring across the metrics subsystem.

eloq_metrics/include/metrics.h (2)

74-83: LGTM: Well-designed handle type.

The MetricHandle struct is appropriately designed:

  • Deleted default constructor prevents accidental uninitialized handles
  • Simple value semantics with public fields suitable for a lightweight handle type
  • Bundles the key and type together, ensuring they stay consistent

159-162: LGTM: Interface updated consistently.

The MetricsRegistry interface correctly adopts the handle-based API:

  • Register returns MetricHandle for subsequent collection calls
  • Collect accepts const MetricHandle& maintaining the key-type coupling
eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp (1)

71-74: LGTM: Correct usage of the new handle-based collection API.

The benchmark correctly extracts key and type from the handle for the Collect call, matching the updated MetricsCollector::Collect signature.

eloq_metrics/src/metrics_registry_impl.cpp (2)

28-50: LGTM: Proper singleton pattern with shared_ptr.

The make_registry_shared local struct pattern correctly enables std::make_shared access to the protected constructor. The shared_ptr-based singleton ensures the registry remains alive as long as any caller holds a reference.


70-78: LGTM: Proper null-check before collection.

The collector null-check ensures graceful handling if the metrics manager failed to initialize. The handle-based Collect call correctly uses handle.key and handle.type.

eloq_metrics/include/metrics_registry_impl.h (2)

35-35: LGTM on the shared_ptr change.

The change from std::unique_ptr to std::shared_ptr for metrics_registry_ enables sharing the registry across multiple components (e.g., DataSubstrate and EloqStore metrics initialization), which aligns with the PR's goal to refactor metrics and enable EloqStore metrics.


46-50: LGTM on the API signature updates.

The updated Register and Collect signatures correctly align with the base class metrics::MetricsRegistry interface that now uses MetricHandle instead of raw MetricKey.

eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp (1)

66-68: LGTM on the updated Collect call.

The Collect call correctly uses the new three-argument form with handle.key, Value, and handle.type, aligning with the refactored MetricsCollector::Collect API.

Comment on lines +17 to +23
static metrics::MetricHandle benchmark_histogram_metric_handle =
(*(metrics::MetricsMgr::GetMetricMgrInstance().mgr_))
.MetricsRegistry(std::move(benchmark_histogram_metric_ptr));

static std::shared_ptr<metrics::MetricsCollector>
benchmark_histogram_collector =
(*(metrics::MetricsMgr::GetMetricMgrInstance().mgr_)).GetCollector();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential null dereference during static initialization.

Both static initializers dereference GetMetricMgrInstance().mgr_ without checking if it's null. If MetricsMgr initialization fails (e.g., port conflict), mgr_ will be nullptr, causing undefined behavior at program startup.

🐛 Suggested safer pattern

Consider lazy initialization or explicit null checks:

static metrics::MetricHandle GetHistogramHandle()
{
    auto result = metrics::MetricsMgr::GetMetricMgrInstance();
    if (result.mgr_ == nullptr)
    {
        // Handle error or use a dummy handle
        throw std::runtime_error("MetricsMgr initialization failed");
    }
    return result.mgr_->MetricsRegistry(
        std::make_unique<metrics::Metric>(benchmark_histogram_metric));
}

static metrics::MetricHandle benchmark_histogram_metric_handle = 
    GetHistogramHandle();

Alternatively, if this is acceptable for benchmarks (crash on misconfiguration), add a comment documenting this assumption.

🤖 Prompt for AI Agents
In @eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp around lines 17 -
23, The static initializers for benchmark_histogram_metric_handle and
benchmark_histogram_collector dereference
MetricsMgr::GetMetricMgrInstance().mgr_ without checking for null, risking
undefined behavior during static init; fix by replacing the inline static
initialization with lazy-initializer functions (e.g., GetHistogramHandle() and
GetHistogramCollector()) that call MetricsMgr::GetMetricMgrInstance(), verify
result.mgr_ != nullptr, and either throw a descriptive exception or return a
safe dummy handle/collector before assigning to the static variables; ensure you
call mgr_->MetricsRegistry(...) and mgr_->GetCollector() only after the null
check and add a brief comment documenting the chosen failure behavior.

Comment on lines +95 to +128
bool EloqStoreDataStore::Initialize()
{
#ifdef ELOQSTORE_WITH_TXSERVICE
if (metrics::enable_eloqstore_metrics)
{
// Initialize metrics for eloqstore
eloq_metrics_app::MetricsRegistryImpl::MetricsRegistryResult
metrics_registry_result =
eloq_metrics_app::MetricsRegistryImpl::GetRegistry();

if (metrics_registry_result.not_ok_ != nullptr)
{
LOG(WARNING) << "Failed to get metrics registry, skipping "
"eloqstore metrics initialization: "
<< metrics_registry_result.not_ok_;
}
else if (metrics_registry_result.metrics_registry_ != nullptr)
{
// Create common labels
// Note: node_ip and node_port may not be available at Initialize()
// time For now, we use empty labels. The shard_id label will be
// added by EloqStore::InitializeMetrics()
metrics::CommonLabels common_labels{};

// Call eloqstore's metrics initialization
// This will initialize metrics for all shards with shard_id labels
eloq_store_service_->InitializeMetrics(
metrics_registry_result.metrics_registry_.get(), common_labels);
}
}
#endif

return true;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the Initialize() contract in the base DataStore class
ast-grep --pattern $'class DataStore {
  $$$
  virtual bool Initialize() $$$
  $$$
}'

Repository: eloqdata/tx_service

Length of output: 45


🏁 Script executed:

#!/bin/bash
# Search for DataStore class definition
rg "class DataStore" --type cpp --type h -A 5

Repository: eloqdata/tx_service

Length of output: 8641


🏁 Script executed:

#!/bin/bash
# Read the DataStore class header file to find Initialize() method
cat store_handler/eloq_data_store_service/data_store.h

Repository: eloqdata/tx_service

Length of output: 4282


🏁 Script executed:

#!/bin/bash
# Find and read the EloqStoreDataStore::Initialize() method in full
rg "bool EloqStoreDataStore::Initialize" -A 50 store_handler/eloq_data_store_service/eloq_store_data_store.cpp

Repository: eloqdata/tx_service

Length of output: 1941


🏁 Script executed:

#!/bin/bash
# Search for StartDB implementation and EloqStore class
rg "bool EloqStoreDataStore::StartDB" -A 30 store_handler/eloq_data_store_service/eloq_store_data_store.cpp

Repository: eloqdata/tx_service

Length of output: 45


🏁 Script executed:

#!/bin/bash
# Find EloqStoreDataStore class header file
fd "eloq_store_data_store.h" -type f

Repository: eloqdata/tx_service

Length of output: 233


🏁 Script executed:

#!/bin/bash
# Find EloqStoreDataStore class header file
fd "eloq_store_data_store.h"

Repository: eloqdata/tx_service

Length of output: 124


🏁 Script executed:

#!/bin/bash
# Read the EloqStoreDataStore class header
cat store_handler/eloq_data_store_service/eloq_store_data_store.h

Repository: eloqdata/tx_service

Length of output: 7470


The Initialize() implementation violates the base class contract.

The base class documents Initialize() to "return True if connect successfully, otherwise false," but the implementation:

  • Never attempts a database connection
  • Only performs metrics initialization
  • Always returns true

The actual connection is deferred to StartDB(). Either:

  1. Initialize() should attempt the database connection and return the result accordingly, or
  2. The base class contract documentation needs updating to reflect that connection is handled by StartDB()

Clarify the intended contract and align the implementation accordingly.

🤖 Prompt for AI Agents
In @store_handler/eloq_data_store_service/eloq_store_data_store.cpp around lines
95 - 128, EloqStoreDataStore::Initialize() currently only sets up metrics and
always returns true, violating the base-class contract that Initialize() should
"return True if connect successfully, otherwise false"; either implement a real
connection attempt here and return its success or update the contract to reflect
connection happens in StartDB(). Fix options: (A) move or call the DB connection
logic from StartDB() into EloqStoreDataStore::Initialize(), perform the connect
attempt, and return true/false based on that result (while still running the
existing metrics initialization via eloq_store_service_->InitializeMetrics), or
(B) update the base-class Initialize() documentation to state that Initialize()
only performs non-connection setup (metrics via metrics::CommonLabels and
eloq_store_service_->InitializeMetrics) and that StartDB() is responsible for
connecting; ensure behavior of StartDB() remains unchanged. Ensure you reference
and adjust any callers/tests that expect Initialize() to indicate connection
success if you choose option A.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp:
- Around line 17-22: The static initialization currently dereferences
metrics::MetricsMgr::GetMetricMgrInstance()->mgr_ when initializing
benchmark_gauge_metric_handle and benchmark_gauge_collector, risking a crash if
GetMetricMgrInstance() returns a null mgr_; change to lazy or guarded
initialization: replace the static direct dereferences with functions or local
statics that call MetricsMgr::GetMetricMgrInstance(), check that the returned
instance and its mgr_ are non-null, handle error cases (e.g., return an
empty/invalid MetricHandle or nullptr collector or log/throw) and only call
MetricsRegistry and GetCollector when mgr_ is valid; ensure symbols to update
are benchmark_gauge_metric_handle, benchmark_gauge_collector, MetricsRegistry,
GetCollector and the call site(s) now use the safe accessor or lazy-initializer.
🧹 Nitpick comments (1)
core/src/metrics_init.cpp (1)

187-192: Minor naming issue: kv_common_common_labels has redundant "common".

The variable name appears to have a typo with duplicated "common".

✏️ Suggested fix
-        metrics::CommonLabels kv_common_common_labels{};
-        kv_common_common_labels["node_ip"] = network_config_.local_ip;
-        kv_common_common_labels["node_port"] =
+        metrics::CommonLabels kv_common_labels{};
+        kv_common_labels["node_ip"] = network_config_.local_ip;
+        kv_common_labels["node_port"] =
             std::to_string(network_config_.local_port);
-        store_hd_->RegisterKvMetrics(metrics_registry_.get(),
-                                     kv_common_common_labels);
+        store_hd_->RegisterKvMetrics(metrics_registry_.get(),
+                                     kv_common_labels);
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1e631d9 and a17d28a.

📒 Files selected for processing (18)
  • CMakeLists.txt
  • core/include/data_substrate.h
  • core/src/data_substrate.cpp
  • core/src/metrics_init.cpp
  • eloq_metrics/README.md
  • eloq_metrics/benchmark/mono_metrics_counter_bench.cpp
  • eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp
  • eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp
  • eloq_metrics/include/meter.h
  • eloq_metrics/include/metrics.h
  • eloq_metrics/include/metrics_manager.h
  • eloq_metrics/include/metrics_registry_impl.h
  • eloq_metrics/src/metrics_manager.cc
  • eloq_metrics/src/metrics_registry_impl.cpp
  • store_handler/eloq_data_store_service/build_eloq_store.cmake
  • store_handler/eloq_data_store_service/eloq_store_data_store.cpp
  • store_handler/eloq_data_store_service/eloq_store_data_store.h
  • tx_service/include/tx_service_metrics.h
✅ Files skipped from review due to trivial changes (1)
  • eloq_metrics/README.md
🚧 Files skipped from review as they are similar to previous changes (4)
  • store_handler/eloq_data_store_service/build_eloq_store.cmake
  • CMakeLists.txt
  • core/src/data_substrate.cpp
  • tx_service/include/tx_service_metrics.h
🧰 Additional context used
🧬 Code graph analysis (10)
eloq_metrics/src/metrics_manager.cc (1)
eloq_metrics/include/metrics.h (1)
  • MetricsRegistry (155-164)
core/src/metrics_init.cpp (2)
store_handler/eloq_data_store_service/main.cpp (1)
  • config_reader (163-163)
eloq_metrics/include/meter.h (1)
  • metrics_registry_ (347-350)
eloq_metrics/include/metrics_registry_impl.h (4)
eloq_metrics/include/metrics.h (2)
  • MetricsRegistry (155-164)
  • Name (38-50)
eloq_metrics/src/metrics_manager.cc (2)
  • MetricsRegistry (41-47)
  • MetricsRegistry (41-41)
core/include/data_substrate.h (1)
  • metrics (103-106)
eloq_metrics/src/metrics_registry_impl.cpp (4)
  • Register (59-68)
  • Register (59-62)
  • Collect (70-78)
  • Collect (70-71)
store_handler/eloq_data_store_service/eloq_store_data_store.h (1)
store_handler/eloq_data_store_service/eloq_store_data_store.cpp (2)
  • Initialize (95-128)
  • Initialize (95-95)
eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (49-79)
  • GetMetricMgrInstance (49-49)
store_handler/eloq_data_store_service/eloq_store_data_store.cpp (2)
store_handler/eloq_data_store_service/eloq_store_data_store.h (1)
  • Initialize (169-178)
eloq_metrics/src/metrics_registry_impl.cpp (2)
  • GetRegistry (28-50)
  • GetRegistry (28-28)
eloq_metrics/include/metrics.h (2)
eloq_metrics/src/metrics_registry_impl.cpp (4)
  • Register (59-68)
  • Register (59-62)
  • Collect (70-78)
  • Collect (70-71)
eloq_metrics/include/meter.h (1)
  • metrics (32-351)
eloq_metrics/include/metrics_manager.h (2)
eloq_metrics/include/metrics.h (1)
  • MetricsRegistry (155-164)
eloq_metrics/src/metrics_manager.cc (2)
  • MetricsRegistry (41-47)
  • MetricsRegistry (41-41)
eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (49-79)
  • GetMetricMgrInstance (49-49)
core/include/data_substrate.h (2)
core/src/metrics_init.cpp (2)
  • RegisterKvStoreMetrics (178-196)
  • RegisterKvStoreMetrics (178-178)
eloq_metrics/include/meter.h (2)
  • metrics (32-351)
  • metrics_registry_ (347-350)
🔇 Additional comments (23)
core/include/data_substrate.h (2)

296-296: LGTM!

The change from std::unique_ptr to std::shared_ptr for metrics_registry_ is appropriate for sharing the registry across components, as seen in MetricsRegistryImpl::GetRegistry() which returns a shared_ptr.


264-264: No action needed — RegisterKvStoreMetrics() visibility is correct.

The method is correctly declared as private at line 264 and is only called internally during DataSubstrate::Start() (line 190 in data_substrate.cpp) as part of the initialization sequence. The AI summary's characterization as a "public method" was inaccurate; the actual code is fine.

Likely an incorrect or invalid review comment.

core/src/metrics_init.cpp (1)

96-100: LGTM!

The new enable_eloqstore_metrics configuration flag follows the same pattern as other metrics flags in this file, with a conservative default of false.

eloq_metrics/include/metrics_manager.h (1)

50-55: LGTM on API changes!

The switch to returning MetricHandle and adding GetCollector() accessor aligns with the handle-based API design seen in eloq_metrics/include/metrics.h.

store_handler/eloq_data_store_service/eloq_store_data_store.h (1)

169-169: LGTM!

Moving the Initialize() implementation to the .cpp file is appropriate since it now contains conditional compilation logic for ELOQSTORE_WITH_TXSERVICE metrics initialization.

eloq_metrics/src/metrics_manager.cc (1)

41-47: LGTM!

The refactored MetricsRegistry method correctly returns a MetricHandle directly instead of wrapping in CollectorWrapper. This aligns with the updated MetricsRegistry interface in eloq_metrics/include/metrics.h where Register returns MetricHandle and Collect accepts MetricHandle.

store_handler/eloq_data_store_service/eloq_store_data_store.cpp (2)

29-33: LGTM on conditional includes.

The conditional includes for metrics headers under ELOQSTORE_WITH_TXSERVICE are properly structured and include the necessary headers for the metrics initialization in the Initialize() method.


95-128: LGTM on Initialize() implementation with one minor suggestion.

The metrics initialization logic is well-structured with proper error handling. The method correctly:

  1. Checks the enable_eloqstore_metrics flag before proceeding
  2. Handles the case where registry retrieval fails with a warning log
  3. Initializes metrics only when a valid registry is available
  4. Returns true regardless of metrics success (metrics are non-critical)

One minor observation: The conditional check on line 111 (else if (metrics_registry_result.metrics_registry_ != nullptr)) is redundant since if not_ok_ is nullptr, the registry should always be valid based on GetRegistry() implementation. However, defensive programming here is acceptable.

eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp (2)

17-23: LGTM on MetricHandle API migration.

The refactoring correctly migrates to the new handle-based API:

  • MetricHandle stores both key and type from registration
  • GetCollector() provides the shared collector instance
  • Both calls to GetMetricMgrInstance() are safe as it returns a singleton

67-76: LGTM on updated Collect call.

The Collect invocation correctly uses the new signature with handle.key, metrics::Value(1), and handle.type, matching the updated MetricsCollector::Collect interface.

eloq_metrics/include/metrics.h (2)

74-83: Clean design for MetricHandle.

The MetricHandle struct effectively bundles MetricKey and Type together, simplifying the API by:

  1. Eliminating the need for callers to track the type separately after registration
  2. Ensuring proper initialization by deleting the default constructor
  3. Making the collection path type-safe

155-164: LGTM on MetricsRegistry interface update.

The interface changes are consistent and well-designed:

  • Register returning MetricHandle provides both key and type to the caller in one operation
  • Collect accepting const MetricHandle& simplifies the call site and ensures type consistency
eloq_metrics/include/meter.h (3)

173-186: LGTM on registration flow update.

The registration flow correctly:

  1. Captures the MetricHandle returned from metrics_registry_->Register()
  2. Stores the handle (containing both key and type) in meter_to_metric_map_
  3. Uses std::move for efficient transfer

202-218: LGTM on Collect implementation.

The Collect method correctly retrieves the MetricHandle from the map and passes it to metrics_registry_->Collect(). The defensive check on line 214 appropriately prevents crashes from missing meter keys while the assertion on line 211 aids debugging.


349-349: LGTM on map type change.

The type change from Map<MeterKey, MetricKey> to Map<MeterKey, MetricHandle> aligns with the new API where registration returns a handle containing both key and type.

eloq_metrics/src/metrics_registry_impl.cpp (3)

28-50: LGTM on singleton pattern with shared_ptr.

The implementation correctly uses std::shared_ptr for the singleton:

  1. The make_registry_shared struct trick is a well-known pattern to enable make_shared with private constructors
  2. The static shared_ptr ensures thread-safe initialization (C++11 guarantees)
  3. Returning shared_ptr in MetricsRegistryResult ensures the singleton remains alive while in use

59-68: LGTM on Register returning MetricHandle.

The Register method correctly creates a Metric and delegates to the manager's MetricsRegistry method, which returns the MetricHandle containing both key and type.


70-78: LGTM on Collect with MetricHandle.

The implementation correctly:

  1. Retrieves the collector via GetCollector()
  2. Performs a null check before calling Collect
  3. Extracts key and type from the handle for the underlying collection
eloq_metrics/include/metrics_registry_impl.h (2)

33-37: LGTM on MetricsRegistryResult with shared_ptr.

The change from unique_ptr to shared_ptr in MetricsRegistryResult correctly supports the new lifetime management pattern where callers can hold references to the singleton registry.


45-50: LGTM on interface declarations.

The Register and Collect method declarations correctly match:

  1. The base class MetricsRegistry interface in metrics.h
  2. The implementations in metrics_registry_impl.cpp
eloq_metrics/benchmark/mono_metrics_counter_bench.cpp (2)

17-22: LGTM on MetricHandle API migration.

The refactoring follows the same correct pattern as the histogram benchmark:

  • Registration returns MetricHandle with both key and type
  • Collector is obtained via GetCollector() from the singleton manager

62-71: LGTM on updated Collect call.

The benchmark correctly uses the new Collect signature with handle.key, metrics::Value(1), and handle.type.

eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp (1)

66-68: LGTM!

The Collect call is correctly updated to use the new three-argument API with benchmark_gauge_metric_handle.key, metrics::Value(1), and benchmark_gauge_metric_handle.type.

Comment on lines +17 to +22
static metrics::MetricHandle benchmark_gauge_metric_handle =
(*(metrics::MetricsMgr::GetMetricMgrInstance().mgr_))
.MetricsRegistry(std::move(benchmark_gauge_metric_ptr));

static std::shared_ptr<metrics::MetricsCollector> benchmark_gauge_collector =
(*(metrics::MetricsMgr::GetMetricMgrInstance().mgr_)).GetCollector();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential null pointer dereference during static initialization.

GetMetricMgrInstance() can return nullptr for mgr_ on initialization errors (e.g., UnSupportMetricCollector, CollectorOpenErr). Dereferencing without a null check will crash the benchmark at startup.

Consider adding defensive checks or using a lazy initialization pattern that can handle errors gracefully.

Proposed fix
-static metrics::MetricHandle benchmark_gauge_metric_handle =
-    (*(metrics::MetricsMgr::GetMetricMgrInstance().mgr_))
-        .MetricsRegistry(std::move(benchmark_gauge_metric_ptr));
-
-static std::shared_ptr<metrics::MetricsCollector> benchmark_gauge_collector =
-    (*(metrics::MetricsMgr::GetMetricMgrInstance().mgr_)).GetCollector();
+static metrics::MetricHandle benchmark_gauge_metric_handle = []() {
+    auto result = metrics::MetricsMgr::GetMetricMgrInstance();
+    if (result.mgr_ == nullptr)
+    {
+        throw std::runtime_error(result.err_msg_ ? result.err_msg_ : "MetricsMgr initialization failed");
+    }
+    return result.mgr_->MetricsRegistry(std::move(benchmark_gauge_metric_ptr));
+}();
+
+static std::shared_ptr<metrics::MetricsCollector> benchmark_gauge_collector = []() {
+    auto result = metrics::MetricsMgr::GetMetricMgrInstance();
+    if (result.mgr_ == nullptr)
+    {
+        throw std::runtime_error(result.err_msg_ ? result.err_msg_ : "MetricsMgr initialization failed");
+    }
+    return result.mgr_->GetCollector();
+}();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
static metrics::MetricHandle benchmark_gauge_metric_handle =
(*(metrics::MetricsMgr::GetMetricMgrInstance().mgr_))
.MetricsRegistry(std::move(benchmark_gauge_metric_ptr));
static std::shared_ptr<metrics::MetricsCollector> benchmark_gauge_collector =
(*(metrics::MetricsMgr::GetMetricMgrInstance().mgr_)).GetCollector();
static metrics::MetricHandle benchmark_gauge_metric_handle = []() {
auto result = metrics::MetricsMgr::GetMetricMgrInstance();
if (result.mgr_ == nullptr)
{
throw std::runtime_error(result.err_msg_ ? result.err_msg_ : "MetricsMgr initialization failed");
}
return result.mgr_->MetricsRegistry(std::move(benchmark_gauge_metric_ptr));
}();
static std::shared_ptr<metrics::MetricsCollector> benchmark_gauge_collector = []() {
auto result = metrics::MetricsMgr::GetMetricMgrInstance();
if (result.mgr_ == nullptr)
{
throw std::runtime_error(result.err_msg_ ? result.err_msg_ : "MetricsMgr initialization failed");
}
return result.mgr_->GetCollector();
}();
🤖 Prompt for AI Agents
In @eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp around lines 17 - 22,
The static initialization currently dereferences
metrics::MetricsMgr::GetMetricMgrInstance()->mgr_ when initializing
benchmark_gauge_metric_handle and benchmark_gauge_collector, risking a crash if
GetMetricMgrInstance() returns a null mgr_; change to lazy or guarded
initialization: replace the static direct dereferences with functions or local
statics that call MetricsMgr::GetMetricMgrInstance(), check that the returned
instance and its mgr_ are non-null, handle error cases (e.g., return an
empty/invalid MetricHandle or nullptr collector or log/throw) and only call
MetricsRegistry and GetCollector when mgr_ is valid; ensure symbols to update
are benchmark_gauge_metric_handle, benchmark_gauge_collector, MetricsRegistry,
GetCollector and the call site(s) now use the safe accessor or lazy-initializer.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@eloq_metrics/benchmark/mono_metrics_counter_bench.cpp`:
- Around line 17-22: The static initialization dereferences
MetricsMgr::GetMetricMgrInstance().mgr_ and assumes MetricsRegistry/GetCollector
succeed (affecting symbols benchmark_counter_metric_handle and
benchmark_counter_collector); instead call MetricsMgr::GetMetricMgrInstance()
once into a local MetricsMgrResult (or similar) variable, validate that mgr_ and
returned collector are non-null before using them, and if validation fails set
the static handles/collector to a safe null/empty state and make the benchmark
skip or early-return when collector is unavailable (apply the same defensive
pattern to the other static inits around benchmark setup and the code at the
second occurrence).
♻️ Duplicate comments (5)
store_handler/eloq_data_store_service/eloq_store_data_store.cpp (1)

95-128: Previous concern about base class contract still applies.

The metrics initialization logic is well-structured with appropriate null checks and warning logs. However, note that the base class contract concern from previous reviews (Initialize should indicate connection success) remains relevant. The method always returns true regardless of metrics initialization outcome.

If this is intentional (metrics initialization is non-critical), consider adding a brief comment documenting this design decision.

eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp (1)

17-23: Null pointer dereference risk during static initialization persists.

Both static initializers dereference GetMetricMgrInstance().mgr_ without null checks. As noted in previous reviews, if MetricsMgr initialization fails (e.g., port conflict, unsupported collector), mgr_ will be nullptr, causing undefined behavior at program startup.

Since this is benchmark code, consider either:

  1. Adding explicit null checks with lazy initialization as previously suggested
  2. Adding a comment documenting that this assumes successful MetricsMgr initialization
core/src/metrics_init.cpp (1)

187-192: Fix duplicated word in variable name.

🧹 Proposed rename
-        metrics::CommonLabels kv_common_common_labels{};
-        kv_common_common_labels["node_ip"] = network_config_.local_ip;
-        kv_common_common_labels["node_port"] =
+        metrics::CommonLabels kv_common_labels{};
+        kv_common_labels["node_ip"] = network_config_.local_ip;
+        kv_common_labels["node_port"] =
             std::to_string(network_config_.local_port);
-        store_hd_->RegisterKvMetrics(metrics_registry_.get(),
-                                     kv_common_common_labels);
+        store_hd_->RegisterKvMetrics(metrics_registry_.get(),
+                                     kv_common_labels);
eloq_metrics/src/metrics_registry_impl.cpp (1)

59-77: Guard against null MetricsMgr in Register/Collect.
If metrics_mgr_result_.not_ok_ is set, mgr_ can be null, and both Register and Collect will dereference it. Please add a guard (or a fatal check) to avoid crashes.

🛡️ Suggested guard
 metrics::MetricHandle MetricsRegistryImpl::Register(
     const metrics::Name &name,
     metrics::Type type,
     const metrics::Labels &labels)
 {
+    if (metrics_mgr_result_.not_ok_ != nullptr ||
+        metrics_mgr_result_.mgr_ == nullptr)
+    {
+        return metrics::MetricHandle{0, type};
+    }
     auto metric = metrics::Metric(name.GetName(), type, labels);

     return metrics_mgr_result_.mgr_->MetricsRegistry(
         std::make_unique<metrics::Metric>(metric));
 }

 void MetricsRegistryImpl::Collect(const metrics::MetricHandle &handle,
                                   const metrics::Value &val)
 {
+    if (metrics_mgr_result_.not_ok_ != nullptr ||
+        metrics_mgr_result_.mgr_ == nullptr)
+    {
+        return;
+    }
     auto collector = metrics_mgr_result_.mgr_->GetCollector();
     if (collector != nullptr)
     {
         collector->Collect(handle.key, val, handle.type);
     }
 }
eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp (1)

17-22: Guard MetricsMgr initialization before dereference (duplicate).

Line 17-22 still dereferences GetMetricMgrInstance().mgr_ during static initialization. This repeats the prior concern about a possible null return on init failure.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a17d28a and 01539e2.

📒 Files selected for processing (18)
  • CMakeLists.txt
  • core/include/data_substrate.h
  • core/src/data_substrate.cpp
  • core/src/metrics_init.cpp
  • eloq_metrics/README.md
  • eloq_metrics/benchmark/mono_metrics_counter_bench.cpp
  • eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp
  • eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp
  • eloq_metrics/include/meter.h
  • eloq_metrics/include/metrics.h
  • eloq_metrics/include/metrics_manager.h
  • eloq_metrics/include/metrics_registry_impl.h
  • eloq_metrics/src/metrics_manager.cc
  • eloq_metrics/src/metrics_registry_impl.cpp
  • store_handler/eloq_data_store_service/build_eloq_store.cmake
  • store_handler/eloq_data_store_service/eloq_store_data_store.cpp
  • store_handler/eloq_data_store_service/eloq_store_data_store.h
  • tx_service/include/tx_service_metrics.h
🚧 Files skipped from review as they are similar to previous changes (7)
  • eloq_metrics/README.md
  • tx_service/include/tx_service_metrics.h
  • store_handler/eloq_data_store_service/eloq_store_data_store.h
  • CMakeLists.txt
  • store_handler/eloq_data_store_service/build_eloq_store.cmake
  • core/src/data_substrate.cpp
  • eloq_metrics/include/metrics.h
🧰 Additional context used
🧬 Code graph analysis (8)
eloq_metrics/src/metrics_manager.cc (1)
eloq_metrics/include/metrics.h (1)
  • MetricsRegistry (155-164)
eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (49-79)
  • GetMetricMgrInstance (49-49)
eloq_metrics/benchmark/mono_metrics_counter_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (49-79)
  • GetMetricMgrInstance (49-49)
core/src/metrics_init.cpp (2)
store_handler/eloq_data_store_service/main.cpp (1)
  • config_reader (163-163)
eloq_metrics/include/meter.h (1)
  • metrics_registry_ (347-350)
store_handler/eloq_data_store_service/eloq_store_data_store.cpp (2)
store_handler/eloq_data_store_service/eloq_store_data_store.h (1)
  • Initialize (169-178)
eloq_metrics/src/metrics_registry_impl.cpp (2)
  • GetRegistry (28-50)
  • GetRegistry (28-28)
eloq_metrics/include/metrics_manager.h (3)
eloq_metrics/include/metrics.h (1)
  • MetricsRegistry (155-164)
eloq_metrics/src/metrics_manager.cc (2)
  • MetricsRegistry (41-47)
  • MetricsRegistry (41-41)
core/include/redis_metrics.h (1)
  • unique_ptr (47-68)
eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (49-79)
  • GetMetricMgrInstance (49-49)
core/include/data_substrate.h (2)
core/src/metrics_init.cpp (2)
  • RegisterKvStoreMetrics (178-196)
  • RegisterKvStoreMetrics (178-178)
eloq_metrics/include/meter.h (2)
  • metrics (32-351)
  • metrics_registry_ (347-350)
🔇 Additional comments (13)
eloq_metrics/src/metrics_manager.cc (1)

41-47: LGTM! Clean API simplification.

The return type change from std::unique_ptr<CollectorWrapper> to MetricHandle aligns well with the new metrics interface pattern. The order of operations correctly captures metric_type before the unique_ptr is moved to SetMetric.

eloq_metrics/include/metrics_manager.h (1)

50-55: LGTM! Well-designed API surface.

The new GetCollector() method appropriately exposes the internal MetricsCollector via shared_ptr, enabling external code (benchmarks, store integrations) to perform metric collection while maintaining safe shared ownership semantics.

store_handler/eloq_data_store_service/eloq_store_data_store.cpp (1)

29-33: LGTM! Appropriate conditional compilation.

The metrics-related includes are correctly guarded by ELOQSTORE_WITH_TXSERVICE, ensuring the build doesn't fail when the txservice integration is disabled.

core/include/data_substrate.h (2)

264-264: LGTM! Clean separation of concerns.

Adding RegisterKvStoreMetrics() as a dedicated method for KV store metrics registration provides a clear initialization pathway, separate from the general metrics setup.


296-296: Appropriate change to shared ownership.

Changing metrics_registry_ from unique_ptr to shared_ptr is correct given that Meter and other components use raw pointers to MetricsRegistry. The shared ownership ensures the registry remains valid while consumers hold references.

eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp (1)

67-76: LGTM! Correct API usage.

The updated Collect call correctly uses the new MetricHandle-based API, passing handle.key, the value, and handle.type to the collector. This is consistent with the pattern in other benchmark files.

eloq_metrics/include/meter.h (2)

44-44: Doc update aligns with handle-based flow.


175-185: Handle-based registration/collection is consistent.

Also applies to: 198-216, 349-349

core/src/metrics_init.cpp (3)

96-100: New eloqstore metrics flag is wired and logged.


144-144: Registry assignment in init looks correct.


178-186: Verify KV-store metrics honor per-feature flags.
If enable_kv_metrics or enable_eloqstore_metrics are intended to gate registration, please confirm RegisterKvStoreMetrics() (or store_hd_->RegisterKvMetrics) respects those flags; otherwise consider adding explicit guards here.

eloq_metrics/src/metrics_registry_impl.cpp (1)

33-41: Shared ownership singleton is fine.

eloq_metrics/include/metrics_registry_impl.h (1)

35-50: Header updates align with handle-based API.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +17 to +22
static metrics::MetricHandle benchmark_counter_metric_handle =
(*(metrics::MetricsMgr::GetMetricMgrInstance().mgr_))
.MetricsRegistry(std::move(benchmark_counter_metric_ptr));

static std::shared_ptr<metrics::MetricsCollector> benchmark_counter_collector =
(*(metrics::MetricsMgr::GetMetricMgrInstance().mgr_)).GetCollector();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid nullptr deref if MetricsMgr/collector init fails.
Consider caching MetricsMgrResult once, validating it, and skipping the benchmark if the collector is unavailable (e.g., port conflict).

🧪 Suggested defensive init
-static metrics::MetricHandle benchmark_counter_metric_handle =
-    (*(metrics::MetricsMgr::GetMetricMgrInstance().mgr_))
-        .MetricsRegistry(std::move(benchmark_counter_metric_ptr));
+static auto benchmark_mgr_result = metrics::MetricsMgr::GetMetricMgrInstance();
+static metrics::MetricHandle benchmark_counter_metric_handle =
+    (benchmark_mgr_result.mgr_ != nullptr)
+        ? benchmark_mgr_result.mgr_->MetricsRegistry(
+              std::move(benchmark_counter_metric_ptr))
+        : metrics::MetricHandle{0, metrics::Type::Counter};

-static std::shared_ptr<metrics::MetricsCollector> benchmark_counter_collector =
-    (*(metrics::MetricsMgr::GetMetricMgrInstance().mgr_)).GetCollector();
+static std::shared_ptr<metrics::MetricsCollector> benchmark_counter_collector =
+    (benchmark_mgr_result.mgr_ != nullptr)
+        ? benchmark_mgr_result.mgr_->GetCollector()
+        : nullptr;

 static void BM_Counter_MonoWrapper_Collect(benchmark::State &state)
 {
+    if (benchmark_counter_collector == nullptr)
+    {
+        state.SkipWithError("metrics collector unavailable");
+        return;
+    }
     for (auto _ : state)
     {
         benchmark_counter_collector->Collect(
             benchmark_counter_metric_handle.key,
             metrics::Value(1),
             benchmark_counter_metric_handle.type);
     }
 }

Also applies to: 66-69

🤖 Prompt for AI Agents
In `@eloq_metrics/benchmark/mono_metrics_counter_bench.cpp` around lines 17 - 22,
The static initialization dereferences MetricsMgr::GetMetricMgrInstance().mgr_
and assumes MetricsRegistry/GetCollector succeed (affecting symbols
benchmark_counter_metric_handle and benchmark_counter_collector); instead call
MetricsMgr::GetMetricMgrInstance() once into a local MetricsMgrResult (or
similar) variable, validate that mgr_ and returned collector are non-null before
using them, and if validation fails set the static handles/collector to a safe
null/empty state and make the benchmark skip or early-return when collector is
unavailable (apply the same defensive pattern to the other static inits around
benchmark setup and the code at the second occurrence).

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@eloq_metrics/src/prometheus_collector.cc`:
- Around line 130-176: PrometheusCollector::SetMetric currently always calls
prometheus::Build*().Name(...).Register(*registry_) and ignores the computed
MetricHash/metric_key (and the previously-removed PutMetricsIfAbsent), causing
duplicate registrations and potential silent failures; reintroduce a local cache
(e.g., std::unordered_map<MetricHashKey, std::shared_ptr<PrometheusMetricData>>
metrics_cache) keyed by metric_key inside PrometheusCollector, protect it with a
mutex, check metrics_cache for metric_key before calling Register(), and if
present return a MetricHandle with the existing PrometheusMetricData; only call
Register() and create a new PrometheusMetricData when the key is absent, insert
it into metrics_cache, and then return the new MetricHandle (or alternatively
document and assert idempotency of prometheus::Registry::Register if you choose
not to cache).
♻️ Duplicate comments (4)
eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp (1)

17-23: Guard static initialization against a null MetricsMgr.
Static globals dereference GetMetricMgrInstance().mgr_ during static init; if init fails, this can crash before benchmarks run. Consider lazy init with a null check and skip/abort the benchmark when the collector/handle isn’t available.

Also applies to: 71-72

eloq_metrics/src/metrics_registry_impl.cpp (1)

70-80: Guard Collect when MetricsMgr initialization failed.
metrics_mgr_result_.mgr_ can be null if initialization failed, which would crash here. Add a simple guard before dereferencing.

🐛 Suggested guard
 void MetricsRegistryImpl::Collect(const metrics::MetricHandle &handle,
                                   const metrics::Value &val)
 {
+    if (metrics_mgr_result_.not_ok_ != nullptr ||
+        metrics_mgr_result_.mgr_ == nullptr)
+    {
+        return;
+    }
     // Get collector from MetricsMgr and delegate to it
     // This ensures collector-specific checks (e.g., exposer_ validation) are
     // performed
     auto collector = metrics_mgr_result_.mgr_->GetCollector();
     if (collector != nullptr)
     {
         collector->Collect(handle, val);
     }
 }
eloq_metrics/benchmark/mono_metrics_counter_bench.cpp (1)

17-22: Guard against nullptr MetricsMgr in static init.

Line 17-22 and Line 66-67 still dereference mgr_ directly; GetMetricMgrInstance() can return nullptr on init errors, which will crash the benchmark. Please gate init and skip the benchmark when the collector is unavailable.

Also applies to: 66-67

eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp (1)

17-22: Protect static init from null MetricsMgr.

Line 17-22 and Line 66-67 dereference mgr_ without checks; if initialization fails, the benchmark will crash at startup. Guard init and skip when the collector is unavailable.

Also applies to: 66-67

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 01539e2 and 9a2f329.

📒 Files selected for processing (11)
  • eloq_metrics/benchmark/mono_metrics_counter_bench.cpp
  • eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp
  • eloq_metrics/benchmark/mono_metrics_histogram_bench.cpp
  • eloq_metrics/include/metrics.h
  • eloq_metrics/include/metrics_collector.h
  • eloq_metrics/include/prometheus_collector.h
  • eloq_metrics/src/metrics_manager.cc
  • eloq_metrics/src/metrics_registry_impl.cpp
  • eloq_metrics/src/prometheus_collector.cc
  • eloq_metrics/tests/metrics_collector_test.cc
  • store_handler/eloq_data_store_service/eloqstore
✅ Files skipped from review due to trivial changes (1)
  • store_handler/eloq_data_store_service/eloqstore
🧰 Additional context used
🧬 Code graph analysis (8)
eloq_metrics/src/metrics_registry_impl.cpp (1)
eloq_metrics/src/prometheus_collector.cc (4)
  • Collect (38-86)
  • Collect (38-39)
  • Collect (179-196)
  • Collect (179-180)
eloq_metrics/include/prometheus_collector.h (3)
eloq_metrics/include/metrics.h (1)
  • MetricCollectorData (152-164)
eloq_metrics/src/metrics_registry_impl.cpp (2)
  • Collect (70-81)
  • Collect (70-71)
eloq_metrics/src/prometheus_collector.cc (12)
  • Collect (38-86)
  • Collect (38-39)
  • Collect (179-196)
  • Collect (179-180)
  • SetMetric (130-177)
  • SetMetric (130-130)
  • CollectClientMetrics (198-239)
  • CollectClientMetrics (198-199)
  • PrometheusCollector (88-91)
  • PrometheusCollector (122-128)
  • Convert2Labels (241-251)
  • Convert2Labels (241-241)
eloq_metrics/include/metrics.h (5)
eloq_metrics/include/metrics_collector.h (1)
  • metrics (28-64)
eloq_metrics/include/prometheus_collector.h (1)
  • metrics (40-85)
eloq_metrics/include/meter.h (1)
  • metrics (32-351)
eloq_metrics/src/metrics_registry_impl.cpp (4)
  • Collect (70-81)
  • Collect (70-71)
  • Register (59-68)
  • Register (59-62)
eloq_metrics/src/prometheus_collector.cc (4)
  • Collect (38-86)
  • Collect (38-39)
  • Collect (179-196)
  • Collect (179-180)
eloq_metrics/src/prometheus_collector.cc (2)
eloq_metrics/src/metrics_registry_impl.cpp (2)
  • Collect (70-81)
  • Collect (70-71)
eloq_metrics/tests/metrics_collector_test.cc (1)
  • handle (41-41)
eloq_metrics/src/metrics_manager.cc (1)
eloq_metrics/include/metrics.h (1)
  • MetricsRegistry (181-190)
eloq_metrics/benchmark/mono_metrics_counter_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (47-77)
  • GetMetricMgrInstance (47-47)
eloq_metrics/include/metrics_collector.h (2)
eloq_metrics/src/prometheus_collector.cc (8)
  • SetMetric (130-177)
  • SetMetric (130-130)
  • Collect (38-86)
  • Collect (38-39)
  • Collect (179-196)
  • Collect (179-180)
  • CollectClientMetrics (198-239)
  • CollectClientMetrics (198-199)
eloq_metrics/src/metrics_registry_impl.cpp (2)
  • Collect (70-81)
  • Collect (70-71)
eloq_metrics/benchmark/mono_metrics_gauge_bench.cpp (1)
eloq_metrics/src/metrics_manager.cc (2)
  • GetMetricMgrInstance (47-77)
  • GetMetricMgrInstance (47-47)
🔇 Additional comments (12)
eloq_metrics/src/metrics_manager.cc (1)

41-45: MetricHandle return path aligns with the new API.

eloq_metrics/tests/metrics_collector_test.cc (2)

37-45: Negative-path test matches new handle semantics.


102-106: Handle-based collect path is covered cleanly.

eloq_metrics/src/metrics_registry_impl.cpp (2)

33-41: Shared_ptr singleton keeps the registry alive as intended.


59-68: Register now returns MetricHandle consistently.

eloq_metrics/include/metrics.h (4)

31-39: Forward declaration and include are appropriate.


77-94: MetricHandle structure is clear and safe.


151-164: Collector-specific data abstraction is clean.


185-188: Registry interface update is consistent with handle-based collection.

eloq_metrics/include/metrics_collector.h (1)

51-59: MetricHandle-based collector interface looks good.

SetMetric/Collect/CollectClientMetrics now uniformly take MetricHandle, aligning with the new handle-centric flow.

eloq_metrics/src/prometheus_collector.cc (1)

30-85: Payload-based collect flow reads clean.

The handle-based Collect/CollectClientMetrics path is consistent end-to-end with the new data payload.

Also applies to: 179-254

eloq_metrics/include/prometheus_collector.h (1)

43-119: Header updates align with handle-centric design.

MetricCollectorData-derived payload and the updated SetMetric/Collect signatures match the new flow.

Also applies to: 130-132

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +130 to +176
MetricHandle PrometheusCollector::SetMetric(std::unique_ptr<Metric> &metric_ptr)
{
auto option = metric_ptr.get();
MetricHash metric_hash;
auto metric_key = metric_hash(*option);
PutMetricsIfAbsent(metric_ptr, metric_key);
return metric_key;
}

bool PrometheusCollector::Collect(MetricKey metric_key,
const Value &metric_value,
const Type &metric_type)
{
if (!this->exposer_)
{
#ifdef WITH_GLOG
LOG(WARNING) << " Prometheus Collector found exposer_ is null";
#endif
// Warning when exposer_ is null, but still collect the metrics for the
// TxThreadScaleService
return false;
}
auto prometheus_labels = Convert2Labels(metric_ptr->labels_);
Type metric_type = metric_ptr->type_;

std::shared_ptr<PrometheusMetricData> data;

switch (metric_type)
{
case Type::Histogram:
{
assert(histograms_.find(metric_key) != histograms_.end());
histograms_.find(metric_key)->second.get().Observe(metric_value.value_);
auto &histogram_family = prometheus::BuildHistogram()
.Name(metric_ptr->name_)
.Register(*registry_);
auto &histogram = histogram_family.Add(
prometheus_labels, PROMETHEUS_HISTOGRAM_DEF_BUCKETS);
data = std::make_shared<PrometheusMetricData>(histogram);
break;
}
case Type::Gauge:
{
assert(gauges_.find(metric_key) != gauges_.end());
switch (metric_value.inc_value_)
{
case Value::IncDecValue::Increment:
gauges_.find(metric_key)->second.get().Increment();
break;
case Value::IncDecValue::Decrement:
gauges_.find(metric_key)->second.get().Decrement();
break;
default:
gauges_.find(metric_key)->second.get().Set(metric_value.value_);
break;
}
auto &gauge_family = prometheus::BuildGauge()
.Name(metric_ptr->name_)
.Register(*registry_);
auto &gauge = gauge_family.Add(prometheus_labels);
data = std::make_shared<PrometheusMetricData>(gauge);
break;
}
case Type::Counter:
{
assert(counters_.find(metric_key) != counters_.end());
auto &counter = counters_.find(metric_key)->second.get();
if (metric_value.HasValue())
{
counter.Increment(metric_value.value_);
}
else
{
counter.Increment();
}
auto &counter_family = prometheus::BuildCounter()
.Name(metric_ptr->name_)
.Register(*registry_);
auto &counter = counter_family.Add(prometheus_labels);
data = std::make_shared<PrometheusMetricData>(counter);
break;
}
default:
break;
// Return handle without data for unsupported types
return MetricHandle(metric_key, metric_type);
}
return true;

return MetricHandle(metric_key, metric_type, data);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Look for MetricHandle caching/dedup in MetricsRegistryImpl
rg -n "MetricHandle|MetricHash|MetricsRegistry|SetMetric|MetricsRegistryImpl" eloq_metrics/src/metrics_registry_impl.cpp eloq_metrics/include/metrics_registry_impl.h

Repository: eloqdata/tx_service

Length of output: 2032


🏁 Script executed:

#!/bin/bash
# Read the Register implementation in MetricsRegistryImpl
echo "=== MetricsRegistryImpl::Register implementation ==="
sed -n '59,80p' eloq_metrics/src/metrics_registry_impl.cpp

echo -e "\n=== Full metrics_registry_impl.cpp to understand structure ==="
wc -l eloq_metrics/src/metrics_registry_impl.cpp

echo -e "\n=== Check PrometheusCollector for SetMetric and any caching ==="
rg -n "cache|Cache|unordered_map|map" eloq_metrics/src/prometheus_collector.cc | head -20

Repository: eloqdata/tx_service

Length of output: 1103


🏁 Script executed:

#!/bin/bash
# Find MetricsMgr definition and MetricsRegistry method
echo "=== Search for MetricsMgr class ==="
rg -n "class MetricsMgr|MetricsRegistry\(" eloq_metrics/include/ eloq_metrics/src/ -A 3 | head -40

echo -e "\n=== Search for MetricsRegistry method implementation ==="
rg -n "MetricsRegistry\(" eloq_metrics/src/ -A 5 | head -40

echo -e "\n=== Check PrometheusCollector for any metric caching ==="
rg -n "metric_cache|metric_map|metrics_map|cache_|std::map|std::unordered_map" eloq_metrics/src/prometheus_collector.cc

Repository: eloqdata/tx_service

Length of output: 2853


🏁 Script executed:

#!/bin/bash
# Check PrometheusCollector class definition
echo "=== PrometheusCollector class and registry ==="
rg -n "class PrometheusCollector|registry_|std::map|std::unordered_map" eloq_metrics/include/prometheus_collector.h -A 2 | head -50

echo -e "\n=== Full SetMetric implementation from prometheus_collector.cc ==="
sed -n '130,176p' eloq_metrics/src/prometheus_collector.cc

echo -e "\n=== Check if there's any metric dedup logic ==="
rg -n "metric_hash|metric_key|MetricHash|duplicate|exists|find" eloq_metrics/src/prometheus_collector.cc

Repository: eloqdata/tx_service

Length of output: 2345


🏁 Script executed:

#!/bin/bash
# Check if PrometheusCollector stores metrics to prevent re-registration
echo "=== Check PrometheusCollector member variables ==="
sed -n '97,129p' eloq_metrics/include/prometheus_collector.h

echo -e "\n=== Check PrometheusCollector full implementation ==="
wc -l eloq_metrics/src/prometheus_collector.cc

echo -e "\n=== Search for any cache/map that tracks registered metrics ==="
rg -n "private:|protected:|member|metric_map|metric_cache|registered" eloq_metrics/include/prometheus_collector.h

echo -e "\n=== Check how SetMetric is called - any dedup at call site ==="
rg -n "SetMetric" eloq_metrics/src/ eloq_metrics/include/ -B 3 -A 3

Repository: eloqdata/tx_service

Length of output: 4107


🏁 Script executed:

#!/bin/bash
# Important: Check if there's any caller-level dedup before SetMetric
echo "=== Search for calls to SetMetric or Register in the codebase ==="
rg -n "\.Register\(|Register(" eloq_metrics/ --type cpp | grep -v "prometheus::" | head -20

echo -e "\n=== Check if metrics are registered once or multiple times ==="
rg -n "SetMetric|Register.*Metric" eloq_metrics/test/ -A 2 -B 2 | head -40

echo -e "\n=== Look for any error handling around Register ==="
rg -n "try|catch|exception|throw|error" eloq_metrics/src/prometheus_collector.cc

echo -e "\n=== Check the comment about PutMetricsIfAbsent being removed ==="
sed -n '250,256p' eloq_metrics/src/prometheus_collector.cc

Repository: eloqdata/tx_service

Length of output: 1177


🏁 Script executed:

#!/bin/bash
# Check if prometheus-cpp library is available in the repo to understand Register behavior
echo "=== Search for prometheus-cpp usage and documentation ==="
rg -n "prometheus::" eloq_metrics/include/prometheus_collector.h | head -10

echo -e "\n=== Check if there's any git history or comments about PutMetricsIfAbsent ==="
rg -n "PutMetricsIfAbsent|duplicate|dedup|cache.*metric" eloq_metrics/src/ eloq_metrics/include/ -B 2 -A 2

echo -e "\n=== Check for any integration tests or usage ==="
find eloq_metrics -name "*.cc" -o -name "*.h" | xargs grep -l "SetMetric\|MetricsRegistry" | head -5

echo -e "\n=== Look at how SetMetric is used in practice ==="
rg -n "SetMetric|MetricsRegistry" eloq_metrics/ -l | xargs -I {} sh -c "echo '=== {} ===' && cat {}" | head -100

Repository: eloqdata/tx_service

Length of output: 4541


🏁 Script executed:

#!/bin/bash
# Search for prometheus-cpp library behavior regarding duplicate registrations
echo "=== Search for test files or examples ==="
find . -type f -name "*.cc" -o -name "*.h" | xargs grep -l "test\|Test\|example" | grep -E "prometheus|metric" | head -10

echo -e "\n=== Check cmake for prometheus-cpp version or details ==="
rg -n "prometheus" CMakeLists.txt

echo -e "\n=== Check if there are any comments about Register behavior ==="
rg -n "Register.*duplicate|Register.*existing|Register.*error" eloq_metrics/

echo -e "\n=== Look for how metrics are actually used - example or app ==="
find eloq_metrics -type f -name "*example*" -o -name "*app*" | head -5

Repository: eloqdata/tx_service

Length of output: 372


🏁 Script executed:

#!/bin/bash
# Check the test file to see how metrics are registered and if duplicates are tested
echo "=== Check metrics_collector_test.cc ==="
cat eloq_metrics/tests/metrics_collector_test.cc

echo -e "\n=== Check metrics_app.cc to see usage patterns ==="
head -100 eloq_metrics/src/metrics_app.cc

Repository: eloqdata/tx_service

Length of output: 6958


🏁 Script executed:

#!/bin/bash
# Check for any prometheus-cpp documentation or error handling
echo "=== Search for any documentation about duplicate metrics ==="
rg -n "duplicate|already.*exist|Register.*error" eloq_metrics/ -i

echo -e "\n=== Check if there's any external documentation reference ==="
grep -r "prometheus-cpp\|prometheus::Registry" eloq_metrics/ | head -5

echo -e "\n=== Look for error handling in Open() method ==="
sed -n '95,120p' eloq_metrics/src/prometheus_collector.cc

echo -e "\n=== Check if SetMetric has any error handling or return value validation ==="
rg -n "throw|catch|error|Error|exception" eloq_metrics/src/prometheus_collector.cc | grep -A 2 -B 2 "SetMetric\|Register"

Repository: eloqdata/tx_service

Length of output: 1760


🏁 Script executed:

#!/bin/bash
# Check the benchmark files for usage patterns and if they test duplicate scenarios
echo "=== Check benchmark usage ==="
head -80 eloq_metrics/benchmark/mono_metrics_counter_bench.cpp

echo -e "\n=== Search for any documentation files ==="
find eloq_metrics -name "README*" -o -name "*.md" | xargs cat 2>/dev/null | grep -A 5 -B 5 "duplicate\|dedup\|cache" -i | head -30

echo -e "\n=== Search git log or comments about the refactoring ==="
rg -n "PutMetricsIfAbsent" eloq_metrics/ -B 5 -A 5

Repository: eloqdata/tx_service

Length of output: 3873


Restore deduplication for metric registration in SetMetric.

Lines 130-176 unconditionally register new metric families without checking for duplicates. The comment at line 253 indicates PutMetricsIfAbsent was removed, leaving no deduplication mechanism in PrometheusCollector. The metric_key computed from MetricHash is never used for caching or lookup. If prometheus::Registry::Register() fails or silently overwrites on duplicate registrations, this will cause silent failures or data loss.

Either reintroduce a local std::map or std::unordered_map to cache metrics by metric_key before calling Register(), or verify and document that prometheus-cpp's registry guarantees idempotent registration and returns the existing family on duplicate attempts.

🤖 Prompt for AI Agents
In `@eloq_metrics/src/prometheus_collector.cc` around lines 130 - 176,
PrometheusCollector::SetMetric currently always calls
prometheus::Build*().Name(...).Register(*registry_) and ignores the computed
MetricHash/metric_key (and the previously-removed PutMetricsIfAbsent), causing
duplicate registrations and potential silent failures; reintroduce a local cache
(e.g., std::unordered_map<MetricHashKey, std::shared_ptr<PrometheusMetricData>>
metrics_cache) keyed by metric_key inside PrometheusCollector, protect it with a
mutex, check metrics_cache for metric_key before calling Register(), and if
present return a MetricHandle with the existing PrometheusMetricData; only call
Register() and create a new PrometheusMetricData when the key is absent, insert
it into metrics_cache, and then return the new MetricHandle (or alternatively
document and assert idempotency of prometheus::Registry::Register if you choose
not to cache).

@lokax lokax merged commit 6d29087 into main Jan 16, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Chore] Remove eloq metrics registration-before-collection requirement [Feature]: Add kv metrics to eloqstore

3 participants