Skip to content

add jemalloc support#287

Merged
lokax merged 12 commits into
mainfrom
jemalloc
Dec 24, 2025
Merged

add jemalloc support#287
lokax merged 12 commits into
mainfrom
jemalloc

Conversation

@lokax

@lokax lokax commented Dec 12, 2025

Copy link
Copy Markdown
Collaborator

Here are some reminders before you submit the pull request

  • Add tests for the change
  • Document changes
  • Reference the link of issue using fixes eloqdb/tx_service#issue_id
  • Reference the link of RFC if exists
  • Pass ./mtr --suite=mono_main,mono_multi,mono_basic

Summary by CodeRabbit

  • New Features

    • Optional jemalloc allocator support via a new build option.
    • Build flag to control host-manager forking.
    • External-transaction arena tracking and public arena-related accessors.
  • Refactor

    • Widespread per-thread/per-arena memory handling and context switches around memory-sensitive operations.
    • Per-arena memory reporting and statistics access.
    • Background epoch worker to refresh arena statistics and manage arena lifecycle.

✏️ Tip: You can customize this high-level summary in your review settings.

Copilot AI review requested due to automatic review settings December 12, 2025 07:47
@coderabbitai

coderabbitai Bot commented Dec 12, 2025

Copy link
Copy Markdown

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds optional jemalloc support: build options and an INTERFACE target, jemalloc helpers and per-arena stats, per-shard arena creation/tracking, an epoch worker, and many guarded save/switch/restore calls so specific operations run using designated jemalloc arenas. Changes are gated by WITH_JEMALLOC.

Changes

Cohort / File(s) Summary
Build configuration
CMakeLists.txt, build_tx_service.cmake
Add WITH_JEMALLOC option and jemalloc_cfg INTERFACE target; propagate compile defs/includes/links via INTERFACE properties; make data substrate linkage PUBLIC and conditionally link jemalloc_cfg. Add FORK_HM_PROCESS option and small formatting tweaks.
Store handler callbacks
store_handler/data_store_service_client_closure.cpp
Surround FetchTableRangesCallback and FetchRangeSlicesCallback with jemalloc thread-arena save/set/restore (guarded by WITH_JEMALLOC).
Allocator utilities
tx_service/include/util.h
Conditionally include <jemalloc/jemalloc.h> and add GetJemallocArenaStat(uint32_t, size_t&, size_t&) to query per-arena resident/allocated stats (guarded by WITH_JEMALLOC).
CcShard API & impl
tx_service/include/cc/cc_shard.h, tx_service/src/cc/cc_shard.cpp
Add arena_id_, SetAsDefaultArena() (returns previous arena), ArenaId()/MemoryLimit() accessors; create per-shard arenas, switch/restore thread arena during init and ops, and use jemalloc stats in fullness checks (guarded).
Local shards arena infra
tx_service/include/cc/local_cc_shards.h, tx_service/src/cc/local_cc_shards.cpp
Add table_ranges_arena_id_, hash_partition_ckpt_arena_id_, GetTableRangesArenaId(); create arenas during init; wrap table-range/hash-partition operations with arena save/set/restore; add an epoch worker to refresh jemalloc epoch and report arena stats (guarded).
Catalog / CC maps / ranges
tx_service/include/cc/catalog_cc_map.h, tx_service/include/cc/range_cc_map.h, tx_service/include/cc/template_cc_map.h
Wrap range initialization, range-slice processing, and export paths with jemalloc arena save/set/restore (guarded by WITH_JEMALLOC).
CC request and internals
tx_service/include/cc/cc_request.h, tx_service/src/cc/cc_req_misc.cpp, tx_service/src/cc/range_slice.cpp
Add conditional jemalloc includes and arena save/set/restore around data-sync/archive release, InitRangeSlices(), and UpdateSlice() callsites (guarded).
TX coordination
tx_service/include/tx_service_common.h, tx_service/include/tx_service.h
Add uint32_t ext_tx_proc_arena_id_ to TxProcCoordinator and capture/restore per-thread arena in RunOneRound() and ForwardTx() external-processor paths (guarded).

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Worker as Worker / Thread
    participant CcShard as CcShard / CcShardHeap
    participant Jemalloc as jemalloc (mallctl)
    participant Work as Memory-sensitive work

    Note over Worker,CcShard: Enter memory-sensitive section (WITH_JEMALLOC)
    Worker->>CcShard: SetAsDefaultArena()
    CcShard->>Jemalloc: mallctl("thread.arena" / set) -> returns prev_arena
    Jemalloc-->>CcShard: prev_arena
    CcShard-->>Worker: shard_arena active

    Worker->>Work: perform allocations (init ranges / update slices / export)
    Work-->>Worker: work done

    Worker->>Jemalloc: mallctl -> restore prev_arena
    Jemalloc-->>Worker: arena restored
    Note right of Worker: Exit memory-sensitive section
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • liunyl

Poem

🐇 I hopped into a cozy thread-bound lair,

Swapped arenas gently with meticulous care,
I nudged ranges and slices while allocations stayed neat,
Restored every badge before leaving the suite.
Now memory hums softly — my burrow’s complete.

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is a template checklist with no actual content describing the changes, objectives, or rationale for the jemalloc integration. Replace the template checklist with a substantive description explaining the jemalloc integration, its benefits, scope of changes, and any testing or documentation completed.
Docstring Coverage ⚠️ Warning Docstring coverage is 8.93% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'add jemalloc support' directly matches the main objective of the PR and clearly describes the primary change—jemalloc integration across the codebase.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch jemalloc

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (11)
tx_service/include/cc/cc_request.h (1)

6294-6323: Fix mallctl arena restoration to use correct type and check return code.

The call to mallctl("thread.arena", ...) should match the actual return type from SetAsDefaultArena(), which is uint32_t, and use sizeof(prev_arena) instead of the hardcoded size. Additionally, the mallctl return code should be checked since arena restoration is correctness-critical and a failure would silently corrupt the thread's allocator state.

Corrected diff:

 #if defined(WITH_JEMALLOC)
-                auto prev_arena = scan_heap->SetAsDefaultArena();
+                uint32_t prev_arena = scan_heap->SetAsDefaultArena();
 #endif
 ...
 #if defined(WITH_JEMALLOC)
-                mallctl(
-                    "thread.arena", NULL, NULL, &prev_arena, sizeof(uint32_t));
+                int rc = mallctl("thread.arena", nullptr, nullptr,
+                                 &prev_arena, sizeof(prev_arena));
+                assert(rc == 0);
 #endif

(Apply similarly to the archive_vec_ block.)

tx_service/src/cc/local_cc_shards.cpp (5)

2119-2164: Add error checking to mallctl("thread.arena") calls to prevent restoring uninitialized prev_arena.

The mallctl function returns a POSIX error code (non-zero indicates failure with codes like EINVAL, ENOENT, EPERM, EAGAIN, EFAULT). At lines 2127–2163, the initial read of prev_arena lacks error checking, so if that call fails, prev_arena remains uninitialized and is later restored to an undefined state. This silent corruption risks memory allocation failures or arena mismanagement during the entry->DropStoreRange() operation.

Apply error handling to the initial mallctl read (e.g., guard with an early return or skip arena override on failure), or use a RAII wrapper to ensure cleanup is skipped on read failure.


1016-1130: Two critical issues in InitTableRanges require fixes: (1) missing mallctl() error handling and uninitialized prev_arena, and (2) writer lock using wrong mutex type.

Issue 1: Unsafe arena restoration. The mallctl("thread.arena", &prev_arena, &sz, NULL, 0) read at line 1023 has no error checking. If it fails (EINVAL, EAGAIN, etc.), prev_arena remains uninitialized, and the restore at line 1128 uses garbage. Initialize prev_arena and check return codes:

 #if defined(WITH_JEMALLOC)
-    uint32_t prev_arena;
-    size_t sz = sizeof(uint32_t);
-    // read prev arena id
-    mallctl("thread.arena", &prev_arena, &sz, NULL, 0);  // read only
+    uint32_t prev_arena = 0;
+    size_t sz = sizeof(prev_arena);
+    const int rc_read = mallctl("thread.arena", &prev_arena, &sz, nullptr, 0);
+    if (rc_read != 0) {
+        LOG(ERROR) << "mallctl(thread.arena) read failed rc=" << rc_read;
+    }
     // override arena id
     auto table_range_arena_id = GetTableRangesArenaId();
-    mallctl(
-        "thread.arena", NULL, NULL, &table_range_arena_id, sizeof(uint32_t));
+    const int rc_set = mallctl("thread.arena",
+                               nullptr, nullptr,
+                               &table_range_arena_id, sizeof(table_range_arena_id));
+    if (rc_set != 0) {
+        LOG(ERROR) << "mallctl(thread.arena) set failed rc=" << rc_set;
+    }
 #endif
 ...
 #if defined(WITH_JEMALLOC)
-    mallctl("thread.arena", NULL, NULL, &prev_arena, sizeof(uint32_t));
+    if (rc_read == 0) {
+        const int rc_restore = mallctl("thread.arena",
+                                       nullptr, nullptr,
+                                       &prev_arena, sizeof(prev_arena));
+        if (rc_restore != 0) {
+            LOG(ERROR) << "mallctl(thread.arena) restore failed rc=" << rc_restore;
+        }
+    }
 #endif

Issue 2: Writer lock on wrong mutex. Line 1016 acquires std::unique_lock<FastMetaDataMutex> lk(fast_meta_data_mux_), which is a writer lock. Per earlier guidance, writer locks must use the original meta_data_mux_ (std::shared_mutex); only reader locks should use the FastMetaDataMutex wrapper. Change to std::unique_lock<std::shared_mutex> lk(meta_data_mux_);.


5003-5083: Add mallctl error handling for arena read/restore and fix comment.

The read at line 5009 and restore at line 5100 do not check mallctl return values. If the read fails, prev_arena remains uninitialized and the restore writes undefined state back to the thread arena. Per jemalloc documentation, mallctl returns 0 on success or an errno value (EINVAL, ENOENT, EPERM, EAGAIN, EFAULT) on failure. Check both operations and handle errors appropriately. Additionally, the comment at line 5100 incorrectly says "override arena id" when it should say "restore" to clarify the operation during debugging.


1293-1343: Add error checking to mallctl calls and use RAII for arena switching to prevent arena state corruption.

The jemalloc arena switch/restore pattern at lines 1293–1343 and 1371–1411 has two critical issues:

  1. No mallctl return value checking: All mallctl() calls ignore return values. Per jemalloc documentation, mallctl() returns 0 on success and non-zero errno values (EINVAL, EPERM, EAGAIN, etc.) on failure. Silent failures corrupt thread arena state for subsequent allocations.

  2. Exception safety gap: If DropStoreRange() or HasEnoughTableRangesMemory() throw an exception, the restoration block will not execute, leaving the thread in the wrong arena.

Implement a small RAII guard (e.g., JemallocArenaGuard) to (1) centralize and enforce mallctl error checking, (2) guarantee restoration via RAII destructor on both normal and exceptional paths, and (3) eliminate duplication across the two code blocks and InitTableRanges (which has the same unchecked pattern at lines 1026–1035).

Also applies to: InitTableRanges at lines 1026–1035.


6588-6619: Apply defensive casting for frag ratio calculation in both jemalloc and mimalloc paths.

In the jemalloc path (lines 6599–6601), the subtraction (committed - allocated) is computed as size_t. Although jemalloc documentation indicates committed ≥ allocated under normal operation, transient inconsistencies or edge cases could cause underflow. The identical calculation pattern also exists in the mimalloc path (lines 6608–6611) using int64_t.

Cast to floating-point before subtraction and clamp the result to avoid undefined behavior:

         LOG(INFO) << "ckpt hash partition heap memory usage report, committed "
                   << committed << ", allocated " << allocated << ", frag ratio "
                   << std::setprecision(2)
-                  << 100 * (static_cast<float>(committed - allocated) /
-                            static_cast<float>(committed));
+                  << (100.0 * std::max(0.0,
+                      (static_cast<double>(committed) - static_cast<double>(allocated)) /
+                       static_cast<double>(committed)));

Apply the same fix to both the #if defined(WITH_JEMALLOC) branch and the #else (mimalloc) branch for consistency.

tx_service/src/cc/cc_shard.cpp (3)

2497-2539: Remove/guard the LOG(INFO) spam in Full(); it’s likely a hot path and will tank performance + observability.
Also, prefer consistent types for comparisons/assignments (size_t ↔ int64_t) to avoid surprises.

-    LOG(INFO) << "yf: resident = " << total_resident << ", total resident "
-              << ", total allocated = " << total_allocated
-              << ", memory limit = " << memory_limit_
-              << ", arena id = " << arena_id_;
-
-    return total_resident >= static_cast<int64_t>(memory_limit_);
+    // If needed, use VLOG/DLOG with a low frequency sampler.
+    return total_resident >= memory_limit_;

2446-2474: Check return code from mallctl("arenas.create", ...) and implement arena destruction in destructor.

  • mallctl() returns 0 on success or errno on failure (EINVAL, EAGAIN, EFAULT, etc.); the unchecked return code can mask silent failures.
  • jemalloc provides arena.<id>.destroy for cleanup, but the destructor has an unresolved "TODO: destroy arena ?" with no implementation. This leaks arena resources at exit, potentially causing index recycling issues.

Suggested approach:

  1. Capture and check the return code from arenas.create:
 #if defined(WITH_JEMALLOC)
     size_t sz = sizeof(uint32_t);
-    mallctl("arenas.create", &arena_id_, &sz, NULL, 0);
+    int rc = mallctl("arenas.create", &arena_id_, &sz, NULL, 0);
+    assert(rc == 0);  // or handle error appropriately
 #endif
  1. Implement arena destruction in ~CcShardHeap() or document why leak-at-exit is acceptable:
 CcShardHeap::~CcShardHeap()
 {
-    // TODO: destroy arena ?
+    #if defined(WITH_JEMALLOC)
+    char mib[64];
+    snprintf(mib, sizeof(mib), "arena.%u.destroy", arena_id_);
+    mallctl(mib, NULL, NULL, NULL, 0);
+    #endif
 }

220-257: Add error handling (or hard asserts) around mallctl("thread.arena", ...) set/restore to avoid silent allocator misconfiguration.
Right now the arena switch/restore in CcShard::Init() assumes mallctl always succeeds. At minimum, capture return codes and assert(rc == 0) (or log+fallback).

 #if defined(WITH_JEMALLOC)
     auto prev_arena = shard_heap_->SetAsDefaultArena();
 #endif
 ...
 #if defined(WITH_JEMALLOC)
-    mallctl("thread.arena", NULL, NULL, &prev_arena, sizeof(uint32_t));
+    int rc = mallctl("thread.arena", nullptr, nullptr, &prev_arena, sizeof(prev_arena));
+    assert(rc == 0);
 #endif
jemalloc mallctl: confirm correct usage of "thread.arena" for get/set, expected argument sizes/types, and whether setting requires any additional config flags.
tx_service/include/cc/cc_shard.h (1)

191-232: Initialize arena_id_ to a safe default value to prevent returning uninitialized garbage in non-jemalloc builds.

The member variable arena_id_ (line 244) is only initialized within #if defined(WITH_JEMALLOC) blocks in the constructor. In non-jemalloc builds, it remains uninitialized. While ArenaId() currently has no call sites in non-jemalloc builds, the API provides no guard to prevent this. Add a default initializer like other members:

-    uint32_t arena_id_;
+    uint32_t arena_id_{0};

This ensures safe behavior across all build configurations.

tx_service/include/cc/local_cc_shards.h (1)

467-486: Verify error handling in GetJemallocArenaStat.

The jemalloc code path calls GetJemallocArenaStat() from util.h. That function calls mallctl four times (lines 457, 462, 464, 466) but does not check return values. If any of those calls fail, the committed and allocated values will remain 0 or stale, leading to incorrect memory-full detection. In TableRangesMemoryFull(), this would cause allocated >= range_slice_memory_limit_ to evaluate to false when it should be true, failing to enforce memory limits.

Consider adding error handling to GetJemallocArenaStat to detect and handle mallctl failures, or validate the returned statistics here before using them for memory-critical decisions.

♻️ Duplicate comments (1)
tx_service/src/cc/cc_req_misc.cpp (1)

411-423: Same arena switching issues as catalog_cc_map.h - needs error handling and RAII.

This code has identical issues to the arena switching in catalog_cc_map.h (lines 442-460):

  1. Missing error checks on mallctl calls
  2. No exception safety - arena won't restore if exception thrown before line 443
  3. Misleading comment on line 415: "// read only" suggests no side effects, but mallctl is reading the current arena value into prev_arena, not performing a read-only operation in the general sense

The presence of this identical pattern in multiple files confirms the need for a shared RAII helper class as suggested in the review of catalog_cc_map.h. Please apply the same refactoring consistently across all arena-switching locations mentioned in the AI summary.

🧹 Nitpick comments (12)
tx_service/include/tx_service_common.h (1)

82-87: Consider adding a documentation comment for consistency.

The ext_tx_proc_heap_ field has a documentation comment explaining its purpose (lines 82-85), but the new ext_tx_proc_arena_id_ field lacks similar documentation. For consistency and maintainability, consider adding a brief comment explaining its purpose (e.g., storing the previous jemalloc arena ID when switching arenas for external TX processing).

tx_service/src/cc/range_slice.cpp (1)

533-542: Consider checking mallctl return values for error handling.

The mallctl function returns an error code (0 on success, or an error like EINVAL/EPERM). While arena switching failures may be non-critical in practice, ignoring the return value silently could mask configuration issues during development or debugging.

Consider at minimum logging a warning on failure:

 #if defined(WITH_JEMALLOC)
     uint32_t prev_arena;
     size_t sz = sizeof(uint32_t);
     // read prev arena id
-    mallctl("thread.arena", &prev_arena, &sz, NULL, 0);  // read only
+    if (mallctl("thread.arena", &prev_arena, &sz, NULL, 0) != 0) {
+        LOG(WARNING) << "Failed to read jemalloc thread arena";
+    }
     // override arena id
     auto table_range_arena_id = local_cc_shards_.GetTableRangesArenaId();
-    mallctl(
-        "thread.arena", NULL, NULL, &table_range_arena_id, sizeof(uint32_t));
+    if (mallctl("thread.arena", NULL, NULL, &table_range_arena_id, sizeof(uint32_t)) != 0) {
+        LOG(WARNING) << "Failed to switch jemalloc thread arena";
+    }
 #endif
tx_service/include/cc/range_cc_map.h (1)

1047-1064: Consistent arena switching pattern, same error handling consideration applies.

The jemalloc arena switching implementation is consistent with the pattern in range_slice.cpp. The same optional suggestion about checking mallctl return values applies here for debugging purposes.

The placement around InitRangeSlices (line 1065) is correct, ensuring allocations during range slice initialization use the designated per-table-range arena.

tx_service/include/cc/cc_request.h (1)

84-86: Avoid including jemalloc headers in a widely-included public header (if possible).
Pulling <jemalloc/jemalloc.h> into cc_request.h risks pervasive transitive exposure (macros/types) + compile-time overhead. If the only need here is mallctl, consider wrapping jemalloc interactions behind a small .cc/utility abstraction and include that instead.

tx_service/src/cc/cc_shard.cpp (2)

53-55: jemalloc header include is fine, but ensure build consistently defines WITH_JEMALLOC wherever mallctl is referenced.


2548-2571: jemalloc disables defrag: OK, but consider surfacing this via metrics/counter to explain behavior changes.

tx_service/include/cc/template_cc_map.h (2)

5136-5139: Prefer RAII guard for arena/heap restoration to prevent future early-return leaks.
Right now control flow is linear, but this function is a frequent touch point; a future return/error path between the switches and the restores would leave the thread configured incorrectly.

     CcShardHeap *scan_heap = shard_->GetShardDataSyncScanHeap();
     mi_heap_t *prev_heap = scan_heap->SetAsDefaultHeap();
     assert(shard_->GetShardHeapThreadId() == mi_thread_id());
+
+    struct HeapRestoreGuard {
+        mi_heap_t* prev;
+        ~HeapRestoreGuard() { mi_heap_set_default(prev); }
+    } heap_guard{prev_heap};
 
 #if defined(WITH_JEMALLOC)
-    unsigned prev_arena = scan_heap->SetAsDefaultArena();
+    unsigned prev_arena = scan_heap->SetAsDefaultArena();
+    struct ArenaRestoreGuard {
+        unsigned prev;
+        ~ArenaRestoreGuard() {
+            (void)mallctl("thread.arena", nullptr, nullptr, &prev, sizeof(prev));
+        }
+    } arena_guard{prev_arena};
 #endif
 ...
-    mi_heap_set_default(prev_heap);
-
-#if defined(WITH_JEMALLOC)
-    int mallctl_rc =
-        mallctl("thread.arena", nullptr, nullptr, &prev_arena, sizeof(prev_arena));
-    DLOG_IF(WARNING, mallctl_rc != 0)
-        << "mallctl(thread.arena) failed, rc=" << mallctl_rc;
-#endif
     return export_size;

Also applies to: 5140-5182


5149-5160: Heap-full check looks good; keep alloc/commit logging behind a real debug flag.
Switching to scan_heap->Full(&alloc, &commit) is a clearer signal than the prior Full() heuristic. Consider replacing the commented LOG(INFO) with DLOG or a dedicated runtime flag to avoid churn.

store_handler/data_store_service_client_closure.cpp (2)

757-769: Consider checking mallctl return values for robustness.

The mallctl calls can fail (e.g., invalid arena ID, jemalloc not properly initialized). While failures are unlikely in practice, ignoring return values makes debugging harder if arena switching silently fails.

 #if defined(WITH_JEMALLOC)
         uint32_t prev_arena;
         size_t sz = sizeof(uint32_t);
         // read prev arena id
-        mallctl("thread.arena", &prev_arena, &sz, NULL, 0);  // read only
+        int ret = mallctl("thread.arena", &prev_arena, &sz, NULL, 0);
+        assert(ret == 0);
         // override arena id
         auto table_range_arena_id = shards->GetTableRangesArenaId();
-        mallctl("thread.arena",
-                NULL,
-                NULL,
-                &table_range_arena_id,
-                sizeof(uint32_t));
+        ret = mallctl("thread.arena",
+                      NULL,
+                      NULL,
+                      &table_range_arena_id,
+                      sizeof(uint32_t));
+        assert(ret == 0);
 #endif

1051-1063: Duplicate arena switching pattern in FetchRangeSlicesCallback.

This is a copy of the same pattern from FetchTableRangesCallback. Consider extracting a RAII helper or scoped guard to reduce code duplication and ensure arena restoration even on early returns.

Example helper (could be placed in util.h):

#if defined(WITH_JEMALLOC)
class ScopedArenaSwitch {
public:
    explicit ScopedArenaSwitch(uint32_t target_arena) {
        size_t sz = sizeof(uint32_t);
        mallctl("thread.arena", &prev_arena_, &sz, NULL, 0);
        mallctl("thread.arena", NULL, NULL, &target_arena, sizeof(uint32_t));
    }
    ~ScopedArenaSwitch() {
        mallctl("thread.arena", NULL, NULL, &prev_arena_, sizeof(uint32_t));
    }
private:
    uint32_t prev_arena_;
};
#endif

Also applies to: 1099-1101

tx_service/include/util.h (1)

456-467: Consider validating mallctl return values and adding bounds check for snprintf.

The mallctl calls can fail for invalid arena IDs. Additionally, while the 64-byte buffer is sufficient for current use, a bounds check would be defensive.

     uint64_t epoch = 1;
-    mallctl("epoch", NULL, NULL, &epoch, sizeof(epoch));
+    int ret = mallctl("epoch", NULL, NULL, &epoch, sizeof(epoch));
+    if (ret != 0) {
+        resident = 0;
+        allocated = 0;
+        return;
+    }
     // Resident memory pages actually held by jemalloc from OS
     char mib[64];
-    snprintf(mib, sizeof(mib), "stats.arenas.%u.resident", arena_id);
-    mallctl(mib, &total_resident, &sz, NULL, 0);
+    int n = snprintf(mib, sizeof(mib), "stats.arenas.%u.resident", arena_id);
+    if (n < 0 || static_cast<size_t>(n) >= sizeof(mib)) {
+        resident = 0;
+        allocated = 0;
+        return;
+    }
+    ret = mallctl(mib, &total_resident, &sz, NULL, 0);
+    // Continue with other stats...
tx_service/include/cc/local_cc_shards.h (1)

455-458: Consider documenting the return value semantics.

The accessor returns table_ranges_arena_id_, which will be 0 if arena creation failed or if jemalloc is not enabled. Consider adding a comment to clarify that 0 represents the default arena, so callers understand this behavior.

Example:

+    // Returns the jemalloc arena ID for table ranges.
+    // Returns 0 if arena creation failed or jemalloc is not enabled (uses default arena).
     uint32_t GetTableRangesArenaId() const
     {
         return table_ranges_arena_id_;
     }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3164008 and a20238d.

📒 Files selected for processing (16)
  • CMakeLists.txt (3 hunks)
  • build_tx_service.cmake (4 hunks)
  • store_handler/data_store_service_client_closure.cpp (4 hunks)
  • tx_service/include/cc/catalog_cc_map.h (2 hunks)
  • tx_service/include/cc/cc_request.h (6 hunks)
  • tx_service/include/cc/cc_shard.h (4 hunks)
  • tx_service/include/cc/local_cc_shards.h (12 hunks)
  • tx_service/include/cc/range_cc_map.h (2 hunks)
  • tx_service/include/cc/template_cc_map.h (2 hunks)
  • tx_service/include/tx_service.h (4 hunks)
  • tx_service/include/tx_service_common.h (1 hunks)
  • tx_service/include/util.h (2 hunks)
  • tx_service/src/cc/cc_req_misc.cpp (2 hunks)
  • tx_service/src/cc/cc_shard.cpp (8 hunks)
  • tx_service/src/cc/local_cc_shards.cpp (12 hunks)
  • tx_service/src/cc/range_slice.cpp (2 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.

Applied to files:

  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/src/cc/cc_shard.cpp
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/include/cc/cc_shard.h
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.

Applied to files:

  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/tx_service.h
  • tx_service/include/cc/range_cc_map.h
  • tx_service/src/cc/cc_shard.cpp
  • tx_service/src/cc/range_slice.cpp
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/include/cc/cc_shard.h
  • tx_service/include/cc/cc_request.h
  • tx_service/include/cc/catalog_cc_map.h
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • tx_service/src/cc/local_cc_shards.cpp
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.

Applied to files:

  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/src/cc/cc_shard.cpp
  • tx_service/src/cc/range_slice.cpp
🧬 Code graph analysis (4)
tx_service/src/cc/cc_shard.cpp (2)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
tx_service/include/cc/cc_shard.h (1)
  • memory_limit_ (236-236)
tx_service/include/cc/local_cc_shards.h (1)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
tx_service/include/cc/cc_shard.h (1)
tx_service/src/cc/cc_shard.cpp (2)
  • SetAsDefaultArena (2481-2495)
  • SetAsDefaultArena (2481-2481)
tx_service/include/cc/cc_request.h (1)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Agent
🔇 Additional comments (12)
tx_service/src/cc/range_slice.cpp (1)

556-558: LGTM for the arena restoration.

The arena restoration is correctly placed after the main UpdateSlice operation and mimalloc heap restoration, maintaining proper ordering of cleanup operations.

tx_service/include/cc/range_cc_map.h (1)

1079-1087: LGTM for the arena restoration and lock release ordering.

The restoration sequence is correct: mimalloc heap/thread restoration → jemalloc arena restoration → mutex unlock. This ensures all memory allocation context is properly restored before releasing the lock.

tx_service/src/cc/cc_shard.cpp (1)

2481-2495: Check mallctl return codes and remove unsafe assert(false) fallback.

This method is unsafe in non-jemalloc builds with assert(false) in the else branch—even if guarded at call sites, public API methods should be defensive. The suggested fixes are sound:

  • Capture and validate return codes from mallctl() calls (both read and write operations can fail)
  • Remove assert(false) and return a benign value like 0 when jemalloc is unavailable
  • Consider moving the entire implementation behind #if defined(WITH_JEMALLOC) in the header if the method truly cannot operate without jemalloc

The diff properly addresses the main issues: return code validation and removing the unsafe assertion.

build_tx_service.cmake (2)

136-138: LGTM: Conditional jemalloc include directories.

The jemalloc include directory integration follows the established pattern for conditional compilation. The JEMALLOC_INCLUDE_DIRS variable is properly populated by pkg_check_modules in the parent CMakeLists.txt when WITH_JEMALLOC is enabled.


217-219: LGTM: Conditional jemalloc_cfg linking.

Linking jemalloc_cfg to txservice when WITH_JEMALLOC is enabled correctly propagates the compile definitions, include directories, and link libraries through the interface library.

store_handler/data_store_service_client_closure.cpp (1)

829-833: Arena restoration follows correct pattern.

The arena is properly restored after the mimalloc heap restoration completes, maintaining consistency between both allocators' thread-local state.

tx_service/include/util.h (1)

24-26: LGTM: Conditional jemalloc header inclusion.

The jemalloc header is correctly guarded by WITH_JEMALLOC preprocessor check.

tx_service/include/tx_service.h (3)

289-300: LGTM: Arena preservation in RunOneRound.

The arena save/restore pattern correctly handles external processor arena context. The previous arena is saved via SetAsDefaultArena() and stored in coordi_->ext_tx_proc_arena_id_ for later restoration.


425-435: LGTM: Arena restoration at end of RunOneRound.

The arena is correctly restored after the mimalloc heap is restored, and the stored arena ID is cleared for external processors.


696-715: Transitive inclusion of jemalloc/jemalloc.h is confirmed.

The header is properly included through the chain: tx_service.hcheckpointer.h (line 49) → util.h (line 37) → <jemalloc/jemalloc.h> (line 25). The mallctl function is available and will be properly declared at the point of use in lines 696-715.

CMakeLists.txt (1)

361-365: LGTM: Conditional jemalloc linking to data_substrate.

The jemalloc_cfg interface library is correctly linked PUBLIC to data_substrate, which will transitively propagate the jemalloc compile definitions, include paths, and link libraries to dependent targets.

tx_service/include/cc/local_cc_shards.h (1)

2010-2011: LGTM: Arena ID initialization.

The arena IDs are properly initialized to 0, which represents the default jemalloc arena. This provides a reasonable fallback if arena creation fails (though arena creation errors should still be handled, as noted in previous comments).

Comment thread CMakeLists.txt
Comment on lines +442 to +460
#if defined(WITH_JEMALLOC)
uint32_t prev_arena;
size_t sz = sizeof(uint32_t);
// read prev arena id
mallctl("thread.arena",
&prev_arena,
&sz,
NULL,
0);
// read only override arena id
auto table_range_arena_id =
shard_->local_shards_
.GetTableRangesArenaId();
mallctl("thread.arena",
NULL,
NULL,
&table_range_arena_id,
sizeof(uint32_t));
#endif

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add error handling for mallctl and consider RAII for arena switching.

The mallctl calls can fail, but there's no error checking. If the first call fails, prev_arena remains uninitialized; if the second fails, the arena switch doesn't occur but execution continues. Additionally, if an exception is thrown between the arena switch and restoration (line 480-488), the arena won't be restored, potentially corrupting memory management state.

Consider creating an RAII helper class to ensure arena restoration even in exceptional cases:

class JemallocArenaGuard {
#if defined(WITH_JEMALLOC)
    uint32_t prev_arena_;
    bool switched_ = false;
public:
    explicit JemallocArenaGuard(uint32_t target_arena) {
        size_t sz = sizeof(uint32_t);
        if (mallctl("thread.arena", &prev_arena_, &sz, NULL, 0) == 0) {
            if (mallctl("thread.arena", NULL, NULL, &target_arena, sizeof(uint32_t)) == 0) {
                switched_ = true;
            }
        }
    }
    ~JemallocArenaGuard() {
        if (switched_) {
            mallctl("thread.arena", NULL, NULL, &prev_arena_, sizeof(uint32_t));
        }
    }
    JemallocArenaGuard(const JemallocArenaGuard&) = delete;
    JemallocArenaGuard& operator=(const JemallocArenaGuard&) = delete;
#else
    explicit JemallocArenaGuard(uint32_t) {}
#endif
};

Then use it as:

JemallocArenaGuard arena_guard(shard_->local_shards_.GetTableRangesArenaId());
// Range slice initialization...
// Automatic restoration on scope exit

This pattern appears in multiple locations per the AI summary. Consider verifying and refactoring all occurrences consistently.

🤖 Prompt for AI Agents
In tx_service/include/cc/catalog_cc_map.h around lines 442 to 460, add error
handling for both mallctl calls and replace the manual arena switching with an
RAII guard so the previous arena is restored even on early returns or
exceptions; implement a small JemallocArenaGuard that calls
mallctl("thread.arena", &prev, &sz, NULL, 0) and, only on success, sets the new
arena via mallctl, records success, and restores the prev arena in its
destructor, and then instantiate this guard before the range-slice
initialization so failures to read or set the arena are detected (check mallctl
return values and handle/log errors) and the arena is always restored on scope
exit.

Comment thread tx_service/include/cc/cc_shard.h Outdated
Comment thread tx_service/include/cc/local_cc_shards.h
Comment thread tx_service/include/cc/local_cc_shards.h
Comment thread tx_service/include/cc/local_cc_shards.h
Comment on lines +956 to +959
#if defined(WITH_JEMALLOC)
mallctl(
"thread.arena", NULL, NULL, &prev_arena_id, sizeof(uint32_t));
#endif

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add error handling for arena restoration.

The arena restoration calls at lines 956-959 and 1005-1007 lack error checking. If mallctl("thread.arena", ...) fails to restore the previous arena, subsequent memory allocations on this thread will use the wrong arena, potentially causing memory accounting issues and violating isolation guarantees.

At minimum, log errors to aid debugging.

Example for line 958:

 #if defined(WITH_JEMALLOC)
-            mallctl(
-                "thread.arena", NULL, NULL, &prev_arena_id, sizeof(uint32_t));
+            if (mallctl("thread.arena", NULL, NULL, &prev_arena_id, sizeof(uint32_t)) != 0) {
+                LOG(ERROR) << "Failed to restore thread arena to " << prev_arena_id;
+            }
 #endif

Apply similar changes at line 1006.

Also applies to: 1005-1007

🤖 Prompt for AI Agents
In tx_service/include/cc/local_cc_shards.h around lines 956-959 and 1005-1007,
the calls to mallctl("thread.arena", ...) restore the previous arena without
checking the return code; add error handling to detect failure, log a clear
error message with the returned error code and arena id/context (use the
existing logger), and take an appropriate fallback action (e.g., abort, assert,
or return an error) to avoid continuing with the wrong arena; ensure both
occurrences mirror each other in behavior and messaging.

Comment thread tx_service/include/cc/template_cc_map.h
Comment thread tx_service/include/util.h

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds jemalloc support as an alternative memory allocator alongside the existing mimalloc implementation. The changes introduce arena-based memory management for different components (CC shards, table ranges, hash partition checkpoints) to enable better memory isolation and tracking when jemalloc is enabled via the WITH_JEMALLOC build flag.

Key Changes:

  • Added CMake option WITH_JEMALLOC to enable jemalloc support during build configuration
  • Implemented arena save/restore pattern throughout the codebase to ensure allocations happen in the correct jemalloc arena
  • Modified memory statistics reporting to use jemalloc-specific APIs when enabled

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
CMakeLists.txt Adds WITH_JEMALLOC option, configures jemalloc package, and creates jemalloc_cfg interface library
build_tx_service.cmake Integrates jemalloc include directories and links jemalloc_cfg to txservice
tx_service/include/util.h Adds GetJemallocArenaStat helper function for querying jemalloc arena statistics
tx_service/include/tx_service_common.h Adds ext_tx_proc_arena_id_ field to TxProcCoordinator for external transaction processor arena tracking
tx_service/include/tx_service.h Adds arena save/restore logic around external transaction processing
tx_service/include/cc/cc_shard.h Adds arena_id_ member, SetAsDefaultArena(), ArenaId(), and MemoryLimit() methods to CcShardHeap
tx_service/include/cc/local_cc_shards.h Adds arena IDs for table ranges and hash partition checkpoint heaps, static assertion for uint32_t/unsigned compatibility, and jemalloc-specific memory checking logic
tx_service/include/cc/template_cc_map.h Adds arena save/restore around scan heap operations during checkpoint export
tx_service/include/cc/range_cc_map.h Adds arena management when initializing range slices
tx_service/include/cc/catalog_cc_map.h Adds arena management during catalog range initialization
tx_service/include/cc/cc_request.h Adds arena save/restore in ReleaseDataSyncScanHeapCc and CollectMemStatsCc request handlers
tx_service/src/cc/cc_shard.cpp Implements arena creation in CcShardHeap constructor, SetAsDefaultArena(), and jemalloc-specific Full() and NeedDefragment() logic
tx_service/src/cc/local_cc_shards.cpp Adds arena save/restore in InitTableRanges, KickoutRangeSlices, DropStoreRangesInBucket, DataSyncForHashPartition, and jemalloc-specific memory reporting
tx_service/src/cc/range_slice.cpp Adds arena management during UpdateSliceSpec operations
tx_service/src/cc/cc_req_misc.cpp Adds arena save/restore in FetchRangeSlicesReq::SetFinish
store_handler/data_store_service_client_closure.cpp Adds arena management in FetchTableRangesCallback and FetchRangeSlicesCallback

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tx_service/include/cc/cc_shard.h Outdated
Comment thread tx_service/include/util.h Outdated
Comment thread CMakeLists.txt
Comment thread tx_service/src/cc/cc_shard.cpp
Comment thread tx_service/include/util.h Outdated
Comment thread tx_service/include/cc/cc_shard.h
Comment thread tx_service/include/cc/cc_request.h
Comment thread tx_service/src/cc/range_slice.cpp Outdated
Comment on lines +537 to +541
mallctl("thread.arena", &prev_arena, &sz, NULL, 0); // read only
// override arena id
auto table_range_arena_id = local_cc_shards_.GetTableRangesArenaId();
mallctl(
"thread.arena", NULL, NULL, &table_range_arena_id, sizeof(uint32_t));

Copilot AI Dec 12, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mallctl calls do not check return values for errors. If mallctl fails (e.g., invalid arena id or configuration), the code will continue silently with potentially incorrect arena settings. Consider checking return values and logging or handling errors appropriately.

Copilot uses AI. Check for mistakes.
Comment on lines +1023 to +1032
#if defined(WITH_JEMALLOC)
uint32_t prev_arena;
size_t sz = sizeof(uint32_t);
// read prev arena id
mallctl("thread.arena", &prev_arena, &sz, NULL, 0); // read only
// override arena id
auto table_range_arena_id = GetTableRangesArenaId();
mallctl(
"thread.arena", NULL, NULL, &table_range_arena_id, sizeof(uint32_t));
#endif

Copilot AI Dec 12, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern of saving/restoring arena id is duplicated many times throughout the codebase (over 10 occurrences). Consider extracting this into a RAII helper class that automatically restores the previous arena id in its destructor, similar to how lock guards work. This would reduce code duplication and make the code less error-prone.

Copilot uses AI. Check for mistakes.
Comment thread tx_service/include/cc/catalog_cc_map.h Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tx_service/src/cc/cc_shard.cpp (1)

2550-2573: Initialize output parameters even when jemalloc support is disabled in NeedDefragment().

Under jemalloc, NeedDefragment() returns false without initializing the *alloc and *commit output parameters. Callers at lines 9719 and 10028 of template_cc_map.h pass uninitialized local variables expecting them to be populated, then use these values in LOG statements. While the short-circuit AND operator prevents immediate use when the function returns false, this represents undefined behavior and is fragile.

Set the output parameters using GetJemallocArenaStat(), which is already available and used elsewhere in the file:

 bool CcShardHeap::NeedDefragment(int64_t *alloc, int64_t *commit) const
 {
 #if defined(WITH_JEMALLOC)
     // defragmentation is not supported with jemalloc
+    size_t total_resident = 0;
+    size_t total_allocated = 0;
+    GetJemallocArenaStat(arena_id_, total_resident, total_allocated);
+    if (alloc != nullptr)  { *alloc  = static_cast<int64_t>(total_allocated); }
+    if (commit != nullptr) { *commit = static_cast<int64_t>(total_resident); }
     return false;
 #else
     int64_t allocated, committed;
     ...
 #endif
 }
♻️ Duplicate comments (2)
tx_service/src/cc/cc_shard.cpp (2)

2464-2474: Resolve the arena-destruction TODO (or document why it’s not possible/needed).
This was already raised in prior review comments; please address before merge to avoid lingering uncertainty about lifecycle/resource behavior.

 CcShardHeap::~CcShardHeap()
 {
     ...
-    // TODO: destroy arena ?
+    // jemalloc arenas do not have a supported public destroy API; they are
+    // reclaimed at process shutdown.
 }

2481-2495: SetAsDefaultArena() should not assert(false) when jemalloc is disabled.
Even if current call sites are #if defined(WITH_JEMALLOC), the public method is now a footgun for future callers and can crash non-jemalloc builds. This was already raised in prior review comments. Also, mallctl return codes should be checked.

 uint32_t CcShardHeap::SetAsDefaultArena()
 {
 #if defined(WITH_JEMALLOC)
     uint32_t prev_arena;
     size_t sz = sizeof(uint32_t);
     // read prev arena id
-    mallctl("thread.arena", &prev_arena, &sz, NULL, 0);  // read only
+    const int rc1 = mallctl("thread.arena", &prev_arena, &sz, nullptr, 0);  // read only
     // override arena id
-    mallctl("thread.arena", NULL, NULL, &arena_id_, sizeof(uint32_t));
+    const int rc2 = mallctl("thread.arena", nullptr, nullptr, &arena_id_, sizeof(uint32_t));
+    (void)rc1; (void)rc2;
     return prev_arena;
 #else
-    assert(false);
-    return 0;
+    // jemalloc not enabled; no-op.
+    return 0;
 #endif
 }
🧹 Nitpick comments (2)
tx_service/src/cc/cc_shard.cpp (2)

220-257: Thread-local arena/heap switching should be exception-safe (RAII) and should check mallctl return codes.
Right now the restore happens at the end of CcShard::Init(). If anything throws/early-exits between switching and restore (e.g., allocation failure), the thread’s default heap/arena may remain altered. Also, mallctl(...) return values are ignored.

 void CcShard::Init()
 {
     InitializeShardHeap();
     mi_heap_t *prev_heap = shard_heap_->SetAsDefaultHeap();
 #if defined(WITH_JEMALLOC)
-    auto prev_arena = shard_heap_->SetAsDefaultArena();
+    const uint32_t prev_arena = shard_heap_->SetAsDefaultArena();
+    struct ThreadArenaGuard
+    {
+        uint32_t prev;
+        ~ThreadArenaGuard()
+        {
+            const int rc = mallctl("thread.arena", nullptr, nullptr,
+                                   &prev, sizeof(prev));
+            (void)rc;
+            // Consider logging/handling rc != 0.
+        }
+    } arena_guard{prev_arena};
 #endif
     lock_vec_.resize(LOCK_ARRAY_INIT_SIZE);
     ...
     mi_heap_set_default(prev_heap);
-
-#if defined(WITH_JEMALLOC)
-    mallctl("thread.arena", NULL, NULL, &prev_arena, sizeof(uint32_t));
-#endif
 }

2497-2541: jemalloc “Full()” semantics and type conversions need a quick sanity check.
You’re comparing resident vs memory_limit_ under jemalloc, while non-jemalloc uses allocated (and sometimes committed). If memory_limit_ is intended as “live bytes”, resident may trip “full” earlier. Also, size_t -> int64_t assignments can overflow on very large values.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a20238d and 7341664.

📒 Files selected for processing (4)
  • tx_service/include/cc/cc_request.h (6 hunks)
  • tx_service/include/cc/cc_shard.h (4 hunks)
  • tx_service/include/cc/local_cc_shards.h (12 hunks)
  • tx_service/src/cc/cc_shard.cpp (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/include/cc/cc_request.h
  • tx_service/include/cc/cc_shard.h
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.

Applied to files:

  • tx_service/src/cc/cc_shard.cpp
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.

Applied to files:

  • tx_service/src/cc/cc_shard.cpp
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.

Applied to files:

  • tx_service/src/cc/cc_shard.cpp
🔇 Additional comments (2)
tx_service/src/cc/cc_shard.cpp (2)

53-55: jemalloc include guard looks fine; please verify header/ABI expectations for your jemalloc version.
Since you’re calling mallctl, this is sensitive to jemalloc build/runtime configuration (e.g., stats/epoch availability).


2446-2462: Arena ID is already defensively initialized; no changes needed.

arena_id_ is initialized to 0 at member declaration (cc_shard.h:244), so it cannot be uninitialized. If mallctl("arenas.create") fails, arena_id_ safely remains 0 (the default arena). The entire codebase consistently does not check mallctl return codes for arena operations, including similar calls in local_cc_shards.h. The premise that arena_id_ may be uninitialized is incorrect.

Likely an incorrect or invalid review comment.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tx_service/include/cc/cc_request.h (1)

6296-6323: Address mallctl error checking inconsistency: if adding return value checks, apply to all mallctl calls in the codebase

The original concern about sizeof(uint32_t) is not valid here. SetAsDefaultArena() explicitly returns uint32_t (see cc_shard.h), so prev_arena is guaranteed to be uint32_t, and sizeof(uint32_t) is correct.

However, the mallctl error checking suggestion is worth considering, but it should be applied consistently. The codebase has 14+ other mallctl calls across range_cc_map.h, template_cc_map.h, catalog_cc_map.h, local_cc_shards.h, and others—none of which currently check the return value. If error handling is needed here at lines 6296-6323 and 6340-6367, the same pattern should be applied codebase-wide to avoid inconsistency.

If error checking is desired, consider:

  • Making prev_arena type explicit (const uint32_t instead of auto) for clarity
  • Applying the mallctl return value check to all occurrences, not just these two locations
  • Documenting whether mallctl failures are expected to be recoverable or should always fail fast
♻️ Duplicate comments (8)
tx_service/include/cc/cc_request.h (1)

7428-7453: CollectMemStatsCc: fix stats type/meaning mismatch (and remove commented-out mallctl block)

  • The commented-out mallctl("thread.arena"... block should be deleted (already noted previously).
  • GetJemallocArenaStat(...) returns size_t; assigning into HeapMemStats::{allocated_,committed_} (int64_t) can overflow on large processes. At minimum, cast explicitly and clamp/validate.
  • The semantic mismatch remains: mimalloc path sets committed_ from mi_thread_stats (committed memory), jemalloc path sets committed_ to “resident”. If this is intended, document it at HeapMemStats (or rename the field(s) / split into resident_ vs committed_) so dashboards/alerts remain comparable across builds.

Suggested local hardening:

-        stats_->allocated_ = allocated;
-        stats_->committed_ = committed;
+        stats_->allocated_ =
+            allocated > static_cast<size_t>(std::numeric_limits<int64_t>::max())
+                ? std::numeric_limits<int64_t>::max()
+                : static_cast<int64_t>(allocated);
+        stats_->committed_ =
+            committed > static_cast<size_t>(std::numeric_limits<int64_t>::max())
+                ? std::numeric_limits<int64_t>::max()
+                : static_cast<int64_t>(committed);

(Requires <limits> include or reuse an existing one if already present transitively.)

tx_service/src/cc/cc_shard.cpp (2)

2472-2473: Clarify the TODO comment about arena destruction.

Jemalloc does not provide a public API to destroy arenas; they are cleaned up at process shutdown. Update the comment to reflect this rather than leaving an unresolved TODO.

-
-    // TODO: destroy arena ?
+    // Note: Jemalloc arenas are not explicitly destroyed. They are cleaned up
+    // automatically when the process exits. There is no mallctl for arena destruction.

2481-2495: Replace assert(false) with a proper fallback in non-jemalloc builds.

If SetAsDefaultArena() is called when WITH_JEMALLOC is not defined, the assert(false) will crash in debug builds and return 0 silently in release builds. This function should either be guarded at the call site or provide a no-op implementation.

 uint32_t CcShardHeap::SetAsDefaultArena()
 {
 #if defined(WITH_JEMALLOC)
     uint32_t prev_arena;
     size_t sz = sizeof(uint32_t);
     // read prev arena id
-    mallctl("thread.arena", &prev_arena, &sz, NULL, 0);  // read only
+    if (mallctl("thread.arena", &prev_arena, &sz, NULL, 0) != 0)
+    {
+        prev_arena = 0;  // fallback to default arena
+    }
     // override arena id
-    mallctl("thread.arena", NULL, NULL, &arena_id_, sizeof(uint32_t));
+    if (mallctl("thread.arena", NULL, NULL, &arena_id_, sizeof(uint32_t)) != 0)
+    {
+        LOG(WARNING) << "Failed to switch to shard arena " << arena_id_;
+    }
     return prev_arena;
 #else
-    assert(false);
+    // jemalloc is not enabled; SetAsDefaultArena() is a no-op.
+    // Callers should guard calls with WITH_JEMALLOC or this returns 0 safely.
     return 0;
 #endif
 }
tx_service/include/cc/local_cc_shards.h (5)

427-431: Add error handling for jemalloc arena creation.

The mallctl("arenas.create", ...) call can fail, but the error is not checked. If arena creation fails, table_ranges_arena_id_ remains 0 (the default arena), silently breaking the intended memory isolation.

 #if defined(WITH_JEMALLOC)
         // create table ranges arena
         size_t sz = sizeof(uint32_t);
-        mallctl("arenas.create", &table_ranges_arena_id_, &sz, NULL, 0);
+        if (mallctl("arenas.create", &table_ranges_arena_id_, &sz, NULL, 0) != 0)
+        {
+            LOG(ERROR) << "Failed to create jemalloc arena for table ranges";
+        }
 #endif

438-442: Add error handling for hash partition checkpoint arena creation.

Same issue as table ranges arena: lacking error checking for mallctl("arenas.create", ...).

 #if defined(WITH_JEMALLOC)
         // create hash partition ckpt arena
         size_t sz = sizeof(uint32_t);
-        mallctl("arenas.create", &hash_partition_ckpt_arena_id_, &sz, NULL, 0);
+        if (mallctl("arenas.create", &hash_partition_ckpt_arena_id_, &sz, NULL, 0) != 0)
+        {
+            LOG(ERROR) << "Failed to create jemalloc arena for hash partition checkpoint";
+        }
 #endif

880-890: Add error handling for thread arena switching.

The code switches the current thread to the table ranges arena without checking for errors. If mallctl("thread.arena", ...) fails:

  1. Memory allocations may occur in the wrong arena
  2. prev_arena_id may be undefined, causing the restoration at lines 956-959 and 1005-1007 to set the arena to a garbage value
 #if defined(WITH_JEMALLOC)
         uint32_t prev_arena_id;
         auto table_range_arena_id = GetTableRangesArenaId();
         size_t sz = sizeof(uint32_t);
-        mallctl("thread.arena", &prev_arena_id, &sz, NULL, 0);
-        mallctl("thread.arena",
-                NULL,
-                NULL,
-                &table_range_arena_id,
-                sizeof(uint32_t));
+        if (mallctl("thread.arena", &prev_arena_id, &sz, NULL, 0) != 0)
+        {
+            prev_arena_id = 0;  // fallback to default arena
+        }
+        if (mallctl("thread.arena", NULL, NULL, &table_range_arena_id, sizeof(uint32_t)) != 0)
+        {
+            LOG(WARNING) << "Failed to switch to table ranges arena";
+        }
 #endif

956-959: Add error handling for arena restoration.

The arena restoration lacks error checking. If mallctl("thread.arena", ...) fails, subsequent allocations will use the wrong arena.

 #if defined(WITH_JEMALLOC)
-            mallctl(
-                "thread.arena", NULL, NULL, &prev_arena_id, sizeof(uint32_t));
+            if (mallctl("thread.arena", NULL, NULL, &prev_arena_id, sizeof(uint32_t)) != 0)
+            {
+                LOG(WARNING) << "Failed to restore thread arena to " << prev_arena_id;
+            }
 #endif

1005-1007: Add error handling for arena restoration (second occurrence).

Same issue as line 956-959: missing error handling for arena restoration.

 #if defined(WITH_JEMALLOC)
-        mallctl("thread.arena", NULL, NULL, &prev_arena_id, sizeof(uint32_t));
+        if (mallctl("thread.arena", NULL, NULL, &prev_arena_id, sizeof(uint32_t)) != 0)
+        {
+            LOG(WARNING) << "Failed to restore thread arena to " << prev_arena_id;
+        }
 #endif
🧹 Nitpick comments (1)
tx_service/include/cc/cc_request.h (1)

6290-6317: Consider shrinking capacity if the goal is to actually return memory to the allocator

pop_back() clears size but typically retains capacity; for large vectors this may not release memory back (allocator-dependent). If the intention is “free memory now”, consider swapping with an empty vector (or clear(); shrink_to_fit();) once you hit the end (accepting potential latency spikes), possibly behind a threshold.

Also applies to: 6344-6360

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7341664 and 6059480.

📒 Files selected for processing (5)
  • tx_service/include/cc/cc_request.h (6 hunks)
  • tx_service/include/cc/cc_shard.h (4 hunks)
  • tx_service/include/cc/local_cc_shards.h (12 hunks)
  • tx_service/include/util.h (2 hunks)
  • tx_service/src/cc/cc_shard.cpp (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tx_service/include/cc/cc_shard.h
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.

Applied to files:

  • tx_service/include/cc/cc_request.h
  • tx_service/src/cc/cc_shard.cpp
  • tx_service/include/cc/local_cc_shards.h
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.

Applied to files:

  • tx_service/include/cc/cc_request.h
  • tx_service/src/cc/cc_shard.cpp
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • tx_service/include/cc/cc_request.h
  • tx_service/include/cc/local_cc_shards.h
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.

Applied to files:

  • tx_service/include/cc/cc_request.h
  • tx_service/src/cc/cc_shard.cpp
  • tx_service/include/cc/local_cc_shards.h
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.

Applied to files:

  • tx_service/include/cc/cc_request.h
🧬 Code graph analysis (3)
tx_service/include/cc/cc_request.h (1)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
tx_service/src/cc/cc_shard.cpp (2)
tx_service/include/cc/cc_shard.h (2)
  • arena_id_ (244-245)
  • memory_limit_ (236-236)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
tx_service/include/cc/local_cc_shards.h (1)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
🔇 Additional comments (14)
tx_service/include/cc/cc_request.h (2)

84-86: Guarded jemalloc include looks fine, but confirm header path portability

#include <jemalloc/jemalloc.h> is distro-dependent (some environments expose jemalloc/jemalloc.h, others just jemalloc.h). If you intend portability across distros/containers, ensure your CMake “find jemalloc” logic matches this include path (or centralize the include behind a config header).


7450-7453: Good change switching this log to DLOG(INFO)

tx_service/include/util.h (1)

24-26: LGTM!

The conditional include for jemalloc is properly guarded.

tx_service/include/cc/local_cc_shards.h (7)

41-41: LGTM!

The <type_traits> include is needed for std::is_same_v used in the static_assert below.


66-71: LGTM!

The static_assert ensures platform compatibility for jemalloc arena IDs. The typo from the previous review has been fixed.


455-458: LGTM!

Simple accessor for the table ranges arena ID.


467-486: LGTM!

The jemalloc branch properly queries arena stats and compares against the memory limit. The mimalloc fallback is preserved for non-jemalloc builds.


496-517: LGTM!

Consistent implementation with TableRangesMemoryFull() for checking memory availability.


522-554: LGTM!

The jemalloc branch provides arena-specific memory reporting. Note that it doesn't need the heap override logic since it operates on arena-level stats.


2010-2012: LGTM!

Arena ID members are properly zero-initialized, which corresponds to jemalloc's default arena.

tx_service/src/cc/cc_shard.cpp (4)

53-55: LGTM!

Conditional jemalloc include is properly guarded.


224-226: LGTM!

Arena context is saved before initialization. This pairs with the restoration at lines 253-256.


2499-2540: LGTM!

The Full() method correctly uses jemalloc arena stats when enabled and falls back to mimalloc otherwise. The comparison against memory_limit_ is appropriate.


2552-2572: LGTM!

Disabling defragmentation for jemalloc builds is reasonable since jemalloc has its own internal memory management strategies. The mimalloc defragmentation logic remains for non-jemalloc builds.

Comment thread tx_service/include/util.h
Comment on lines +445 to +471
#if defined(WITH_JEMALLOC)
static inline void GetJemallocArenaStat(uint32_t arena_id,
size_t &resident,
size_t &allocated)
{
// estimate thread memory usage from total process memory
size_t total_resident = 0;
size_t small_allocated = 0;
size_t large_allocated = 0;
size_t sz = sizeof(total_resident);

uint64_t epoch = 1;
mallctl("epoch", NULL, NULL, &epoch, sizeof(epoch));
// Resident memory pages actually held by jemalloc from OS
// mallctl("stats.resident", &total_resident, &sz, NULL, 0);
// mallctl("stats.allocated", &total_allocated, &sz, NULL, 0);
char mib[64];
snprintf(mib, sizeof(mib), "stats.arenas.%u.resident", arena_id);
mallctl(mib, &total_resident, &sz, NULL, 0);
snprintf(mib, sizeof(mib), "stats.arenas.%u.small.allocated", arena_id);
mallctl(mib, &small_allocated, &sz, NULL, 0);
snprintf(mib, sizeof(mib), "stats.arenas.%u.large.allocated", arena_id);
mallctl(mib, &large_allocated, &sz, NULL, 0);

resident = total_resident;
allocated = small_allocated + large_allocated;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add error handling for mallctl calls.

The mallctl calls can fail (e.g., invalid arena ID, stats not available), leaving output variables with stale/uninitialized values. At minimum, initialize outputs to zero before the calls and consider logging failures for debugging.

 static inline void GetJemallocArenaStat(uint32_t arena_id,
                                         size_t &resident,
                                         size_t &allocated)
 {
     // estimate thread memory usage from total process memory
     size_t total_resident = 0;
     size_t small_allocated = 0;
     size_t large_allocated = 0;
     size_t sz = sizeof(total_resident);

     uint64_t epoch = 1;
-    mallctl("epoch", NULL, NULL, &epoch, sizeof(epoch));
+    if (mallctl("epoch", NULL, NULL, &epoch, sizeof(epoch)) != 0)
+    {
+        resident = 0;
+        allocated = 0;
+        return;
+    }
     // Resident memory pages actually held by jemalloc from OS
     // mallctl("stats.resident", &total_resident, &sz, NULL, 0);
     // mallctl("stats.allocated", &total_allocated, &sz, NULL, 0);
     char mib[64];
     snprintf(mib, sizeof(mib), "stats.arenas.%u.resident", arena_id);
-    mallctl(mib, &total_resident, &sz, NULL, 0);
+    if (mallctl(mib, &total_resident, &sz, NULL, 0) != 0)
+    {
+        total_resident = 0;
+    }
     snprintf(mib, sizeof(mib), "stats.arenas.%u.small.allocated", arena_id);
-    mallctl(mib, &small_allocated, &sz, NULL, 0);
+    if (mallctl(mib, &small_allocated, &sz, NULL, 0) != 0)
+    {
+        small_allocated = 0;
+    }
     snprintf(mib, sizeof(mib), "stats.arenas.%u.large.allocated", arena_id);
-    mallctl(mib, &large_allocated, &sz, NULL, 0);
+    if (mallctl(mib, &large_allocated, &sz, NULL, 0) != 0)
+    {
+        large_allocated = 0;
+    }

     resident = total_resident;
     allocated = small_allocated + large_allocated;
 }
🤖 Prompt for AI Agents
In tx_service/include/util.h around lines 445 to 471, the mallctl calls are
unchecked and output variables may remain uninitialized on error; initialize
total_resident, small_allocated, large_allocated to 0 before calling mallctl,
capture and check each mallctl return value, and on non-zero return set the
corresponding local value to 0 and log or record the error (including arena_id
and the mallctl string) so the function never returns stale data; finally assign
resident = total_resident and allocated = small_allocated + large_allocated as
before.

Comment thread tx_service/src/cc/cc_shard.cpp
Comment thread tx_service/src/cc/cc_shard.cpp
@lokax lokax linked an issue Dec 12, 2025 that may be closed by this pull request

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
tx_service/include/cc/cc_request.h (1)

6295-6330: Check mallctl return value + use sizeof(prev_arena) to avoid silent arena-restore failures

If mallctl("thread.arena", ...) fails, the thread may continue on the scan arena and skew memory accounting / contention across unrelated work. At least DLOG/LOG (or assert) the return code; also prefer sizeof(prev_arena) over sizeof(uint32_t) to avoid type/size drift.

 #if defined(WITH_JEMALLOC)
-                mallctl(
-                    "thread.arena", NULL, NULL, &prev_arena, sizeof(uint32_t));
+                if (int rc = mallctl("thread.arena",
+                                     nullptr,
+                                     nullptr,
+                                     &prev_arena,
+                                     sizeof(prev_arena));
+                    rc != 0)
+                {
+                    LOG(ERROR) << "mallctl(thread.arena) restore failed rc="
+                               << rc;
+                }
 #endif
tx_service/src/cc/cc_shard.cpp (1)

2497-2541: Fix type safety in Full() (size_t vs int64_t) and add error handling.

The code assigns size_t values (total_resident, total_allocated) directly to int64_t* outputs without overflow checking, and compares size_t against static_cast<int64_t>(memory_limit_). On systems where values exceed INT64_MAX, this causes silent integer overflow and sign corruption. Additionally, GetJemallocArenaStat() return value is not checked—if the function fails, stale initialized zeros are used, returning "not full" incorrectly.

Implement overflow protection by clamping to INT64_MAX before assignment, fix the return comparison to avoid unnecessary casts, and verify that GetJemallocArenaStat() succeeds before using its output (or document expected behavior on failure).

tx_service/include/cc/local_cc_shards.h (1)

418-443: Move mallctl("arenas.create", ...) inside the if block and add error checking for both arena creations.

In InitializeTableRangesHeap(), the mallctl call is outside the if (!table_ranges_heap_) guard, so repeated calls will create new arenas and overwrite table_ranges_arena_id_, leaking resources. In InitializeHashPartitionCkptHeap(), there are no guards at all—both mi_heap_new() and mallctl are unconditionally called every invocation.

Additionally, both functions ignore the return value of mallctl. Per jemalloc documentation, mallctl returns 0 on success or an errno (EINVAL, ENOENT, EPERM, EAGAIN, EFAULT) on failure. Silent failures will leave arena IDs at 0, causing fallback to the default arena and potential correctness issues.

Suggested fix:

 void InitializeTableRangesHeap()
 {
     std::unique_lock<std::mutex> lk(table_ranges_heap_mux_);
     if (!table_ranges_heap_)
     {
         table_ranges_thread_id_ = mi_thread_id();
         table_ranges_heap_ = mi_heap_new();
+    #if defined(WITH_JEMALLOC)
+        size_t sz = sizeof(uint32_t);
+        if (mallctl("arenas.create", &table_ranges_arena_id_, &sz, nullptr, 0) != 0)
+        {
+            LOG(ERROR) << "Failed to create jemalloc arena for table ranges";
+            table_ranges_arena_id_ = 0;
+        }
+    #endif
     }
-
-#if defined(WITH_JEMALLOC)
-    // create table ranges arena
-    size_t sz = sizeof(uint32_t);
-    mallctl("arenas.create", &table_ranges_arena_id_, &sz, NULL, 0);
-#endif
 }

Apply the same fix to InitializeHashPartitionCkptHeap(), adding a guard to prevent repeated heap allocation.

♻️ Duplicate comments (3)
tx_service/include/cc/cc_request.h (1)

7428-7453: CollectMemStatsCc: stats semantics + remove dead commented-out code

  • stats_->committed_ is “mimalloc committed” in the non-jemalloc path but appears to be “jemalloc resident” in the jemalloc path via GetJemallocArenaStat(...). If callers assume comparable semantics across builds, this will confuse alerting/diagnostics—please document/rename/adjust to keep meaning consistent.
  • The commented-out prev_arena block should be removed (or turned into a real debug check behind a flag).
tx_service/src/cc/cc_shard.cpp (2)

220-257: Restore thread.arena with error handling (and only if SetAsDefaultArena succeeded).
CcShard::Init() restores via mallctl("thread.arena", ...) but ignores return code; also there’s no guard ensuring prev_arena is valid if switching failed upstream.

Recommend checking rc and logging failures (and ideally make SetAsDefaultArena() return an optional/success flag).


2456-2462: Check mallctl("arenas.create", ...) return value (arena_id_ can silently stay 0).
If arena creation fails, per-shard isolation and accounting break silently.

Please capture/handle rc (log + decide fallback/abort).

🧹 Nitpick comments (5)
tx_service/include/cc/cc_request.h (2)

84-86: Prefer minimizing jemalloc exposure from a widely-included header

Including <jemalloc/jemalloc.h> from cc_request.h can significantly widen the dependency surface/compile impact. If the only need here is mallctl, consider forward-declaring mallctl under WITH_JEMALLOC (or moving jemalloc-only code into a .cc / a small jemalloc-specific header) to keep this header lighter.

 #include "util.h"
 #if defined(WITH_JEMALLOC)
-#include <jemalloc/jemalloc.h>
+extern "C" int mallctl(const char *name,
+                       void *oldp,
+                       size_t *oldlenp,
+                       void *newp,
+                       size_t newlen);
 #endif

6338-6373: Deduplicate the heap/arena switching with a small RAII guard

The “set default heap + set default arena + restore both” logic is duplicated for data_sync_vec_ and archive_vec_. A tiny scoped guard (e.g., ScopedDefaultShardHeapAndArena) would reduce drift risk (future edits updating one block but not the other).

tx_service/include/cc/local_cc_shards.h (2)

41-71: Static-assert is reasonable; consider using jemalloc’s arena type directly where possible.
The unsigned == uint32_t guard is fine, but if the repo ever targets platforms where this differs, you’ll likely want a dedicated using JemallocArenaId = unsigned; (and keep it out of unrelated APIs).


465-487: jemalloc “committed” variable naming is misleading; also handle stats failures.
GetJemallocArenaStat() returns (resident, allocated), but the call sites use committed to hold resident pages. Rename locals to resident to avoid confusion (especially in logs).

Also, if stats refresh fails, treat memory as “unknown” and avoid flipping behavior based on garbage/partial data.

Also applies to: 494-518, 520-555

tx_service/include/cc/cc_shard.h (1)

191-192: Header exposes SetAsDefaultArena() unconditionally, but implementation asserts when WITH_JEMALLOC is off. All current callers properly guard this method with #if defined(WITH_JEMALLOC), so immediate risk is mitigated. However, the unconditional declaration is misleading and creates a maintenance hazard for future codepaths. Consider guarding the declaration with #if defined(WITH_JEMALLOC) to make the API contract explicit and prevent accidental misuse.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6059480 and 9bdad31.

📒 Files selected for processing (6)
  • CMakeLists.txt (3 hunks)
  • tx_service/include/cc/cc_request.h (6 hunks)
  • tx_service/include/cc/cc_shard.h (4 hunks)
  • tx_service/include/cc/local_cc_shards.h (12 hunks)
  • tx_service/include/util.h (2 hunks)
  • tx_service/src/cc/cc_shard.cpp (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • CMakeLists.txt
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.

Applied to files:

  • tx_service/include/cc/cc_shard.h
  • tx_service/include/cc/cc_request.h
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/src/cc/cc_shard.cpp
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.

Applied to files:

  • tx_service/include/cc/cc_shard.h
  • tx_service/include/cc/cc_request.h
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/src/cc/cc_shard.cpp
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.

Applied to files:

  • tx_service/include/cc/cc_request.h
  • tx_service/src/cc/cc_shard.cpp
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • tx_service/include/cc/cc_request.h
  • tx_service/include/cc/local_cc_shards.h
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.

Applied to files:

  • tx_service/include/cc/cc_request.h
🧬 Code graph analysis (3)
tx_service/include/cc/cc_shard.h (1)
tx_service/src/cc/cc_shard.cpp (2)
  • SetAsDefaultArena (2481-2495)
  • SetAsDefaultArena (2481-2481)
tx_service/include/cc/local_cc_shards.h (2)
tx_service/include/cc/ccm_scanner.h (1)
  • uint32_t (467-477)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
tx_service/src/cc/cc_shard.cpp (2)
tx_service/include/cc/cc_shard.h (2)
  • arena_id_ (244-245)
  • memory_limit_ (236-236)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
🔇 Additional comments (3)
tx_service/src/cc/cc_shard.cpp (2)

2552-2573: Disabling defrag under jemalloc: OK, but ensure callers don’t assume it can happen.
Returning false unconditionally is fine; consider documenting at call sites that “defragmentation is mimalloc-only” to prevent future surprise.


2481-2495: CcShardHeap::SetAsDefaultArena() should not assert(false) in non-jemalloc builds.
This crashes at runtime if called in a non-jemalloc build; prefer compile-time removal (wrap the entire function declaration in #if defined(WITH_JEMALLOC)) or a documented no-op/fail-fast log instead of debug assert. Also add return code checks for both mallctl() calls.

tx_service/include/cc/cc_shard.h (1)

205-209: Document arena_id_ initialization and ArenaId() return value semantics.

The arena_id_ member is initialized to 0 and the ArenaId() accessor is defined for all builds, but all actual callers invoke it within #if defined(WITH_JEMALLOC) guards where arena_id_ is properly set by SetAsDefaultArena() during initialization. In non-jemalloc builds, ArenaId() is never called because the code paths use mi_thread_stats() instead. While the current implementation is safe due to ifdef protection, adding a comment explaining that arena_id_{0} is overwritten during init in jemalloc builds would improve code clarity.

Comment thread tx_service/include/cc/local_cc_shards.h
Comment thread tx_service/include/util.h
Comment on lines +24 to +26
#if defined(WITH_JEMALLOC)
#include <jemalloc/jemalloc.h>
#endif

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -name "util.h" -path "*/tx_service/include/*"

Repository: eloqdata/tx_service

Length of output: 90


🏁 Script executed:

wc -l ./tx_service/include/util.h

Repository: eloqdata/tx_service

Length of output: 94


🏁 Script executed:

cat -n ./tx_service/include/util.h

Repository: eloqdata/tx_service

Length of output: 18159


🏁 Script executed:

rg "snprintf" --type cpp --type h -c

Repository: eloqdata/tx_service

Length of output: 120


🏁 Script executed:

head -50 ./tx_service/src/tx_trace.cpp

Repository: eloqdata/tx_service

Length of output: 1681


🏁 Script executed:

rg "snprintf" ./tx_service/src/tx_trace.cpp -A 3 -B 3

Repository: eloqdata/tx_service

Length of output: 487


🏁 Script executed:

grep -l "cstdio\|stdio.h" tx_service/include/*.h

Repository: eloqdata/tx_service

Length of output: 92


🏁 Script executed:

head -50 ./tx_service/include/sharder.h

Repository: eloqdata/tx_service

Length of output: 1567


🏁 Script executed:

head -100 ./tx_service/include/type.h | grep -E "cstdio|stdio|include"

Repository: eloqdata/tx_service

Length of output: 393


🏁 Script executed:

rg "tx_trace.h" tx_service/include/

Repository: eloqdata/tx_service

Length of output: 124


🏁 Script executed:

rg "template_cc_map|tx_trace" tx_service/include/sharder.h tx_service/include/type.h

Repository: eloqdata/tx_service

Length of output: 45


🏁 Script executed:

web_search
Does jemalloc/jemalloc.h include stdio.h or cstdio?

Repository: eloqdata/tx_service

Length of output: 156


🌐 Web query:

jemalloc/jemalloc.h header includes stdio cstdio

💡 Result:

Short answer: jemalloc's public header jemalloc/jemalloc.h does not itself require or directly include <stdio.h> / ; platform/internal headers used inside the implementation (e.g. jemalloc_internal_decls.h / jemalloc_internal.h) do include <stdio.h>. See the patch showing internal headers including <stdio.h>. [1]

References

  • Patch / mail archive showing internal headers include <stdio.h> (jemalloc_internal_decls.h). [1]

Fix header self-sufficiency + add mallctl error handling (outputs used for "memory full" decisions).

  1. snprintf() is used but <cstdio> isn't included in util.h (fragile transitive include that may break if dependencies change).
  2. All mallctl() return codes are ignored (lines 457, 463, 465, 467); on failure you return stale/uninitialized stats that downstream code may use for memory-full checks.

Suggested changes:

  • Add #include <cstdio> after #include <jemalloc/jemalloc.h>
  • Initialize output parameters resident = 0; allocated = 0; at function entry
  • Check the first mallctl("epoch", ...) return code and return early on failure
  • Cast remaining mallctl() calls to (void) to explicitly suppress the return value, or check each one with logging for debugging
🤖 Prompt for AI Agents
In tx_service/include/util.h around lines 24 to 26, add a direct include of
<cstdio> immediately after the jemalloc header to make snprintf available and
avoid fragile transitive includes; in the function that queries mallctl stats
initialize output parameters resident = 0 and allocated = 0 at the start to
avoid returning stale values; check the return value of the first
mallctl("epoch", ...) and return early (or propagate an error) on failure
instead of continuing; and for the remaining mallctl() calls either explicitly
cast their return values to (void) to indicate intentional ignore or add proper
error logging/handling for each call so failures are not silently ignored.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (6)
tx_service/src/cc/cc_shard.cpp (3)

253-256: Consider adding error handling for arena restoration.

The mallctl("thread.arena", ...) call ignores the return value. While failure here is unlikely to cause immediate issues (initialization is complete), logging a warning would aid debugging.


2475-2476: Clarify the TODO comment.

Jemalloc doesn't provide a public API to destroy arenas; they are cleaned up at process shutdown. The comment should clarify this rather than leaving an open TODO.


2501-2504: The assert(false) provides no protection in release builds.

In release builds, the assert(false) becomes a no-op, causing undefined behavior when returning an uninitialized-looking 0. Since the calling code at line 225 is already guarded by #if defined(WITH_JEMALLOC), this branch should never be reached. Consider using a compile-time error or __builtin_unreachable() to make this explicit.

tx_service/include/cc/local_cc_shards.h (3)

891-901: Thread arena switching lacks error handling and initialization.

The arena switching code has critical issues that were flagged in previous reviews but remain unaddressed:

  1. prev_arena_id (line 892) is uninitialized; if the read at line 895 fails, it contains garbage
  2. Neither mallctl call checks return values; failures silently leave the thread in the wrong arena
  3. Restoration at lines 968-969 and 1017 will write uninitialized data if the read failed

Initialize prev_arena_id and check all mallctl return values:

 #if defined(WITH_JEMALLOC)
-        uint32_t prev_arena_id;
+        uint32_t prev_arena_id = 0;
+        bool have_prev_arena = false;
         auto table_range_arena_id = GetTableRangesArenaId();
         size_t sz = sizeof(uint32_t);
-        mallctl("thread.arena", &prev_arena_id, &sz, NULL, 0);
-        mallctl("thread.arena",
-                NULL,
-                NULL,
-                &table_range_arena_id,
-                sizeof(uint32_t));
+        if (mallctl("thread.arena", &prev_arena_id, &sz, nullptr, 0) == 0) {
+            have_prev_arena = true;
+        }
+        if (mallctl("thread.arena", nullptr, nullptr, &table_range_arena_id, sizeof(uint32_t)) != 0) {
+            LOG(ERROR) << "Failed to set thread.arena=" << table_range_arena_id;
+        }
 #endif

Then guard restoration with have_prev_arena at lines 968-969 and 1017.


967-970: Arena restoration lacks error handling.

This restoration call doesn't check the mallctl return value. If restoration fails, the thread continues using the wrong arena, violating memory isolation guarantees.

Additionally, prev_arena_id may be uninitialized if the read at line 895 failed.

After applying the fix at lines 891-901, update restoration:

 #if defined(WITH_JEMALLOC)
-            mallctl(
-                "thread.arena", NULL, NULL, &prev_arena_id, sizeof(uint32_t));
+            if (have_prev_arena &&
+                mallctl("thread.arena", nullptr, nullptr, &prev_arena_id, sizeof(uint32_t)) != 0) {
+                LOG(ERROR) << "Failed to restore thread.arena=" << prev_arena_id;
+            }
 #endif

1016-1018: Arena restoration lacks error handling.

Same issue as lines 967-970: missing error check and potentially uninitialized prev_arena_id.

Apply the same fix pattern as suggested for lines 967-970:

 #if defined(WITH_JEMALLOC)
-        mallctl("thread.arena", NULL, NULL, &prev_arena_id, sizeof(uint32_t));
+        if (have_prev_arena &&
+            mallctl("thread.arena", nullptr, nullptr, &prev_arena_id, sizeof(uint32_t)) != 0) {
+            LOG(ERROR) << "Failed to restore thread.arena=" << prev_arena_id;
+        }
 #endif
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9bdad31 and 8e9c3c7.

📒 Files selected for processing (3)
  • tx_service/include/cc/local_cc_shards.h (12 hunks)
  • tx_service/include/cc/template_cc_map.h (2 hunks)
  • tx_service/src/cc/cc_shard.cpp (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tx_service/include/cc/template_cc_map.h
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.

Applied to files:

  • tx_service/include/cc/local_cc_shards.h
  • tx_service/src/cc/cc_shard.cpp
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.

Applied to files:

  • tx_service/include/cc/local_cc_shards.h
  • tx_service/src/cc/cc_shard.cpp
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • tx_service/include/cc/local_cc_shards.h
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.

Applied to files:

  • tx_service/src/cc/cc_shard.cpp
🧬 Code graph analysis (2)
tx_service/include/cc/local_cc_shards.h (1)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
tx_service/src/cc/cc_shard.cpp (2)
tx_service/include/cc/cc_shard.h (2)
  • arena_id_ (244-245)
  • memory_limit_ (236-236)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
🔇 Additional comments (14)
tx_service/src/cc/cc_shard.cpp (5)

53-55: LGTM!

Conditional jemalloc header inclusion follows the standard pattern for optional feature compilation.


224-226: LGTM!

Arena save/restore pattern correctly mirrors the existing mimalloc heap management on line 223.


2456-2464: LGTM!

Arena creation now includes proper error handling with LOG(FATAL) on failure, which is appropriate since a shard cannot function without its arena.


2525-2525: Verify: jemalloc uses resident memory while mimalloc uses allocated.

The jemalloc path compares total_resident (committed pages) against memory_limit_, while mimalloc compares allocated. This means jemalloc will trigger "full" earlier due to fragmentation. Confirm this behavioral difference is intentional.


2555-2558: LGTM!

Disabling defragmentation for jemalloc is correct since jemalloc handles memory fragmentation differently than mimalloc's heap-based approach.

tx_service/include/cc/local_cc_shards.h (9)

41-41: LGTM!

The <type_traits> inclusion is required for the std::is_same_v used in the static assertion below.


66-71: LGTM!

The static assertion ensures that jemalloc's unsigned arena ID type matches uint32_t on this platform, catching potential portability issues at compile time.


427-435: LGTM!

The arena creation now properly checks for errors and fails fast with LOG(FATAL), preventing silent fallback to the default arena.


442-453: LGTM!

Arena creation for hash partition checkpoint properly handles errors consistently with the table ranges arena creation.


466-469: LGTM!

Simple accessor method for retrieving the table ranges arena ID.


478-497: LGTM!

The jemalloc arena stats integration is properly guarded with WITH_JEMALLOC and falls back to mimalloc stats otherwise.


507-528: LGTM!

Consistent implementation of jemalloc arena stats for memory checks.


533-565: LGTM!

Memory reporting correctly uses jemalloc arena stats when available.


2021-2022: LGTM!

Arena ID members are properly declared as uint32_t with sensible default initialization to 0 (the default jemalloc arena).

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
tx_service/src/cc/local_cc_shards.cpp (1)

1016-1126: Arena save/switch/restore blocks look correct; consider RAII to de‑duplicate pattern

Across InitTableRanges, KickoutRangeSlices, DropStoreRangesInBucket, and DataSyncForHashPartition the new WITH_JEMALLOC blocks correctly:

  • read and store the current arena,
  • switch to the relevant table‑range/hash‑ckpt arena,
  • perform heap‑bound work under the existing mimalloc mi_override_thread/mi_heap_set_default overrides, and
  • restore mimalloc state and then the previous arena, with no early returns between switch and restore.

From a correctness and concurrency perspective this matches the existing mimalloc usage and appears safe.

The same manual save/restore sequence is now repeated in multiple functions (and in other files in this PR). A small RAII helper (e.g., an ArenaGuard that switches in its ctor and restores in its dtor) would reduce duplication and make future changes less error‑prone, especially if more call sites are added.

Also applies to: 1261-1392, 2079-2139, 4969-5040

tx_service/include/cc/catalog_cc_map.h (1)

442-451: Prefer an RAII guard for jemalloc arena switching around InitRangeSlices.

The new JemallocArenaSwitcher::ReadCurrentArena / SwitchToArena calls correctly save and restore prev_arena, but the pattern is still manual and split across two blocks. Any future early return or exception between the two blocks would leave the thread in the wrong arena. A small RAII guard (e.g. JemallocArenaGuard that reads the current arena in its ctor and restores it in the dtor) would make this safer and reduce duplication with other arena-switch sites.

Also applies to: 472-476

tx_service/include/cc/local_cc_shards.h (1)

962-965: Inconsistent error handling: use JemallocArenaSwitcher::SwitchToArena for arena restoration.

Lines 962-965 use raw mallctl without error checking, while line 1012 correctly uses JemallocArenaSwitcher::SwitchToArena. This inconsistency could silently fail to restore the arena, causing subsequent allocations to use the wrong arena.

Apply this diff to use the helper with error handling:

 #if defined(WITH_JEMALLOC)
-            mallctl(
-                "thread.arena", NULL, NULL, &prev_arena_id, sizeof(uint32_t));
+            JemallocArenaSwitcher::SwitchToArena(prev_arena_id);
 #endif
🧹 Nitpick comments (4)
tx_service/include/tx_service.h (2)

289-301: Consider an RAII helper for heap+arena switching in RunOneRound.

The new jemalloc integration mirrors the existing mimalloc pattern: SetAsDefaultHeap() / SetAsDefaultArena() at the top of the EXT path and mi_heap_set_default(prev_heap) / JemallocArenaSwitcher::SwitchToArena(prev_arena) at the bottom, with extra bookkeeping for ext_tx_proc_heap_ / ext_tx_proc_arena_id_ when is_ext_proc is true. This is logically correct but easy to get wrong if more early returns or branches are added later.

A small RAII scope guard that captures and restores both heap and arena (and possibly the overridden thread id) would make this path more robust and reduce duplication between RunOneRound and ForwardTx.

Also applies to: 425-435


697-711: ForwardTx heap+arena switching is correct but could also benefit from RAII.

ForwardTx now mirrors RunOneRound: it switches both the mimalloc default heap and the jemalloc arena via SetAsDefaultHeap() / SetAsDefaultArena(), then restores them with mi_heap_set_default(coordi_->ext_tx_proc_heap_) and JemallocArenaSwitcher::SwitchToArena(coordi_->ext_tx_proc_arena_id_), finally clearing the stored state. The sequence is balanced and guarded by WITH_JEMALLOC.

To reduce the chance of future imbalances (e.g., new early returns) and avoid repeating this pattern across sites, consider introducing a shared RAII guard (e.g., in a small helper class) that encapsulates save/switch/restore for both heap and arena.

tx_service/src/cc/cc_shard.cpp (2)

2475-2476: Resolve or clarify the TODO comment.

Jemalloc supports arena.<i>.destroy to "Destroy the arena" and "discard all of the arena's extant allocations", with "Future calls to arenas.create may recycle the arena index." However, "Destruction will fail if any threads are currently associated with the arena."

If arena destruction is intentionally skipped because arenas are cleaned up at process shutdown, update the comment to explain this:

-    // TODO: destroy arena ?
+    // Note: jemalloc arenas are intentionally not destroyed here.
+    // arena.<i>.destroy requires no threads to be associated with the arena,
+    // which is difficult to guarantee at shutdown. Arenas are cleaned up
+    // automatically when the process exits.

2484-2495: Improve error handling in non-jemalloc fallback path.

The assert(false) on line 2492 provides no diagnostic information if triggered. While callers should guard calls with #if defined(WITH_JEMALLOC), a more informative error would help debugging if this is ever reached unexpectedly.

 #else
-    assert(false);
+    LOG(FATAL) << "SetAsDefaultArena() called but WITH_JEMALLOC is not defined";
     return 0;
 #endif
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8e9c3c7 and 29cba32.

📒 Files selected for processing (11)
  • store_handler/data_store_service_client_closure.cpp (4 hunks)
  • tx_service/include/cc/catalog_cc_map.h (2 hunks)
  • tx_service/include/cc/cc_shard.h (6 hunks)
  • tx_service/include/cc/local_cc_shards.h (12 hunks)
  • tx_service/include/cc/range_cc_map.h (2 hunks)
  • tx_service/include/cc/template_cc_map.h (2 hunks)
  • tx_service/include/tx_service.h (4 hunks)
  • tx_service/src/cc/cc_req_misc.cpp (2 hunks)
  • tx_service/src/cc/cc_shard.cpp (8 hunks)
  • tx_service/src/cc/local_cc_shards.cpp (12 hunks)
  • tx_service/src/cc/range_slice.cpp (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • tx_service/src/cc/cc_req_misc.cpp
  • tx_service/include/cc/range_cc_map.h
  • tx_service/include/cc/template_cc_map.h
  • store_handler/data_store_service_client_closure.cpp
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.

Applied to files:

  • tx_service/src/cc/range_slice.cpp
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/src/cc/cc_shard.cpp
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.

Applied to files:

  • tx_service/src/cc/range_slice.cpp
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/cc/cc_shard.h
  • tx_service/include/tx_service.h
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/src/cc/cc_shard.cpp
  • tx_service/include/cc/catalog_cc_map.h
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.

Applied to files:

  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/cc/cc_shard.h
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/include/cc/catalog_cc_map.h
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/cc/cc_shard.h
  • tx_service/include/tx_service.h
  • tx_service/include/cc/local_cc_shards.h
🧬 Code graph analysis (7)
tx_service/src/cc/range_slice.cpp (1)
tx_service/include/cc/cc_shard.h (2)
  • ReadCurrentArena (187-197)
  • SwitchToArena (199-208)
tx_service/src/cc/local_cc_shards.cpp (2)
tx_service/include/cc/local_cc_shards.h (1)
  • hash_partition_ckpt_arena_id_ (2017-2017)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
tx_service/include/cc/cc_shard.h (1)
tx_service/src/cc/cc_shard.cpp (2)
  • SetAsDefaultArena (2484-2495)
  • SetAsDefaultArena (2484-2484)
tx_service/include/tx_service.h (1)
tx_service/include/cc/cc_shard.h (1)
  • SwitchToArena (199-208)
tx_service/include/cc/local_cc_shards.h (1)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
tx_service/src/cc/cc_shard.cpp (2)
tx_service/include/cc/cc_shard.h (4)
  • SwitchToArena (199-208)
  • arena_id_ (278-279)
  • ReadCurrentArena (187-197)
  • memory_limit_ (270-270)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
tx_service/include/cc/catalog_cc_map.h (1)
tx_service/include/cc/cc_shard.h (2)
  • ReadCurrentArena (187-197)
  • SwitchToArena (199-208)
🔇 Additional comments (17)
tx_service/src/cc/range_slice.cpp (1)

525-555: Jemalloc arena switch around UpdateSlice is correctly scoped and symmetric

The new WITH_JEMALLOC block cleanly saves the current arena, switches to the table‑ranges arena for UpdateSlice, and restores both mimalloc state and the previous jemalloc arena afterward, with no early returns in between. No functional issues spotted.

tx_service/src/cc/local_cc_shards.cpp (1)

6548-6577: Jemalloc-based hash-partition ckpt heap reporting is consistent and side-effect free

The new WITH_JEMALLOC branch in ReportHashPartitionCkptHeapUsage pulls resident and allocated via GetJemallocArenaStat(hash_partition_ckpt_arena_id_, ...) and logs the same frag ratio as the mimalloc path, without touching the custom mimalloc heap or mutex. This is an appropriate, low‑risk way to surface jemalloc arena stats and doesn’t affect existing behavior when jemalloc is disabled.

tx_service/include/cc/local_cc_shards.h (9)

66-71: Good defensive programming for platform portability.

The static_assert ensures that uint32_t is the same as unsigned when jemalloc is enabled, which is essential since jemalloc's arena IDs are of type unsigned. This catches potential issues on exotic platforms.


427-435: LGTM!

Error handling for arena creation is properly implemented with LOG(FATAL) to abort on failure, which is appropriate since the arena is essential for memory isolation.


442-453: LGTM!

Consistent error handling pattern with InitializeTableRangesHeap().


466-469: LGTM!

Simple accessor for the arena ID.


478-497: LGTM!

The jemalloc path correctly uses arena-specific statistics via GetJemallocArenaStat() and compares allocated memory to the limit, consistent with the mimalloc fallback path.


507-528: LGTM!

Consistent implementation pattern with TableRangesMemoryFull().


533-565: LGTM!

The jemalloc path correctly simplifies the reporting logic since arena statistics don't require the mimalloc heap/thread context management.


891-896: LGTM on arena switching setup.

Using JemallocArenaSwitcher::ReadCurrentArena and SwitchToArena provides proper error handling with LOG(FATAL) on failure.


2016-2018: LGTM!

Arena ID members are correctly initialized to 0, which is jemalloc's default arena.

tx_service/src/cc/cc_shard.cpp (6)

53-55: LGTM!

Conditional jemalloc header inclusion is correctly guarded.


224-226: LGTM!

Arena switching in Init() correctly saves the previous arena and uses the shard's dedicated arena for initialization.


253-256: LGTM!

Arena restoration correctly uses JemallocArenaSwitcher::SwitchToArena which includes proper error handling.


2456-2464: LGTM!

Arena creation with proper error handling via LOG(FATAL).


2499-2533: Verify the memory comparison semantic difference between jemalloc and mimalloc paths.

The jemalloc path compares total_resident (RSS-like metric) against memory_limit_, while the mimalloc path compares allocated (active allocations). This is a semantic difference:

  • mimalloc: allocated >= memory_limit_ OR (defragment enabled AND committed > memory_limit_)
  • jemalloc: total_resident >= memory_limit_

This may be intentional since jemalloc's resident is closer to mimalloc's committed, but it means the jemalloc path may trigger memory pressure earlier/later depending on fragmentation. Please verify this is the intended behavior.


2545-2565: LGTM - defragmentation disabled for jemalloc.

The comment explains that defragmentation is not supported with jemalloc. This is reasonable since jemalloc handles memory management differently (via decay-based purging).

Comment thread tx_service/include/cc/cc_shard.h Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (8)
tx_service/src/cc/range_slice.cpp (1)

533-555: Jemalloc arena switching pattern is consistent with broader PR changes.

The save/switch/restore pattern correctly brackets the UpdateSlice call to ensure allocations occur in the designated table-range arena. The arena restoration happens before unlocking, which is correct.

However, the return values from ReadCurrentArena and SwitchToArena are not checked. While these helpers log FATAL on failure (which typically aborts), ignoring the return value leaves a small window where execution could continue with incorrect arena settings.

store_handler/data_store_service_client_closure.cpp (2)

757-763: Arena switching correctly brackets range key allocation.

The jemalloc arena management properly wraps the CreateTxKey calls and range vector population, ensuring these allocations use the designated table-range arena. The restoration before heap_lk.unlock() maintains correct ordering.

Also applies to: 823-827


1045-1052: LGTM: Arena management for range slice processing.

The arena switching correctly wraps the CreateTxKey allocations within the slice parsing loop. The pattern is consistent with FetchTableRangesCallback and other jemalloc-integrated paths in this PR.

Also applies to: 1088-1091

tx_service/src/cc/cc_req_misc.cpp (1)

411-438: Arena switching correctly scoped around range slice initialization.

The jemalloc arena management properly brackets InitRangeSlices, ensuring that range slice allocations use the designated table-range arena. The ordering with respect to the heap lock and thread override handling is correct.

tx_service/src/cc/local_cc_shards.cpp (1)

1023-1029: Consider a small RAII guard for repeated save/switch/restore arena pattern

The { ReadCurrentArena(prev); SwitchToArena(target); … SwitchToArena(prev); } pattern now appears in multiple places (InitTableRanges, KickoutRangeSlices, DropStoreRangesInBucket, DataSyncForHashPartition, and other files in this PR). A simple RAII helper (e.g., JemallocArenaGuard) would centralize the logic and make future additions less error‑prone.

Also applies to: 1298-1305, 1360-1366, 2105-2112, 4974-4979

tx_service/include/cc/cc_shard.h (1)

24-30: Guard jemalloc header include with WITH_JEMALLOC

This header is compiled even when jemalloc support is disabled, but #include <jemalloc/jemalloc.h> is unconditional. On systems without jemalloc installed or when WITH_JEMALLOC=OFF (per CMake option), this will break the build despite all actual usage being conditionally compiled.

Recommend wrapping the include:

-#include <jemalloc/jemalloc.h>
+#if defined(WITH_JEMALLOC)
+#include <jemalloc/jemalloc.h>
+#endif

Please confirm your intended build matrix (jemalloc optional vs required) and that the CMake configuration does not always inject jemalloc’s headers even when WITH_JEMALLOC is off.

tx_service/src/cc/cc_shard.cpp (2)

2476-2477: Clarify or remove // TODO: destroy arena ? in destructor.

Jemalloc arenas typically live for the process lifetime; explicit destruction is non‑trivial and often avoided. Leaving this as a bare TODO is misleading—either document why arenas are intentionally not destroyed, or implement safe destruction if you really need it.


2484-2495: Avoid assert(false) for non‑jemalloc SetAsDefaultArena() path.

Right now the #else branch asserts unconditionally:

#else
    assert(false);
    return 0;
#endif

That will hard‑crash in any non‑jemalloc build if this function is ever called (e.g., by future refactors that miss the #if defined(WITH_JEMALLOC) guard). Safer options:

  • Make the function a no‑op in non‑jemalloc builds, or
  • Compile it only when WITH_JEMALLOC is defined (and keep calls under the same guard).

For example, minimal change:

 uint32_t CcShardHeap::SetAsDefaultArena()
 {
 #if defined(WITH_JEMALLOC)
     uint32_t prev_arena;
     JemallocArenaSwitcher::ReadCurrentArena(prev_arena);
     JemallocArenaSwitcher::SwitchToArena(arena_id_);
     return prev_arena;
-#else
-    assert(false);
-    return 0;
-#endif
+#else
+    // Jemalloc is not enabled; arena switching is a no‑op.
+    return 0;
+#endif
 }
🧹 Nitpick comments (5)
tx_service/include/cc/cc_request.h (2)

6289-6375: Consider RAII for heap/arena switching around data-sync heap releases

The new SetAsDefaultHeap + SetAsDefaultArena and corresponding mi_heap_set_default + mallctl("thread.arena", ...) restore logic looks correct and is executed on all paths, but it’s easy to regress if future early returns are added in this block. A small RAII helper (constructor does the switch, destructor restores heap+arena) would make this safer and reduce duplication between the data_sync_vec_ and archive_vec_ branches.


7428-7453: Jemalloc stats path is consistent; tiny logging/text nit

The split between mimalloc (mi_thread_stats) and jemalloc (GetJemallocArenaStat on the shard heap’s arena) keeps HeapMemStats::allocated_/committed_ semantics aligned, and switching to DLOG(INFO) avoids noisy logs. Minor nit: the log message still says "commited" instead of "committed" if you care about polishing debug output.

tx_service/include/cc/local_cc_shards.h (2)

418-455: Arena creation in heap initializers is fine; consider guarding against repeated creation

InitializeTableRangesHeap and InitializeHashPartitionCkptHeap now create dedicated jemalloc arenas with fatal logging on failure, and the new table_ranges_arena_id_ / hash_partition_ckpt_arena_id_ fields expose those ids cleanly.

If these initializers can be called more than once, you may want to move the mallctl("arenas.create", ...) calls inside the existing if (!table_ranges_heap_) / a similar guard for the hash-part heap to avoid leaking unused arenas and accidentally changing the arena id after first initialization. If they are strictly one-shot, the current code is OK but that assumption might be worth documenting.

It’d be good to confirm (from call sites or docs) that these init functions are only invoked once per process lifetime; if not, tightening the guard around arenas.create would remove any ambiguity.

Also applies to: 2012-2018


886-1013: Prefer JemallocArenaSwitcher over raw mallctl in CreateTableRange new-range branch

In CreateTableRange you now:

  • Read and switch the thread arena via JemallocArenaSwitcher::ReadCurrentArena / SwitchToArena(table_range_arena_id).
  • In the “existing range, newer version” path, restore via JemallocArenaSwitcher::SwitchToArena(prev_arena_id) at the end of the function.
  • In the “new range” path, restore with a raw mallctl("thread.arena", ...) just before the early return.

That single raw mallctl is now the only direct arena-restore call in this function and bypasses the centralized error handling in JemallocArenaSwitcher (LOG(FATAL) on failure).

You can make this consistent and safer by reusing the helper:

-#if defined(WITH_JEMALLOC)
-            mallctl(
-                "thread.arena", NULL, NULL, &prev_arena_id, sizeof(uint32_t));
-#endif
+#if defined(WITH_JEMALLOC)
+            JemallocArenaSwitcher::SwitchToArena(prev_arena_id);
+#endif

This keeps both branches on the same error-handling path and avoids silently continuing if arena restoration were ever to fail.

Please confirm that JemallocArenaSwitcher::SwitchToArena already covers your desired logging/abort behavior so that unifying on it here doesn’t change the intended failure semantics.

tx_service/src/cc/cc_shard.cpp (1)

2545-2548: Jemalloc NeedDefragment() stub may leave outputs uninitialized; align semantics.

With jemalloc you unconditionally return false and never touch *alloc/*commit. If any caller expects those out‑params to be initialized even when defrag is not needed, they will now see indeterminate values.

To keep behavior predictable, either:

  • Ensure callers ignore alloc/commit when NeedDefragment() is false, or
  • Initialize them in the jemalloc path too, e.g.:
 bool CcShardHeap::NeedDefragment(int64_t *alloc, int64_t *commit) const
 {
 #if defined(WITH_JEMALLOC)
-    // defragmentation is not supported with jemalloc
-    return false;
+    // Defragmentation is not supported with jemalloc; always disabled.
+    if (alloc != nullptr)
+    {
+        *alloc = 0;
+    }
+    if (commit != nullptr)
+    {
+        *commit = 0;
+    }
+    return false;
 #else
     int64_t allocated, committed;
     mi_thread_stats(&allocated, &committed);
     ...

Please double‑check call sites to confirm which behavior is relied on.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 29cba32 and 834641d.

📒 Files selected for processing (16)
  • CMakeLists.txt (3 hunks)
  • build_tx_service.cmake (4 hunks)
  • store_handler/data_store_service_client_closure.cpp (4 hunks)
  • tx_service/include/cc/catalog_cc_map.h (2 hunks)
  • tx_service/include/cc/cc_request.h (6 hunks)
  • tx_service/include/cc/cc_shard.h (6 hunks)
  • tx_service/include/cc/local_cc_shards.h (12 hunks)
  • tx_service/include/cc/range_cc_map.h (2 hunks)
  • tx_service/include/cc/template_cc_map.h (2 hunks)
  • tx_service/include/tx_service.h (4 hunks)
  • tx_service/include/tx_service_common.h (1 hunks)
  • tx_service/include/util.h (2 hunks)
  • tx_service/src/cc/cc_req_misc.cpp (2 hunks)
  • tx_service/src/cc/cc_shard.cpp (8 hunks)
  • tx_service/src/cc/local_cc_shards.cpp (12 hunks)
  • tx_service/src/cc/range_slice.cpp (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • build_tx_service.cmake
🚧 Files skipped from review as they are similar to previous changes (5)
  • CMakeLists.txt
  • tx_service/include/util.h
  • tx_service/include/cc/template_cc_map.h
  • tx_service/include/cc/catalog_cc_map.h
  • tx_service/include/tx_service_common.h
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.

Applied to files:

  • tx_service/src/cc/range_slice.cpp
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/src/cc/cc_shard.cpp
  • tx_service/include/cc/cc_request.h
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.

Applied to files:

  • tx_service/src/cc/range_slice.cpp
  • tx_service/include/cc/cc_shard.h
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/include/tx_service.h
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/src/cc/cc_shard.cpp
  • tx_service/include/cc/range_cc_map.h
  • tx_service/include/cc/cc_request.h
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.

Applied to files:

  • tx_service/include/cc/cc_shard.h
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/src/cc/cc_shard.cpp
  • tx_service/include/cc/cc_request.h
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • tx_service/include/cc/cc_shard.h
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/include/tx_service.h
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/cc/cc_request.h
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.

Applied to files:

  • tx_service/include/cc/cc_request.h
🧬 Code graph analysis (8)
store_handler/data_store_service_client_closure.cpp (1)
tx_service/include/cc/cc_shard.h (2)
  • ReadCurrentArena (187-197)
  • SwitchToArena (199-208)
tx_service/src/cc/range_slice.cpp (1)
tx_service/include/cc/cc_shard.h (2)
  • ReadCurrentArena (187-197)
  • SwitchToArena (199-208)
tx_service/include/cc/local_cc_shards.h (1)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
tx_service/include/tx_service.h (1)
tx_service/include/cc/cc_shard.h (1)
  • SwitchToArena (199-208)
tx_service/src/cc/local_cc_shards.cpp (3)
tx_service/include/cc/cc_shard.h (2)
  • ReadCurrentArena (187-197)
  • SwitchToArena (199-208)
tx_service/include/cc/local_cc_shards.h (1)
  • hash_partition_ckpt_arena_id_ (2017-2017)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
tx_service/src/cc/cc_req_misc.cpp (1)
tx_service/include/cc/cc_shard.h (2)
  • ReadCurrentArena (187-197)
  • SwitchToArena (199-208)
tx_service/src/cc/cc_shard.cpp (2)
tx_service/include/cc/cc_shard.h (4)
  • SwitchToArena (199-208)
  • arena_id_ (278-279)
  • ReadCurrentArena (187-197)
  • memory_limit_ (270-270)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
tx_service/include/cc/cc_request.h (1)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
🔇 Additional comments (17)
tx_service/include/cc/cc_request.h (1)

84-86: Guarded jemalloc include looks correct

Including <jemalloc/jemalloc.h> only under #if defined(WITH_JEMALLOC) cleanly isolates the new allocator‑specific code and avoids build breakage when jemalloc is absent.

tx_service/src/cc/local_cc_shards.cpp (7)

1023-1029: Jemalloc arena switch around InitTableRanges is balanced and scoped correctly

Saving prev_arena, switching to GetTableRangesArenaId(), and restoring prev_arena after all work (and after the mimalloc heap/thread restore) ensures table‑range allocations are isolated without leaving the thread in a wrong arena. No early returns bypass the restore, so this looks correct.

Also applies to: 1124-1126


1298-1305: KickoutRangeSlices: jemalloc arena override is correctly paired even on early return

The jemalloc block around DropStoreRange() saves the current arena, switches to the table‑range arena, then restores prev_arena before checking has_enough_mem and possibly returning. This keeps arena state consistent even when the function returns early after freeing enough memory.

Also applies to: 1320-1322


1360-1366: Second KickoutRangeSlices path also restores jemalloc arena on all exits

The second eviction path (sorted scanned_ranges) mirrors the first: arena is switched only inside the critical section and restored before any potential return. The symmetry with the earlier block reduces risk of state leakage across iterations.

Also applies to: 1381-1383


2105-2112: DropStoreRangesInBucket jemalloc usage matches existing mimalloc pattern

The jemalloc arena is switched only when IsStoreRangeFree(true) succeeds and is always restored immediately after DropStoreRange() and the mimalloc heap/thread reset. The return false path does not touch the arena, so there’s no missing restore here.

Also applies to: 2126-2128


3027-3028: Pin/unpin node group terms caching is correct and aligns with prior guidance

Caching TryPinNodeGroupData results per node_group_id_ and unpinning them once at the end guarded by if (term >= 0) avoids repeated pin/unpin churn and ensures you never unpin a failed pin attempt. This is exactly the pattern we wanted around TryPinNodeGroupData usage. Based on learnings, this looks good.

Also applies to: 3247-3254


4974-4979: Hash‑partition ckpt: jemalloc arena switch is tightly scoped to allocation window

In DataSyncForHashPartition, the jemalloc arena is switched to hash_partition_ckpt_arena_id_ only while building data_sync_vec / archive_vec / mv_base_vec under the ckpt heap mutex, then restored before releasing the mimalloc heap/thread overrides and the lock. There are no early returns inside this window, so the arena state should always be restored correctly.

Also applies to: 5036-5039


6548-6560: Hash‑partition ckpt heap usage reporting correctly splits jemalloc vs mimalloc paths

Using GetJemallocArenaStat(hash_partition_ckpt_arena_id_, …) under WITH_JEMALLOC and keeping the previous mimalloc‑based mi_thread_stats path in the #else cleanly separates the two allocators. Both branches restore their respective heap/thread or arena state before returning, so reporters won’t perturb allocator configuration.

Also applies to: 6576-6577

tx_service/include/cc/range_cc_map.h (1)

1039-1072: Jemalloc arena switch around InitRangeSlices looks balanced and safe

The read–switch–restore sequence is scoped tightly around InitRangeSlices, mirrors the existing mimalloc heap/thread override, and has no early returns between switch and restore. This should correctly confine range-slice allocations to the table-ranges arena without leaking the arena change beyond this block.

tx_service/include/tx_service.h (2)

280-304: External Tx processing now restores both heap and jemalloc arena per round

The added SetAsDefaultArena/SwitchToArena pairing in RunOneRound correctly mirrors the existing mimalloc SetAsDefaultHeap usage and ensures the thread’s arena is restored even when not in external processing mode.


685-712: ForwardTx jemalloc arena handling is symmetric with heap switching

ForwardTx now saves the previous arena via SetAsDefaultArena and restores it with JemallocArenaSwitcher::SwitchToArena(ext_tx_proc_arena_id_), alongside the existing heap/thread-id restoration. This keeps external forwarding confined to the shard’s arena context and resets coordinator state to 0 afterward.

tx_service/include/cc/cc_shard.h (1)

179-210: JemallocArenaSwitcher and shard-heap arena tracking look consistent

JemallocArenaSwitcher centralizes mallctl("thread.arena", ...) with fatal logging on failure, and CcShardHeap’s SetAsDefaultArena/ArenaId plus arena_id_{0} provide clean per-shard arena tracking. This design aligns with callers in TxProcessor and LocalCcShards that save/restore arenas around heap-sensitive work.

If you haven’t already, it’s worth double-checking that all SetAsDefaultArena() call sites are compiled only when jemalloc is present, and that the JemallocArenaSwitcher API matches the jemalloc version you target.

Also applies to: 225-279

tx_service/include/cc/local_cc_shards.h (2)

41-71: Type-trait guard for jemalloc arena ids is reasonable

Including <type_traits> and asserting unsigned is uint32_t under WITH_JEMALLOC makes the uint32_t-based arena ids safe against platforms where unsigned might differ. This is a low-cost, clear guard around jemalloc’s unsigned arena API.

Please verify against your target jemalloc headers that arena-related mallctl interfaces indeed use unsigned for arena ids across supported platforms.


466-541: Jemalloc-based table-range memory accounting matches the existing limit semantics

The new jemalloc paths in GetTableRangesArenaId(), TableRangesMemoryFull(), HasEnoughTableRangesMemory(), and TableRangeHeapUsageReport() correctly:

  • Query stats for the dedicated table-ranges arena via GetJemallocArenaStat.
  • Compare allocated against range_slice_memory_limit_ and derive a 90% “enough memory” threshold.
  • Log a consistent usage report under WITH_JEMALLOC, while retaining the prior mimalloc-based behavior when jemalloc is disabled.

This keeps the memory limit semantics intact while switching to per-arena stats.

Please double-check that GetJemallocArenaStat is available via the includes in this header on all build targets and that its units (bytes vs pages) line up with range_slice_memory_limit_.

tx_service/src/cc/cc_shard.cpp (3)

53-55: Jemalloc include is correctly feature‑gated.

Including <jemalloc/jemalloc.h> only under WITH_JEMALLOC matches the new mallctl usage in this file; no issues here.


224-226: Arena save/restore in Init() looks symmetric.

Saving the current arena via SetAsDefaultArena() and restoring it with JemallocArenaSwitcher::SwitchToArena(prev_arena) mirrors the heap default save/restore pattern and keeps the call site jemalloc‑only via #if defined(WITH_JEMALLOC). This is a reasonable integration point.

Also applies to: 254-256


2457-2464: Per‑heap jemalloc arena creation and error handling look good.

Creating a dedicated arena per CcShardHeap via mallctl("arenas.create", &arena_id_, &sz, NULL, 0) and treating failures as LOG(FATAL) aligns with the JemallocArenaSwitcher helpers and previous feedback about checking mallctl return codes.

Comment thread tx_service/src/cc/cc_shard.cpp

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
tx_service/src/cc/cc_shard.cpp (3)

2478-2479: Clarify the TODO comment.

The TODO should be resolved with a clear explanation. Jemalloc arenas cannot be explicitly destroyed; they are cleaned up at process shutdown.


2494-2497: The assert(false) in non-jemalloc path is fragile.

While callers are currently guarded by #if defined(WITH_JEMALLOC), the assert(false) will crash in debug builds if accidentally called. Consider either:

  1. Making this a compile-time error by not defining the method at all when jemalloc is disabled, or
  2. Documenting that this is intentionally unreachable

2522-2522: Remove unnecessary int64_t cast.

Both total_resident and memory_limit_ are size_t, so the cast is redundant and introduces a signed/unsigned comparison.

-    return total_resident >= static_cast<int64_t>(memory_limit_);
+    return total_resident >= memory_limit_;
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 834641d and c9fa127.

📒 Files selected for processing (2)
  • tx_service/include/cc/template_cc_map.h (3 hunks)
  • tx_service/src/cc/cc_shard.cpp (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tx_service/include/cc/template_cc_map.h
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.

Applied to files:

  • tx_service/src/cc/cc_shard.cpp
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.

Applied to files:

  • tx_service/src/cc/cc_shard.cpp
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.

Applied to files:

  • tx_service/src/cc/cc_shard.cpp
🧬 Code graph analysis (1)
tx_service/src/cc/cc_shard.cpp (2)
tx_service/include/cc/cc_shard.h (4)
  • SwitchToArena (199-208)
  • arena_id_ (278-279)
  • ReadCurrentArena (187-197)
  • memory_limit_ (270-270)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-471)
🔇 Additional comments (3)
tx_service/src/cc/cc_shard.cpp (3)

224-226: LGTM!

The arena save/restore pattern correctly mirrors the mimalloc heap pattern on lines 223 and 252.

Also applies to: 253-256


2456-2467: LGTM!

Arena creation with proper error handling. The fatal log will terminate the process on failure, which is appropriate for initialization.


2552-2556: LGTM!

Disabling defragmentation for jemalloc is appropriate since jemalloc handles fragmentation internally with its own mechanisms.

Comment thread tx_service/src/cc/cc_shard.cpp Outdated
@lokax lokax force-pushed the jemalloc branch 2 times, most recently from 834641d to 8f740bd Compare December 17, 2025 07:00

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (5)
tx_service/include/cc/catalog_cc_map.h (1)

442-476: RAII pattern for arena switching remains unimplemented.

The jemalloc arena switching still lacks exception safety. While the helper functions now include error handling via LOG(FATAL), if InitRangeSlices (lines 453-460) throws an exception, the arena restoration at lines 473-474 won't execute, leaving the thread in the wrong arena for subsequent allocations. The return values from ReadCurrentArena and SwitchToArena are also ignored.

Based on learnings, the RAII guard pattern suggested in the previous review would ensure automatic arena restoration even when exceptions occur.

tx_service/src/cc/cc_shard.cpp (3)

2492-2492: Clarify arena destruction behavior.

The TODO comment should be resolved. Based on learnings, jemalloc does not provide a public API to destroy arenas, and they are cleaned up automatically at process shutdown.

Apply this diff:

-    // TODO: destroy arena ?
+    // jemalloc does not provide a public API to destroy arenas.
+    // Arenas are cleaned up automatically at process shutdown.

2546-2546: Remove unnecessary cast.

The comparison at line 2546 casts memory_limit_ to int64_t, but both total_resident and memory_limit_ are size_t, so the cast is unnecessary and creates a signed/unsigned mismatch.

Apply this diff:

-    return total_resident >= static_cast<int64_t>(memory_limit_);
+    return total_resident >= memory_limit_;

2500-2511: Remove assert(false) in the non-jemalloc case.

If SetAsDefaultArena() is called when WITH_JEMALLOC is not defined, the code will crash. The function should either be removed in non-jemalloc builds or return a safe default without asserting.

Apply this diff:

 uint32_t CcShardHeap::SetAsDefaultArena()
 {
 #if defined(WITH_JEMALLOC)
     uint32_t prev_arena;
     JemallocArenaSwitcher::ReadCurrentArena(prev_arena);
     JemallocArenaSwitcher::SwitchToArena(arena_id_);
     return prev_arena;
 #else
-    assert(false);
+    // jemalloc is not enabled; SetAsDefaultArena() is a no-op.
     return 0;
 #endif
 }
tx_service/src/cc/local_cc_shards.cpp (1)

1030-1041: Consider a small RAII helper to de‑duplicate jemalloc/mimalloc arena switching

The new jemalloc integration repeats essentially the same pattern many times:

  • Capture current jemalloc arena (ReadCurrentArena), switch to a specific arena, do work, then switch back.
  • In the same scopes we already juggle mimalloc’s mi_override_thread and mi_heap_set_default.

This works, but it’s easy to get wrong in future edits (e.g., missed restore on a new early‑return path) and is now scattered across:

  • InitTableRanges
  • KickoutRangeSlices (both clean‑up passes)
  • DropStoreRangesInBucket
  • DataSyncForHashPartition heap section
  • ReportHashPartitionCkptHeapUsage (WITH_JEMALLOC branch)

A tiny RAII guard like:

class ArenaGuard {
public:
    explicit ArenaGuard(uint32_t target) {
        JemallocArenaSwitcher::ReadCurrentArena(prev_);
        JemallocArenaSwitcher::SwitchToArena(target);
    }
    ~ArenaGuard() {
        JemallocArenaSwitcher::SwitchToArena(prev_);
    }
private:
    uint32_t prev_;
};

would let these blocks become ArenaGuard guard(GetTableRangesArenaId()); or ArenaGuard guard(hash_partition_ckpt_arena_id_);, removing duplicated boilerplate and making it harder to forget the restore. Similar RAII for the mimalloc heap override could further simplify the nesting.

Also applies to: 1126-1138, 1310-1317, 1332-1334, 1373-1378, 1393-1395, 2117-2124, 2138-2140, 5010-5015, 5072-5075, 6584-6595

🧹 Nitpick comments (11)
store_handler/data_store_service_client_closure.cpp (1)

1088-1090: Arena restoration is correct, but consider RAII for future improvements.

The arena restoration is properly placed before lock release. However, the manual save/restore pattern could be improved with an RAII wrapper to ensure arena restoration even if exceptions occur.

Consider introducing an RAII wrapper class for arena switching:

class JemallocArenaGuard {
    uint32_t prev_arena_;
public:
    explicit JemallocArenaGuard(uint32_t target_arena) {
        JemallocArenaSwitcher::ReadCurrentArena(prev_arena_);
        JemallocArenaSwitcher::SwitchToArena(target_arena);
    }
    ~JemallocArenaGuard() {
        JemallocArenaSwitcher::SwitchToArena(prev_arena_);
    }
    JemallocArenaGuard(const JemallocArenaGuard&) = delete;
    JemallocArenaGuard& operator=(const JemallocArenaGuard&) = delete;
};

This would simplify usage and improve exception safety:

#if defined(WITH_JEMALLOC)
    JemallocArenaGuard arena_guard(shards->GetTableRangesArenaId());
#endif

Note: This pattern would also benefit the mimalloc thread/heap switching code.

tx_service/include/cc/local_cc_shards.h (1)

962-965: Inconsistent arena restoration: use JemallocArenaSwitcher::SwitchToArena for consistency.

Lines 891-896 and 1011-1013 use the JemallocArenaSwitcher helper methods which include error handling, but this restoration call uses raw mallctl without checking the return value. For consistency and to ensure errors are properly logged, use the helper method.

Apply this diff:

 #if defined(WITH_JEMALLOC)
-            mallctl(
-                "thread.arena", NULL, NULL, &prev_arena_id, sizeof(uint32_t));
+            JemallocArenaSwitcher::SwitchToArena(prev_arena_id);
 #endif
tx_service/include/tx_service_common.h (1)

86-86: Consider guarding with WITH_JEMALLOC for clarity.

The ext_tx_proc_arena_id_ field is added under EXT_TX_PROC_ENABLED but will only be meaningful when WITH_JEMALLOC is also defined. While the current approach works (the field is simply unused when jemalloc is disabled), you may want to nest it under WITH_JEMALLOC for clarity and to avoid the unused member warning in non-jemalloc builds.

     mi_heap_t *ext_tx_proc_heap_{nullptr};
+#ifdef WITH_JEMALLOC
     uint32_t ext_tx_proc_arena_id_{0};
+#endif
 #endif

Alternatively, leave as-is if the intent is to always have the field available for future use regardless of jemalloc.

tx_service/include/cc/cc_request.h (2)

6283-6323: Arena save/restore around data-sync heap release is correct; tighten prev_arena typing

The new SetAsDefaultArena/mallctl("thread.arena", …) blocks correctly mirror the mimalloc mi_heap_set_default save/restore and ensure jemalloc arena state is restored after releasing data_sync_vec_ and archive_vec_. One small robustness tweak:

  • Make prev_arena explicitly uint32_t (or whatever ArenaId() returns) and use sizeof(prev_arena) instead of the hard-coded sizeof(uint32_t), in both branches, to avoid silent type/size drift if SetAsDefaultArena()’s return type ever changes.

For example:

-#if defined(WITH_JEMALLOC)
-                auto prev_arena = scan_heap->SetAsDefaultArena();
-#endif
+ #if defined(WITH_JEMALLOC)
+                uint32_t prev_arena = scan_heap->SetAsDefaultArena();
+ #endif
...
-#if defined(WITH_JEMALLOC)
-                mallctl(
-                    "thread.arena", NULL, NULL, &prev_arena, sizeof(uint32_t));
-#endif
+ #if defined(WITH_JEMALLOC)
+                mallctl("thread.arena",
+                        nullptr,
+                        nullptr,
+                        &prev_arena,
+                        sizeof(prev_arena));
+ #endif

(and mirror the same change in the archive_vec_ block).

Also applies to: 6333-6367


7428-7453: Jemalloc stats path in CollectMemStatsCc is consistent; confirm GetJemallocArenaStat freshness

The split between mimalloc (mi_thread_stats) and jemalloc (GetJemallocArenaStat(ArenaId(), committed, allocated)) looks good, and mapping allocated_ / committed_ keeps semantics aligned. Using DLOG instead of LOG(INFO) also avoids noisy per-core logging in production.

Given CollectMemStatsCc now relies on GetJemallocArenaStat (see tx_service/include/util.h), please double-check that helper is refreshing jemalloc stats appropriately (e.g., via mallctl("epoch", …) or some centralized mechanism) and that it succeeds on your jemalloc build (stats-enabled, stats.arenas.* available). If epoch advancement is handled elsewhere or eventual consistency is acceptable, current code is fine; otherwise consider re-enabling the epoch update in GetJemallocArenaStat or documenting that these metrics are approximate.

tx_service/include/tx_service.h (3)

289-295: Remove commented debug logging.

The commented LOG(INFO) statements with debug variable names should be removed before merge.

Apply this diff:

 #if defined(WITH_JEMALLOC)
         auto prev_arena = shard_heap->SetAsDefaultArena();
-        uint32_t debug_current_arena;
-        JemallocArenaSwitcher::ReadCurrentArena(debug_current_arena);
-        // LOG(INFO) << "yf: runoneround, prev arena = " << prev_arena
-        //          << ", current arena = " << debug_current_arena;
 #endif

700-707: Remove commented debug logging.

The commented LOG(INFO) statements with debug variable names should be removed before merge.

Apply this diff:

 #if defined(WITH_JEMALLOC)
         coordi_->ext_tx_proc_arena_id_ = shard_heap->SetAsDefaultArena();
-        uint32_t debug_current_arena;
-        JemallocArenaSwitcher::ReadCurrentArena(debug_current_arena);
-        // LOG(INFO) << "yf: runoneround, prev arena = "
-        //          << coordi_->ext_tx_proc_arena_id_
-        //          << ", current arena = " << debug_current_arena;
 #endif

717-721: Remove trailing blank line.

Line 721 is an unnecessary blank line within the conditional block.

Apply this diff:

 #if defined(WITH_JEMALLOC)
         JemallocArenaSwitcher::SwitchToArena(coordi_->ext_tx_proc_arena_id_);
         coordi_->ext_tx_proc_arena_id_ = 0;
 #endif
-
tx_service/src/cc/cc_shard.cpp (3)

774-783: Remove commented debug logging.

The commented LOG(INFO) statement at lines 778-783 should be removed to keep the code clean.

Apply this diff:

     if (is_new_tx)
     {
         tx_it->second = GetTxLockInfo(tx_term);
-        /*
-        LOG(INFO) << "== UpsertLockHoldingTx: txn = " << txn
-                  << ", is key write lock"
-                  << ", lkinfo addr = " << tx_it->second.get()
-                  << ", table type = " << (int) (table_type);
-        */
     }

821-826: Remove commented debug logging.

The commented LOG(INFO) statement at lines 823-825 should be removed to keep the code clean.

Apply this diff:

     if (lk_info->cce_list_.empty())
     {
-        if (lk_info->wlock_ts_ != 0)
-        {
-            // LOG(INFO) << "== DeleteLockHoldingTx: txn = " << txn
-            //          << ", lkinfo addr = " << lk_info.get();
-        }
-
         RecycleTxLockInfo(std::move(lk_info));

2528-2534: Remove commented debug logging.

The commented LOG(INFO) statement at lines 2528-2534 should be removed before merge.

Apply this diff:

     mallctl("stats.resident", &debug_total_resident, &sz, NULL, 0);
     mallctl("stats.allocated", &debug_total_allocated, &sz, NULL, 0);
 
-    /*
-    LOG(INFO) << "==yf: GetJemallocArenaStat: total_resident " << total_resident
-              << ", allocated = " << total_allocated
-              << ", arena id = " << arena_id_
-              << ", stats.resident = " << debug_total_resident
-              << ", stats.allocated = " << debug_total_allocated;
-    */
-
     if (alloc != nullptr)
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c9fa127 and 8f740bd.

📒 Files selected for processing (16)
  • CMakeLists.txt (3 hunks)
  • build_tx_service.cmake (4 hunks)
  • store_handler/data_store_service_client_closure.cpp (4 hunks)
  • tx_service/include/cc/catalog_cc_map.h (2 hunks)
  • tx_service/include/cc/cc_request.h (6 hunks)
  • tx_service/include/cc/cc_shard.h (6 hunks)
  • tx_service/include/cc/local_cc_shards.h (14 hunks)
  • tx_service/include/cc/range_cc_map.h (2 hunks)
  • tx_service/include/cc/template_cc_map.h (3 hunks)
  • tx_service/include/tx_service.h (4 hunks)
  • tx_service/include/tx_service_common.h (1 hunks)
  • tx_service/include/util.h (2 hunks)
  • tx_service/src/cc/cc_req_misc.cpp (2 hunks)
  • tx_service/src/cc/cc_shard.cpp (10 hunks)
  • tx_service/src/cc/local_cc_shards.cpp (15 hunks)
  • tx_service/src/cc/range_slice.cpp (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • tx_service/include/cc/range_cc_map.h
  • tx_service/src/cc/cc_req_misc.cpp
  • tx_service/include/util.h
  • tx_service/include/cc/template_cc_map.h
  • tx_service/include/cc/cc_shard.h
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.

Applied to files:

  • tx_service/include/cc/catalog_cc_map.h
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/cc/cc_request.h
  • tx_service/src/cc/cc_shard.cpp
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.

Applied to files:

  • tx_service/include/cc/catalog_cc_map.h
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/cc/cc_request.h
  • tx_service/src/cc/cc_shard.cpp
  • tx_service/src/cc/range_slice.cpp
  • tx_service/include/tx_service.h
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • tx_service/include/cc/local_cc_shards.h
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/cc/cc_request.h
  • tx_service/include/tx_service.h
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.

Applied to files:

  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/cc/cc_request.h
  • tx_service/src/cc/cc_shard.cpp
  • tx_service/src/cc/range_slice.cpp
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.

Applied to files:

  • tx_service/include/cc/cc_request.h
🧬 Code graph analysis (6)
tx_service/include/cc/local_cc_shards.h (3)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-469)
tx_service/include/cc/cc_shard.h (2)
  • ReadCurrentArena (187-197)
  • SwitchToArena (199-208)
tx_service/src/cc/local_cc_shards.cpp (2)
  • EpochWorker (2922-2938)
  • EpochWorker (2922-2922)
store_handler/data_store_service_client_closure.cpp (1)
tx_service/include/cc/cc_shard.h (2)
  • ReadCurrentArena (187-197)
  • SwitchToArena (199-208)
tx_service/src/cc/local_cc_shards.cpp (2)
tx_service/include/cc/cc_shard.h (2)
  • ReadCurrentArena (187-197)
  • SwitchToArena (199-208)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-469)
tx_service/include/cc/cc_request.h (1)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-469)
tx_service/src/cc/cc_shard.cpp (2)
tx_service/include/cc/cc_shard.h (3)
  • SwitchToArena (199-208)
  • arena_id_ (278-279)
  • ReadCurrentArena (187-197)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-469)
tx_service/include/tx_service.h (1)
tx_service/include/cc/cc_shard.h (2)
  • ReadCurrentArena (187-197)
  • SwitchToArena (199-208)
🔇 Additional comments (18)
store_handler/data_store_service_client_closure.cpp (4)

757-763: Arena switching implementation is correct.

The jemalloc arena context switching is properly implemented:

  • Current arena is saved before switching
  • Switching happens after lock acquisition and mimalloc setup
  • The critical section (catalog factory access and range processing) runs in the designated table_range_arena_id

824-826: Arena restoration is correctly placed.

The arena is restored to the previous value before releasing the heap lock, ensuring thread-safe arena context management. The restoration sequence mirrors the entry sequence symmetrically.


1045-1052: Consistent arena switching pattern across callbacks.

This callback applies the same arena switching pattern as FetchTableRangesCallback. Both use GetTableRangesArenaId(), which is appropriate since both callbacks process table range data.


761-761: Arena initialization is properly handled with explicit FATAL error logging.

The code creates jemalloc arenas via mallctl("arenas.create", ...) during heap initialization and immediately crashes with a clear FATAL message if creation fails. This guarantees arenas are initialized before callbacks invoke GetTableRangesArenaId(), making the verification request unnecessary.

tx_service/include/cc/local_cc_shards.h (5)

66-71: LGTM: Static assertion ensures type compatibility.

The static_assert correctly validates that unsigned and uint32_t are the same type on the target platform, which is required for jemalloc's arena ID type. The typo from the previous review has been fixed.


427-435: LGTM: Arena creation with proper error handling.

The mallctl("arenas.create", ...) call now checks the return value and logs a fatal error if arena creation fails. This addresses the previous review concern about silent failures.


478-497: LGTM: Jemalloc arena stats integration.

The jemalloc path correctly uses GetJemallocArenaStat to query arena-specific memory statistics. The logic mirrors the mimalloc path's intent while using jemalloc's arena-based memory tracking.


2016-2018: LGTM: Arena ID members properly initialized.

The new table_ranges_arena_id_ and hash_partition_ckpt_arena_id_ members are default-initialized to 0, which is the default jemalloc arena. They will be set to valid arena IDs during initialization when WITH_JEMALLOC is defined.


2363-2365: LGTM: EpochWorker enables accurate jemalloc stats.

The EpochWorker declaration under WITH_JEMALLOC is essential because jemalloc caches statistics and requires periodic epoch advancement (via mallctl("epoch", ...)) to refresh them. The implementation in local_cc_shards.cpp correctly advances the epoch every second.

tx_service/src/cc/range_slice.cpp (1)

533-555: LGTM: Arena switching for memory isolation during slice updates.

The arena save/switch/restore pattern correctly isolates memory allocations during UpdateSlice to the designated table ranges arena. Using JemallocArenaSwitcher helper methods ensures consistent error handling with LOG(FATAL) on failures, matching the pattern established in cc_shard.h.

tx_service/include/cc/cc_request.h (1)

84-86: Guarded jemalloc include alongside mimalloc

Including <jemalloc/jemalloc.h> under WITH_JEMALLOC is appropriate for the new arena/stat helpers and keeps non-jemalloc builds untouched. Because this header is pulled into a TU that already includes <mimalloc-2.1/mimalloc.h>, please verify both allocators’ headers/macros coexist cleanly in all build configs (no conflicting malloc overrides or ABI assumptions).

tx_service/include/tx_service.h (1)

266-442: LGTM - Arena save/restore logic is correctly implemented.

The jemalloc arena management around external TX processing paths correctly saves the previous arena before switching and restores it afterward. The pattern is consistent across both RunOneRound and ForwardTx.

tx_service/src/cc/cc_shard.cpp (2)

220-257: LGTM - Arena save/restore in Init() is correctly implemented.

The jemalloc arena management in Init() properly saves the previous arena before initialization and restores it afterward, matching the pattern used elsewhere.


2469-2480: LGTM - Arena creation with proper error handling.

The arena creation in the CcShardHeap constructor includes appropriate error handling per the earlier review feedback (commit 8e9c3c7).

CMakeLists.txt (1)

28-34: No action needed.

The code correctly uses JEMALLOC_LINK_LIBRARIES from pkg_check_modules. This is a documented CMake variable that provides full library paths and is the appropriate choice for modern CMake projects using pkg-config.

tx_service/src/cc/local_cc_shards.cpp (1)

126-130: Epoch worker wiring for jemalloc looks correct; just confirm lifecycle and error behavior

  • Construction (epoch_worker_ctx_(1)), startup, and termination paths for EpochWorker are consistent with other worker contexts and gated by WITH_JEMALLOC.
  • The loop cleanly exits when epoch_worker_ctx_.status_ changes, and Terminate() will wake the worker via the shared cv_/mux_.

Two things to double‑check:

  • mallctl("epoch", ...) return value is ignored; if the underlying jemalloc build can ever return non‑zero without aborting, you may want to log at least once or gate future calls after repeated failures.
  • Ensure epoch_worker_ctx_.Terminate() is always called on all shutdown paths where other workers are terminated so the background epoch thread can’t outlive its owning LocalCcShards.

Also applies to: 296-301, 2916-2939

build_tx_service.cmake (2)

136-139: Jemalloc integration looks reasonable; verify availability of jemalloc_cfg and JEMALLOC_INCLUDE_DIRS

  • WITH_JEMALLOC now appends ${JEMALLOC_INCLUDE_DIRS} to INCLUDE_DIR and links txservice against the jemalloc_cfg INTERFACE target. That’s consistent with the pattern described in the top‑level CMake (jemalloc_cfg providing compile defs and link flags).
  • As long as the root CMakeLists.txt defines WITH_JEMALLOC and creates jemalloc_cfg plus JEMALLOC_INCLUDE_DIRS (e.g., via a find_package or custom logic), this wiring should be fine.

Please double‑check:

  • That jemalloc_cfg is always defined before this file is processed when WITH_JEMALLOC=ON; otherwise CMake will error at target_link_libraries.
  • That JEMALLOC_INCLUDE_DIRS is populated (or explicitly empty) in the same condition, so the INCLUDE_DIR list doesn’t accidentally rely on an unset variable in some configurations.

Also applies to: 217-219


355-356: Host manager link visibility change is fine

Switching to target_link_libraries(host_manager PRIVATE ...) is appropriate for an executable and doesn’t change behavior; it just clarifies link interface semantics.

Comment thread CMakeLists.txt
Comment on lines +24 to +27
message(NOTICE "WITH_JEMALLOC: ${WITH_JEMALLOC}")

add_library(jemalloc_cfg INTERFACE)
option(WITH_JEMALLOC "Use jemalloc" OFF)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Move option declaration before message.

Line 24 references ${WITH_JEMALLOC} before the option is declared at line 27. This will print an empty or cached value. Move the option() declaration before the message() for clarity.

Apply this diff:

-message(NOTICE "WITH_JEMALLOC: ${WITH_JEMALLOC}")
-
 add_library(jemalloc_cfg INTERFACE)
 option(WITH_JEMALLOC "Use jemalloc" OFF)
+message(NOTICE "WITH_JEMALLOC: ${WITH_JEMALLOC}")
+
 if(WITH_JEMALLOC)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
message(NOTICE "WITH_JEMALLOC: ${WITH_JEMALLOC}")
add_library(jemalloc_cfg INTERFACE)
option(WITH_JEMALLOC "Use jemalloc" OFF)
add_library(jemalloc_cfg INTERFACE)
option(WITH_JEMALLOC "Use jemalloc" OFF)
message(NOTICE "WITH_JEMALLOC: ${WITH_JEMALLOC}")
🤖 Prompt for AI Agents
CMakeLists.txt around lines 24-27: the CMake option WITH_JEMALLOC is referenced
in the message() before it is declared, so move the option(WITH_JEMALLOC "Use
jemalloc" OFF) line above the message(NOTICE ...) line; ensure option() appears
before any use of ${WITH_JEMALLOC} so the message prints the actual value (or
cached value) rather than an empty/undefined one.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
tx_service/src/cc/cc_shard.cpp (2)

485-496: Consider compile-time guard for the entire function.

The assert(false) in the non-jemalloc case is a runtime check, but since all call sites are already guarded by #if defined(WITH_JEMALLOC), this function should never be invoked when jemalloc is disabled. Consider guarding the entire function definition with #if defined(WITH_JEMALLOC) or making it return a safe default value instead of asserting.

Alternative approaches:

Option 1: Guard the function definition

+#if defined(WITH_JEMALLOC)
 uint32_t CcShardHeap::SetAsDefaultArena()
 {
-#if defined(WITH_JEMALLOC)
     uint32_t prev_arena;
     JemallocArenaSwitcher::ReadCurrentArena(prev_arena);
     JemallocArenaSwitcher::SwitchToArena(arena_id_);
     return prev_arena;
-#else
-    assert(false);
-    return 0;
-#endif
 }
+#endif

Option 2: Return a sentinel value

 #else
-    assert(false);
+    // jemalloc is not enabled; SetAsDefaultArena() is a no-op.
+    LOG(WARNING) << "SetAsDefaultArena() called without jemalloc support";
     return 0;
 #endif

2500-2517: Remove unnecessary cast in size_t comparison.

The cast to int64_t at line 2516 is unnecessary because both total_resident and memory_limit_ are size_t. This causes a signed/unsigned comparison warning.

Apply this diff:

-    return total_resident >= static_cast<int64_t>(memory_limit_);
+    return total_resident >= memory_limit_;

Note: The else branch (line 2531) correctly keeps the cast since allocated is int64_t in the mimalloc case.

🧹 Nitpick comments (1)
tx_service/src/cc/cc_shard.cpp (1)

477-477: Clarify or resolve the TODO comment.

The TODO comment about destroying the arena should be resolved. Based on jemalloc documentation, there is no public API to destroy arenas—they are automatically cleaned up at process shutdown. Consider updating the comment to reflect this, or remove it if arena cleanup is intentionally omitted.

Apply this diff to clarify the comment:

-    // TODO: destroy arena ?
+    // Note: jemalloc does not provide a public API to destroy arenas.
+    // Arenas are automatically cleaned up at process shutdown.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f740bd and 4598c4d.

📒 Files selected for processing (3)
  • tx_service/include/cc/template_cc_map.h (2 hunks)
  • tx_service/include/tx_service.h (4 hunks)
  • tx_service/src/cc/cc_shard.cpp (9 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tx_service/include/cc/template_cc_map.h
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.

Applied to files:

  • tx_service/include/tx_service.h
  • tx_service/src/cc/cc_shard.cpp
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • tx_service/include/tx_service.h
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.

Applied to files:

  • tx_service/src/cc/cc_shard.cpp
🧬 Code graph analysis (2)
tx_service/include/tx_service.h (1)
tx_service/include/cc/cc_shard.h (1)
  • SwitchToArena (199-208)
tx_service/src/cc/cc_shard.cpp (2)
tx_service/include/cc/cc_shard.h (4)
  • SwitchToArena (199-208)
  • arena_id_ (278-279)
  • ReadCurrentArena (187-197)
  • memory_limit_ (270-270)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-469)
🔇 Additional comments (11)
tx_service/include/tx_service.h (6)

289-291: LGTM!

The arena capture correctly mirrors the heap switching pattern. The previous arena is saved for later restoration, maintaining proper arena context for external processing.


298-300: LGTM!

Storing the previous arena ID in the coordinator correctly mirrors the heap storage pattern, enabling proper restoration after external processing.


425-427: LGTM!

Arena restoration correctly mirrors the heap restoration at line 424. Based on the code snippets, SwitchToArena includes proper error handling (LOG(FATAL) on failure).


433-435: LGTM!

Clearing the arena ID correctly mirrors the heap cleanup and properly resets the external processor state.


696-698: LGTM!

Arena switching in ForwardTx follows the same pattern as RunOneRound, correctly capturing the arena state before forwarding the transaction.


708-711: LGTM!

The arena restoration and cleanup correctly completes the external processing teardown, maintaining symmetry with the heap operations.

tx_service/src/cc/cc_shard.cpp (5)

53-55: LGTM!

The conditional inclusion of the jemalloc header is properly guarded and follows standard practices for optional dependencies.


224-226: LGTM!

Arena switching in CcShard::Init correctly mirrors the heap switching pattern, ensuring shard initialization occurs in the appropriate arena context.


254-256: LGTM!

Arena restoration correctly completes the arena switching bracket around shard initialization, with proper error handling in the underlying SwitchToArena implementation.


458-465: LGTM!

Arena creation includes proper error handling with a fatal log on failure, which is appropriate since arena creation is critical for shard heap initialization.


2546-2566: LGTM!

Defragmentation is appropriately disabled when jemalloc is enabled, as defragmentation is a mimalloc-specific feature. The comment clearly documents this design decision.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
tx_service/src/cc/cc_shard.cpp (2)

2477-2477: Resolve or clarify the TODO comment about arena destruction.

The TODO comment suggests uncertainty about whether arenas should be destroyed. According to jemalloc documentation, there is no public API to explicitly destroy arenas—they are automatically cleaned up when the process terminates. Update the comment to reflect this:

-    // TODO: destroy arena ?
+    // Note: jemalloc does not provide a public API to destroy arenas.
+    // Arenas are automatically cleaned up at process termination.

2485-2496: Remove assert(false) from non-jemalloc branch or guard the declaration.

The assert(false) on line 2493 will crash if SetAsDefaultArena() is called when jemalloc is disabled. Since the method is declared unconditionally in the header (line 225), this creates a dangerous API surface. Either:

  1. Guard the declaration in the header (recommended, see earlier comment), or
  2. Remove the assert(false) and document the no-op behavior:
 #else
-    assert(false);
+    // No-op when jemalloc is not enabled
     return 0;
 #endif

The first option (guarding the declaration) is preferred as it prevents accidental calls at compile time.

tx_service/include/cc/cc_shard.h (2)

26-26: Guard the jemalloc header include with WITH_JEMALLOC.

The unconditional include breaks builds when WITH_JEMALLOC=OFF. Wrap it in #if defined(WITH_JEMALLOC) ... #endif to match the pattern used in the implementation file (lines 53-55 in cc_shard.cpp) and to protect against builds without jemalloc.

+#if defined(WITH_JEMALLOC)
 #include <jemalloc/jemalloc.h>
+#endif

225-225: Consider guarding jemalloc-specific public API with WITH_JEMALLOC.

SetAsDefaultArena() and ArenaId() expose jemalloc arena management. The implementation of SetAsDefaultArena() asserts false when jemalloc is disabled (line 2493), making it unsafe to call. For API consistency, guard these declarations:

+#if defined(WITH_JEMALLOC)
     uint32_t SetAsDefaultArena();
+#endif

     // Check if this heap is full
     bool Full(int64_t *alloc = nullptr, int64_t *commit = nullptr) const;
     
     // ... other methods ...

+#if defined(WITH_JEMALLOC)
     uint32_t ArenaId() const
     {
         return arena_id_;
     }
+#endif

Alternatively, document that these methods should only be called when WITH_JEMALLOC is defined, but guarding is cleaner and prevents accidental misuse.

Also applies to: 239-242

🧹 Nitpick comments (1)
tx_service/include/cc/local_cc_shards.h (1)

891-896: Consider initializing prev_arena_id for defensive programming.

While the JemallocArenaSwitcher::ReadCurrentArena method uses LOG(FATAL) on failure (preventing uninitialized use in practice), initializing prev_arena_id to 0 would follow defensive programming practices.

🔎 Optional improvement:
 #if defined(WITH_JEMALLOC)
-        uint32_t prev_arena_id;
+        uint32_t prev_arena_id = 0;
         JemallocArenaSwitcher::ReadCurrentArena(prev_arena_id);
         auto table_range_arena_id = GetTableRangesArenaId();
         JemallocArenaSwitcher::SwitchToArena(table_range_arena_id);
 #endif
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4598c4d and ef9174a.

📒 Files selected for processing (4)
  • tx_service/include/cc/cc_request.h (6 hunks)
  • tx_service/include/cc/cc_shard.h (6 hunks)
  • tx_service/include/cc/local_cc_shards.h (14 hunks)
  • tx_service/src/cc/cc_shard.cpp (9 hunks)
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.

Applied to files:

  • tx_service/include/cc/cc_request.h
  • tx_service/include/cc/cc_shard.h
  • tx_service/include/cc/local_cc_shards.h
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.

Applied to files:

  • tx_service/include/cc/cc_request.h
  • tx_service/src/cc/cc_shard.cpp
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • tx_service/include/cc/cc_request.h
  • tx_service/include/cc/cc_shard.h
  • tx_service/include/cc/local_cc_shards.h
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.

Applied to files:

  • tx_service/include/cc/cc_request.h
  • tx_service/include/cc/cc_shard.h
  • tx_service/include/cc/local_cc_shards.h
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.

Applied to files:

  • tx_service/include/cc/cc_request.h
🧬 Code graph analysis (4)
tx_service/include/cc/cc_request.h (2)
tx_service/include/cc/cc_shard.h (1)
  • SwitchToArena (197-208)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-469)
tx_service/src/cc/cc_shard.cpp (2)
tx_service/include/cc/cc_shard.h (4)
  • SwitchToArena (197-208)
  • arena_id_ (278-279)
  • ReadCurrentArena (185-195)
  • memory_limit_ (270-270)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-469)
tx_service/include/cc/cc_shard.h (1)
tx_service/src/cc/cc_shard.cpp (2)
  • SetAsDefaultArena (2485-2496)
  • SetAsDefaultArena (2485-2485)
tx_service/include/cc/local_cc_shards.h (2)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-469)
tx_service/src/cc/local_cc_shards.cpp (2)
  • EpochWorker (2922-2938)
  • EpochWorker (2922-2922)
🔇 Additional comments (14)
tx_service/include/cc/local_cc_shards.h (11)

41-41: LGTM - necessary include for jemalloc type checking.

The <type_traits> header is required for std::is_same_v used in the static assertion at line 68.


66-71: LGTM - good compile-time type safety check.

The static assertion ensures that unsigned equals uint32_t on this platform, which is necessary for jemalloc arena ID compatibility. This prevents subtle bugs on platforms where these types differ.


427-435: LGTM - proper error handling for arena creation.

The arena creation now includes error checking with LOG(FATAL), which appropriately terminates the process if the arena cannot be created during initialization. This was correctly addressed from previous review feedback.


442-453: LGTM - consistent error handling for arena creation.

The hash partition checkpoint arena creation follows the same pattern as the table ranges arena, with appropriate error handling via LOG(FATAL).


466-469: LGTM - straightforward accessor method.

The getter provides public access to the table ranges arena ID, which is needed by other components for arena-aware memory operations.


478-497: LGTM - correct arena-aware memory checking.

The function properly queries jemalloc arena-specific statistics when WITH_JEMALLOC is defined, and falls back to the existing mimalloc-based implementation otherwise. The memory limit comparison logic is correct.


507-528: LGTM - consistent arena-aware memory threshold checking.

The function correctly implements a 90% memory threshold check using arena-specific statistics when jemalloc is enabled, with appropriate fallback to mimalloc.


533-565: LGTM - proper memory reporting for both allocators.

The function correctly reports memory usage using arena-specific stats for jemalloc (simpler path) and maintains the existing thread-override logic for mimalloc. Both implementations provide consistent logging.


962-964: LGTM - proper arena restoration at both exit points.

The arena is correctly restored at both exit paths from CreateTableRange (after creating a new range and after updating an existing range). The error handling via JemallocArenaSwitcher::SwitchToArena is consistent with the save operation.

Also applies to: 1010-1012


2015-2016: LGTM - proper member variable declarations for arena IDs.

The arena ID members are appropriately initialized to 0 (jemalloc's default arena). They are set during initialization functions that use LOG(FATAL) on failure, ensuring they won't remain at 0 in a running system.


2179-2179: LGTM - proper epoch worker scaffolding for jemalloc statistics.

The epoch_worker_ctx_ member and EpochWorker() method declaration follow the established pattern for worker threads in this class. The epoch worker is necessary to periodically refresh jemalloc statistics (via mallctl("epoch", ...)) for accurate memory reporting, as shown in the implementation from relevant code snippets.

Also applies to: 2362-2364

tx_service/include/cc/cc_request.h (3)

6297-6323: Arena save/restore around scan‑heap releases is correct and symmetric

The SetAsDefaultArena() / JemallocArenaSwitcher::SwitchToArena(prev_arena) pairing mirrors the mimalloc heap pattern and correctly ensures FlushRecord deallocations run in the shard's arena. Both blocks properly restore heap and arena before any early return paths.

Consider a small RAII helper for both heap and arena to eliminate duplication across the two blocks (lines 6297–6323 and 6339–6365) and reduce future maintenance risk, though the current logic is sound and correct.


7426-7451: Memory stats collection correctly maintains semantic consistency across allocators

The implementation properly splits mimalloc and jemalloc paths, with each mapping allocated/committed memory to HeapMemStats fields consistently. Jemalloc's resident value aligns with the committed_ field, and allocated sums align with allocated_. Debug logging via DLOG remains scoped to non-production builds.


84-86: Guarded jemalloc include is correct

The #if defined(WITH_JEMALLOC) guard around #include <jemalloc/jemalloc.h> matches the pattern used throughout the codebase (util.h, tx_service.h, template_cc_map.h, and others), ensuring compatibility in non-jemalloc builds.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
tx_service/src/cc/local_cc_shards.cpp (1)

1037-1043: Jemalloc arena save/switch/restore pattern is correct but duplicated; consider an RAII guard

The new WITH_JEMALLOC blocks around table‑range operations and hash‑partition ckpt heap usage all follow the same pattern:

  • Read current arena.
  • Switch to a specific arena (GetTableRangesArenaId(), hash_partition_ckpt_arena_id_, etc.).
  • Do the work (DropStoreRange, UpdateSliceSpec, hash‑partition vec allocations).
  • Switch back to the previous arena.

The control flow in each site looks correct (no returns before the restore), but this pattern is now repeated in several hot paths (InitTableRanges, KickoutRangeSlices in both branches, DropStoreRangesInBucket, DataSyncForHashPartition).

To keep this scalable and less error‑prone as more call sites are added, consider introducing a small RAII helper, e.g.:

#if defined(WITH_JEMALLOC)
class ArenaGuard {
public:
    explicit ArenaGuard(uint32_t target)
    {
        JemallocArenaSwitcher::ReadCurrentArena(prev_);
        JemallocArenaSwitcher::SwitchToArena(target);
    }
    ~ArenaGuard()
    {
        JemallocArenaSwitcher::SwitchToArena(prev_);
    }
private:
    uint32_t prev_{0};
};
#endif

and then use ArenaGuard guard(GetTableRangesArenaId()); / ArenaGuard guard(hash_partition_ckpt_arena_id_); at the top of each block instead of duplicating the save/restore logic.

Also applies to: 1138-1140, 1312-1319, 1334-1336, 1375-1380, 1395-1397, 2119-2126, 2140-2142, 5012-5017, 5074-5077

🧹 Nitpick comments (6)
tx_service/include/cc/cc_request.h (2)

6294-6322: Arena switch/save/restore in ReleaseDataSyncScanHeapCc appears sound; consider a small guard helper.

The pattern:

  • SetAsDefaultHeap() (and under jemalloc, SetAsDefaultArena()) before the batched pop_back() loop,
  • then mi_heap_set_default(prev_heap) and JemallocArenaSwitcher::SwitchToArena(prev_arena) before any early Enqueue(this) / return false

correctly restores both the mimalloc default heap and jemalloc thread arena even when the vector is only partially drained. The same pattern is repeated for archive_vec_, so arena state is consistently restored around both release paths.

If this pattern is going to be reused elsewhere, it may be worth factoring into a small RAII guard (e.g., a ShardHeapScope that owns both heap and optional arena switching) to reduce duplication and make it harder to forget a restore on future edits, but it’s not required for correctness.


6337-6365: Second jemalloc arena switch block mirrors the first; duplication is OK but could share a helper.

The jemalloc-specific block around archive_vec_ release mirrors the data_sync_vec_ case:

  • SetAsDefaultHeap() + SetAsDefaultArena() before erase-batch loop.
  • mi_heap_set_default(prev_heap) + JemallocArenaSwitcher::SwitchToArena(prev_arena) afterwards, before possible re-enqueue.

That symmetry is good for avoiding subtle arena-state bugs. Same comment as above: you could optionally centralize this into a small guard or helper to avoid repeating the four calls in multiple places, but behavior here looks correct.

tx_service/src/cc/local_cc_shards.cpp (2)

126-132: Epoch worker integration for jemalloc looks sound; consider minor hardening

  • Constructor wiring (epoch_worker_ctx_(1) under WITH_JEMALLOC, 0 otherwise), thread startup in StartBackgroudWorkers, and shutdown in Terminate() are consistent with the other worker contexts; lifecycle looks correct.
  • EpochWorker()’s wait_until loop correctly exits when status_ changes and only calls mallctl("epoch", ...) on timeout, so it won’t busy‑spin.

Two small improvements you might want to consider:

  • Check the return value of mallctl("epoch", ...) and log at ERROR (or FATAL if you want to treat it as non‑recoverable) if it fails; that would make misconfigured jemalloc deployments much easier to diagnose.
  • For operability, optionally set a thread name for the epoch worker similar to other workers (e.g., "jemalloc_epoch"), which helps when debugging with tools like top/gdb/profilers.

Also applies to: 298-303, 2918-2941


6587-6598: Jemalloc-based heap usage reporting is fine; optional error handling for stats

The new WITH_JEMALLOC branch in ReportHashPartitionCkptHeapUsage() that calls GetJemallocArenaStat(hash_partition_ckpt_arena_id_, ...) cleanly replaces the previous mimalloc‑specific mi_thread_stats path when jemalloc is enabled, and keeping the old path under #else preserves behavior for non‑jemalloc builds.

If you want to make this more robust:

  • Consider adding basic error checking inside GetJemallocArenaStat (or here) on the mallctl("stats.arenas...") calls and logging a warning if they fail for the configured arena ID, so misconfigurations or missing stats don’t silently yield garbage numbers.
  • It might also be worth documenting that, with jemalloc, the reported numbers depend on the epoch refresher thread and are per‑arena, not per‑thread, to set operator expectations.

Also applies to: 6614-6614

tx_service/src/cc/cc_shard.cpp (2)

2476-2478: Clarify or remove the TODO comment.

The TODO about destroying the arena is misleading. jemalloc does not provide a public API to destroy arenas—they are automatically reclaimed at process shutdown. Consider updating the comment to clarify this behavior:

-    // TODO: destroy arena ?
+    // Note: jemalloc does not provide a public API to destroy arenas.
+    // Arenas are automatically cleaned up at process shutdown.

2485-2496: Replace assert(false) with a compile-time error or graceful no-op.

The assert(false) in the non-jemalloc branch is problematic:

  1. In release builds, assert may be compiled out, causing undefined behavior (returning uninitialized value).
  2. It doesn't provide a clear compile-time error for misuse.

Since callers are already guarded by WITH_JEMALLOC, consider making this a compile-time error or a safe no-op:

Option 1: Compile-time error (preferred if callers are always guarded)
 uint32_t CcShardHeap::SetAsDefaultArena()
 {
 #if defined(WITH_JEMALLOC)
     uint32_t prev_arena;
     JemallocArenaSwitcher::ReadCurrentArena(prev_arena);
     JemallocArenaSwitcher::SwitchToArena(arena_id_);
     return prev_arena;
 #else
-    assert(false);
-    return 0;
+    static_assert(false, "SetAsDefaultArena() requires WITH_JEMALLOC");
 #endif
 }
Option 2: Safe no-op with documentation
 uint32_t CcShardHeap::SetAsDefaultArena()
 {
 #if defined(WITH_JEMALLOC)
     uint32_t prev_arena;
     JemallocArenaSwitcher::ReadCurrentArena(prev_arena);
     JemallocArenaSwitcher::SwitchToArena(arena_id_);
     return prev_arena;
 #else
-    assert(false);
+    // No-op when jemalloc is not enabled; callers should guard with WITH_JEMALLOC
     return 0;
 #endif
 }
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ef9174a and 4f11cba.

📒 Files selected for processing (16)
  • CMakeLists.txt (3 hunks)
  • build_tx_service.cmake (4 hunks)
  • store_handler/data_store_service_client_closure.cpp (4 hunks)
  • tx_service/include/cc/catalog_cc_map.h (2 hunks)
  • tx_service/include/cc/cc_request.h (6 hunks)
  • tx_service/include/cc/cc_shard.h (6 hunks)
  • tx_service/include/cc/local_cc_shards.h (14 hunks)
  • tx_service/include/cc/range_cc_map.h (2 hunks)
  • tx_service/include/cc/template_cc_map.h (2 hunks)
  • tx_service/include/tx_service.h (4 hunks)
  • tx_service/include/tx_service_common.h (1 hunks)
  • tx_service/include/util.h (2 hunks)
  • tx_service/src/cc/cc_req_misc.cpp (2 hunks)
  • tx_service/src/cc/cc_shard.cpp (9 hunks)
  • tx_service/src/cc/local_cc_shards.cpp (15 hunks)
  • tx_service/src/cc/range_slice.cpp (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • tx_service/include/cc/range_cc_map.h
  • tx_service/src/cc/cc_req_misc.cpp
  • CMakeLists.txt
  • store_handler/data_store_service_client_closure.cpp
  • tx_service/include/tx_service_common.h
  • tx_service/include/cc/catalog_cc_map.h
  • tx_service/include/cc/template_cc_map.h
  • tx_service/include/util.h
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.

Applied to files:

  • tx_service/include/cc/cc_request.h
  • tx_service/include/tx_service.h
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/src/cc/range_slice.cpp
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/src/cc/cc_shard.cpp
  • tx_service/include/cc/cc_shard.h
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.

Applied to files:

  • tx_service/include/cc/cc_request.h
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/src/cc/range_slice.cpp
  • tx_service/src/cc/cc_shard.cpp
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • tx_service/include/cc/cc_request.h
  • tx_service/include/tx_service.h
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/cc/local_cc_shards.h
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.

Applied to files:

  • tx_service/include/cc/cc_request.h
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/include/cc/cc_shard.h
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.

Applied to files:

  • tx_service/include/cc/cc_request.h
🧬 Code graph analysis (7)
tx_service/include/cc/cc_request.h (2)
tx_service/include/cc/cc_shard.h (1)
  • SwitchToArena (199-210)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-469)
tx_service/include/tx_service.h (1)
tx_service/include/cc/cc_shard.h (1)
  • SwitchToArena (199-210)
tx_service/src/cc/local_cc_shards.cpp (2)
tx_service/include/cc/cc_shard.h (2)
  • ReadCurrentArena (187-197)
  • SwitchToArena (199-210)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-469)
tx_service/src/cc/range_slice.cpp (1)
tx_service/include/cc/cc_shard.h (2)
  • ReadCurrentArena (187-197)
  • SwitchToArena (199-210)
tx_service/include/cc/local_cc_shards.h (1)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-469)
tx_service/src/cc/cc_shard.cpp (2)
tx_service/include/cc/cc_shard.h (4)
  • SwitchToArena (199-210)
  • arena_id_ (280-281)
  • ReadCurrentArena (187-197)
  • memory_limit_ (272-272)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-469)
tx_service/include/cc/cc_shard.h (1)
tx_service/src/cc/cc_shard.cpp (2)
  • SetAsDefaultArena (2485-2496)
  • SetAsDefaultArena (2485-2485)
🔇 Additional comments (31)
build_tx_service.cmake (2)

217-219: Public linking of jemalloc_cfg propagates to consumers.

Using PUBLIC linkage ensures that any target linking against txservice will also receive the jemalloc configuration. This is appropriate since the header files expose jemalloc-related APIs (e.g., JemallocArenaSwitcher).


136-139: LGTM! Conditional jemalloc include and link integration.

The jemalloc include directory and library linking are properly guarded by WITH_JEMALLOC, which is defined in the main CMakeLists.txt. The jemalloc_cfg target is also correctly defined as an INTERFACE library. This follows the same pattern used for other optional dependencies in the codebase.

tx_service/src/cc/range_slice.cpp (1)

533-555: Arena switching pattern is correct.

The save-switch-restore pattern around UpdateSlice properly ensures allocations occur in the designated table-range arena. The restoration on line 554 happens before the heap mutex is unlocked, maintaining consistency.

Note: The JemallocArenaSwitcher::ReadCurrentArena and SwitchToArena methods already log fatal on failure (as seen in the relevant code snippets), so additional error handling here is not strictly necessary—though the code path terminates on failure rather than attempting recovery.

tx_service/include/cc/cc_shard.h (4)

27-29: Jemalloc header properly guarded.

The #include <jemalloc/jemalloc.h> is now correctly wrapped with #if defined(WITH_JEMALLOC), allowing builds without jemalloc to succeed. This addresses the previous review feedback.


181-212: JemallocArenaSwitcher implementation looks correct.

The static helper methods properly wrap mallctl calls for thread arena management. Using LOG(FATAL) on failure is appropriate here since arena mismanagement could lead to memory corruption or undefined behavior.


264-267: LGTM! New accessor for memory limit.

The MemoryLimit() getter provides read access to the existing memory_limit_ member, following the pattern of other accessors in the class.


241-244: All ArenaId() call sites are properly guarded by WITH_JEMALLOC.

Verification confirms the current approach is safe: the unguarded ArenaId() getter is only invoked within WITH_JEMALLOC-guarded code blocks. When jemalloc is disabled, these methods are never called, so the default-initialized return value of 0 presents no risk.

No action needed.

tx_service/include/cc/cc_request.h (1)

84-86: Guarded jemalloc include looks correct and localized.

Conditionally including <jemalloc/jemalloc.h> under #if defined(WITH_JEMALLOC) here is fine and keeps the dependency constrained to jemalloc builds; no issues from a compilation or ODR perspective.

tx_service/include/tx_service.h (3)

289-300: LGTM - Arena capture and storage follows heap management pattern.

The jemalloc arena management mirrors the existing mimalloc heap management correctly:

  • prev_arena captures the previous arena state via SetAsDefaultArena()
  • For external processing, the arena id is stored in coordi_->ext_tx_proc_arena_id_

425-435: LGTM - Arena restoration and cleanup is correct.

The teardown logic properly:

  1. Restores the arena using the local prev_arena variable
  2. Clears ext_tx_proc_arena_id_ to 0 for external processing cleanup

This matches the pattern used for mimalloc heap restoration.


696-711: LGTM - ForwardTx arena management is correct and consistent with codebase patterns.

The arena save/restore in ForwardTx correctly:

  1. Stores the previous arena in coordi_->ext_tx_proc_arena_id_ via shard_heap->SetAsDefaultArena() (line 697)
  2. Restores using the stored id before clearing it (lines 709-710)

JemallocArenaSwitcher and SetAsDefaultArena() are accessible through the include chain: tx_service.hlocal_cc_shards.hcc_shard.h, and this pattern is consistently used throughout the codebase.

tx_service/include/cc/local_cc_shards.h (13)

66-71: LGTM! Good defensive platform compatibility check.

The static_assert ensures type consistency with jemalloc's arena ID type across platforms. This prevents subtle bugs on platforms where unsigned might differ from uint32_t.


427-435: LGTM! Arena creation with proper error handling.

The error handling uses LOG(FATAL) which is appropriate for arena creation failure since this is an unrecoverable initialization error.


442-453: LGTM! Consistent arena creation pattern.

Error handling mirrors the table ranges arena creation appropriately.


466-469: LGTM!

Simple const accessor for arena ID.


478-497: LGTM! Proper jemalloc arena memory check.

The implementation correctly uses arena-specific allocated memory for the limit check, providing accurate per-arena memory tracking.


507-528: LGTM!

Consistent with TableRangesMemoryFull() pattern, correctly checking arena memory against the 90% threshold.


533-565: LGTM! Clean jemalloc-based reporting.

The jemalloc branch correctly simplifies the reporting by querying arena stats directly without requiring thread context switching.


891-896: LGTM! Arena switching uses proper helper methods.

The JemallocArenaSwitcher helper methods include error handling with LOG(FATAL), addressing the previous review concern about uninitialized arena IDs.


962-964: LGTM!

Arena correctly restored before the early return path.


1010-1012: LGTM!

Arena correctly restored on the fallthrough return path.


2015-2017: LGTM!

Arena ID members correctly zero-initialized, which defaults to the standard arena until explicitly created.


2179-2179: LGTM!

Worker context for the jemalloc epoch advancement thread.


2362-2364: LGTM!

Epoch worker declaration is properly guarded. The implementation (per relevant snippets) advances jemalloc's epoch periodically, which is necessary for accurate memory statistics.

tx_service/src/cc/cc_shard.cpp (7)

53-55: LGTM!

Conditional jemalloc header include properly guarded.


224-226: LGTM!

Arena switching during initialization ensures shard-specific memory allocation tracking.


253-256: LGTM!

Arena restoration uses the helper method with built-in error handling.


774-774: LGTM!

Minor formatting change.


2457-2465: LGTM!

Arena creation with proper error handling using LOG(FATAL) for unrecoverable failures.


2500-2534: LGTM!

The jemalloc branch correctly compares size_t values without unnecessary casting. Memory limit checking logic is consistent across both branches.


2546-2566: LGTM!

Correctly disables defragmentation for jemalloc since the mimalloc-specific defragmentation logic doesn't apply.

Comment thread tx_service/include/cc/cc_request.h
Comment thread tx_service/include/cc/local_cc_shards.h Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
tx_service/src/cc/local_cc_shards.cpp (1)

1037-1043: Extract RAII helper for jemalloc arena switching

The save/switch/restore pattern for jemalloc arenas is now repeated in several places (table range init, slice kickout, bucket drop, hash-partition ckpt heap). This is easy to get wrong on future edits (early returns, added branches) and was already called out in earlier reviews.

A small RAII guard would centralize this and make all sites safer:

Example RAII wrapper and usage
// In a common header, under WITH_JEMALLOC
struct ScopedJemallocArena {
    uint32_t prev_{0};
    bool active_{false};

    explicit ScopedJemallocArena(uint32_t target) {
#if defined(WITH_JEMALLOC)
        if (JemallocArenaSwitcher::ReadCurrentArena(prev_)) {
            active_ = JemallocArenaSwitcher::SwitchToArena(target);
        }
#endif
    }

    ~ScopedJemallocArena() {
#if defined(WITH_JEMALLOC)
        if (active_) {
            JemallocArenaSwitcher::SwitchToArena(prev_);
        }
#endif
    }
};

Then e.g. in InitTableRanges / KickoutRangeSlices / DropStoreRangesInBucket / hash-partition ckpt paths:

#if defined(WITH_JEMALLOC)
    ScopedJemallocArena scoped_arena(GetTableRangesArenaId());
#endif
// ... body without explicit prev_arena handling ...

Also applies to: 1138-1140, 1312-1320, 1334-1336, 1374-1380, 1395-1397, 2119-2125, 2141-2142, 5013-5018, 5075-5078

CMakeLists.txt (1)

24-35: Define WITH_JEMALLOC option before using it and re‑use existing review guidance

Right now ${WITH_JEMALLOC} is logged (Line 24) and tested in if(WITH_JEMALLOC) (Line 28) before the option(WITH_JEMALLOC ...) call (Line 27) establishes the default. On a fresh configure where the user hasn’t set the cache entry, the first message() will print an empty value rather than the intended default and the first if() will still work but read a “pre‑option” variable.

Reordering keeps things clearer and matches prior guidance:

option(WITH_JEMALLOC "Use jemalloc" OFF)
message(NOTICE "WITH_JEMALLOC: ${WITH_JEMALLOC}")

add_library(jemalloc_cfg INTERFACE)
if(WITH_JEMALLOC)
    find_package(PkgConfig REQUIRED)
    pkg_check_modules(JEMALLOC REQUIRED jemalloc)
    target_compile_definitions(jemalloc_cfg INTERFACE WITH_JEMALLOC)
    target_include_directories(jemalloc_cfg INTERFACE ${JEMALLOC_INCLUDE_DIRS})
    target_link_libraries(jemalloc_cfg INTERFACE ${JEMALLOC_LINK_LIBRARIES})
endif()

Also, as in the earlier review, please confirm on your system that pkg_check_modules(JEMALLOC REQUIRED jemalloc) really defines JEMALLOC_LINK_LIBRARIES (full paths) and not only JEMALLOC_LIBRARIES, and adjust if needed.

🧹 Nitpick comments (4)
tx_service/src/cc/local_cc_shards.cpp (1)

126-131: Epoch worker wiring looks consistent with existing worker_ctx pattern

The new epoch_worker_ctx_ initialization, thread startup in StartBackgroudWorkers, and shutdown via epoch_worker_ctx_.Terminate() follow the same pattern as the other worker contexts, and EpochWorker()’s loop correctly exits when status_ changes. Consider optionally giving the epoch thread a name (like the other workers) for easier debugging, e.g. via pthread_setname_np when you create it.

Also applies to: 298-303, 2918-2941

CMakeLists.txt (1)

367-371: Confirm PUBLIC linkage for data_substrate and jemalloc_cfg is intentional

Changing

target_link_libraries(${DATA_SUBSTRATE_LIB} PUBLIC ${DATA_SUBSTRATE_LINK_LIBS})

and conditionally adding

if(WITH_JEMALLOC)
    target_link_libraries(${DATA_SUBSTRATE_LIB} PUBLIC jemalloc_cfg)
endif()

means txservice, ${METRICS_LIB}, RocksDB / AWS libs, and jemalloc now propagate transitively to any target that links against data_substrate. That’s often fine (and needed for shared libraries), but it does broaden consumers’ link surfaces.

If the intent is only to expose headers while keeping link dependencies private, consider keeping these PRIVATE (and adding necessary libs explicitly to final binaries instead). Otherwise, no functional issue—just worth double‑checking this is the desired API surface.

tx_service/src/cc/cc_shard.cpp (2)

2539-2539: Clarify arena cleanup strategy.

The TODO comment should be resolved before merge. Based on past review feedback, if jemalloc doesn't provide a public API for arena destruction and cleanup happens automatically at process shutdown, the comment should document this explicitly rather than leaving it as a TODO.

🔎 Suggested documentation
-    // TODO: destroy arena ?
+    // Jemalloc does not provide a public API to destroy arenas.
+    // Arenas are cleaned up automatically at process shutdown.

Based on coding guidelines and past review feedback.


2547-2558: Consider compile-time guard instead of runtime assert.

While the current implementation is safe (all call sites are guarded by WITH_JEMALLOC), the assert(false) in the non-jemalloc branch is a code smell. Consider making the entire function available only when jemalloc is enabled.

🔎 Suggested approach

In the header file, conditionally declare the function:

#if defined(WITH_JEMALLOC)
    uint32_t SetAsDefaultArena();
#endif

Then remove the #else branch entirely from the implementation, so the function only exists when jemalloc is enabled.

Based on past review feedback suggesting compile-time protection.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f11cba and 1ebb205.

📒 Files selected for processing (16)
  • CMakeLists.txt
  • build_tx_service.cmake
  • store_handler/data_store_service_client_closure.cpp
  • tx_service/include/cc/catalog_cc_map.h
  • tx_service/include/cc/cc_request.h
  • tx_service/include/cc/cc_shard.h
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/include/cc/range_cc_map.h
  • tx_service/include/cc/template_cc_map.h
  • tx_service/include/tx_service.h
  • tx_service/include/tx_service_common.h
  • tx_service/include/util.h
  • tx_service/src/cc/cc_req_misc.cpp
  • tx_service/src/cc/cc_shard.cpp
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/src/cc/range_slice.cpp
🚧 Files skipped from review as they are similar to previous changes (6)
  • tx_service/include/tx_service.h
  • tx_service/include/cc/cc_shard.h
  • tx_service/include/cc/cc_request.h
  • tx_service/include/cc/template_cc_map.h
  • tx_service/include/util.h
  • tx_service/include/cc/range_cc_map.h
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.

Applied to files:

  • tx_service/src/cc/range_slice.cpp
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/src/cc/cc_shard.cpp
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.

Applied to files:

  • tx_service/src/cc/range_slice.cpp
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/src/cc/cc_shard.cpp
  • tx_service/include/cc/catalog_cc_map.h
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.

Applied to files:

  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/include/cc/catalog_cc_map.h
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/cc/local_cc_shards.h
🧬 Code graph analysis (5)
store_handler/data_store_service_client_closure.cpp (1)
tx_service/include/cc/cc_shard.h (2)
  • ReadCurrentArena (187-197)
  • SwitchToArena (199-210)
tx_service/src/cc/range_slice.cpp (1)
tx_service/include/cc/cc_shard.h (2)
  • ReadCurrentArena (187-197)
  • SwitchToArena (199-210)
tx_service/src/cc/cc_req_misc.cpp (1)
tx_service/include/cc/cc_shard.h (2)
  • ReadCurrentArena (187-197)
  • SwitchToArena (199-210)
tx_service/src/cc/local_cc_shards.cpp (1)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-469)
tx_service/include/cc/local_cc_shards.h (2)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-469)
tx_service/src/cc/local_cc_shards.cpp (2)
  • EpochWorker (2924-2940)
  • EpochWorker (2924-2924)
🔇 Additional comments (19)
store_handler/data_store_service_client_closure.cpp (2)

757-763: Jemalloc arena switch/restore in FetchTableRangesCallback looks consistent and correctly paired.

The arena is read, switched to the table-ranges arena for the catalog/range construction work, then restored before releasing the custom mimalloc heap and mutex, with no early returns in between. This matches the heap/thread override pattern in this function and elsewhere, so allocator state should remain balanced across all paths.

Also applies to: 824-826


1045-1052: Balanced arena switching around slice decoding in FetchRangeSlicesCallback.

Similarly here, the jemalloc arena is saved, switched for the slice_info_ construction loop, then restored before unlocking the heap mutex and continuing segment iteration. Control flow doesn’t bypass the restore, so allocator context is preserved.

Also applies to: 1088-1090

tx_service/include/tx_service_common.h (1)

77-87: New ext_tx_proc_arena_id_ field is reasonable and consistent with existing coordinator state.

Storing the external tx processor’s jemalloc arena ID alongside ext_tx_proc_heap_ fits the existing pattern and should allow correct save/restore of allocator context without introducing new races, assuming current synchronization around external processor occupation remains unchanged.

tx_service/src/cc/range_slice.cpp (1)

533-539: Allocator switching around UpdateSlice is correctly ordered and symmetric.

Within StoreRange::UpdateSliceSpec, you now route allocations through the table-ranges arena while the custom heap is active and restore the previous arena before dropping the heap lock and potentially calling KickoutRangeSlices(). The control flow between these points is straight-line, so allocator state should not leak.

Also applies to: 554-555

tx_service/src/cc/cc_req_misc.cpp (1)

411-417: Jemalloc arena scoping around InitRangeSlices is well-bounded.

The success path in FetchRangeSlicesReq::SetFinish now switches into the table-ranges arena only for InitRangeSlices, then restores the previous arena before releasing the heap lock. This keeps per-range slice allocations isolated without altering error handling or control flow.

Also applies to: 436-438

tx_service/include/cc/catalog_cc_map.h (1)

442-451: Arena switching for empty-index range initialization is consistent with other table-range sites.

For newly added secondary indexes, the code now binds InitRangeSlices to the table-ranges heap and jemalloc arena, then restores both the mimalloc heap/thread and the previous arena before releasing the heap mutex. There are no intervening returns, so allocator state remains coherent across this path.

Also applies to: 472-476

build_tx_service.cmake (2)

136-138: WITH_JEMALLOC include/link integration for txservice looks correct.

Conditionally adding ${JEMALLOC_INCLUDE_DIRS} to INCLUDE_DIR and linking txservice publicly against jemalloc_cfg under WITH_JEMALLOC should make the allocator headers and configuration available wherever txservice is used, without affecting non-jemalloc builds.

Also applies to: 217-219


355-356: host_manager linkage uses PRIVATE as expected.

Linking host_manager with ${HOST_MANAGER_LINK_LIB} via target_link_libraries(... PRIVATE ...) is appropriate for an executable and keeps its dependency set from leaking into other targets.

tx_service/src/cc/local_cc_shards.cpp (1)

6605-6617: Jemalloc-based ckpt heap reporting is reasonable; confirm arena id setup

Switching ReportHashPartitionCkptHeapUsage() to use GetJemallocArenaStat(hash_partition_ckpt_arena_id_, ...) under WITH_JEMALLOC is consistent with the rest of the jemalloc integration. Please just double‑check that hash_partition_ckpt_arena_id_ is always initialized to a valid jemalloc arena before any reporting calls, and that the deployed jemalloc version supports the stats.arenas.<id>.* MIBs used in GetJemallocArenaStat.

CMakeLists.txt (1)

400-403: Host_manager install guard change is purely cosmetic

The if(FORK_HM_PROCESS) / endif() block around the host_manager install target remains functionally the same; the added ~ marker here appears to be a non‑semantic change (e.g., whitespace or formatting). No action needed.

tx_service/include/cc/local_cc_shards.h (5)

474-477: LGTM!

The getter implementation is clean and properly const-qualified.


486-505: LGTM!

The jemalloc integration for memory fullness checks is properly implemented with appropriate fallback to the existing mimalloc-based implementation.


515-536: LGTM!

The memory availability check correctly uses jemalloc stats when available and maintains consistency with the mimalloc fallback.


541-573: LGTM!

The jemalloc integration simplifies the reporting path by eliminating the need for heap switching, while maintaining the same logging output format.


2370-2372: LGTM!

The EpochWorker declaration is properly guarded and follows the established pattern for background worker threads in this class.

tx_service/src/cc/cc_shard.cpp (4)

53-55: LGTM!

The conditional jemalloc header inclusion is properly guarded.


232-234: LGTM!

The arena switching around shard initialization ensures that lock vectors and other data structures are allocated in the shard-specific arena, with proper restoration afterwards.

Also applies to: 262-264


2562-2596: LGTM!

The Full() method correctly uses jemalloc arena statistics when available, with proper fallback to mimalloc stats. The comparison logic is consistent across both paths.


2608-2628: LGTM!

Correctly disables defragmentation when jemalloc is enabled, as jemalloc manages its own memory allocation strategy. The mimalloc-specific defragmentation logic remains unchanged in the fallback path.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
tx_service/src/cc/local_cc_shards.cpp (1)

1034-1041: Repeated manual save/switch/restore of jemalloc arena would benefit from a small RAII helper.

Across multiple call sites you now have the same pattern:

  • ReadCurrentArena(prev_arena);
  • SwitchToArena(target_arena_id);
  • work under that arena
  • SwitchToArena(prev_arena);

This appears in InitTableRanges, both KickoutRangeSlices drop paths, DropStoreRangesInBucket, and the hash-partition checkpoint allocation path.

Functionally this looks correct, but the duplication makes it easier to accidentally miss a restore or introduce early-return paths in future edits. A tiny RAII guard (e.g. JemallocArenaGuard guard(target_arena_id);) inside JemallocArenaSwitcher would centralize this logic and reduce that risk.

Also applies to: 1135-1137, 1309-1317, 1331-1333, 1371-1377, 1392-1395, 2116-2123, 2137-2139, 5010-5015, 5072-5076

🧹 Nitpick comments (2)
tx_service/src/cc/local_cc_shards.cpp (1)

137-139: Epoch worker integration for jemalloc looks sound; consider minor polish.

The epoch worker context is cleanly gated by WITH_JEMALLOC, constructed with a single worker, started in StartBackgroudWorkers, and terminated via epoch_worker_ctx_.Terminate(), matching the pattern of other worker contexts. The 1-second mallctl("epoch", ...) loop is straightforward and contained under the worker mutex.

Non-blocking suggestions:

  • Optionally set a thread name for the epoch worker (similar to other workers) for easier debugging.
  • If you expect different workloads, consider making the epoch interval configurable instead of hardcoded to 1s.

Also applies to: 295-300, 2915-2917, 2920-2938

tx_service/include/cc/local_cc_shards.h (1)

2368-2370: LGTM! EpochWorker properly integrated for jemalloc stat updates.

The EpochWorker and its context are correctly declared and guarded by WITH_JEMALLOC. The worker periodically advances the jemalloc epoch (via mallctl("epoch", ...)) to ensure that arena statistics remain current for the memory tracking functions.

📝 Optional: Add explanatory comment

Consider adding a brief comment explaining the EpochWorker's purpose:

 #if defined(WITH_JEMALLOC)
+    // Periodically advances jemalloc epoch to refresh arena statistics
     void EpochWorker();
 #endif

This would help future maintainers understand why this worker exists when reviewing jemalloc-related code.

Also applies to: 2483-2485

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1ebb205 and 846e9d5.

📒 Files selected for processing (2)
  • tx_service/include/cc/local_cc_shards.h
  • tx_service/src/cc/local_cc_shards.cpp
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.

Applied to files:

  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/cc/local_cc_shards.h
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.

Applied to files:

  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/cc/local_cc_shards.h
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/include/cc/local_cc_shards.h
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.

Applied to files:

  • tx_service/src/cc/local_cc_shards.cpp
🧬 Code graph analysis (1)
tx_service/include/cc/local_cc_shards.h (2)
tx_service/include/util.h (1)
  • GetJemallocArenaStat (446-469)
tx_service/src/cc/local_cc_shards.cpp (2)
  • EpochWorker (2921-2937)
  • EpochWorker (2921-2921)
🔇 Additional comments (8)
tx_service/src/cc/local_cc_shards.cpp (1)

6602-6614: Jemalloc-based hash-partition checkpoint heap reporting is correctly gated and mirrors prior behavior.

Switching ReportHashPartitionCkptHeapUsage to use GetJemallocArenaStat(hash_partition_ckpt_arena_id_, ...) under WITH_JEMALLOC while retaining the existing mimalloc-based path under #else keeps behavior consistent across allocators and uses the right source for arena-level stats. The logged fields and fragmentation calculation remain unchanged.

Also applies to: 6630-6631

tx_service/include/cc/local_cc_shards.h (7)

66-71: LGTM! Platform compatibility check is appropriate.

The static_assert ensures that jemalloc's unsigned arena ID type matches the codebase's uint32_t type, preventing subtle bugs on platforms where they differ. The error message clearly explains the issue.


435-443: LGTM! Arena creation error handling is appropriate.

The error handling with LOG(FATAL) for arena creation failure is correct. Since this is called during initialization, terminating the process on failure is the right approach—the service cannot function properly without the dedicated arena.


450-461: LGTM! Consistent error handling.

Arena creation for hash partition checkpoint follows the same correct pattern as table ranges arena creation, with appropriate LOG(FATAL) error handling.


474-477: LGTM! Clean accessor implementation.

The GetTableRangesArenaId() accessor is straightforward and correctly exposes the arena ID for use in memory tracking and arena switching operations.


486-505: LGTM! Memory tracking correctly uses jemalloc stats.

The memory checking functions (TableRangesMemoryFull, HasEnoughTableRangesMemory, and TableRangeHeapUsageReport) properly query jemalloc arena statistics via GetJemallocArenaStat when WITH_JEMALLOC is defined, while maintaining backward compatibility with the mimalloc-only path.

Also applies to: 515-536, 541-573


899-904: LGTM! Arena switching is correctly implemented with proper cleanup.

The arena save/switch/restore pattern is correct:

  1. Save current arena before operations (line 901)
  2. Switch to dedicated table ranges arena (line 903)
  3. Restore on both exit paths (lines 971 and 1019)

The JemallocArenaSwitcher helper functions handle errors with LOG(FATAL), which is appropriate for arena switching failures. Both code paths (new range creation and existing range update) correctly restore the original arena.

Also applies to: 970-972, 1018-1020


2023-2024: LGTM! Arena ID members correctly initialized.

The arena ID members are initialized to 0 (jemalloc's default arena), which is appropriate. During initialization via InitializeTableRangesHeap() and InitializeHashPartitionCkptHeap(), these are set to dedicated arena IDs, or the process terminates on failure.

@lokax lokax merged commit 8dce439 into main Dec 24, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add jemalloc to support memory profiling

3 participants