Skip to content

[Bug] storeEventListenStatus leaks shared store listener when a non-owner CachedGraphTransaction closes first #3033

@dpol1

Description

@dpol1

Bug Type (问题类型)

logic (逻辑设计问题)

Before submit

  • 我已经确认现有的 IssuesFAQ 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents)

Environment (环境信息)

  • Server Version: 1.7.0 (Apache Release Version)
  • Backend: RocksDB x nodes, HDD or SSD
  • OS: xx CPUs, xx G RAM, Ubuntu 2x.x / CentOS 7.x
  • Data Size: xx vertices, xx edges

Expected & Actual behavior (期望与实际表现)

Expected

When several CachedGraphTransaction instances share the same graph, closing a non‑owner transaction must not touch the shared store listener registered by the first (owner) transaction. The store listener should only be unregistered when the owner closes.

Actual

unlistenChanges() calls storeEventListenStatus.remove(graphName) unconditionally. The first transaction to close — owner or not — pops the tracking entry. If it is a non‑owner, it then calls provider().unlisten(this.storeEventListener) on its own per‑instance listener, which was never registered with the provider, so the unlisten is a no‑op. The owner's real store listener stays registered but is now untracked. When the owner closes later, remove(graphName) returns null, the real unlisten(...) call is skipped, and the listener leaks for the rest of the JVM lifetime.

The same pattern is present in CachedSchemaTransaction.java.

Reproduction

  1. Open two CachedGraphTransaction instances T1 (owner) and T2 (non‑owner) for the same spaceGraphName.
  2. T1.listenChanges() registers T1.storeEventListener via store().provider().listen(...) and inserts the tracking entry into storeEventListenStatus.
  3. T2.listenChanges() sees the entry already there and skips registration.
  4. Close T2 first. storeEventListenStatus.remove(graphName) returns true. provider().unlisten(T2.storeEventListener) runs as a no‑op because that instance was never registered.
  5. Close T1. storeEventListenStatus.remove(graphName) returns null. The real provider().unlisten(T1.storeEventListener) is skipped.
  6. T1.storeEventListener stays registered with the provider indefinitely.

Root cause

Same owner‑first‑close shape that PR #3017 fixed for graphCacheEventListeners via the CacheListenerHolder ref‑count. storeEventListenStatus was left as‑is, with the gap flagged in the in‑code TODO:

The unit test CachedGraphTransactionTest papers over it with storeListeners.putIfAbsent(graphName, true) after closing the secondary transaction, so the test teardown can still unlisten the primary listener.

Suggested fix

Two options surfaced in the PR #3017 review:

  1. Apply the same CacheListenerHolder ref‑counted holder pattern used for the cache listener to the store listener.
  2. Stop registering and unregistering on transaction lifecycle. The cache is already a CacheManager singleton per spaceGraphName, so the store listener lifetime is the graph, not the transaction. Move register/unregister to graph open/dispose.

Option 2 is preferred: it removes the owner concept and the whole race class. Option 1 is the smaller mechanical change.

Scope

  • Apply the chosen fix to CachedGraphTransaction and CachedSchemaTransaction.
  • Remove the putIfAbsent workaround from CachedGraphTransactionTest.
  • Add a regression test that exercises non‑owner‑first‑close ordering and asserts the owner's listener is still unregistered when the owner closes.

References

Vertex/Edge example (问题点 / 边数据举例)

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions