Clarify relationship between "LongMemEval" results and MemoryAgentBench's reconstructed longmemeval_s subset

Hi, thanks for releasing this benchmark suite — really useful unification of the memory systems.

I'm trying to reproduce the "LongMemEval" numbers in the paper, but the only data path I can find in the code is benchmark/memoryagentbench/hf_datasets.py pulling ai-hyz/MemoryAgentBench — there's no standalone LongMemEval loader, and the context chunking, prompts, and gold answers all come from that MemoryAgentBench split rather than the original LongMemEval release. 
Could you clarify whether the reported numbers are entirely from this MemoryAgentBench-reconstructed subset, whether the workload should instead be labeled "MemoryAgentBench (longmemeval_s*)"?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clarify relationship between "LongMemEval" results and MemoryAgentBench's reconstructed longmemeval_s subset #2

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Clarify relationship between "LongMemEval" results and MemoryAgentBench's reconstructed longmemeval_s subset #2

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions