Skip to content

Clarify relationship between "LongMemEval" results and MemoryAgentBench's reconstructed longmemeval_s subset #2

Description

@junkuanliu

Hi, thanks for releasing this benchmark suite — really useful unification of the memory systems.

I'm trying to reproduce the "LongMemEval" numbers in the paper, but the only data path I can find in the code is benchmark/memoryagentbench/hf_datasets.py pulling ai-hyz/MemoryAgentBench — there's no standalone LongMemEval loader, and the context chunking, prompts, and gold answers all come from that MemoryAgentBench split rather than the original LongMemEval release.
Could you clarify whether the reported numbers are entirely from this MemoryAgentBench-reconstructed subset, whether the workload should instead be labeled "MemoryAgentBench (longmemeval_s*)"?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions