Found 31 repositories(showing 30)
xiaowu0162
Benchmarking Chat Assistants on Long-Term Interactive Memory (ICLR 2025)
ZenSystemAI
A Multi Agent Memory MCP That Connect Agents Across Systems and Machines
JordanMcCann
Memory system for AI agents. #1 on LongMemEval — 96.2% (481/500). Beats every published system including Chronos, Mastra, Supermemory, and Emergence. Built solo in 16 days for $1,000.
mastra-ai
Memory examples for the Mastra Memory workshop on Jul 24, 2025
rawwerks
No description available
dddabtc
Atlas Memory — self-hosted long-term memory for AI agents. LongMemEval (90.18%)
Backboard-io
No description available
B-Divyesh
High-accuracy long-term memory for AI agents. 86.6% on LongMemEval with 10-step Hyper Search RAG pipeline.
recallrai
Official Python SDK for RecallrAI – a revolutionary contextual memory system that enables AI assistants to form meaningful connections between conversations, just like human memory.
nicoloboschi
Visual inspector for LongMemEval dataset
abbudjoe
Benchmark suite for conversational memory systems (LongMemEval, ConvoMem)
No description available
marklubin
LENS - AI Memory Benchmark - Memory as Experience, Not Facts
HankieMcSpanky
Local-first AI memory that scores 100% on user fact recall. Open-source memory layer for LLM agents with hybrid search, middle-out compression, and local LLM support. Beats Mem0 (49%) and Zep (71%) on LongMemEval. Your data never leaves your machine.
pdx97
No description available
lugmanhussainkhan
No description available
josancamon19
No description available
Neutrally-app
Neutrally's LongMemEval-S hypothesis file and reproduction instructions — 89.4% (447/500)
hellen9527
在longmemeval评估集上评测
Yummytanmo
This project provides a complete adaptation of MemoryOS for evaluating long-term memory capabilities on the LongMemEval benchmark, with all core MemoryOS functionality preserved and optimized for evaluation scenarios.
Yummytanmo
An evaluation adapter for benchmarking the A-mem memory system on LongMemEval, supporting retrieval metrics (Recall@k, NDCG@k) and QA evaluation with multiple LLM backends (OpenAI, SGLang, Ollama).
AlpenglowAgents
NOUSai LongMemEval benchmark evidence: 73% with Ollama 3B (local inference)
juaneliascabrera
Implementación de RAG sobre Gemma3:4b. Testeado con LongMemEval
Koushik1161
Next-generation agent memory system with 70.4% QA accuracy on LongMemEval
omega-memory
OMEGA persistent memory plugin for OpenClaw — graph-based, local-first, #1 on LongMemEval
rivercrab26
Standardized benchmark framework for AI memory systems. Test Mem0, Graphiti, Letta, and more against LongMemEval, LoCoMo, HaluMem.
Jinstronda
Open source memory benchmark + RAG system. 82.8% on LongMemEval. Ships as MCP server for Claude Code.
sayedRaheel
Scalable conversational memory via recursive sub-agent delegation — 46% EM vs 5% truncation on LongMemEval-S, zero training
VihAMBR
What works for LLM long-term memory - tested on 2,040 questions across LoCoMo and LongMemEval. CoT prompting > fancy encoding.
A lightweight web app for humans to explore LongMemEval’s long‑context questions and see how the dataset is structured.