Self-organising long-term memory substrate for agentic LLM workflows, grounded in Event Segmentation Theory (EST) and Predictive Processing (PP). Ingests multi-turn conversations, segments them into topically coherent episodes via LLM-powered boundary detection, distils durable semantic knowledge from each episode, and exposes a unified search surface for downstream reasoning. Designed as a minimalist production-ready core: PostgreSQL for structured metadata, Qdrant for vector similarity.
- Storage
- Two complementary stores: PostgreSQL holds episode metadata, temporal anchors, and structured knowledge records; Qdrant stores dense vector embeddings of episodes and knowledge facts for semantic search. Episodes are generated by an LLM-powered segmentation pipeline that applies transitional masking heuristics (from EST) to detect coherent topic boundaries before writing.
- Retrieval
- Unified search surface over both stores: semantic vector similarity via Qdrant for fuzzy recall, and structured PostgreSQL queries for exact metadata filters (time, source, entity). A Predict–Calibrate cycle extracts high-value facts from gaps between existing knowledge and new episodes, progressively refining the knowledge base. Async concurrency and caching keep latency low under load.
- Self-host
- Self-host: moderate
- License
- MIT
- Pricing
- Open-source MIT, free to self-host. Requires a running PostgreSQL 16 instance and a Qdrant instance (Docker Compose setup provided). Also requires an LLM and embedding API (OpenRouter or OpenAI-compatible). · Free / OSS
- GitHub stars
- 204
- Last release
- —
- Last commit
- 2026-04-16
- First catalogued
- 2026-06-28
Strengths
- Cognitive-science grounding (Event Segmentation Theory + Predictive Processing) produces semantically coherent episode boundaries — not arbitrary chunk splits
- Predict–Calibrate cycle actively surfaces memory gaps and fills them, making the knowledge base self-correcting over time
- Production-ready concurrency design (async, caching) and clean separation of metadata (Postgres) from vectors (Qdrant)
- MIT-licensed with a research paper (arXiv:2508.03341) providing theoretical foundations
Watch out
- Requires two external services (PostgreSQL + Qdrant) — heavier infrastructure footprint than single-binary memory solutions
- 204 stars, no formal release tag; actively developed but not yet at a stable versioned release — API may change
- LLM used for episode boundary detection adds cost and latency at write time; offline/low-cost paths are not documented
Best for
- Agentic LLM workflows needing structured long-term memory with semantically coherent episodes and a unified search surface across episodic and semantic stores
Benchmark results
No sourced results yet.
Sources
- Nemori README (vendor)
- Nemori paper (arXiv) (paper)
- GitHub API repo metadata (204 stars, MIT, no formal release) (third-party)
Last verified 2026-06-28 · updated by discover-frameworks