Long-term agent memory built on a temporal knowledge graph (Graphiti). Zep Community Edition under Apache-2.0; Zep Cloud as the managed offering. Strong fit for agents that need time-aware facts and entity-resolution across user sessions.
Zep is a long-term agent memory framework built on a temporal knowledge graph (Graphiti), preserving time-aware facts and entity-resolution across user sessions in a way that pure vector retrieval cannot. Available as Apache-2.0 Community Edition and a managed Zep Cloud, it is the leading choice when an agent needs to know not just what is true, but when it became true. The key tradeoff: best-in-class semantic memory and temporal reasoning versus a younger operational footprint and a steeper concept curve than dropping in a vanilla vector store.
For Layer 4 agent memory, trust means an agent retrieves the right facts about the right user with the right freshness — and never asserts something as current when it was actually superseded. Zep's temporal knowledge graph is the rare memory substrate that makes recency a first-class property: facts have valid-from and valid-until timestamps, and superseded facts are explicitly tombstoned rather than silently overwritten. The failure modes that hurt trust are mostly operational — graph drift when ingestion lags, or entity-resolution mistakes that merge two distinct users — both of which are detectable with the right monitoring discipline.
Memory retrieval on Zep Cloud typically returns in 200-500ms; self-hosted Community Edition depends on the backing store. Temporal graph traversals on cold paths can hit 1-2s. Below 5s, no cap, but not the sub-100ms tier that pushes I to 5-6.
Pythonic SDK with `memory.add` and `memory.search`; REST API mirrors the SDK. Graphiti graph queries surface through high-level helpers. Some advanced features need docs reading.
User-scoped memory and session isolation; Zep Cloud adds workspace RBAC. No native ABAC at the framework level. Cap rule applied — RBAC-only without ABAC caps at 3.
CE self-hostable on AWS, Azure, GCP, on-prem. Zep Cloud as the managed option. Apache-2.0 license keeps the OSS path durable; no cloud lock-in.
Temporal knowledge graph (Graphiti) preserves time-aware context. Custom entity types act as a semantic vocabulary. Far richer than pure vector memory.
OSS code, structured logs, message-level metadata. Zep Cloud adds usage analytics. OSS edition lacks built-in per-query cost attribution.
Audit log on Zep Cloud plus structured logs in CE. Graph versioning via temporal facts counts toward G4 (model versioning). No native HITL, threat modeling, or compliance mapping.
Zep Cloud dashboard plus OpenTelemetry hooks cover APM. Temporal facts make retrieval rationale visible (O6). Lacks distributed tracing and LLM cost attribution.
Sub-second retrieval p95 on tuned deployments; real-time message ingestion is supported. Cache hit rate and 10x load scenarios not first-class concepts in the framework itself.
Top-of-category Lexicon support — entity resolution via Graphiti node merging, custom entity types as glossary, confidence scores enable disambiguation, entity aliasing supported. Aligned with mem0 and letta peers at L=5.
Graph constraints, required fields on entity types, single source of truth in the graph, entity schema validation. No first-class quality gates or ML-based anomaly detection.
Best suited for
Compliance certifications
Zep Cloud SOC 2 Type II per https://www.getzep.com/security. No HIPAA BAA, ISO 27001, or FedRAMP as of 2026-05. CE compliance posture inherits entirely from the deployment environment and backing stores.
Use with caution for
Both are OSS agent memory frameworks. Mem0 is lighter, faster to bolt on, with broader storage-backend support. Zep wins when temporal reasoning matters — what happened when, what superseded what — which Mem0 does not model first-class.
View analysis →Letta (MemGPT lineage) is a more opinionated agent runtime with memory as one part. Zep is a memory layer that bolts onto any agent runtime. Pick Letta if you want a complete persona runtime; Zep if you want memory to layer onto LangGraph or AutoGen.
View analysis →Role: Sits at Layer 4 as the temporal memory substrate — turns raw conversation history into a queryable knowledge graph of typed entities and time-aware facts.
Upstream: Reads from LLM-generated turns and user messages; pulls embeddings from OpenAI / Anthropic / local models. Backed by graph stores (Neo4j, FalkorDB) and vector stores (pgvector, Qdrant, Pinecone).
Downstream: Feeds retrieved memories back into agent prompts at L4; integrates with LangChain, LangGraph, AutoGen as a memory provider. Logfire / OpenTelemetry export for L6 observability.
Mitigation: Configure aggressive `user_id` scoping; require explicit identity tokens on every memory write; audit merged-entity logs weekly; integration tests asserting no cross-user retrieval
Mitigation: Monitor ingestion lag; surface 'as-of-date' in retrieval results; treat older facts with explicit recency-aware prompting
Mitigation: Run scheduled backups of the backing graph store; test restore quarterly; document RTO/RPO in the deployment runbook
Temporal facts and Graphiti's entity resolution are exactly the right shape; the agent can reason about 'plan upgraded last Tuesday' as a first-class fact.
Memory model fits, but no BAA on Cloud means the team must self-host on a BAA-signing backing store and accept higher operational burden.
Overkill — a simple Redis-backed conversation buffer is the right shape and avoids the Graphiti operational footprint.
This analysis is AI-generated using the INPACT and GOALS frameworks from "Trust Before Intelligence." Scores and assessments are algorithmic and may not reflect the vendor's complete capabilities. Always validate with your own evaluation.