OSS framework for stateful agents with persistent memory hierarchies (formerly MemGPT). Apache-2.0. Manages core/recall/archival memory across conversations. Backed by Postgres and vector stores.
Letta is an OSS framework for stateful agents with persistent memory hierarchies — the production evolution of MemGPT (the academic paper from UC Berkeley that introduced operating-system-style memory management for LLM context). Apache-2.0 license, with Letta Cloud as the managed offering. Pick Letta when your agent needs memory beyond the LLM's context window — long-running conversations, user preferences over months, knowledge accumulation across sessions. Distinct from RAG: RAG retrieves from external corpora at query time; Letta manages the agent's own state evolution. The category-defining tool for L4 Agent Memory.
Letta encodes a specific architectural decision about agent state: it's not just retrieval, it's first-class memory that the agent can read AND write. From a Trust Before Intelligence lens, that's a meaningful trust expansion — the agent now has persistent identity-shaping state, not just transient session state. This raises questions traditional RAG architectures don't: how do you audit what the agent has learned? How do you delete a user's memory on GDPR request? How do you prevent prompt-injection from corrupting the persistent memory? Letta provides the primitives (memory hierarchy with explicit read/write boundaries, archival memory in Postgres or vector DB) but the operator must use them deliberately. Treating Letta's memory layer as 'just a database' misses the trust implications of agent-controlled writes.
Memory retrieval depends on storage backend (Postgres + vector DB typically). Sub-100ms for hot memory + recall queries; archival lookups depend on vector DB latency. Cap rule N/A.
Programming framework with explicit memory operations as first-class primitives. core_memory_replace, archival_memory_insert, recall_memory_search are the agent's tools for managing its own state. Cap rule N/A.
Per-agent state isolation via agent_id; deployment-driven authentication via API key or SSO in Letta Cloud. RBAC at the API level. Cap rule applied: P-low for agent frameworks without engine-level ABAC.
Pluggable storage backends (Postgres, SQLite, vector DBs). Multi-cloud via Letta Cloud or self-host. Strong portability.
Memory hierarchy is explicit context: core memory (always in prompt), recall memory (recent message history), archival memory (long-term searchable). Plus tool-call traces, state changes. Strong C dimension — context IS what Letta manages.
Memory operation logs, agent state inspection, message history with full trace. Per-session cost not native (depends on LLM provider). Cap rule N/A.
G1=N (no engine-level ABAC), G2=Y (memory operation logs), G3=N, G4=Y (memory versioning via history), G5=N, G6=N. 2/6 -> 2; bumped to 3 for memory-as-audit-trail.
O1=Y (Letta Cloud has dashboards; OSS exposes basic metrics), O2=N, O3=N (no per-operation cost — depends on LLM provider), O4=Y (memory health monitorable), O5=N, O6=N. 2/6 -> 3 lenient.
A1=Y (sub-100ms hot memory), A2=Y (memory writes immediately available), A3=N (no integral cache beyond Postgres page cache), A4=N (single-instance OSS; multi-instance requires careful deployment), A5=N (memory backends scale but framework newer), A6=N (sequential memory ops typical). 2/6 -> 3.
L1=Y (memory entities have stable identity), L2=N, L4=Y (continuous learning IS what memory does), L5=Y (memory section names, agent persona, tool registry), L6=N. 3/6 -> 5 lenient (memory-management is a specialized lexicon discipline; this is Letta's strongest dimension).
S1=Y (memory writes are deterministic given inputs), S2=Y (typed memory blocks), S3=Y (single-source-of-truth per agent), S4=Y (typed schemas for memory blocks), S5=N (no built-in content quality validation on memory writes — this is a real risk), S6=Y (memory operation logs flag anomalies). 5/6 -> 4.
Best suited for
Compliance certifications
Letta (OSS) holds no compliance certifications. Letta Cloud (managed) advertises BAA availability and SOC 2 path — verify with sales for current attestation status. Self-hosted Letta inherits substrate compliance only. GDPR right-to-be-forgotten is operator responsibility — ensure memory deletion APIs are wired and tested.
Use with caution for
Mem0 is the most direct alternative — Apache-2.0 OSS, similar memory-layer-for-agents framing. Letta wins on memory hierarchy first-class (core/recall/archival); Mem0 wins on framework integration breadth (LangChain/LlamaIndex/CrewAI plugins are smoother). Both are emerging; pick on integration ergonomics.
View analysis →LangChain provides ConversationBufferMemory + ConversationSummaryMemory primitives but they're simpler than Letta's hierarchy. Use Letta when memory-as-OS is the architecture; use LangChain memory primitives for simpler conversational state.
View analysis →Redis is a primitive (in-memory KV); Letta is a framework that uses Redis-like backends. Use Redis directly for transient session state; use Letta for persistent agent memory with hierarchy + write-side semantics.
View analysis →Role: L4 Agent Memory framework. Manages persistent agent state (core/recall/archival hierarchy) backed by Postgres + vector DB. Used inside L7 agent runtimes that need cross-session continuity.
Upstream: Receives memory operations from L7 agent runtime (read/write API). Receives raw model completions from L4 LLM providers (OpenAI, Anthropic, Mistral, vLLM). Memory writes come from agent tool calls.
Downstream: Returns memory contents to the agent's prompt context. Persists state changes to L1 storage (Postgres + vector DB). Memory operation logs feed L6 observability + L5 audit.
Mitigation: Validate all memory writes against safety guardrails (NeMo Guardrails, Promptfoo policies). Implement write-time review for sensitive memory blocks. Periodic audits of archival memory for anomalies. Don't let untrusted users influence memory writes that affect other users.
Mitigation: Tag every memory write with user_id (or agent_owner). Implement explicit deletion API that traverses all storage backends. Test the deletion flow regularly (automated GDPR-fire-drill).
Mitigation: Define explicit policies for what the agent can write to which memory tier. Core memory (always in prompt) needs strict gating. Archival memory (searchable) needs review for sensitive data (PII, credentials) before insertion.
Mitigation: Strict per-agent (per-user) isolation. Validate with multi-tenant tests. Use storage-backend filters that enforce agent_id at query time.
Mitigation: Implement memory-decay or summarization strategies. Cap archival memory size per agent. Monitor memory growth + retrieval latency.
Letta Cloud signs the BAA. Per-patient agent_id isolates memory. Core memory holds care plan summary; archival memory holds historical interactions. GDPR-equivalent (HIPAA Right of Access) implementable via memory query + export.
Self-hosted Letta with Postgres + pgvector. Agent updates archival memory after each session; core memory summarizes project state. Cost minimal; sovereignty maximum.
Letta's persistent memory overhead isn't justified for stateless query/response. Use a simpler RAG-only setup. Letta shines when memory continuity actually matters.
This analysis is AI-generated using the INPACT and GOALS frameworks from "Trust Before Intelligence." Scores and assessments are algorithmic and may not reflect the vendor's complete capabilities. Always validate with your own evaluation.