Valkey

L1 — Multi-Modal Storage Cache Free (OSS) BSD-3-Clause · OSS

Open-source in-memory data store and message broker. Linux Foundation fork of Redis at version 7.2.4, forked in March 2024 when Redis adopted RSALv2/SSPL. BSD-3-Clause license inherited from pre-RSALv2 Redis. Drop-in replacement for Redis OSS; does not include the additional modules from Redis Stack (vector search, JSON, graph, time-series).

AI Analysis

Valkey is the Linux Foundation OSS continuation of Redis, forked at Redis 7.2.4 in March 2024 immediately after Redis adopted RSALv2/SSPL licensing. It's the canonical OSI-approved path for Redis-compatible in-memory caching, and unusually for an OSS fork, the major hyperscalers committed quickly: AWS ElastiCache, Google Cloud MemoryStore, and Oracle Cloud all run Valkey as their managed Redis-compatible offerings. Drop-in replacement for Redis OSS in cache, pub/sub, and streams use cases — but does NOT include the modules that make up Redis Stack (vector search, JSON, graph, time-series).

Trust Before Intelligence

Valkey's defining trust property is **license predictability with hyperscaler commitment**. When Redis adopted RSALv2/SSPL in March 2024, the cloud providers had a problem: their managed Redis services depended on a now-source-available substrate. Within weeks, AWS, Google, and Oracle pivoted to Valkey, locking in OSS license continuity through hyperscaler-scale operational investment. For agent stacks, this means: choosing Valkey doesn't just give you OSS license posture, it gives you the same substrate the cloud providers themselves are running. The risk shifts from 'will this fork survive?' (it will, AWS depends on it) to 'will the OSS Redis ecosystem fragment between Valkey and Redis 8+ proper?' — a real concern as both projects evolve away from the 7.2.4 fork point.

INPACT Score

23/36

I — Instant

6/6

Sub-millisecond in-memory key-value operations. No cold start in steady state — the cluster is always hot. Cap rule N/A. Same engine as Redis 7.2.4; performance is identical at the fork point and continues to track Redis closely as both projects optimize.

N — Natural

2/6

Redis commands (GET, SET, ZADD, MULTI, etc.) are precise and well-documented but not natural language. Cap rule N/A — commands are a standard, not a 'proprietary query language' in the methodology sense, but the score reflects the absence of natural-language semantic comprehension.

P — Permitted

3/6

RBAC via ACL since Redis 6 (multi-user with per-key/per-command permissions) but no ABAC. Cap rule applied: 'RBAC-only without ABAC -> cap at 3.' Authentication via passwords or AUTH; no entity-attribute-based decisions. To get ABAC for Valkey-cached data, push policy enforcement to L5 governance layer.

A — Adaptive

4/6

Cluster mode with sharding, Sentinel for HA, multi-cloud via every major cloud's managed offering plus self-hosted. Cap rule N/A — not single-cloud lock-in. Drift detection requires external tooling but data replication is built-in.

C — Contextual

4/6

Multiple data types (strings, hashes, lists, sets, sorted sets, streams, HyperLogLog, bitmaps), pub/sub messaging, Lua scripting for atomic multi-key operations. Less external system integration than databases with FDW concepts (Postgres) but rich within its in-memory domain. Cap rule N/A.

T — Transparent

4/6

MONITOR command shows live operations, slowlog tracks slow queries, INFO provides operational stats, latency monitoring built-in. No query plans (it's not a query system), no per-query cost (it's a cache). Cap rule N/A — cost-per-query attribution is conceptually N/A for in-memory cache.

GOALS Score

16/25

G — Governance

2/6

G1=N (no ABAC; ACL is RBAC-only with sub-10ms enforcement but cap rule applies), G2=N (slowlog captures slow operations only, not full access audit by default — needs external tooling), G3=N (cache primitive, not workflow tool — HITL belongs upstream), G4=N (no model versioning concept), G5=N (no AI threat modeling in cache scope), G6=N (LF project doesn't publish compliance mappings). 1/6 -> 2.

O — Observability

2/6

O1=Y (INFO/MONITOR integrate cleanly with Datadog/Prometheus/Grafana for APM), O2=N (no native distributed tracing — must come from app layer), O3=N (LLM cost tracking N/A for cache, but the question literally asks Yes/No), O4=Y (MTTD <10min achievable with monitoring stack, slowlog for hot spots), O5=N (no drift detection — that's an external concern), O6=N (not an AI decisioning system). 2/6 -> 2.

A — Availability

5/6

A1=Y (sub-millisecond p95 — Valkey is faster than the 2s budget by 1000x), A2=Y (in-memory means data freshness <30s is trivial), A3=Y (cache hit rate IS the core metric — easy to monitor and tune), A4=Y (HA via cluster + Sentinel achieves 99.9%+ when properly deployed), A5=Y (Redis-grade scale — hyperscalers run Valkey at multi-million ops/sec), A6=Y (MGET, pipelining, multi-key transactions for parallel retrieval). 6/6 -> 5.

L — Lexicon

2/6

L1=N (no entity resolution — Valkey returns whatever was cached), L2=N (no glossary), L3=N (no disambiguation), L4=N (no continuous learning), L5=Y (key namespacing convention IS terminology alignment if used disciplined-ly, lenient interp), L6=N (no human evaluation surface). 1/6 -> 2.

S — Solid

5/6

S1=Y (deterministic — cache returns exactly what was stored, every time), S2=Y (typed values, no NULL/missing fields by design), S3=Y (cluster replication keeps replicas eventually consistent — lenient interp matching memcached peer at S=5), S4=Y (typed key-value model means schema mismatches are impossible — there's no schema to violate), S5=Y (replication + AOF persistence + RDB snapshots act as 3-stage quality gates), S6=Y (slowlog as anomaly detection signal, latency monitoring identifies outliers). 6/6 -> 5.

AI-Identified Strengths

+ License predictability — BSD-3-Clause under Linux Foundation governance, immune to the unilateral relicensing that hit Redis (RSALv2/SSPL March 2024)
+ Hyperscaler commitment — AWS ElastiCache, Google Cloud MemoryStore, Oracle Cloud all run Valkey as their managed Redis-compatible service. The cloud providers have skin in the game.
+ Drop-in replacement for Redis OSS — Redis client libraries work without modification, application code requires zero changes for cache/pub-sub/streams use cases
+ Sub-millisecond performance with cluster mode for sharding and Sentinel for HA
+ Active development continuing the trajectory Redis was on before RSALv2 — bug fixes, security patches, new features land in Valkey
+ Multi-cloud via every major cloud's managed offering plus self-hosted — true portability
+ Smaller operational surface than Redis Stack — without the modules, fewer features means fewer attack surfaces and fewer things to tune

AI-Identified Limitations

- Does NOT include Redis Stack's modules (RediSearch, RedisJSON, RedisGraph, RedisTimeSeries) — if you need those, you're not actually a Valkey use case
- RBAC only (via ACL since Redis 6) — no ABAC. ABAC enforcement must happen at L5, not at Valkey
- Eventual consistency in cluster mode — strict consistency requires Sentinel + careful client config, not the default
- No native distributed tracing — observability beyond INFO/MONITOR comes from app instrumentation
- Compliance certifications come from your deployment, not from Valkey the project — the LF project doesn't hold SOC 2 or sign BAAs
- Ecosystem fragmentation risk — as Valkey and Redis 8+ both evolve away from the 7.2.4 fork point, application compatibility may diverge over time
- Persistence (AOF/RDB) requires careful tuning for high-write workloads; RDB snapshots can pause production briefly

Industry Fit

Best suited for

OSS-first organizations avoiding RSALv2/SSPL license trapsMulti-cloud agent stacks where cache portability is a strategic priorityWorkloads already using AWS ElastiCache, GCP MemoryStore, or Oracle Cloud — those services run Valkey nativelyRAG pipelines using cache for embedding lookups, session state, or rate-limit countersTeams with existing Redis expertise wanting to avoid the RSALv2 license commitment going forwardHyperscaler marketplace listings that require OSI-approved OSS components

Compliance certifications

Valkey the project does not hold compliance certifications. Compliance comes from how you deploy: AWS ElastiCache for Valkey (HIPAA BAA, SOC 2, FedRAMP Moderate, PCI DSS, ISO 27001), Google Cloud MemoryStore for Valkey (HIPAA BAA, SOC 2, ISO 27001), self-hosted on FedRAMP-authorized infrastructure (AWS GovCloud, Azure Gov). The LF project doesn't sign BAAs or hold third-party audit reports. For regulated workloads, pick the managed deployment that matches your compliance gate.

Use with caution for

Teams expecting Redis Stack's modules (vector, JSON, graph, time-series) — those aren't in ValkeyWorkloads requiring sub-microsecond latency — Valkey is sub-millisecond, but specialized in-process caches like Caffeine/Java are faster for embedded useStrict-consistency requirements — Valkey cluster is eventually consistent by defaultTeams without operational expertise to manage stateful HA infrastructure — use a managed offering (ElastiCache, MemoryStore) or pick Momento serverless instead

AI-Suggested Alternatives

Redis Stack

Choose Redis Stack when you need the additional modules (RediSearch for vector/full-text, RedisJSON, RedisGraph, RedisTimeSeries) and accept RSALv2/SSPL licensing. Valkey wins on license predictability and hyperscaler commitment but is missing those modules entirely. If your use case is just cache/pub-sub/streams, Valkey is strictly better. If you need vector search inside Redis specifically, you need Redis Stack or an alternative like pgvector at L1.

View analysis →

Memcached

Choose Memcached for the simplest possible distributed key-value cache with no clustering, no persistence, no rich types. Valkey wins on data structures (sorted sets, streams, pub/sub), HA via Sentinel/cluster, and persistence. Memcached wins on operational simplicity for stateless cache use cases — fewer features, less to tune, faster startup.

View analysis →

AWS MemoryDB for Redis

Choose AWS MemoryDB for AWS-native deployments wanting durable Redis-compatible storage with strong consistency (it's not just cache — it's a primary database). Valkey wins on cross-cloud flexibility and OSS license posture; MemoryDB wins on durability guarantees and managed operational burden inside AWS. Note AWS ElastiCache is the cache equivalent and runs on Valkey.

View analysis →

Momento

Choose Momento for serverless cache with zero ops and pay-per-request pricing. Valkey wins on portability and predictable cost; Momento wins on operational simplicity (no cluster to manage) and elasticity (scales automatically). Momento makes sense for variable traffic; Valkey makes sense for predictable workloads where you can right-size capacity.

View analysis →

Integration in 7-Layer Architecture

Role: L1 in-memory cache and message broker substrate. Provides key-value lookups, pub/sub, streams, and atomic data structures (sets, sorted sets, hashes) for agent stacks.

Upstream: Receives writes from L2 streaming (CDC results), L4 retrieval (cached embeddings), L7 orchestration (task queues), and direct application caches. Configuration via valkey.conf or config commands.

Downstream: Serves cached reads to L4 retrieval (hot-data lookups), L7 inter-agent messaging (pub/sub fanout), L5 governance (ABAC decision cache, rate-limit counters), and direct application reads.

⚡ Trust Risks

high Team picks Valkey expecting Redis Stack's vector search / JSON / graph features

Mitigation: Audit your Redis usage for module dependencies BEFORE picking Valkey. If you use FT.SEARCH, JSON.SET, GRAPH.QUERY, etc., you need either Redis Stack proper (RSALv2 trade-off) or a different stack. For vector search, consider pgvector at L1; for JSON, native JSONB in Postgres or document store; for time-series, TimescaleDB.

high Cluster deployed without Sentinel or proper HA — single-node Valkey in production

Mitigation: Use cluster mode (3+ nodes) for sharding OR Sentinel for HA. Test failover with a planned reboot and measure RTO. Don't run production agent stacks against a single Valkey instance — cache restart loses all state in the warm-up window.

medium AOF/RDB persistence not configured, treating Valkey as durable storage

Mitigation: Decide explicitly: cache (AOF off, accept data loss on restart) or persistent store (AOF=always or every-second). Document the choice. Don't accidentally rely on RDB snapshots — they can lag the in-memory state by minutes.

medium ACL not enabled — using AUTH password as the only access control

Mitigation: Enable Redis ACLs (Redis 6+, inherited by Valkey). Create per-service users with command/key restrictions. Don't share the default user across all clients. ACL changes are logged via aclLog command.

medium Treating Valkey's eventual consistency in cluster mode as strict consistency

Mitigation: Read from replicas may lag the primary by milliseconds. For agent state requiring strict reads, use WAIT command or read-from-primary. Document which agent operations tolerate staleness vs require consistency.

Use Case Scenarios

strong Multi-cloud agent stack using AWS ElastiCache + GCP MemoryStore consistently

Both managed services run Valkey, so behavior, command semantics, and tuning patterns are identical across clouds. Avoids per-cloud cache divergence. License posture is consistent and OSS.

strong RAG agent caching embedding-similarity results with sub-ms p95

Use Valkey as a key-value lookup for query-hash → top-K embedding IDs. Misses fall through to pgvector or Pinecone at L1/L4. Hit rate above 60% delivers significant cost reduction on embedding API calls.

weak Replacing Redis Stack for vector similarity search workloads

Valkey doesn't have RediSearch. If your existing Redis Stack deployment relies on FT.SEARCH or JSON-based vector storage, migrating to Valkey requires re-architecting. Move vector search to pgvector, Milvus, or Qdrant at L1; keep Valkey at L1 for actual caching.

strong Greenfield startup needing Redis-compatible cache with OSS license

Valkey is the obvious choice — same API as Redis, BSD-3 license, hyperscaler-managed services available immediately. No reason to start with Redis Stack for cache-only use cases.

Stack Impact

L1 Valkey colocates with primary L1 stores (Postgres, Snowflake) as the operational cache layer. Reduces upstream load by 60-90% for read-heavy agent workloads through hot-data caching.

L2 Pub/sub and streams enable Valkey-as-message-bus for low-latency intra-stack signaling, complementing or replacing Kafka for ephemeral, stream-as-state use cases at L2.

L4 Caches embedding lookups, retrieved chunks, and reranker scores at L4 to absorb traffic spikes against expensive LLM/embedding APIs. Counter-intuitively NOT a substitute for Redis Stack's vector search — that needs the modules.

L5 Caches session tokens and rate-limit counters at L5. Sub-millisecond ABAC decision caching when policy engines (OPA, Cedar) push enforcement results to Valkey.

L7 Backbone for inter-agent message passing in multi-agent orchestration at L7 — pub/sub for fanout, sorted sets for priority queues, lists for FIFO task queues.

⚠ Watch For

! Team running production Valkey on a single node — restart loses all cache state, agent stack hits cold-start latency until warm
! Team expects Valkey to provide vector search — they need Redis Stack, pgvector, or a dedicated vector DB
! Persistence not configured (AOF off + RDB off) when treating Valkey as durable storage — all data lost on restart
! ACL not enabled — single shared password protects all access, no per-service permission separation
! Choosing Valkey because 'it's free' without budgeting for HA cluster operations — the operational expertise is non-trivial
! Treating Valkey cluster as strictly consistent — replica reads can lag primary by milliseconds, which breaks agent state assumptions

2-Week POC Checklist

☐ Deploy 3-node Valkey cluster (or single primary + 2 Sentinel-managed replicas) and benchmark cache hit rate under realistic agent traffic
☐ Configure persistence appropriate to use case: AOF=always for durable data, RDB-only for ephemeral cache, neither for pure ephemeral state
☐ Enable ACLs and create per-service users with restricted command/key permissions; don't use the default user in production
☐ Set up Datadog/Prometheus integration via INFO endpoint; build dashboards for hit rate, evictions, latency, slowlog volume
☐ Configure alerting on cluster failover events, replica lag, and hit-rate degradation below your SLO threshold
☐ Test cluster failover with a planned reboot and measure RTO; verify clients reconnect to the new primary without manual intervention
☐ Audit existing Redis usage for module dependencies (FT.SEARCH, JSON, GRAPH, TS) — those aren't in Valkey, plan migration if needed
☐ Pick a managed deployment (AWS ElastiCache for Valkey, GCP MemoryStore for Valkey) if your team can't operate stateful HA — don't self-host without expertise
☐ Document which agent operations require strict consistency (read-from-primary) vs tolerate replica lag
☐ Verify your deployment's compliance posture (HIPAA BAA, SOC 2, FedRAMP) independently of Valkey the project — the LF project doesn't certify your deployment

Explore in Interactive Stack Builder →

Visit Valkey website →

This analysis is AI-generated using the INPACT and GOALS frameworks from "Trust Before Intelligence." Scores and assessments are algorithmic and may not reflect the vendor's complete capabilities. Always validate with your own evaluation.