Caching Strategies
A cache is a fast, temporary data store that sits in front of a slower data source and serves repeated reads without hitting the origin.
Caching is one of the highest-leverage performance optimizations available — a well-placed cache can reduce database queries by 90%+ and cut p99 latency from hundreds of milliseconds to single digits. It is also one of the most dangerous: a poorly implemented cache serves stale data, causes cache stampedes, and creates consistency bugs that are hard to reproduce.
“There are only two hard things in Computer Science: cache invalidation and naming things.” — Phil Karlton
Where Caching Lives
Section titled “Where Caching Lives”Caches exist at multiple layers. Understanding which layer you’re working with determines what strategies apply.
| Layer | Technology | What It Caches | Typical Latency |
|---|---|---|---|
| Browser | HTTP cache headers | Static assets, API responses | 0ms (local disk) |
| CDN | CloudFront, Cloudflare, Fastly | Static assets, cacheable HTML/API | ~5–50ms (edge node) |
| Application / In-process | In-memory dict, Guava cache | Frequently accessed objects (per-instance) | <1ms |
| Distributed Cache | Redis, Memcached | Shared data across all app instances | 1–5ms |
| Database Query Cache | Postgres, MySQL (deprecated) | Query result sets | ~1ms (avoided in modern DBs) |
Cache-Aside (Lazy Loading)
Section titled “Cache-Aside (Lazy Loading)”The most common pattern. The application manages the cache explicitly: check the cache first, fetch from the DB on a miss, then populate the cache.
def get_user(user_id: int) -> User: # 1. Check cache cached = redis.get(f"user:{user_id}") if cached: return deserialize(cached)
# 2. Cache miss — fetch from DB user = db.query("SELECT * FROM users WHERE id = ?", user_id)
# 3. Populate cache with a TTL redis.setex(f"user:{user_id}", ttl=300, value=serialize(user))
return userAdvantages:
- Only data that’s actually requested gets cached (no wasted memory on cold data)
- Cache fails gracefully — if Redis is down, the app still works (just slower)
Disadvantages:
- First request after a cache miss (or cold start) hits the DB — potential latency spike
- Cache stampede: if a popular key expires, hundreds of requests simultaneously miss and all hit the DB at once
Stampede mitigation:
- Mutex/lock: Only one request fetches from DB; others wait for the cache to populate
- Probabilistic Early Expiration: Randomly refresh the cache slightly before it expires, based on estimated recomputation cost
- Stale-while-revalidate: Return the stale value immediately and refresh in the background
Write Strategies
Section titled “Write Strategies”How you write data determines consistency guarantees and performance tradeoffs.
Write-Through
Section titled “Write-Through”Write to the cache and the database simultaneously, in the same operation.
Write ──► [Cache] ──► [Database] (both or neither)- Consistency: Cache is always up-to-date
- Latency: Write latency = DB write latency (cache doesn’t speed up writes)
- Risk: Cache fills with data that may never be read (write heavy, low read ratio = wasted cache)
Write-Behind (Write-Back)
Section titled “Write-Behind (Write-Back)”Write to the cache immediately; asynchronously flush to the database later.
Write ──► [Cache] └──► [async flush] ──► [Database]- Consistency: Risk of data loss if cache node fails before flushing
- Latency: Very fast writes (return as soon as cache is updated)
- Use for: High-write, low-criticality data (analytics events, view counts, activity feeds)
Write-Around
Section titled “Write-Around”Skip the cache on writes — write directly to the database. Cache is populated only on subsequent reads (cache-aside pattern).
- Use for: Data written once but read infrequently — you don’t want to pollute the cache with data that may never be re-read
Cache Invalidation Strategies
Section titled “Cache Invalidation Strategies”The hardest problem in caching: when and how to evict stale data.
TTL (Time-To-Live)
Section titled “TTL (Time-To-Live)”Simplest approach: every cache entry expires after a fixed duration.
redis.setex("product:42", ttl=3600, value=serialize(product)) # 1-hour TTL- Simple: No explicit invalidation code needed
- Trade-off: Data can be stale for up to
TTLseconds; too low a TTL defeats the purpose of caching
TTL guidelines by data type:
| Data Type | Recommended TTL |
|---|---|
| User profile | 5–15 minutes |
| Product catalog | 1–24 hours |
| Configuration / feature flags | 1–30 minutes |
| Session data | Equal to session expiry |
| Rate limit counters | Equals the rate limit window |
Event-Driven Invalidation
Section titled “Event-Driven Invalidation”When data changes, explicitly delete or update the cache entry.
def update_user(user_id: int, data: dict): db.update("users", user_id, data) redis.delete(f"user:{user_id}") # Force re-fetch on next read- More consistent than TTL alone — cache is invalidated the moment data changes
- More complex — every write path must remember to invalidate the relevant cache keys
Eviction Policies
Section titled “Eviction Policies”When a cache reaches its memory limit, it must evict old entries. Choose the policy based on your access patterns.
| Policy | Behavior | Best For |
|---|---|---|
| LRU (Least Recently Used) | Evict the item that hasn’t been accessed the longest | General-purpose (Redis default) |
| LFU (Least Frequently Used) | Evict the item accessed least often | Workloads with clear hot vs. cold items |
| FIFO | Evict the oldest item regardless of access | Simple, fair eviction |
| Random | Evict a random item | Very fast, used in some CPU caches |
| TTL-based | Evict expired items first | Always-on background eviction |
Redis allows configuring eviction policy globally: maxmemory-policy allkeys-lru is a safe default for most use cases.
CDN Caching
Section titled “CDN Caching”A CDN (Content Delivery Network) caches content at edge nodes geographically close to users. Requests are served by the nearest edge node rather than your origin server.
Cache-Control Headers
Section titled “Cache-Control Headers”CDN behavior is controlled via HTTP headers:
Cache-Control: public, max-age=86400 # Cache for 24 hours, anywhereCache-Control: private, max-age=300 # Only browser cache, not CDNCache-Control: no-cache # Must revalidate before servingCache-Control: no-store # Never cache (auth pages, etc.)Cache-Control: s-maxage=3600, max-age=60 # CDN: 1hr, browser: 1minCache-Control: stale-while-revalidate
Section titled “Cache-Control: stale-while-revalidate”A powerful directive: serve the stale cached response immediately while refreshing in the background.
Cache-Control: max-age=60, stale-while-revalidate=300The user gets a fast response; the CDN updates the cached content asynchronously. Users see content up to 5 minutes old, but never experience a slow cache-miss.
Cache Busting
Section titled “Cache Busting”When you deploy new static assets, you want users to get the new version immediately — but their browser has the old version cached. The solution: include a content hash in the filename.
# Before: browser caches app.js — old version served indefinitely<script src="/app.js">
# After: filename changes on every deploy — cache busted automatically<script src="/app.d4f3a82.js">This way, the cache max-age can be very long (years) for hashed files, since the URL itself changes when content changes.
Redis Patterns
Section titled “Redis Patterns”Redis is the de-facto standard distributed cache. Beyond simple key-value storage, it supports data structures that enable sophisticated patterns.
Common Data Structures
Section titled “Common Data Structures”| Structure | Commands | Use Case |
|---|---|---|
| String | GET, SET, INCR | Simple cache entries, counters, rate limiting |
| Hash | HGET, HSET, HGETALL | Object fields (user profile fields individually) |
| List | LPUSH, RPOP | Queues, activity feeds, recent items |
| Set | SADD, SMEMBERS, SINTER | Unique visitors, tag membership, deduplication |
| Sorted Set | ZADD, ZRANGE | Leaderboards, ranked results, sliding window rate limits |
Rate Limiting with Redis
Section titled “Rate Limiting with Redis”def is_rate_limited(user_id: str, limit: int = 100, window_seconds: int = 60) -> bool: key = f"rate:{user_id}:{int(time.time() / window_seconds)}" count = redis.incr(key) if count == 1: redis.expire(key, window_seconds) # Set TTL on first increment return count > limitWhen NOT to Cache
Section titled “When NOT to Cache”Caching is not always the answer. Avoid caching when:
- Data changes frequently: If the cache hit rate is low, you’re adding complexity for no benefit
- Stale data is unacceptable: Financial balances, inventory counts, anything where users must see exact current state
- The operation is not idempotent: Caching the result of a write operation or side-effectful function is dangerous
- The source is already fast: Caching an in-memory lookup or a well-indexed simple DB query adds overhead without meaningful benefit