Caching Strategies

A cache is a fast, temporary data store that sits in front of a slower data source and serves repeated reads without hitting the origin.

Caching is one of the highest-leverage performance optimizations available — a well-placed cache can reduce database queries by 90%+ and cut p99 latency from hundreds of milliseconds to single digits. It is also one of the most dangerous: a poorly implemented cache serves stale data, causes cache stampedes, and creates consistency bugs that are hard to reproduce.

“There are only two hard things in Computer Science: cache invalidation and naming things.” — Phil Karlton

Where Caching Lives

Caches exist at multiple layers. Understanding which layer you’re working with determines what strategies apply.

Layer	Technology	What It Caches	Typical Latency
Browser	HTTP cache headers	Static assets, API responses	0ms (local disk)
CDN	CloudFront, Cloudflare, Fastly	Static assets, cacheable HTML/API	~5–50ms (edge node)
Application / In-process	In-memory dict, Guava cache	Frequently accessed objects (per-instance)	<1ms
Distributed Cache	Redis, Memcached	Shared data across all app instances	1–5ms
Database Query Cache	Postgres, MySQL (deprecated)	Query result sets	~1ms (avoided in modern DBs)

Cache-Aside (Lazy Loading)

The most common pattern. The application manages the cache explicitly: check the cache first, fetch from the DB on a miss, then populate the cache.

def get_user(user_id: int) -> User:
    # 1. Check cache
    cached = redis.get(f"user:{user_id}")
    if cached:
        return deserialize(cached)

    # 2. Cache miss — fetch from DB
    user = db.query("SELECT * FROM users WHERE id = ?", user_id)

    # 3. Populate cache with a TTL
    redis.setex(f"user:{user_id}", ttl=300, value=serialize(user))

    return user

Advantages:

Only data that’s actually requested gets cached (no wasted memory on cold data)
Cache fails gracefully — if Redis is down, the app still works (just slower)

Disadvantages:

First request after a cache miss (or cold start) hits the DB — potential latency spike
Cache stampede: if a popular key expires, hundreds of requests simultaneously miss and all hit the DB at once

Stampede mitigation:

Mutex/lock: Only one request fetches from DB; others wait for the cache to populate
Probabilistic Early Expiration: Randomly refresh the cache slightly before it expires, based on estimated recomputation cost
Stale-while-revalidate: Return the stale value immediately and refresh in the background

Write Strategies

How you write data determines consistency guarantees and performance tradeoffs.

Write-Through

Write to the cache and the database simultaneously, in the same operation.

Write ──► [Cache] ──► [Database]  (both or neither)

Consistency: Cache is always up-to-date
Latency: Write latency = DB write latency (cache doesn’t speed up writes)
Risk: Cache fills with data that may never be read (write heavy, low read ratio = wasted cache)

Write-Behind (Write-Back)

Write to the cache immediately; asynchronously flush to the database later.

Write ──► [Cache]
              └──► [async flush] ──► [Database]

Consistency: Risk of data loss if cache node fails before flushing
Latency: Very fast writes (return as soon as cache is updated)
Use for: High-write, low-criticality data (analytics events, view counts, activity feeds)

Write-Around

Skip the cache on writes — write directly to the database. Cache is populated only on subsequent reads (cache-aside pattern).

Use for: Data written once but read infrequently — you don’t want to pollute the cache with data that may never be re-read

Cache Invalidation Strategies

The hardest problem in caching: when and how to evict stale data.

TTL (Time-To-Live)

Simplest approach: every cache entry expires after a fixed duration.

redis.setex("product:42", ttl=3600, value=serialize(product))  # 1-hour TTL

Simple: No explicit invalidation code needed
Trade-off: Data can be stale for up to TTL seconds; too low a TTL defeats the purpose of caching

TTL guidelines by data type:

Data Type	Recommended TTL
User profile	5–15 minutes
Product catalog	1–24 hours
Configuration / feature flags	1–30 minutes
Session data	Equal to session expiry
Rate limit counters	Equals the rate limit window

Event-Driven Invalidation

When data changes, explicitly delete or update the cache entry.

def update_user(user_id: int, data: dict):
    db.update("users", user_id, data)
    redis.delete(f"user:{user_id}")  # Force re-fetch on next read

More consistent than TTL alone — cache is invalidated the moment data changes
More complex — every write path must remember to invalidate the relevant cache keys

Eviction Policies

When a cache reaches its memory limit, it must evict old entries. Choose the policy based on your access patterns.

Policy	Behavior	Best For
LRU (Least Recently Used)	Evict the item that hasn’t been accessed the longest	General-purpose (Redis default)
LFU (Least Frequently Used)	Evict the item accessed least often	Workloads with clear hot vs. cold items
FIFO	Evict the oldest item regardless of access	Simple, fair eviction
Random	Evict a random item	Very fast, used in some CPU caches
TTL-based	Evict expired items first	Always-on background eviction

Redis allows configuring eviction policy globally: maxmemory-policy allkeys-lru is a safe default for most use cases.

CDN Caching

A CDN (Content Delivery Network) caches content at edge nodes geographically close to users. Requests are served by the nearest edge node rather than your origin server.

Cache-Control Headers

CDN behavior is controlled via HTTP headers:

Cache-Control: public, max-age=86400          # Cache for 24 hours, anywhere
Cache-Control: private, max-age=300           # Only browser cache, not CDN
Cache-Control: no-cache                       # Must revalidate before serving
Cache-Control: no-store                       # Never cache (auth pages, etc.)
Cache-Control: s-maxage=3600, max-age=60      # CDN: 1hr, browser: 1min

Cache-Control: `stale-while-revalidate`

A powerful directive: serve the stale cached response immediately while refreshing in the background.

Cache-Control: max-age=60, stale-while-revalidate=300

The user gets a fast response; the CDN updates the cached content asynchronously. Users see content up to 5 minutes old, but never experience a slow cache-miss.

Cache Busting

When you deploy new static assets, you want users to get the new version immediately — but their browser has the old version cached. The solution: include a content hash in the filename.

# Before: browser caches app.js — old version served indefinitely
<script src="/app.js">

# After: filename changes on every deploy — cache busted automatically
<script src="/app.d4f3a82.js">

This way, the cache max-age can be very long (years) for hashed files, since the URL itself changes when content changes.

Redis Patterns

Redis is the de-facto standard distributed cache. Beyond simple key-value storage, it supports data structures that enable sophisticated patterns.

Common Data Structures

Structure	Commands	Use Case
String	`GET`, `SET`, `INCR`	Simple cache entries, counters, rate limiting
Hash	`HGET`, `HSET`, `HGETALL`	Object fields (user profile fields individually)
List	`LPUSH`, `RPOP`	Queues, activity feeds, recent items
Set	`SADD`, `SMEMBERS`, `SINTER`	Unique visitors, tag membership, deduplication
Sorted Set	`ZADD`, `ZRANGE`	Leaderboards, ranked results, sliding window rate limits

Rate Limiting with Redis

def is_rate_limited(user_id: str, limit: int = 100, window_seconds: int = 60) -> bool:
    key = f"rate:{user_id}:{int(time.time() / window_seconds)}"
    count = redis.incr(key)
    if count == 1:
        redis.expire(key, window_seconds)  # Set TTL on first increment
    return count > limit

When NOT to Cache

Caching is not always the answer. Avoid caching when:

Data changes frequently: If the cache hit rate is low, you’re adding complexity for no benefit
Stale data is unacceptable: Financial balances, inventory counts, anything where users must see exact current state
The operation is not idempotent: Caching the result of a write operation or side-effectful function is dangerous
The source is already fast: Caching an in-memory lookup or a well-indexed simple DB query adds overhead without meaningful benefit