The Basics of Caching — strategies that actually move the needle

August 30th, 20251 min read#dev #performance #caching #backend #ca-duh

A fast primer on in-memory, distributed, and CDN caching—what they are, when to use each, and how they boost latency, throughput, and cost efficiency.

TL;DR

Put CDN/edge in front of anything public. It offloads traffic and slashes tail latency.
Use distributed cache (Redis/Memcached) for data shared across instances.
Add in-memory (per-process) caches for ultra-hot, tiny objects and computed results.
Start with cache-aside + TTL, then add request coalescing and stale-while-revalidate to tame stampedes.

Why cache at all?

Caching trades freshness for speed & scale.

Lower latency: serve from memory/edge (micro- to milliseconds).
Higher throughput: fewer DB and origin hits.
Resilience: withstand traffic spikes and partial outages.
Lower cost: less egress, fewer DB reads, smaller fleets.

The layers (mental map)

Browser cache: honors Cache-Control, ETag, Last-Modified.
CDN/Edge (reverse proxy): Cloudflare/Fastly/Akamai; caches full responses at POPs.
App-side caches:
- In-memory (process-local): fastest, not shared across instances.
- Distributed (Redis/Memcached): shared, network hop required.
DB/query cache: results/materialized views; often downstream of app caches.

Core strategies

1) In-memory (process-local)

Use for: super-hot config, small computed results, feature flags, rate-limited calls.

Pros: nanosecond lookup, zero network.
Cons: not shared; stale until process restarts; risk of duplication across instances.

Example (LRU + TTL, Node/TS):

import LRU from "lru-cache";

const cache = new LRU<string, any>({ max: 5_000, ttl: 60_000 }); // 1 min

export async function getUserFast(id: string) {
  const key = `user:${id}`;
  let val = cache.get(key);
  if (val) return val;

  // cache-aside
  val = await db.users.findById(id);
  cache.set(key, val);
  return val;
}