TL;DR
- Concurrency = structure: multiple tasks are in progress at the same time (often interleaved on one CPU). Great for I/O-bound work.
- Parallelism = execution: multiple tasks are executing simultaneously on different cores. Great for CPU-bound work.
- You can have concurrency without parallelism (event loop) and parallelism without concurrency (single big vectorized task). Many systems use both.
- Choose async I/O / event loops for high-latency waits (HTTP/DB), and threads/processes/workers for heavy CPU. Limit concurrency to avoid overload.
One-minute mental model
Time ─────────────────────────────────────────────────────────────
Concurrency (interleaving on one core)
Task A: ███ ███ ███ ███
Task B: ███ ███ ███
Task C: ███ ███ ███
Parallelism (multi-core, truly simultaneous)
Core 1: ███████████████████ Task A
Core 2: ███████████████████ Task B
Core 3: ███████████████████ Task C
- Concurrency is how you structure a program to make progress on multiple tasks by switching between them when one would otherwise wait.
- Parallelism is about using more hardware (cores/threads/GPUs) to run tasks at the same moment.
Where each shines
| Workload | Concurrency (async/evented) | Parallelism (multi-core/GPU) | |---|---|---| | Serve thousands of HTTP requests | ✅ Excellent | ⚠️ Unnecessary unless handlers are CPU-heavy | | Fetch many APIs / DB queries | ✅ Excellent | ⚠️ Network-bound; parallel cores don’t help much | | Image/video encoding, ML inference | ⚠️ Limited | ✅ Use threads, processes, SIMD, GPU | | Data transforms over large arrays | ⚠️ Often | ✅ Parallel map/reduce, vectorization | | Background jobs & pipelines | ✅ Orchestrate many waits | ✅ Parallel-execute CPU steps |
JavaScript examples
Concurrency (event loop, async I/O):
async function fetchAll(ids){
// Start first, await together
const tasks = ids.map(id => fetch(`/api/items/${id}`).then(r => r.json()));
return Promise.all(tasks);
}
Parallelism (Web Workers for CPU-bound work):
// main.js
const worker = new Worker("worker.js");
worker.postMessage({ n: 50_000_000 });
worker.onmessage = (e) => console.log("sum:", e.data);
// worker.js
self.onmessage = (e) => {
let s = 0; for (let i = 0; i < e.data.n; i++) s += i;
postMessage(s);
};
JS is single-threaded in the main thread; use Workers (or server-side clusters) for CPU parallelism.
Python examples
Concurrency (asyncio for I/O):
import asyncio, aiohttp
async def fetch(session, url):
async with session.get(url) as r:
return await r.text()
async def main(urls):
async with aiohttp.ClientSession() as s:
tasks = [asyncio.create_task(fetch(s, u)) for u in urls]
return await asyncio.gather(*tasks)
asyncio.run(main(["/a","/b","/c"]))
Parallelism for CPU (GIL-aware):
from concurrent.futures import ProcessPoolExecutor
def cpu_heavy(n): return sum(i*i for i in range(n))
with ProcessPoolExecutor() as pool:
results = list(pool.map(cpu_heavy, [10_000_000]*4)) # runs on multiple cores
Python’s GIL limits CPU-bound threads; use processes (or native extensions/NumPy/Cython) for parallel CPU work. For blocking I/O in threads,
concurrent.futures.ThreadPoolExecutoris fine.
Go quick glance
Concurrency (goroutines + channels):
func main() {
ch := make(chan int)
for i := 0; i < 3; i++ {
go func(v int){ ch <- v*v }(i)
}
for i := 0; i < 3; i++ { fmt.Println(<-ch) }
}
Go’s runtime multiplexes goroutines over threads. With multiple cores (GOMAXPROCS>1), many goroutines run in parallel too.
Pitfalls & fixes
| Problem | Why it hurts | Fix |
|---|---|---|
| await in tight loops (JS/Python) | Serializes work | Start tasks first; await together (Promise.all, gather) |
| Blocking CPU on event loop | Starvation, timeouts | Offload to threads/processes/workers |
| Too much parallelism | Thrashing, rate limits | Bound concurrency (semaphores/pools/queues) |
| Data races (threads) | Corruption, heisenbugs | Use immutable data, message passing, or proper locks/atomics |
| Shared mutable global state | Hard to reason/test | Prefer pure functions; inject dependencies |
| Forgetting timeouts/cancellation | Hangs & resource leaks | Timeouts, AbortController, asyncio.timeout, cancellation tokens |
How to choose (decision guide)
- Is the task mostly waiting on I/O? → Concurrency (async/evented) first.
- Is the task heavy CPU? → Parallelism (threads/processes/GPU).
- Is it a mix? → Orchestrate with concurrency; offload hot CPU parts to parallel workers.
- Running in serverless? → Concurrency saves cold-start cost; parallelism limited by per-function CPU quota.
Practical patterns
- JS: event loop for I/O + Web Workers/worker threads for CPU sections.
- Python:
asynciofor I/O; ProcessPoolExecutor / multiprocessing (or vectorized libs) for CPU. - Go: goroutines everywhere; tune
GOMAXPROCSand use worker pools for CPU-heavy parts. - Queues: use a job queue (SQS, RabbitMQ, Celery, BullMQ) to control fan-out and backpressure.
Quick checklist
- [ ] Classify tasks: I/O-bound vs CPU-bound.
- [ ] Use async I/O to raise throughput when waiting.
- [ ] Add worker pools for CPU; cap concurrency.
- [ ] Always set timeouts and support cancellation.
- [ ] Measure: event loop lag, p95 latency, CPU %, run queue, GC pressure.
- [ ] Keep code race-safe: prefer message passing or immutable data structures.
One-minute adoption plan
- Identify top I/O waits and CPU hotspots (profiling).
- Convert I/O paths to async/await; group awaits.
- Introduce bounded worker pools for CPU (threads/processes/workers).
- Add timeouts, cancellation, and backpressure.
- Monitor & tune: throughput vs. latency vs. cost.