caduh

The Difference Between Concurrency and Parallelism — clear & concise

4 min read

Concurrency is about dealing with lots of things at once; parallelism is about doing lots of things at once. Here’s the mental model, practical examples, and when to use each.

TL;DR

  • Concurrency = structure: multiple tasks are in progress at the same time (often interleaved on one CPU). Great for I/O-bound work.
  • Parallelism = execution: multiple tasks are executing simultaneously on different cores. Great for CPU-bound work.
  • You can have concurrency without parallelism (event loop) and parallelism without concurrency (single big vectorized task). Many systems use both.
  • Choose async I/O / event loops for high-latency waits (HTTP/DB), and threads/processes/workers for heavy CPU. Limit concurrency to avoid overload.

One-minute mental model

Time ─────────────────────────────────────────────────────────────

Concurrency (interleaving on one core)
Task A: ███      ███      ███      ███
Task B:    ███      ███      ███
Task C:       ███      ███      ███

Parallelism (multi-core, truly simultaneous)
Core 1: ███████████████████  Task A
Core 2: ███████████████████  Task B
Core 3: ███████████████████  Task C
  • Concurrency is how you structure a program to make progress on multiple tasks by switching between them when one would otherwise wait.
  • Parallelism is about using more hardware (cores/threads/GPUs) to run tasks at the same moment.

Where each shines

| Workload | Concurrency (async/evented) | Parallelism (multi-core/GPU) | |---|---|---| | Serve thousands of HTTP requests | ✅ Excellent | ⚠️ Unnecessary unless handlers are CPU-heavy | | Fetch many APIs / DB queries | ✅ Excellent | ⚠️ Network-bound; parallel cores don’t help much | | Image/video encoding, ML inference | ⚠️ Limited | ✅ Use threads, processes, SIMD, GPU | | Data transforms over large arrays | ⚠️ Often | ✅ Parallel map/reduce, vectorization | | Background jobs & pipelines | ✅ Orchestrate many waits | ✅ Parallel-execute CPU steps |


JavaScript examples

Concurrency (event loop, async I/O):

async function fetchAll(ids){
  // Start first, await together
  const tasks = ids.map(id => fetch(`/api/items/${id}`).then(r => r.json()));
  return Promise.all(tasks);
}

Parallelism (Web Workers for CPU-bound work):

// main.js
const worker = new Worker("worker.js");
worker.postMessage({ n: 50_000_000 });
worker.onmessage = (e) => console.log("sum:", e.data);

// worker.js
self.onmessage = (e) => {
  let s = 0; for (let i = 0; i < e.data.n; i++) s += i;
  postMessage(s);
};

JS is single-threaded in the main thread; use Workers (or server-side clusters) for CPU parallelism.


Python examples

Concurrency (asyncio for I/O):

import asyncio, aiohttp

async def fetch(session, url):
    async with session.get(url) as r:
        return await r.text()

async def main(urls):
    async with aiohttp.ClientSession() as s:
        tasks = [asyncio.create_task(fetch(s, u)) for u in urls]
        return await asyncio.gather(*tasks)

asyncio.run(main(["/a","/b","/c"]))

Parallelism for CPU (GIL-aware):

from concurrent.futures import ProcessPoolExecutor
def cpu_heavy(n): return sum(i*i for i in range(n))

with ProcessPoolExecutor() as pool:
    results = list(pool.map(cpu_heavy, [10_000_000]*4))  # runs on multiple cores

Python’s GIL limits CPU-bound threads; use processes (or native extensions/NumPy/Cython) for parallel CPU work. For blocking I/O in threads, concurrent.futures.ThreadPoolExecutor is fine.


Go quick glance

Concurrency (goroutines + channels):

func main() {
  ch := make(chan int)
  for i := 0; i < 3; i++ {
    go func(v int){ ch <- v*v }(i)
  }
  for i := 0; i < 3; i++ { fmt.Println(<-ch) }
}

Go’s runtime multiplexes goroutines over threads. With multiple cores (GOMAXPROCS>1), many goroutines run in parallel too.


Pitfalls & fixes

| Problem | Why it hurts | Fix | |---|---|---| | await in tight loops (JS/Python) | Serializes work | Start tasks first; await together (Promise.all, gather) | | Blocking CPU on event loop | Starvation, timeouts | Offload to threads/processes/workers | | Too much parallelism | Thrashing, rate limits | Bound concurrency (semaphores/pools/queues) | | Data races (threads) | Corruption, heisenbugs | Use immutable data, message passing, or proper locks/atomics | | Shared mutable global state | Hard to reason/test | Prefer pure functions; inject dependencies | | Forgetting timeouts/cancellation | Hangs & resource leaks | Timeouts, AbortController, asyncio.timeout, cancellation tokens |


How to choose (decision guide)

  • Is the task mostly waiting on I/O?Concurrency (async/evented) first.
  • Is the task heavy CPU?Parallelism (threads/processes/GPU).
  • Is it a mix? → Orchestrate with concurrency; offload hot CPU parts to parallel workers.
  • Running in serverless? → Concurrency saves cold-start cost; parallelism limited by per-function CPU quota.

Practical patterns

  • JS: event loop for I/O + Web Workers/worker threads for CPU sections.
  • Python: asyncio for I/O; ProcessPoolExecutor / multiprocessing (or vectorized libs) for CPU.
  • Go: goroutines everywhere; tune GOMAXPROCS and use worker pools for CPU-heavy parts.
  • Queues: use a job queue (SQS, RabbitMQ, Celery, BullMQ) to control fan-out and backpressure.

Quick checklist

  • [ ] Classify tasks: I/O-bound vs CPU-bound.
  • [ ] Use async I/O to raise throughput when waiting.
  • [ ] Add worker pools for CPU; cap concurrency.
  • [ ] Always set timeouts and support cancellation.
  • [ ] Measure: event loop lag, p95 latency, CPU %, run queue, GC pressure.
  • [ ] Keep code race-safe: prefer message passing or immutable data structures.

One-minute adoption plan

  1. Identify top I/O waits and CPU hotspots (profiling).
  2. Convert I/O paths to async/await; group awaits.
  3. Introduce bounded worker pools for CPU (threads/processes/workers).
  4. Add timeouts, cancellation, and backpressure.
  5. Monitor & tune: throughput vs. latency vs. cost.