Pagination That Scales: Offset vs Cursor vs Keyset

October 29th, 20258 min read#databases #backend #postgresql #elasticsearch #api-design #performance #architecture

Trade-offs, pitfalls, and production-ready code with SQL, Node, FastAPI, MongoDB and Elasticsearch examples.

Pagination That Scales: Offset vs Cursor vs Keyset

Trade-offs, pitfalls, and production-ready code

TL;DR

Offset: simplest, supports “jump to page N”, but slows down as N grows and can miss/duplicate on hot tables.
Cursor (API) + Keyset (DB “seek”): fastest and most stable under write load, great for infinite scroll and “next/prev”. Harder to implement, no true random access without extra work.
Choose: Use offset for small/admin/analytics pages that need page numbers. Use cursor+keyset for feeds, inboxes, search results at scale.

The Data Model We’ll Use

-- PostgreSQL
CREATE TABLE posts (
  id           BIGSERIAL PRIMARY KEY,
  title        TEXT NOT NULL,
  body         TEXT NOT NULL,
  status       TEXT NOT NULL CHECK (status IN ('DRAFT','PUBLISHED')),
  published_at TIMESTAMPTZ NOT NULL DEFAULT now(),  -- immutable once published
  author_id    BIGINT NOT NULL
);
-- Useful composite index for our sort/filter:
CREATE INDEX idx_posts_pub_desc ON posts (published_at DESC, id DESC) WHERE status='PUBLISHED';

Use an immutable sort column (e.g., published_at when the post becomes visible). If the sort field changes after publication, pages will “reshuffle”.

1) Offset Pagination

Pros

Dead simple (LIMIT ... OFFSET ...)
Supports page numbers (“go to page 7”)
Easy to add total counts

Cons

Slow for large offsets: DB must skip N rows
Unstable under writes: inserts/deletes cause gaps/duplication across pages
Expensive with complex filters unless carefully indexed

SQL

-- Page params: page (1-based), page_size
-- Compute offset = (page-1)*page_size
SELECT id, title, published_at
FROM posts
WHERE status='PUBLISHED'
ORDER BY published_at DESC, id DESC
LIMIT $1 OFFSET $2;  -- $1=page_size, $2=offset

Node (Express + node-postgres)

import express from "express";
import { Pool } from "pg";
const app = express();
const pool = new Pool();

app.get("/posts", async (req, res) => {
  const pageSize = Math.min(Number(req.query.page_size) || 20, 100);
  const page = Math.max(Number(req.query.page) || 1, 1);
  const offset = (page - 1) * pageSize;

  const { rows } = await pool.query(
    `SELECT id, title, published_at
     FROM posts
     WHERE status='PUBLISHED'
     ORDER BY published_at DESC, id DESC
     LIMIT $1 OFFSET $2`,
    [pageSize, offset]
  );

  const count = await pool.query(
    `SELECT count(*)::int AS total FROM posts WHERE status='PUBLISHED'`
  );

  res.json({
    page, page_size: pageSize,
    total: count.rows[0].total,
    data: rows
  });
});

When it’s fine: small catalogs, admin panels, reports, search with “page 12 of 40”.

2) Cursor (API) + Keyset (DB “seek”) Pagination

Idea: Instead of skipping rows, start after the last row you saw using a cursor (opaque token) and a keyset (DB predicate like “(published_at,id) < (last_published_at,last_id)”).

Pros

Consistent under writes (no drift)
Fast for deep pages (no big skip)
Perfect for infinite scroll, “Load more”

Cons

More code (cursors, next/prev)
No native “jump to page 37”
Need a stable, unique ordering (use tie-breaker id)

Cursor shape

Make the cursor an opaque base64 of the last row’s sort keys (and direction).

// cursor.ts
export function encodeCursor(v: unknown) {
  return Buffer.from(JSON.stringify(v)).toString("base64url");
}
export function decodeCursor<T=any>(c?: string|null): T|null {
  if (!c) return null;
  return JSON.parse(Buffer.from(c, "base64url").toString("utf8"));
}

SQL (PostgreSQL)

-- Forward page (newer -> older): ORDER BY published_at DESC, id DESC
-- Input: last_published_at, last_id (from cursor), page_size
-- First page: no cursor => no WHERE
SELECT id, title, published_at
FROM posts
WHERE status='PUBLISHED'
  AND (
    $1 IS NULL AND $2 IS NULL
    OR (published_at, id) < ($1::timestamptz, $2::bigint)
  )
ORDER BY published_at DESC, id DESC
LIMIT $3; -- page_size

Handle previous page by reversing the comparison and order, then reverse the result in app code:

-- Previous page: rows "after" the current first item (DESC timeline),
-- so we need > and ASC to walk backwards, then reverse in app.
SELECT id, title, published_at
FROM posts
WHERE status='PUBLISHED'
  AND (published_at, id) > ($1::timestamptz, $2::bigint)
ORDER BY published_at ASC, id ASC
LIMIT $3;

Node (Express) full route with next/prev

import express from "express";
import { Pool } from "pg";
import { encodeCursor, decodeCursor } from "./cursor";
const app = express();
const pool = new Pool();

type Cursor = { published_at: string; id: number; dir: "next"|"prev" };

app.get("/feed", async (req, res) => {
  const pageSize = Math.min(Number(req.query.page_size) || 20, 100);
  const cursorParam = String(req.query.cursor || "");
  const cursor = decodeCursor<Cursor>(cursorParam);
  const isPrev = cursor?.dir === "prev";

  if (!isPrev) {
    // forward
    const args = [cursor?.published_at ?? null, cursor?.id ?? null, pageSize + 1];
    const { rows } = await pool.query(
      `SELECT id, title, published_at
       FROM posts
       WHERE status='PUBLISHED'
         AND (($1 IS NULL AND $2 IS NULL) OR (published_at, id) < ($1::timestamptz, $2::bigint))
       ORDER BY published_at DESC, id DESC
       LIMIT $3`, args);

    const hasMore = rows.length > pageSize;
    const data = hasMore ? rows.slice(0, pageSize) : rows;

    const nextCursor = hasMore
      ? encodeCursor({ published_at: data[data.length-1].published_at, id: data[data.length-1].id, dir: "next" })
      : null;
    const prevCursor = data.length
      ? encodeCursor({ published_at: data[0].published_at, id: data[0].id, dir: "prev" })
      : null;

    res.json({ data, next_cursor: nextCursor, prev_cursor: prevCursor });
  } else {
    // previous: walk ASC then reverse
    const args = [cursor.published_at, cursor.id, pageSize + 1];
    const { rows } = await pool.query(
      `SELECT id, title, published_at
       FROM posts
       WHERE status='PUBLISHED'
         AND (published_at, id) > ($1::timestamptz, $2::bigint)
       ORDER BY published_at ASC, id ASC
       LIMIT $3`, args);

    const hasMorePrev = rows.length > pageSize;
    const page = hasMorePrev ? rows.slice(0, pageSize) : rows; // take the closest items
    const data = page.reverse();

    const nextCursor = data.length
      ? encodeCursor({ published_at: data[data.length-1].published_at, id: data[data.length-1].id, dir: "next" })
      : null;
    const prevCursor = hasMorePrev
      ? encodeCursor({ published_at: data[0].published_at, id: data[0].id, dir: "prev" })
      : null;

    res.json({ data, next_cursor: nextCursor, prev_cursor: prevCursor });
  }
});

Python (FastAPI + SQLAlchemy Core)

from fastapi import FastAPI, Query
from base64 import urlsafe_b64decode, urlsafe_b64encode
import json
from sqlalchemy import text
from sqlalchemy.ext.asyncio import create_async_engine

app = FastAPI()
engine = create_async_engine("postgresql+asyncpg://...")

def enc(o): return urlsafe_b64encode(json.dumps(o).encode()).decode()
def dec(c): return json.loads(urlsafe_b64decode(c.encode())) if c else None

@app.get("/feed")
async def feed(page_size: int = Query(20, ge=1, le=100), cursor: str | None = None):
    cur = dec(cursor)
    async with engine.begin() as conn:
        if not cur or cur["dir"] == "next":
            q = text("""
            SELECT id, title, published_at
            FROM posts
            WHERE status='PUBLISHED'
              AND ( :ts IS NULL AND :id IS NULL
                    OR (published_at, id) < (:ts::timestamptz, :id::bigint) )
            ORDER BY published_at DESC, id DESC
            LIMIT :lim
            """);
            rows = (await conn.execute(q, {"ts": cur["published_at"] if cur else None,
                                           "id": cur["id"] if cur else None,
                                           "lim": page_size+1})).mappings().all()
            has_more = len(rows) > page_size
            data = rows[:page_size] if has_more else rows
            next_cursor = enc({"published_at": data[-1]["published_at"].isoformat(), "id": data[-1]["id"], "dir":"next"}) if data and has_more else None
            prev_cursor = enc({"published_at": data[0]["published_at"].isoformat(), "id": data[0]["id"], "dir":"prev"}) if data else None
            return {"data": data, "next_cursor": next_cursor, "prev_cursor": prev_cursor}
        else:
            q = text("""
            SELECT id, title, published_at
            FROM posts
            WHERE status='PUBLISHED'
              AND (published_at, id) > (:ts::timestamptz, :id::bigint)
            ORDER BY published_at ASC, id ASC
            LIMIT :lim
            """);
            rows = (await conn.execute(q, {"ts": cur["published_at"], "id": cur["id"], "lim": page_size+1})).mappings().all()
            has_more_prev = len(rows) > page_size
            page = rows[:page_size] if has_more_prev else rows
            data = list(reversed(page))
            next_cursor = enc({"published_at": data[-1]["published_at"].isoformat(), "id": data[-1]["id"], "dir": "next"}) if data else None
            prev_cursor = enc({"published_at": data[0]["published_at"].isoformat(), "id": data[0]["id"], "dir": "prev"}) if has_more_prev else None
            return {"data": data, "next_cursor": next_cursor, "prev_cursor": prev_cursor}

Key Implementation Notes (You’ll Thank Yourself Later)

A) Always use a unique tie-breaker

Sort by (published_at DESC, id DESC). Without a tie-breaker, two rows with equal timestamps can flip order between requests.

B) Make the cursor opaque

Encode JSON to base64url. You can include filter snapshots (e.g., status, author_id) so cursors can be validated against changed filters.

C) Direction matters

Forward (“next”): < with DESC ordering
Backward (“prev”): > with ASC ordering, then reverse in app

D) Index to your exact predicate

CREATE INDEX CONCURRENTLY IF NOT EXISTS
  idx_posts_pub_desc ON posts (published_at DESC, id DESC)
  WHERE status='PUBLISHED';

If you filter by author or tag, extend the index: (author_id, published_at DESC, id DESC) or use a covering index suitable for your most common filter.

E) Hot tables & consistency

For perfect “snapshot consistency” across many pages, you need a listing version (materialized view or a table of IDs fixed at query start). Otherwise, expect slight drift as new rows arrive.
For feeds/inboxes, drift is often acceptable (users expect new stuff to appear).

3) “Jump to Page N” with Keyset?

True random access isn’t native to keyset. Options:

Jump to date: let users choose a date/time; you keyset-seek to the first item at/after that time.
Bookmark ID: “Open around post #12345”—seek to (published_at,id) of that row and show neighbors.
Approximate page mapping: store periodic anchors (every 1000th ID) to estimate offsets; then keyset from the anchor.

4) Counting, “Total Results”, and SEO Pages

Counts:

Exact count: SELECT count(*) (can be slow on huge tables + filters).
Fast approximate (Postgres):
- reltuples estimate from pg_class (coarse)
- Incremental counters (denormalized) or materialized view for frequently used filters.

SEO page lists (category pages with numbered pagination):

Offset is OK if counts are required and result sets are not massive.
Cache pages and counts aggressively (CDN + application cache).
If pages are huge, consider precomputing ID lists nightly.

5) Edge Cases & Gotchas

Edits to sort key: If published_at can change, pages will reshuffle. Prefer immutable sort keys (e.g., first_published_at).
Deletes: Offset pages shift; keyset unaffected.
Duplicates across pages: Ensure strict comparison on keyset and consistent inclusion/exclusion (don’t use <= in forward pagination).
Prev page overlap: When walking backwards, fetch page_size+1, take the closest page_size rows in ASC, then reverse.
Multi-column seek: For ASC order, change operators accordingly. For complex sorts, normalize to a tuple and apply lexicographic comparisons.

6) Beyond SQL

MongoDB (seek by `_id`)

// Next page
db.posts.find({ status: "PUBLISHED", _id: { $lt: lastId } })
  .sort({ _id: -1 }).limit(21);

// Prev page: use $gt and reverse in app.

Elasticsearch / OpenSearch (`search_after`)

{
  "size": 20,
  "sort": [{ "published_at": "desc" }, { "id": "desc" }],
  "search_after": ["2025-10-30T12:00:00Z", 12345],
  "query": { "term": { "status": "PUBLISHED" } }
}

7) Decision Cheat-Sheet

| Requirement | Pick | Why | |---|---|---| | Infinite scroll, large dataset, high write rate | Cursor + Keyset | Fast, stable, no deep skips | | Page numbers, export, SEO category pages | Offset (+ caching) | Easy counts, jump to N | | “Open near item X” | Keyset from anchor | Deterministic and quick | | “Jump to date” | Keyset by timestamp | Predictable UX, no big skip | | Strict snapshot across many pages | Materialize IDs | Consistency over freshness |

8) Test Like It’s Production

Seed 5–50M rows, realistic skew (recent heavy).
Benchmark: page 1, 10, 1000 with and without competing inserts/deletes.
Verify no duplicates/misses when new rows are inserted between requests.
Validate indexes with EXPLAIN (ANALYZE, BUFFERS)—you should see Index Only Scan or Index Scan, not sequential scans.

9) Quick Reference Queries

Forward (DESC)

-- First page
... WHERE status='PUBLISHED'
ORDER BY published_at DESC, id DESC
LIMIT :n;

-- Next page (after last tuple)
... WHERE status='PUBLISHED'
  AND (published_at, id) < (:last_ts, :last_id)
ORDER BY published_at DESC, id DESC
LIMIT :n;

Backward (DESC timeline)

-- Prev page (before first tuple)
... WHERE status='PUBLISHED'
  AND (published_at, id) > (:first_ts, :first_id)
ORDER BY published_at ASC, id ASC
LIMIT :n; -- then reverse in app

Index

CREATE INDEX CONCURRENTLY IF NOT EXISTS
  idx_posts_pub_desc ON posts (published_at DESC, id DESC)
  WHERE status='PUBLISHED';

Pagination That Scales: Offset vs Cursor vs Keyset

TL;DR

The Data Model We’ll Use

1) Offset Pagination

SQL

Node (Express + node-postgres)

2) Cursor (API) + Keyset (DB “seek”) Pagination

Cursor shape

SQL (PostgreSQL)

Node (Express) full route with next/prev

Python (FastAPI + SQLAlchemy Core)

Key Implementation Notes (You’ll Thank Yourself Later)

A) Always use a unique tie-breaker

B) Make the cursor opaque

C) Direction matters

D) Index to your exact predicate

E) Hot tables & consistency

3) “Jump to Page N” with Keyset?

4) Counting, “Total Results”, and SEO Pages

5) Edge Cases & Gotchas

6) Beyond SQL

MongoDB (seek by _id)

Elasticsearch / OpenSearch (search_after)

7) Decision Cheat-Sheet

8) Test Like It’s Production

9) Quick Reference Queries

MongoDB (seek by `_id`)

Elasticsearch / OpenSearch (`search_after`)