Data Engineering & Databases · PostgreSQL

Postgres Power Moves: Indexes, JSONB, and Partitioning

Features that turn Postgres into a serious data platform.

Reading time: ~8–12 min
Level: All levels
Updated:

Postgres can be “just a relational database”… or it can be your fastest, most reliable data platform. This guide to Postgres power moves: indexes, JSONB, and partitioning shows how to make queries faster, keep semi-structured data sane, and scale big tables without turning operations into a weekly emergency.


Quickstart

Want immediate performance wins without reading every detail? Do these in order. Each step is designed to be safe, measurable, and reversible.

1) Measure one slow query (don’t guess)

Pick a query that’s slow in production or staging. Run it with real parameters, then inspect the plan.

  • Capture the query text + parameters (example user, date range, status)
  • Run EXPLAIN (ANALYZE, BUFFERS) on the same query
  • Note: row counts, sort steps, nested loops, and “rows removed by filter”
  • Change one thing at a time and re-measure

2) Add one index that matches the query shape

Most “I need more CPU” problems are “I need the right index” problems. Prioritize filters and joins that appear in your slowest queries.

  • Index columns used in WHERE + JOIN
  • Match ORDER BY if the query sorts frequently
  • Consider partial indexes for hot subsets (status = 'open')
  • Use CONCURRENTLY on production to avoid long locks

3) If you store JSON, index the access paths

JSONB is powerful—but only if you treat it like a real schema. Decide which keys matter for queries.

  • Use jsonb (not json) for indexing and efficient operators
  • Add a GIN index for containment queries (@>)
  • For frequent keys, consider generated columns + B-tree index
  • Keep JSONB for “flexible fields”; normalize core relational facts

4) Partition only when it solves a real pain

Partitioning shines for time-series + retention, very large tables, and predictable pruning. It’s not a default “performance button.”

  • Pick a partition key your queries already filter on (often a timestamp)
  • Start with monthly or weekly partitions (avoid thousands)
  • Automate: create future partitions, add indexes, vacuum/analyze
  • Use partitioning to make retention a DROP PARTITION operation
The “one query” rule

If you improve the slowest query by 10×, you feel it instantly. If you add ten random indexes, you mostly add write overhead. Start with one measurable target and let the plan tell you what to do.

Overview

Postgres has a reputation for being “boring” in the best way: stable, predictable, and extremely capable. But it’s also packed with features that let it behave like a modern data platform—especially when you combine: indexes (fast reads), JSONB (flexible shape), and partitioning (scale + retention).

What you’ll learn in this Postgres power moves guide

  • How to choose indexes that actually get used (and avoid write-bloat)
  • How JSONB queries work and how to index them without “full table scan surprises”
  • When partitioning helps performance, when it helps operations, and when it’s unnecessary
  • A repeatable tuning workflow: measure → change → verify → maintain

The theme is simple: Postgres will do what you ask, but it needs the right structures. When you see slow queries, you’re usually seeing one of three issues: missing access paths (indexes), unclear data modeling (JSONB used as a dumping ground), or tables that have grown beyond what’s comfortable for “single heap + single index set” operations.

Feature Best for Watch out for
Indexes Fast filtering, joins, sorting; making queries predictable Write overhead, bloat, wrong column order, low selectivity
JSONB Semi-structured fields, metadata, evolving schemas, event payloads Unindexed keys, inconsistent types, too many ad-hoc query paths
Partitioning Time-series tables, retention policies, pruning large historical data Too many partitions, missing indexes on new partitions, pruning not triggered
A practical promise

You don’t need to memorize every index type or planner detail. You need a small set of rules, a measurement habit, and the discipline to keep your data structures aligned with your query patterns.

Core concepts

Before you change anything, it helps to have a few mental models. These are the concepts that explain 90% of real-world Postgres performance outcomes.

1) The planner doesn’t “prefer indexes” — it prefers cheaper plans

Postgres chooses a plan based on estimated cost: I/O, CPU, and how many rows it expects to touch. If it estimates it will read a large portion of the table anyway, it may choose a sequential scan even if an index exists. That’s not “Postgres being dumb” — it’s often a sign your index doesn’t match the query, your stats are stale, or your filter is not selective.

Plan signs that usually mean “index opportunity”

  • Rows removed by filter is huge (filtering after reading lots of data)
  • A big Sort step appears on every request (ORDER BY without support)
  • A join becomes a Nested Loop over many rows (missing join index)
  • Bitmap Heap Scan with many heap fetches (could benefit from covering index or clustering)

2) Indexes are access paths (and they have a cost)

Think of an index as a shortcut that trades extra writes for faster reads. Every INSERT/UPDATE/DELETE must maintain indexes. If you add indexes “just in case,” you slow down writes and vacuum, and you still might not speed up the queries that matter.

Index type Use it when… Common operator / pattern
B-tree (default) Equality/range filters, joins, ORDER BY =, >, <, sorting
GIN Containment / membership (arrays, JSONB) @>, ?, array membership
BRIN Very large tables with physical correlation (often time) Huge append-only tables, coarse pruning
GiST Geospatial, similarity, certain custom types PostGIS, nearest-neighbor patterns

3) Composite vs partial vs covering: three “power moves” inside indexing

Composite index

Multiple columns in one index. Great when queries filter by a set of columns, or filter + sort. Column order matters: put the most selective filters first (in practice: the ones you always filter on).

  • Best for: WHERE a = … AND b = … ORDER BY c
  • Tip: match left-to-right usage (the “leftmost prefix” idea)

Partial index

Index only a subset of rows (e.g., active users, open tickets). This is how you make “hot path” queries fast without indexing everything.

  • Best for: skewed data + frequent filters like status, tenant, is_deleted
  • Tip: keep the predicate identical to real queries

Covering index (INCLUDE) and index-only scans

A covering index stores extra columns so Postgres can satisfy the query from the index without visiting the table heap. This can be a big win on read-heavy endpoints—when visibility maps and vacuum keep things healthy.

  • Use INCLUDE for columns you select but don’t filter on
  • Great for “top N” pages and dashboards (LIMIT queries)
  • Still needs maintenance: vacuum/analyze affects index-only scans

4) JSONB is schema-flexibility, not schema-absence

JSONB lets you store evolving fields without migrations for every tiny change. The trade-off: query performance depends on having a clear set of “important keys” and indexing them properly. Without that, JSONB becomes “opaque text” and your queries drift into scans.

JSONB anti-pattern

If core relational facts live in JSONB (user_id, status, created_at), you’re opting out of the strongest parts of SQL: constraints, types, join optimization, and predictable indexing. Keep JSONB for metadata and evolving attributes.

5) Partitioning is as much about operations as it is about speed

Partitioning can speed up queries via partition pruning (skipping partitions that can’t match a filter), but it’s also a powerful operational tool: fast retention (drop partitions), faster vacuum on active partitions, and smaller index sets per partition.

Partitioning is a good fit when…

  • The table is huge and mostly time-ordered (events, logs, metrics, transactions)
  • You frequently query “recent data” and rarely touch the full history
  • You need retention: keep 90 days, delete the rest efficiently
  • You can keep partition counts reasonable (e.g., weekly/monthly)

Step-by-step

This is a repeatable workflow you can use on any Postgres database: start with evidence, change the minimum, and verify with the planner. The goal is not “more indexes” — it’s faster queries with controlled overhead.

Step 1 — Pick a target query and capture its real shape

The same SQL can behave very differently depending on parameters. Always benchmark the query variant that users actually run.

  • Record filter values (tenant/user/status/date range)
  • Include the ORDER BY and LIMIT (these often drive index choice)
  • Run EXPLAIN (ANALYZE, BUFFERS) and save the output
  • Look for: scans, sorts, join type, and misestimates (estimated vs actual rows)

Step 2 — Design an index that matches WHERE + JOIN + ORDER BY

A “good index” is one that matches your query’s access pattern. Start with the most common pattern, not the edge case. In practice, many OLTP apps have these shapes:

Typical hot path

  • Filter by tenant/user + status
  • Sort by created_at desc
  • Return small pages (LIMIT 20–100)

Index strategy

  • Composite B-tree matching filter + sort
  • Use INCLUDE to avoid extra heap reads
  • Use a partial index if only a subset is queried often

Here’s a practical example you can adapt. It includes a composite covering index and a partial index for a hot subset. Use CONCURRENTLY on production (it’s slower to build but avoids blocking writes).

-- Example: speeding up a "recent orders" endpoint
-- Query shape:
--   WHERE account_id = $1 AND status = 'open'
--   ORDER BY created_at DESC
--   LIMIT 50

-- 1) Composite index matching filter + sort (created_at DESC)
-- INCLUDE adds columns needed for the response so Postgres can often do an index-only scan.
CREATE INDEX CONCURRENTLY IF NOT EXISTS orders_account_status_created_idx
ON orders (account_id, status, created_at DESC)
INCLUDE (total_cents, currency, customer_id);

-- 2) Partial index for the hottest subset (e.g., open orders)
-- This is excellent when status = 'open' is a small % of the table but a large % of traffic.
CREATE INDEX CONCURRENTLY IF NOT EXISTS orders_open_account_created_idx
ON orders (account_id, created_at DESC)
WHERE status = 'open';

-- 3) After building indexes, update stats so the planner can choose correctly.
ANALYZE orders;

-- 4) Verify with a real plan (in psql):
-- EXPLAIN (ANALYZE, BUFFERS)
-- SELECT id, total_cents, currency, customer_id
-- FROM orders
-- WHERE account_id = 42 AND status = 'open'
-- ORDER BY created_at DESC
-- LIMIT 50;
Quick “is this index used?” check

If the plan still sorts or scans, it’s usually one of these: the index columns are in the wrong order, the query uses an expression that doesn’t match the index, or statistics are stale. Fix alignment first, then look for advanced tuning.

Step 3 — Make JSONB fast by indexing the question you ask

JSONB is great when you have flexible attributes: “utm_campaign”, “device”, “feature_flags”, “payload metadata”. The key is to identify which JSON keys you query often and pick an indexing approach:

Query pattern Good choice Why
Containment filters GIN on JSONB Fast for @> and membership operators
One key used constantly Generated column + B-tree Cheaper than GIN for simple equality/range
Expression comparisons Expression index Matches (props->>'key') exactly

Below is a practical JSONB pattern: keep the full payload in JSONB, index general containment with GIN, and “promote” frequently-used keys into generated columns for predictable performance.

-- Example: event table with JSONB metadata you need to filter by
CREATE TABLE IF NOT EXISTS app_events (
  id          bigserial PRIMARY KEY,
  occurred_at timestamptz NOT NULL,
  event_name  text        NOT NULL,
  props       jsonb       NOT NULL DEFAULT '{}'::jsonb
);

-- 1) GIN index for containment queries like props @> '{"plan":"pro"}'
CREATE INDEX CONCURRENTLY IF NOT EXISTS app_events_props_gin
ON app_events USING gin (props);

-- 2) Promote a frequently-used key into a generated column (cheap, typed access path)
-- This helps when you filter by one key constantly and want B-tree behavior.
ALTER TABLE app_events
  ADD COLUMN IF NOT EXISTS plan text
  GENERATED ALWAYS AS (props->>'plan') STORED;

CREATE INDEX CONCURRENTLY IF NOT EXISTS app_events_plan_time_idx
ON app_events (plan, occurred_at DESC);

-- 3) Example queries
-- Containment (GIN):
--   Find events where plan == 'pro' and country == 'SK'
-- SELECT *
-- FROM app_events
-- WHERE props @> '{"plan":"pro","country":"SK"}'
--   AND occurred_at > now() - interval '7 days';

-- Promoted key (B-tree):
--   Fast "recent events for plan" endpoints
-- SELECT id, occurred_at, event_name
-- FROM app_events
-- WHERE plan = 'pro'
-- ORDER BY occurred_at DESC
-- LIMIT 100;
Keep JSON types consistent

If one row stores {"age":"12"} and another stores {"age":12}, you’ll fight casts and indexes. Decide “string vs number vs boolean” per key and enforce it in your app or ingestion pipeline.

Step 4 — Add partitioning for big tables + retention (and get pruning)

Partitioning is most valuable when it allows Postgres to skip irrelevant data. That requires your queries to filter on the partition key (directly or via constraints the planner can understand). Time-based partitioning is the workhorse: you almost always have a timestamp, and you almost always query ranges.

When partitioning helps query speed

  • Queries filter by time range (last 7 days, last month)
  • Most traffic reads “recent” data
  • Indexes per partition stay smaller and cheaper to scan

When partitioning helps operations

  • Retention becomes dropping partitions (fast)
  • Vacuum/analyze impacts only active partitions
  • Bulk loads are easier to manage and isolate

This example creates a range-partitioned table by month, adds a default partition for “out of range” safety, and shows the key maintenance trick: create indexes on every new partition.

-- Example: time-partitioned table for logs/events
CREATE TABLE IF NOT EXISTS event_log (
  id          bigserial NOT NULL,
  occurred_at timestamptz NOT NULL,
  account_id  bigint NOT NULL,
  event_type  text NOT NULL,
  payload     jsonb NOT NULL DEFAULT '{}'::jsonb,
  PRIMARY KEY (id, occurred_at) -- include partition key in PK/unique constraints
) PARTITION BY RANGE (occurred_at);

-- Default partition catches unexpected timestamps (bad clients, backfills, clock issues)
CREATE TABLE IF NOT EXISTS event_log_default
PARTITION OF event_log DEFAULT;

-- Create monthly partitions (example: Jan and Feb 2026)
CREATE TABLE IF NOT EXISTS event_log_2026_01
PARTITION OF event_log
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');

CREATE TABLE IF NOT EXISTS event_log_2026_02
PARTITION OF event_log
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');

-- Indexes must be created per partition (local indexes).
-- Put your "hot path" filters first (account_id + time is common).
CREATE INDEX IF NOT EXISTS event_log_2026_01_account_time_idx
ON event_log_2026_01 (account_id, occurred_at DESC);

CREATE INDEX IF NOT EXISTS event_log_2026_02_account_time_idx
ON event_log_2026_02 (account_id, occurred_at DESC);

-- Retention becomes fast: drop whole partitions instead of deleting millions of rows.
-- DROP TABLE event_log_2025_10;

-- Verify pruning:
-- EXPLAIN
-- SELECT count(*)
-- FROM event_log
-- WHERE occurred_at >= '2026-02-10' AND occurred_at < '2026-02-11'
--   AND account_id = 42;
Partitioning rule that saves projects

If your query does not filter by the partition key, partitioning usually makes it slower (it may scan many partitions). Partitioning is not a substitute for indexes; it’s a structure that makes certain time-range queries and retention simpler.

Step 5 — Keep performance gains durable (maintenance matters)

The last step is the one people skip: keeping indexes and stats healthy. Performance is not just schema design; it’s ongoing upkeep. The good news: you can keep it lightweight if you focus on a few basics.

Healthy database basics

  • Autovacuum tuned enough to prevent bloat in hot tables
  • Regular ANALYZE so estimates match reality
  • Watch for sudden plan changes after major data growth
  • Keep “create new partitions + indexes” automated

Signs you need attention

  • Index size grows far faster than table size (bloat)
  • Queries flip from index scans to seq scans unexpectedly
  • Dead tuples stay high for long periods
  • Vacuum can’t keep up on the busiest tables

Common mistakes

These are the classic “we added indexes but it’s still slow” patterns. Fixing them usually beats adding more hardware.

Mistake 1 — Creating indexes without verifying plan usage

An unused index is not harmless: it increases write cost and maintenance.

  • Fix: check EXPLAIN (ANALYZE, BUFFERS) before/after.
  • Fix: confirm the query predicates match the index definition (expressions, casts, order).

Mistake 2 — Wrong column order in a composite index

Composite indexes are not “bags of columns.” Order determines what the index can accelerate.

  • Fix: lead with the columns you always filter on (often tenant/account/user).
  • Fix: match ORDER BY (e.g., created_at DESC) for pagination queries.

Mistake 3 — Indexing low-selectivity columns

Indexing a column with only a few values (e.g., boolean) often won’t help unless paired with another filter.

  • Fix: use a partial index for a hot subset (WHERE is_active), or combine with tenant/user.
  • Fix: prefer composite indexes that include a selective leading column.

Mistake 4 — JSONB as a dumping ground (no consistent keys/types)

If keys and types vary per row, you end up with slow scans, messy casts, and unpredictable results.

  • Fix: define a key dictionary (“these 20 keys are allowed”) and enforce types.
  • Fix: promote frequent keys into generated columns for speed and clarity.

Mistake 5 — Partitioning too early (or too much)

Partitioning adds complexity. If you don’t prune partitions, you may scan more structures than before.

  • Fix: partition for a clear reason (retention + time-range queries on huge tables).
  • Fix: keep partition counts sane (monthly/weekly, not per day unless necessary).

Mistake 6 — Forgetting indexes on new partitions

The parent table’s indexes don’t magically appear on future partitions unless you create them.

  • Fix: automate “create partition + create indexes + analyze” as a routine job.
  • Fix: verify pruning and index usage per partition with explain plans.
The fastest fixes are usually boring

Most performance wins come from aligning query shape ↔ data structure (indexes), and keeping stats/maintenance healthy. That’s why “power moves” are often simple—but applied deliberately.

FAQ

How do I know if I actually need a new index?

Check the plan. If you see large scans, repeated sorts, or joins that read far more rows than they return, an index may help—but only if it matches the predicates you use. Always verify with EXPLAIN (ANALYZE, BUFFERS) before and after.

What’s the difference between JSON and JSONB in Postgres?

Use JSONB for most cases. JSONB stores a binary representation that supports efficient operators and indexing. Plain JSON preserves text formatting but is usually slower to query and index.

When should I use a GIN index on JSONB?

When you filter with containment or membership operators (especially @> and key existence operators). GIN is built for “does this document contain X?” queries. For a single frequently-used key, consider a generated column + B-tree instead.

How many columns should I put in a composite index?

Usually 2–3 is enough. Include the columns you always filter on and (optionally) the column you sort by. If you’re adding columns “just in case,” prefer INCLUDE for selected columns rather than expanding the key.

Does partitioning automatically make queries faster?

No. Partitioning helps when the planner can prune partitions (skip them) based on your WHERE clause. If queries don’t filter on the partition key, partitioning can be slower due to extra planning and scanning across partitions.

Why does Postgres sometimes ignore an index and do a sequential scan?

Because it estimates the seq scan is cheaper. This can be correct (low selectivity) or a sign of stale stats, mismatched predicates (casts/expressions), or an index that doesn’t align with the query. Run ANALYZE and confirm the query matches the index.

How do I avoid index bloat on busy tables?

Keep vacuum effective and avoid unnecessary indexes. Bloat often grows when dead tuples accumulate and indexes churn. Monitor hot tables, ensure autovacuum can keep up, and periodically reassess whether old indexes are still used.

Cheatsheet

A scan-fast checklist for choosing the right “power move” and avoiding the most common traps.

Index cheatsheet

  • Start with EXPLAIN (ANALYZE, BUFFERS) on a real query
  • B-tree for =, ranges, joins, ORDER BY
  • Match query shape: WHERE + JOIN + ORDER BY
  • Composite index: put “always filtered” columns first
  • Use INCLUDE to cover selected columns
  • Use partial indexes for hot subsets (status, active, not_deleted)
  • Build CONCURRENTLY in production to avoid blocking writes

JSONB cheatsheet

  • Use jsonb unless you have a strong reason not to
  • Add GIN when you do containment (@>)
  • Promote frequent keys to generated columns + B-tree
  • Keep types consistent per key (string vs number vs boolean)
  • Normalize core relational facts (ids, timestamps, status)
  • Make “important keys” explicit (don’t query 200 random keys ad-hoc)

Partitioning cheatsheet

  • Partition for retention + very large time-series tables
  • Choose a partition key your queries filter on (often time)
  • Keep partition count reasonable (weekly/monthly)
  • Automate: create future partitions + indexes + analyze
  • Add a default partition to catch out-of-range inserts
  • Verify partition pruning with EXPLAIN

Decision mini-map

If your problem is… Try first
Slow filters/joins/sorts Composite/covering/partial B-tree indexes
Slow JSONB lookups GIN + promoted keys (generated columns)
Huge time-series table + retention pain Range partitioning by time
Plan changes after data growth ANALYZE + revisit index alignment
Keep a “schema for queries” mindset

Schema design is not only about storing data. It’s about storing data in a way your real queries can access cheaply. Indexes, JSONB strategy, and partitioning are the tools that make that happen.

Wrap-up

Postgres performance isn’t magic—it’s alignment. When your indexes match your queries, when your JSONB fields have clear access paths, and when partitioning is applied for pruning and retention (not vibes), Postgres becomes dramatically more predictable and scalable.

What to do next (15-minute plan)

  • Pick one slow query and capture EXPLAIN (ANALYZE, BUFFERS)
  • Add one targeted index (composite/partial/covering) and verify improvement
  • If JSONB is involved, list the top 5 queried keys and index them intentionally
  • If a table is “retention painful,” evaluate time partitioning and pruning readiness

If you want to go deeper, the related posts below pair well with this guide—especially around explain plans and index design.

Quiz

Quick self-check (demo). This quiz is auto-generated for data / engineering / databases.

1) What’s the best first step before adding indexes to speed up a query?
2) Which index type commonly accelerates JSONB containment queries (e.g., props @> '{"key":"value"}')?
3) What is “partition pruning” in Postgres?
4) Which situation is a strong candidate for a partial index?