Warehouse Cost Control: The Queries That Burn Money

By Samuel Labant Published Jan 9, 2026 Updated Jan 9, 2026

Spot expensive patterns and keep usage predictable. This guide shows the query shapes that quietly burn budget (full table scans, join explosions, runaway BI refreshes), how to find them fast, and how to put guardrails in place so cost stays boring.

Quickstart

If you only have an hour, do these in order. The goal is to turn “we think costs are high” into a short list of specific queries and specific fixes—then prevent repeats.

1) Confirm your billing unit

Warehouses charge in different ways (bytes scanned, compute time/credits, cluster seconds). Your “expensive query” detector should rank by the unit that drives your bill.

Locate the system table / query history view your warehouse provides
Identify the metric to sort by (credits, bytes, slot-ms, runtime)
Decide a simple threshold: “anything above X is review-worthy”

2) Build a “Top 20 cost queries” view

Don’t start by optimizing random queries. Start with the top offenders and fix the few that drive the majority of spend.

Group by query signature (same SQL shape, different parameters)
Include user, tool (BI/dbt), and schedule (hourly/daily)
Store results in a small table so you can trend over time

3) Kill the common “money burners” first

A few patterns show up everywhere. Fixing them is repeatable and usually doesn’t require a full redesign.

Replace SELECT * with a minimal projection
Add partition filters (date / shard / tenant) to enable pruning
Remove accidental many-to-many joins (join keys not unique)
Cap unbounded windows and huge DISTINCT operations

4) Add two guardrails (today)

Guardrails make cost predictable even when humans do human things (copy/paste, “quick” explorations, broken dashboards).

Set statement timeouts / max bytes scanned / resource monitors (vendor-specific)
Introduce a “heavy queries” role/warehouse and keep it separated
Add a daily alert when spend exceeds a baseline by N%

The 80/20 rule for warehouse cost control

Most teams can cut a meaningful chunk of spend by fixing 10–20 recurring query signatures (usually from dashboards, scheduled jobs, and “analysis notebooks”). Start there before you debate a new architecture.

Overview

“Warehouse cost control” isn’t about shaving milliseconds off every SQL statement. It’s about removing the patterns that make usage unpredictable: a dashboard refresh that scans terabytes, a join that multiplies rows by accident, or a model build that reprocesses history every run.

What this post covers

How to find the queries that burn money (and how to group them so you fix the root cause)
A mental model for cost: scan, compute, and concurrency
A practical optimization playbook for the most expensive SQL shapes
Guardrails that prevent regressions (budgets, limits, and workload separation)
Checklists you can copy into your team’s runbook

Where cost spikes usually come from	What it looks like	Typical fix
Over-scanning	Bytes read jumps; same result, bigger bill	Projection + partition filters + pruning
Join explosion	Rows processed >> rows returned	Fix join keys, pre-aggregate, de-duplicate
Repeat compute	Same expensive transformation runs many times	Materialize, incrementalize, cache strategically
Concurrency	Dozens of queries run at once (BI refresh, backfills)	Scheduling, workload isolation, quotas

Measure before you optimize

Warehouse optimization is easy to get wrong if you guess. Start with query history, rank by cost, and fix the top offenders. Most teams are surprised by what’s actually expensive.

Core concepts

To control warehouse spend, you need a simple mental model: cost is usually driven by some combination of data scanned, compute time, and concurrency. Almost every “money-burning query” is failing at least one of these.

1) The cost triangle: scan, compute, concurrency

Scan

How much data you read from storage (tables, partitions, micro-partitions, files).

Drives cost heavily in scan-priced systems
Highly sensitive to missing filters and SELECT *
Often improves with partitioning/clustering and good predicates

Compute

How much CPU/memory work the engine performs (joins, sorts, shuffles, UDFs).

Explodes with big joins, DISTINCT, and unbounded windows
Improves with better join keys, pre-aggregation, and materialization
Can be reduced by avoiding repeated transformation work

Concurrency (the silent multiplier)

Even efficient queries become expensive when they run too often or all at once. BI tools, scheduled backfills, and retries are common sources of accidental concurrency.

Schedule heavy jobs outside peak dashboard hours
Separate workloads (interactive vs batch)
Set quotas/limits so one team can’t drain the pool

2) Query signature beats “top queries”

A single expensive query is interesting. A query that runs 1,000 times a day is a budget line item. That’s why you should group by query signature: the same SQL shape with different parameters.

How to think about “signature”

Normalize whitespace and literals (dates/IDs) into placeholders
Group by normalized SQL + tool (BI vs dbt vs notebook)
Track count, p95 runtime, and total cost over a window

3) Pruning: the warehouse’s superpower

Modern warehouses avoid reading data they can prove you don’t need. This is called pruning. It only works if your query gives the engine a strong clue (usually a filter on a partition key or clustering key).

Pruning-friendly predicates

WHERE event_date BETWEEN '2026-01-01' AND '2026-01-31'
WHERE tenant_id = ? (if data is clustered by tenant)
WHERE created_at >= CURRENT_DATE - INTERVAL '7' DAY

Pruning killers

Wrapping the partition column in a function (engine can’t prune)
Filtering after a big join instead of before (too late)
“Wildcard” date ranges (e.g., last 5 years) for dashboards

4) Materialization: stop paying for the same work twice

If a complex transformation is reused (dashboards, many downstream models, repeated ad-hoc analysis), you should consider materializing it: a table, incremental model, or materialized view. The trick is to materialize the stable expensive part, not everything.

Optimization isn’t only SQL

Many “expensive queries” are expensive because they run too often. Fixing refresh schedules, caching, and workload isolation can cut spend without touching a single JOIN.

Step-by-step

This section is a practical workflow you can repeat monthly. It turns cost control into a routine: observe → rank → fix → guardrail → verify.

Step 1 — Make cost visible (query history → a small mart)

Your warehouse already logs query execution. The fastest win is to capture those logs into a small table (a “cost mart”) that your team can query like any other dataset.

Starter SQL: top query signatures by spend (generic shape)

Adjust table/column names for your warehouse. The important part is: group by a normalized query text (signature), and rank by a cost metric you trust.

-- Create a daily rollup of expensive query signatures (starter pattern).
-- Replace warehouse_query_history with your system view/table.
-- Replace "cost_units" with credits/bytes/slot_ms/etc. depending on your platform.

WITH base AS (
  SELECT
    DATE_TRUNC('day', started_at) AS day,
    user_name,
    tool_name,
    -- Very simple normalization: trim + collapse whitespace.
    -- In practice, you may also replace literals (dates/ids) with placeholders.
    REGEXP_REPLACE(TRIM(query_text), '\\s+', ' ') AS query_signature,
    total_runtime_ms,
    cost_units
  FROM warehouse_query_history
  WHERE started_at >= CURRENT_DATE - INTERVAL '14' DAY
),
agg AS (
  SELECT
    day,
    tool_name,
    query_signature,
    COUNT(*) AS runs,
    SUM(cost_units) AS total_cost_units,
    AVG(cost_units) AS avg_cost_units,
    PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY total_runtime_ms) AS p95_runtime_ms
  FROM base
  GROUP BY 1,2,3
)
SELECT *
FROM agg
ORDER BY total_cost_units DESC
LIMIT 20;

What to save in your “cost mart”

Signature (normalized query), tool, user, role/warehouse
Runs, total cost, avg cost, p95 runtime
Rows scanned/processed (if available), plus rows returned
Error/timeout counts (retries can be expensive)

Step 2 — Triage: classify each offender by “why it’s expensive”

For each top signature, put it into a bucket. This avoids “random fixes” and points you at the highest-leverage change.

Bucket	Fast diagnosis	First fix to try
Over-scan	Huge bytes read; low rows returned	Project fewer columns; add partition filter; prune early
Join explosion	Rows processed extremely high vs base tables	Check key uniqueness; pre-aggregate; de-duplicate
Sort/shuffle heavy	Big `ORDER BY`, `DISTINCT`, wide grouping	Reduce columns; aggregate earlier; avoid global sorts
Repeat compute	Same query runs constantly (dashboards/jobs)	Materialize; cache; schedule; isolate workload
Concurrency spike	Many runs start in the same minute	Stagger refresh; throttle; separate warehouses/queues

Step 3 — Fix the expensive patterns (playbook)

Here are the repeatable fixes that show up across warehouses. Apply them to your top offenders first, then re-measure.

A) Stop scanning what you don’t use (projection + predicate pruning)

Replace SELECT * with only required columns (especially wide event tables)
Filter on partition keys (date, tenant, shard) as early as possible
Push predicates into CTEs/subqueries that feed joins (prune before join)
Avoid functions on partition columns in WHERE (pruning killer)

Example: refactor an over-scanning dashboard query

This is the most common cost burn: a “simple” dashboard query scanning a full fact table because it doesn’t filter on the partition key and pulls every column.

-- BEFORE: wide scan + late filtering
-- SELECT * plus no partition pruning makes this expensive on large tables.

SELECT
  *
FROM fact_events
WHERE DATE(created_at) >= CURRENT_DATE - INTERVAL '30' DAY
  AND event_name IN ('purchase', 'refund');

-- AFTER: prune first + project only needed columns + avoid function on partition key
WITH filtered AS (
  SELECT
    event_date,          -- assume this is a partition column
    user_id,
    order_id,
    event_name,
    revenue_usd
  FROM fact_events
  WHERE event_date >= CURRENT_DATE - INTERVAL '30' DAY
    AND event_name IN ('purchase', 'refund')
)
SELECT
  event_date,
  event_name,
  COUNT(*) AS events,
  COUNT(DISTINCT user_id) AS users,
  SUM(revenue_usd) AS revenue_usd
FROM filtered
GROUP BY 1,2
ORDER BY event_date DESC;

B) Prevent join explosions (cardinality control)

Join explosions happen when your join keys aren’t unique on at least one side (or when you accidentally join fact-to-fact). The result: the engine processes a huge intermediate table even if the final result is small.

Red flags

Rows processed is orders of magnitude larger than inputs
Adding a JOIN increases runtime/cost drastically
The query “works” but totals look too high (double-counting)
Someone “fixed” it with DISTINCT (usually a smell)

Fixes that actually stick

Enforce uniqueness in dimension tables (one row per key)
Pre-aggregate fact tables before joining
Use bridge tables intentionally (many-to-many is valid, but explicit)
Add constraints/tests in transformation layer (e.g., uniqueness tests)

C) Reduce repeat compute (materialize and schedule)

If the same expensive query runs repeatedly (dashboards or downstream models), materializing can be cheaper than re-computing. Choose materialization when the query is stable, reused, and the output size is reasonable.

A simple decision rule

If a query is in your top 20 by total cost and runs multiple times per day, it’s a candidate for materialization or caching. If it runs once a month, focus on correctness and safety first.

Step 4 — Add guardrails: predictable cost beats heroic debugging

Guardrails don’t replace optimization—they prevent regressions and protect you from “one bad query” incidents. The best guardrails are visible, simple, and tied to ownership.

Operational guardrails

Statement timeouts (interactive defaults)
Per-user / per-role quotas for heavy workloads
Separate compute pools: interactive vs batch vs backfills
Retry limits for scheduled jobs (avoid infinite failure loops)

Governance guardrails

Tag queries/jobs by team or cost center
Weekly “top offenders” review (30 minutes)
Budgets and alerts (daily/weekly spend vs baseline)
Require code review for scheduled queries over a threshold

A small daily spend alert (example script)

Use this pattern to notify a channel when spend deviates from normal. Replace the SQL with your warehouse’s system tables, and replace the webhook with your preferred notifier. Keep it simple: baseline + threshold + top offenders.

import os
import json
import datetime as dt
import urllib.request

# Environment variables you define in your scheduler:
# WAREHOUSE_DSN, ALERT_WEBHOOK_URL
#
# Pseudocode: replace run_query() with your warehouse client.
def run_query(sql: str):
  # return list[dict]
  raise NotImplementedError("Connect your warehouse client here.")

def post_webhook(url: str, payload: dict):
  data = json.dumps(payload).encode("utf-8")
  req = urllib.request.Request(url, data=data, headers={"Content-Type": "application/json"})
  urllib.request.urlopen(req, timeout=10).read()

today = dt.date.today()
yday = today - dt.timedelta(days=1)
window_start = today - dt.timedelta(days=14)

# 1) Pull yesterday spend + baseline (simple rolling average)
spend_sql = f"""
WITH daily AS (
  SELECT
    DATE_TRUNC('day', started_at) AS day,
    SUM(cost_units) AS spend
  FROM warehouse_query_history
  WHERE started_at >= DATE '{window_start.isoformat()}'
    AND started_at <  DATE '{today.isoformat()}'
  GROUP BY 1
)
SELECT
  (SELECT spend FROM daily WHERE day = DATE '{yday.isoformat()}') AS spend_yday,
  (SELECT AVG(spend) FROM daily WHERE day < DATE '{yday.isoformat()}') AS baseline_avg
"""

rows = run_query(spend_sql)
spend_yday = float(rows[0]["spend_yday"] or 0.0)
baseline_avg = float(rows[0]["baseline_avg"] or 0.0)

# 2) Threshold check (tune to taste)
threshold_multiplier = float(os.getenv("ALERT_THRESHOLD_MULTIPLIER", "1.5"))
should_alert = baseline_avg > 0 and spend_yday > baseline_avg * threshold_multiplier

if should_alert:
  top_sql = f"""
  SELECT
    tool_name,
    REGEXP_REPLACE(TRIM(query_text), '\\s+', ' ') AS query_signature,
    COUNT(*) AS runs,
    SUM(cost_units) AS total_cost_units
  FROM warehouse_query_history
  WHERE started_at >= DATE '{yday.isoformat()}'
    AND started_at <  DATE '{today.isoformat()}'
  GROUP BY 1,2
  ORDER BY total_cost_units DESC
  LIMIT 5
  """
  top = run_query(top_sql)

  message_lines = [
    f"Warehouse spend alert for {yday.isoformat()}",
    f"Spend: {spend_yday:.2f} (baseline avg: {baseline_avg:.2f})",
    "",
    "Top offenders:",
  ]
  for i, r in enumerate(top, 1):
    message_lines.append(f"{i}) [{r['tool_name']}] {r['total_cost_units']:.2f} units · {r['runs']} runs")
    message_lines.append(f"   {r['query_signature'][:140]}")

  payload = {"text": "\n".join(message_lines)}
  post_webhook(os.environ["ALERT_WEBHOOK_URL"], payload)

Don’t alert on “total spend” only

Total spend is useful, but it’s not actionable by itself. Always include the top offenders list and the owner signal (tool/user/team) so the alert leads to a fix—not a blame session.

Step 5 — Verify and prevent regressions

Re-run the cost ranking after each fix (did the signature drop?)
Capture before/after: runtime, bytes, and total cost over 7 days
Add a test or review rule for the pattern you fixed (so it doesn’t return)
Document the decision: what changed, why, and who owns it

Common mistakes

These are the patterns behind “the warehouse bill jumped” and “the dashboard is slow”. The fixes are usually boring—and that’s a good thing.

Mistake 1 — Treating `SELECT *` as harmless

On wide tables, SELECT * pulls columns you don’t need, increases scan, and blocks some optimizations. It also makes schema changes riskier.

Fix: select only required columns; create curated “analytics views” with stable schemas.
Fix: add a lint rule or review checklist for production SQL.

Mistake 2 — Filtering on the wrong thing (no pruning)

Filtering on DATE(created_at) looks correct but often prevents pruning on a partition column.

Fix: filter directly on the partition column (e.g., event_date).
Fix: push filters into early CTEs before joins.

Mistake 3 — Accidental many-to-many joins

This is the fastest way to create huge intermediate datasets and wrong totals. It’s also hard to notice until cost spikes.

Fix: validate uniqueness of join keys (dimensions should be 1 row per key).
Fix: pre-aggregate facts before joining; avoid joining fact-to-fact unless intentional.

Mistake 4 — Using `DISTINCT` as a “band-aid”

DISTINCT can hide a modeling problem and often forces expensive sorts/shuffles.

Fix: fix the join duplication instead of de-duplicating late.
Fix: if you must de-duplicate, do it in a narrow, pre-filtered subquery.

Mistake 5 — Unbounded windows and global sorts

Window functions are powerful, but unbounded frames and global ORDER BY over huge tables can be very costly.

Fix: limit the time range; pre-aggregate; restrict window frames.
Fix: only sort at the end, and only the rows you’ll actually display.

Mistake 6 — Letting dashboards refresh “as fast as possible”

A 5-minute refresh on a heavy model is rarely worth the cost. BI refresh schedules are a common hidden multiplier.

Fix: stagger refreshes; increase intervals; cache stable layers.
Fix: build a single “serving table” for dashboards instead of recomputing raw logic.

If you’re unsure where to start

Find the top query signatures by total cost over 14 days. Then ask two questions: (1) can we read less? (2) can we run this less often? Those two fixes cover most real-world savings.

FAQ

How do I find my most expensive queries?

Use your warehouse’s query history tables/views and rank by the metric your bill uses (credits, bytes scanned, slot-ms, runtime). Group by query signature (normalized SQL), not by raw query ID, so you can fix repeated patterns instead of one-offs.

What’s the fastest optimization that usually saves money?

Add pruning-friendly filters (partition keys) and remove SELECT *. These reduce data scanned and often reduce compute too. If a query is repeated (dashboards/jobs), the next fastest win is to run it less often or materialize the expensive layer.

Partitioning vs clustering: which one helps cost more?

Partitioning helps most when your queries reliably filter by the partition key (often date). Clustering helps when you filter by other columns (tenant, user, status) and want better pruning within partitions. In practice: partition by the most common “time slice”, and cluster by the most common selective filters you use in WHERE and joins.

Is caching a real cost strategy or just a performance trick?

Caching can reduce repeated compute, but it’s not a guaranteed control mechanism by itself because cache hit rates depend on query shape, parameters, and timing. Treat caching as a bonus. For predictable cost, prefer materialization + scheduling + workload isolation.

When should I materialize a query (table/view/materialized view)?

Materialize when the query is expensive, stable, reused, and the output is much smaller than the input. A good sign is a query that appears in your top offenders list and runs many times per day (dashboards, downstream models).

How do we set limits without blocking legitimate analysis?

Separate workloads: keep interactive exploration on a “default” pool with reasonable timeouts, and create an explicit “heavy” pool for big backfills/one-off analysis with approvals. Add clear error messages and a path forward (“run this in batch warehouse”), and you’ll reduce both cost and frustration.

What metrics should we monitor weekly for warehouse cost control?

Track total spend, spend by team/tool, top query signatures, p95 runtime for recurring queries, and concurrency peaks. Weekly review should result in a short list: “fix these 3 signatures” and “adjust these 2 schedules/guardrails.”

Cheatsheet

A scan-fast checklist for warehouse cost control and the queries that burn money.

Detect: find the money burners

Rank by cost metric your bill uses (credits/bytes/runtime)
Group by query signature (normalize SQL)
Capture tool + user + schedule (BI/dbt/notebook)
Look for repeated queries and concurrency spikes
Record before/after metrics for every fix

Diagnose: why is it expensive?

Over-scan: big bytes read; tiny output
Join explosion: intermediate rows explode
Sort/shuffle: DISTINCT, wide group-bys, global ORDER BY
Repeat compute: expensive query runs constantly
Concurrency: same query starts many times per minute

Fix: the highest-leverage moves

Project only needed columns (no SELECT *)
Filter on partition keys (enable pruning)
Push filters before joins (prune early)
Validate join key uniqueness; pre-aggregate facts
Materialize reused heavy layers; incrementalize backfills
Reduce refresh frequency; stagger schedules

Prevent: guardrails for predictable spend

Timeouts / quotas / resource monitors (vendor-specific)
Separate interactive vs batch compute pools
Daily baseline alert + top offenders list
Ownership: tags by team/cost center
Weekly “top signatures” review (30 minutes)

Print this mental model

If a query is expensive, it’s usually because it reads too much, computes too much, or runs too often. Fixing any one of those is a win. Fixing two is how budgets become predictable.

Wrap-up

Warehouse cost control is mostly about routine: measure what’s expensive, fix the patterns that repeat, and add guardrails so cost stays stable as the team and data grow. If you implement just one thing from this post, make it a weekly “top query signatures” review with clear ownership and a short optimization backlog.

A simple next-week plan

Build a 14-day “top signatures” report from query history
Pick the top 5 and label each: over-scan / join explosion / repeat compute / concurrency
Apply one fix per query (small, surgical changes)
Add one guardrail: timeout/quota or workload separation
Re-measure after 7 days and document the before/after

Keep optimization boring

The best outcome is not a heroic rewrite. It’s a warehouse where the bill makes sense, the dashboards are stable, and “runaway queries” are rare because the system nudges people toward safe defaults.

UniLab Editorial

Modern learning notes for practical builders.

Warehouse Cost Control: The Queries That Burn Money

Quickstart

1) Confirm your billing unit

2) Build a “Top 20 cost queries” view

3) Kill the common “money burners” first

4) Add two guardrails (today)

Overview

What this post covers

Core concepts

1) The cost triangle: scan, compute, concurrency

Scan

Compute

Concurrency (the silent multiplier)

2) Query signature beats “top queries”

How to think about “signature”

3) Pruning: the warehouse’s superpower

Pruning-friendly predicates

Pruning killers

4) Materialization: stop paying for the same work twice

Step-by-step

Step 1 — Make cost visible (query history → a small mart)

Starter SQL: top query signatures by spend (generic shape)

Step 2 — Triage: classify each offender by “why it’s expensive”

Step 3 — Fix the expensive patterns (playbook)

A) Stop scanning what you don’t use (projection + predicate pruning)

Example: refactor an over-scanning dashboard query

B) Prevent join explosions (cardinality control)

Red flags

Fixes that actually stick

C) Reduce repeat compute (materialize and schedule)

Step 4 — Add guardrails: predictable cost beats heroic debugging

Operational guardrails

Governance guardrails

A small daily spend alert (example script)

Step 5 — Verify and prevent regressions

Common mistakes

Mistake 1 — Treating SELECT * as harmless

Mistake 2 — Filtering on the wrong thing (no pruning)

Mistake 3 — Accidental many-to-many joins

Mistake 4 — Using DISTINCT as a “band-aid”

Mistake 5 — Unbounded windows and global sorts

Mistake 6 — Letting dashboards refresh “as fast as possible”

FAQ

How do I find my most expensive queries?

What’s the fastest optimization that usually saves money?

Partitioning vs clustering: which one helps cost more?

Is caching a real cost strategy or just a performance trick?

When should I materialize a query (table/view/materialized view)?

How do we set limits without blocking legitimate analysis?

What metrics should we monitor weekly for warehouse cost control?

Cheatsheet

Detect: find the money burners

Diagnose: why is it expensive?

Fix: the highest-leverage moves

Prevent: guardrails for predictable spend

Wrap-up

A simple next-week plan

Quiz

Related posts

Mistake 1 — Treating `SELECT *` as harmless

Mistake 4 — Using `DISTINCT` as a “band-aid”