Programming · Data Structures

Hash Tables, Heaps, and Tries: When to Use What

Pick the right structure quickly with practical examples.

Reading time: ~8–12 min
Level: All levels
Updated:

Hash tables, heaps, and tries solve different “shapes” of problems: exact lookup, best-next choice, and prefix search. This guide gives you a fast decision process, real patterns (top-K, frequency counting, autocomplete), and the pitfalls that show up in interviews and production code.


Quickstart

If you need to pick a structure fast (interview, code review, or you’re staring at a perf issue), use this: start from the operation you do most often, then choose the structure that makes that operation cheap.

Fast decision map

  • Need exact key → value lookup? Use a hash table (map/dict).
  • Need “always get the smallest/largest next”? Use a heap (priority queue).
  • Need prefix search / autocomplete? Use a trie (prefix tree).
  • Need order + range queries? Consider a balanced tree (not covered deeply here).

3 questions to ask before coding

  • What are the hot operations? (lookup, insert, delete, pop-min, prefix-find)
  • What’s “big” here? (#items vs key length vs update rate)
  • What are your constraints? memory, ordering, worst-case guarantees, concurrency

Mini cheat: complexity by intuition

Structure Feels like… Best at Typical cost
Hash table Cabinet of labeled drawers Exact access by key Average ~O(1) lookup/insert
Heap “Always give me the next best” Min/max or priority scheduling O(log n) push/pop, O(1) peek
Trie Walk letters/digits step-by-step Prefix queries O(L) where L = key length
Quick heuristic

If you can phrase your problem as “given a key, find the value,” it’s probably a hash table. If you can phrase it as “repeatedly take the next most important thing,” it’s a heap. If you can phrase it as “find everything that starts with …,” it’s a trie.

Overview

Choosing the wrong data structure is one of the fastest ways to end up with slow code, complicated workarounds, or both. The tricky part is that hash tables, heaps, and tries can all store “keyed” data—so beginners often treat them as interchangeable. They are not.

What this post covers

  • The mental model for each structure (what it “wants” to do)
  • When to choose it (and when not to)
  • Common patterns: frequency counts, top-K, autocomplete
  • Gotchas that matter in interviews and real systems

What this post intentionally skips

  • Deep math proofs and implementation details of resizing/rehashing
  • All tree variants (AVL/Red-Black/B-Trees) beyond brief comparison
  • Specialized variants (Fibonacci heaps, radix tries) unless helpful
A useful framing

Data structures are tradeoffs: you spend time, memory, or complexity to make a specific operation cheap. The “right” choice is the one that makes your hot path simple and fast.

Core concepts

Hash tables (maps/dicts): fast exact lookup

A hash table stores (key → value) pairs and uses a hash function to jump near the right location. In typical implementations, lookup/insert/delete are average O(1) because you do “a small constant amount of work” regardless of dataset size—until you collide or trigger a resize.

Use a hash table when

  • You need membership tests: “have we seen X?”
  • You’re counting/frequency mapping
  • You need de-duplication (often a set)
  • You want cache-like access by ID (userId → profile)

Watch out for

  • Collisions: many keys landing in the same bucket slows things down
  • Resizes: occasionally expensive (rehashing)
  • No ordering guarantees in general (language-dependent)
  • Worst-case O(n) if many keys collide or adversarial input exists

Heaps: a priority queue, not a sorted list

A heap is a tree-like structure stored in an array that maintains a partial order: the minimum (or maximum) element is always at the top. It does not keep everything sorted. Its superpower is: “give me the next best element quickly”.

Use a heap when

  • You repeatedly need min/max: scheduling, simulation, Dijkstra/A*
  • You need top-K / bottom-K from a stream
  • You’re merging sorted lists efficiently
  • You need “next event time” in event loops

Watch out for

  • Searching arbitrary elements is slow (often O(n))
  • Decrease-key is not built-in in many standard libraries
  • Heap gives you the top element fast, but not “ranked order” for free
  • Stable ordering by insertion time is not guaranteed

Tries: prefix trees for strings (and similar keys)

A trie stores keys by walking character-by-character (or byte-by-byte). Instead of comparing whole strings, you traverse a path. That means operations scale with key length L, not with number of keys n: lookup/insert/prefix-find are typically O(L).

Use a trie when

  • You need prefix queries: autocomplete, search suggestions
  • You need “startsWith” checks constantly
  • You want lexicographic traversal (e.g., list words in order)
  • You need to store many keys with shared prefixes (can compress well with variants)

Watch out for

  • Memory: lots of nodes/edges (especially naive implementations)
  • Key normalization: casing, Unicode, punctuation decisions matter
  • For small key sets, a hash table can be simpler and faster
  • Concurrent writes can be tricky without careful design

The “shape of your query” decides the structure

If your query is… Typical structure Why
“Find this exact key.” Hash table Constant-time average access
“Repeatedly get next min/max.” Heap Fast push/pop of extreme element
“Find all keys with this prefix.” Trie Prefix traversal avoids scanning all keys

Step-by-step

Here’s a practical way to choose between hash tables, heaps, and tries. You can use this as an interview “thinking aloud” framework or as a design checklist in a real system.

Step 1 — Write down your operations (the real requirements)

  • Exact lookup: get(key), contains(key)
  • Updates: put(key, value), delete(key), increment counters
  • Priority: push(item, priority), pop-min/pop-max, peek
  • Prefix: startsWith(prefix), listAll(prefix), longestPrefix

Step 2 — Pick the default structure for the hot path

If the hot path is exact access

Use a hash table (or set). It’s usually the simplest and fastest tool for membership tests and key-based retrieval.

  • IDs to objects (userId → user)
  • Seen/visited flags in graph/search
  • Frequency counts and grouping

If the hot path is “next best”

Use a heap (priority queue). It’s designed for repeated selection of the minimum/maximum (or highest priority).

  • Top-K elements
  • Task scheduling by priority/time
  • Shortest path algorithms

If the hot path is prefix search

Use a trie when you must do many prefix queries without scanning all keys. It shines in autocomplete and dictionary-like tasks.

  • Autocomplete suggestions
  • Prefix-based filtering
  • Longest prefix matching (routing, tokenization)

If you need ordering/ranges

This is where balanced trees (ordered maps/sets) often beat hash tables. Heaps don’t support range queries either.

  • “Give me keys between A and B”
  • “What’s the next key after X?”
  • “Iterate in sorted order cheaply”

Step 3 — Apply common patterns (with copy/pasteable code)

Pattern A: frequency counting + fast membership (hash table)

Use a dictionary/map when you need counts, grouping, or “have we seen this?” checks. This is also the backbone of many interview problems (anagrams, duplicates, sliding windows).

from collections import defaultdict

def top_repeated_words(words, min_count=2):
    # Hash table pattern: O(n) average time
    counts = defaultdict(int)
    for w in words:
        w = w.strip().lower()
        if not w:
            continue
        counts[w] += 1

    # Keep only items that repeat
    return {w: c for (w, c) in counts.items() if c >= min_count}

# Example
print(top_repeated_words(["Apple", "apple", "banana", "Banana", "kiwi", "banana"]))
  • Why it works: each update hits a key bucket directly (average O(1)).
  • Gotcha: normalize keys (case, whitespace, Unicode) or you’ll count “Apple” and “apple” separately.

Pattern B: top-K from a stream (heap)

When you can’t (or don’t want to) sort everything, keep a small heap of size K. This gives you O(n log K) time instead of O(n log n), and it works on streaming input.

import heapq

def top_k(nums, k):
    if k <= 0:
        return []

    heap = []  # min-heap of the current top K
    for x in nums:
        if len(heap) < k:
            heapq.heappush(heap, x)
        else:
            # If x is bigger than the smallest in the top-K, replace it
            if x > heap[0]:
                heapq.heapreplace(heap, x)

    # heap is not sorted; sort at the end if you need ordering
    return sorted(heap, reverse=True)

print(top_k([5, 1, 9, 2, 9, 3, 10, 4], k=3))  # [10, 9, 9]
  • Why it works: heap top is the “weakest” of your top-K; easy to evict.
  • Gotcha: don’t assume the heap list is sorted—only heap[0] is guaranteed (min-heap).

Pattern C: autocomplete / prefix queries (trie)

A trie pays a memory cost to make prefix queries cheap. This example supports insertion and suggestions for a prefix. In production, you’d likely add frequency ranks, compress nodes, and cap suggestion counts.

class TrieNode {
  constructor() {
    this.children = new Map(); // char -> TrieNode
    this.isWord = false;
  }
}

class Trie {
  constructor() {
    this.root = new TrieNode();
  }

  insert(word) {
    let node = this.root;
    for (const ch of word) {
      if (!node.children.has(ch)) node.children.set(ch, new TrieNode());
      node = node.children.get(ch);
    }
    node.isWord = true;
  }

  suggestions(prefix, limit = 10) {
    // Walk the prefix: O(L)
    let node = this.root;
    for (const ch of prefix) {
      if (!node.children.has(ch)) return [];
      node = node.children.get(ch);
    }

    // DFS from the prefix node
    const out = [];
    const stack = [[node, prefix]];
    while (stack.length && out.length < limit) {
      const [cur, path] = stack.pop();
      if (cur.isWord) out.push(path);

      // Push children in reverse lexicographic order for nicer results
      const entries = Array.from(cur.children.entries()).sort((a, b) => b[0].localeCompare(a[0]));
      for (const [ch, child] of entries) stack.push([child, path + ch]);
    }
    return out;
  }
}

// Example
const t = new Trie();
["app", "apple", "apply", "apt", "banana"].forEach(w => t.insert(w));
console.log(t.suggestions("app")); // ["app", "apple", "apply"] (order can vary)
  • Why it works: you only traverse the prefix once, then enumerate matches.
  • Gotcha: decide on normalization (casefolding, Unicode) so “App” and “app” behave as intended.

Step 4 — Combine structures when the problem demands it

Real systems often need two properties at once. That’s where combinations show up: use one structure for fast access, another for ordering or “next best.”

Hash table + heap

Great for “top-K by score” where scores change, or for caching policies. The hash table finds items fast; the heap picks the next eviction/priority candidate.

  • Leaderboard-ish problems
  • Task scheduling with updates
  • Approximate LRU/LFU patterns (with careful invariants)

Trie + ranking

Autocomplete usually needs ranked suggestions (frequency, recency). Common approach: store counts on nodes and return top suggestions with a capped traversal.

  • Store popularity per word
  • Cap traversal limit to avoid huge prefix expansions
  • Consider compressed tries (radix) to reduce memory
Don’t over-engineer early

If you have a few thousand keys and only occasional prefix queries, a simple hash table plus filtering can be “good enough.” Reach for tries when prefix queries are frequent or the key space is large enough that scanning hurts.

Common mistakes

These are the mistakes that make code slow, buggy, or surprisingly complex. Fixing them usually simplifies your solution immediately.

Mistake 1 — Using a heap as if it were sorted

A heap only guarantees the top element (min or max). The underlying array is not sorted.

  • Symptom: “Why isn’t my heap list in order?”
  • Fix: pop repeatedly to get sorted order, or sort once at the end (like top-K).

Mistake 2 — Expecting fast removal/search in a heap

Removing an arbitrary item is typically O(n) unless you add extra indexing.

  • Symptom: you’re scanning the heap to find a task/user/item.
  • Fix: add a hash table index (item → position) or redesign the operation.

Mistake 3 — Forgetting key normalization (hash tables and tries)

“App”, “app”, and “app ” are different keys unless you decide otherwise.

  • Fix: normalize on the boundary (lowercase, trim, Unicode normalize where needed).
  • Fix: document it (future you will forget).

Mistake 4 — Choosing a trie for simple exact lookup

If you only need exact matches, a hash table is simpler and often faster (and uses less memory in typical runtimes).

  • Fix: use a hash set/map unless prefix queries are a core feature.
  • Fix: measure memory; naive tries can explode in node count.

Mistake 5 — Ignoring worst-case behavior

Hash tables are average O(1), but worst-case can degrade with collisions or adversarial input.

  • Fix: for security-sensitive cases, consider structures with guaranteed bounds.
  • Fix: avoid trusting unbounded user input as hash keys without constraints.

Mistake 6 — Rebuilding instead of updating

Sorting everything each time you need a top element is expensive. Heaps are designed to update incrementally.

  • Fix: keep a heap and update it as data arrives.
  • Fix: only sort when you need the final ranked output.
Interview signal

Saying “heap is great for repeated min/max selection, but not for arbitrary search” is often the line that separates “knows the name” from “knows when to use it.”

FAQ

When should I use a hash table vs a trie for strings?

Use a hash table for exact matches (is this word present? word → data). Use a trie when prefix queries are frequent (autocomplete, startsWith, listing all matches for a prefix). If you don’t need prefix search, a hash set/map is usually simpler and cheaper in memory.

Is a heap the same thing as a binary search tree?

No. A heap only guarantees the top element (min or max) is easy to access. A binary search tree maintains full ordering, enabling in-order traversal and range queries. Heaps are best for priority queues; BSTs are best for ordered maps/sets and ranges.

Why not always sort instead of using a heap?

Sorting is great when you need a full ordered list once. But if you repeatedly need “the next best” while data changes, heaps give you efficient incremental updates (push/pop in O(log n)) without re-sorting everything.

What does “average O(1)” for hash tables really mean?

It means that with a good hash function and controlled load factor, operations take constant time on average across many inputs. Occasionally you pay a bigger cost (resizing/rehashing), and in worst-case collision scenarios performance can degrade.

Tries seem memory-heavy—how do real systems handle that?

They compress. Common approaches include radix tries (compress chains of single-child nodes), storing edges as arrays only where dense, and limiting suggestion expansion. Many autocomplete systems also store a limited “top suggestions” cache at nodes to avoid deep traversals.

What if I need “top-K by score,” but scores change over time?

Use a combination: hash table for fast access (item → current score) and a heap for retrieval. You may also need a strategy for stale heap entries (lazy deletion) or an indexed heap if you require true updates.

Cheatsheet

Save this section. It’s meant to be scanned in 20 seconds before you write code (or answer an interview question).

Pick the structure (fast)

  • Exact lookup / membership: hash table (map/dict) or set
  • Top-K / repeated min/max: heap (priority queue)
  • Prefix / autocomplete / longest prefix: trie
  • Sorted iteration / ranges: ordered map / balanced tree

Time complexity (typical)

Operation Hash table Heap Trie
Lookup / contains Avg O(1) O(n) (arbitrary) O(L)
Insert Avg O(1) O(log n) O(L)
Delete Avg O(1) O(n) (unless indexed) O(L) (plus cleanup)
Get min/max Not direct O(1) peek, O(log n) pop Not direct
Prefix query Needs scan/filter Not a match O(L + output)

Notes: n = number of items; L = key length. Real-world performance depends on implementation details and constants.

If you’re unsure

Default to a hash table for exact lookups. Switch to a heap only when you truly need repeated min/max selection. Switch to a trie only when prefix queries are a core requirement and scanning becomes a bottleneck.

Wrap-up

Hash tables, heaps, and tries are not competing “ways to store data”—they’re solutions to different kinds of questions: exact key lookup, repeated best-next selection, and prefix search. If you start from the hot operation and choose the structure that makes that operation cheap, your code usually becomes both faster and simpler.

What to practice next

  • Hash table problems: anagrams, two-sum variants, sliding windows
  • Heap problems: top-K, merge k sorted lists, Dijkstra
  • Trie problems: autocomplete, word search, prefix replacement

A quick “design habit”

  • State operations + constraints first
  • Choose structure based on the hot path
  • Call out tradeoffs (memory, worst-case, ordering)
  • Only optimize further after measuring

Want to go deeper? Use the related posts below to strengthen Big-O intuition, debug performance issues, and write cleaner Python quickly.

Quiz

Quick self-check (demo). This quiz is auto-generated for programming / data / structures.

1) You need to check if a userId exists and fetch its profile quickly (many lookups). What’s the best default structure?
2) You’re processing a stream of numbers and need the top 100 largest values without sorting everything. What do you use?
3) You need fast autocomplete: given a prefix, return possible words. What structure fits best?
4) Which statement about heaps is correct?