Kafka Basics: Events, Partitions, and Consumer Groups

By Samuel Labant Published Jan 9, 2026 Updated Jan 9, 2026

Kafka looks intimidating until you get the mental model: it’s a distributed log. Producers append events to topics, topics are split into partitions (for scale), and consumer groups let you process those partitions in parallel without double-reading. This guide is “Kafka Basics: Events, Partitions, and Consumer Groups” in practical terms—what each concept means, why it exists, and how to run a tiny local setup to see the behavior with your own eyes.

Quickstart: the fastest way to understand Kafka

If you only do one thing, make Kafka feel tangible: run it locally, create a multi-partition topic, then start two consumers in the same group and watch Kafka split the work. The goal isn’t a perfect setup—it’s a working mental model you can reuse.

Do this in 15–20 minutes

Start Kafka (single broker) locally
Create a topic with 3 partitions
Produce a few keyed events
Run two consumers with the same group.id
Inspect offsets + lag for the group

What you’ll learn (without theory overload)

Why Kafka ordering is per partition
Why partitions are the unit of parallelism
What a consumer group really coordinates
Why offsets are a cursor (not “deleting messages”)
How to spot “stuck” consumers quickly

Make it click

Keep a simple rule in your head: partitions are lanes and a consumer group is a team of workers. Kafka assigns lanes to workers. If you add workers, Kafka reassigns lanes (a rebalance).

Overview: what this post covers (and why it matters)

Kafka is often introduced as “a messaging system,” but that framing hides the two ideas that make it powerful: durability (events are persisted and replayable) and scalable consumption (consumer groups coordinate parallel processing).

The practical questions this post answers

What is an event in Kafka, and what should go into it?
How do partitions affect ordering, scaling, and performance?
What does a consumer group coordinate, exactly?
Why do rebalances happen, and how do you avoid “thrash”?
How do offsets work (and why they’re the key to reliability)?

You’ll also get a minimal local setup and a few operational habits (like checking consumer lag) that save hours when you move from “toy pipeline” to “production job.”

Kafka is not “just queues”

Queues are often “read once and disappear.” Kafka is different: it keeps events for a retention period, and consumers track their position via offsets. That makes replay and backfills a normal workflow, not an emergency.

Core concepts: the mental model that prevents confusion

Here are the terms you’ll see everywhere, explained in the “what it is / what it does / why you should care” style. If you internalize these, Kafka stops feeling magical.

Concept	Mental model	Why it matters in practice
Event	A fact that happened (immutable record)	Design it so downstream systems can process it reliably and replay it safely.
Topic	A named stream (like a table you append to)	It’s your “channel” of events; producers write to it, consumers read from it.
Partition	A shard / lane inside a topic	The unit of ordering and parallelism: ordering is guaranteed within one partition.
Offset	A cursor position in a partition	Offsets are how consumers remember “where I’m at” and how you do replays.
Producer	Appends events to a topic	Producer settings (acks, retries, batching) control durability and throughput.
Consumer	Reads events from a topic	Consumers must keep up, handle failures, and commit offsets correctly.
Consumer group	A team that shares work	Each partition is assigned to at most one consumer in a group at a time.
Broker	A Kafka server node	Brokers store partitions and serve reads/writes; clusters scale by adding brokers.

Events: what to include (and what to avoid)

A Kafka event is typically a key/value pair with metadata. Your “value” is often JSON/Avro/Protobuf, but the bigger question is: can someone process this later without guessing?

Good event design habits

Include an event type and schema version
Use stable IDs (user_id, order_id, device_id)
Include timestamps (event time and ingestion time if needed)
Keep it additive (new fields > breaking changes)
Prefer facts over derived conclusions (“paid” event > “good customer”)

Common foot-guns

Using the key randomly (destroys ordering guarantees)
Embedding huge blobs (hurts throughput and retention costs)
Changing meaning without versioning (“status” changes semantics)
Relying on consumer-side time as “truth”
Assuming Kafka guarantees “exactly once” by default

Partitions: ordering, scaling, and why keys matter

Partitions exist so Kafka can scale. But they also define what “in order” means. Kafka can preserve order for events that land in the same partition—so you typically use a key that groups related events (like order_id) to keep their sequence consistent.

Ordering isn’t global

Kafka does not guarantee a single total order across all partitions of a topic. If you need strict ordering for a stream of related events, ensure they share a key that maps them to the same partition.

Consumer groups: “one partition, one consumer” (within a group)

A consumer group is Kafka’s scaling primitive for reading: Kafka assigns partitions to consumers so that a partition is processed by at most one consumer in the group at a time. Add consumers and Kafka may rebalance assignments. Add partitions and you increase maximum parallelism.

Step-by-step: run Kafka locally and see the behavior

This is a practical walk-through you can copy/paste. It uses a single-node setup (good for learning) and focuses on the three things that trip people up: partitions, keys, and consumer groups.

Step 1 — Start Kafka locally (single broker, KRaft)

For a local demo, we keep it simple: one broker, plaintext listener, and a small amount of storage. This is not “production secure,” and that’s okay—the goal is to observe behavior.

version: "3.8"

services:
  kafka:
    image: bitnami/kafka:latest
    container_name: kafka
    ports:
      - "9092:9092"
      - "29092:29092"
    environment:
      # KRaft (no ZooKeeper) single-node demo
      - KAFKA_ENABLE_KRAFT=yes
      - KAFKA_CFG_PROCESS_ROLES=broker,controller
      - KAFKA_CFG_NODE_ID=1
      - KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=1@kafka:9093
      - KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER

      # Listeners: one for inside Docker, one for your host machine
      - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,PLAINTEXT_HOST://:29092,CONTROLLER://:9093
      - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
      - KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      - ALLOW_PLAINTEXT_LISTENER=yes

      # Demo-friendly defaults
      - KAFKA_CFG_AUTO_CREATE_TOPICS_ENABLE=false
      - KAFKA_CFG_NUM_PARTITIONS=3
      - KAFKA_CFG_OFFSETS_TOPIC_REPLICATION_FACTOR=1
      - KAFKA_CFG_TRANSACTION_STATE_LOG_REPLICATION_FACTOR=1
      - KAFKA_CFG_TRANSACTION_STATE_LOG_MIN_ISR=1

Run it

Save as docker-compose.yml
Start: docker compose up -d
Confirm the broker is up: docker logs -f kafka

This setup exposes Kafka on localhost:29092 for your host, while internal Docker traffic uses kafka:9092. That split avoids the classic “advertised.listeners” confusion on laptops.

Step 2 — Create a topic and produce a few keyed events

We’ll create a topic named payments with 3 partitions. Then we’ll publish a few events with a key so related events stick to the same partition (your future self will thank you when debugging ordering).

# Create a 3-partition topic (replication=1 for a single-broker demo)
docker exec -it kafka /opt/bitnami/kafka/bin/kafka-topics.sh \
  --bootstrap-server localhost:9092 \
  --create --topic payments --partitions 3 --replication-factor 1

# Describe it (check partitions)
docker exec -it kafka /opt/bitnami/kafka/bin/kafka-topics.sh \
  --bootstrap-server localhost:9092 \
  --describe --topic payments

# Produce keyed events: KEY|VALUE
# (parse.key=true means split key/value using key.separator)
docker exec -it kafka /opt/bitnami/kafka/bin/kafka-console-producer.sh \
  --bootstrap-server localhost:9092 \
  --topic payments \
  --property parse.key=true \
  --property key.separator="|"

While the producer is running, paste a few lines like these (same key means same partition):

order-1001|{"event":"payment_authorized","order_id":"1001","amount":42.50,"currency":"EUR"}
order-1002|{"event":"payment_authorized","order_id":"1002","amount":13.99,"currency":"EUR"}
order-1001|{"event":"payment_captured","order_id":"1001","amount":42.50,"currency":"EUR"}

Close the producer with Ctrl+C when you’re done.

Step 3 — Consume with a consumer group (and watch parallelism)

Now the payoff: start consumers with the same group id. Kafka will assign partitions across them. With 3 partitions, at most 3 consumers can actively read in parallel in a single group.

Terminal A (consumer 1)

Run a console consumer with group id demo-payments
Print partition + offset so you can see what’s happening
Read from the beginning for the first run

Terminal B (consumer 2)

Start the same command again (same group id)
Notice Kafka rebalances and reassigns partitions
Produce more events and watch which terminal prints them

import json
from confluent_kafka import Consumer, KafkaException

BOOTSTRAP = "localhost:29092"
TOPIC = "payments"
GROUP_ID = "demo-payments"

conf = {
    "bootstrap.servers": BOOTSTRAP,
    "group.id": GROUP_ID,
    "auto.offset.reset": "earliest",      # first run only; after commits it will resume
    "enable.auto.commit": True,           # start simple; see notes below for manual commits
}

c = Consumer(conf)
c.subscribe([TOPIC])

try:
    while True:
        msg = c.poll(1.0)
        if msg is None:
            continue
        if msg.error():
            raise KafkaException(msg.error())

        key = msg.key().decode("utf-8") if msg.key() else None
        value = msg.value().decode("utf-8") if msg.value() else None

        payload = None
        if value:
            try:
                payload = json.loads(value)
            except json.JSONDecodeError:
                payload = value

        print(
            f"topic={msg.topic()} partition={msg.partition()} offset={msg.offset()} key={key} value={payload}"
        )

except KeyboardInterrupt:
    pass
finally:
    c.close()

Auto-commit vs manual commit (practical guidance)

Auto-commit is fine for low-stakes pipelines and learning. In production, you often commit offsets after your processing succeeds (especially if you write to a database or call an external API). The key idea: committing an offset means “this group has processed up to here,” not “Kafka deleted the message.”

Step 4 — Check consumer lag (the fastest health signal)

When a pipeline “feels slow,” consumer lag usually tells you whether you’re keeping up. Lag is the gap between the end of the log and the group’s current committed offset.

A minimal operations checklist

Is the consumer running and polling regularly?
Is lag growing (consumer is falling behind)?
Are you blocked on downstream I/O (DB, API, storage)?
Do you have enough partitions for your desired parallelism?
Are rebalances happening too often (thrashing)?

Step 5 — Choose partitions and consumer groups deliberately

This is where Kafka design becomes “engineering” instead of “copy a tutorial.” Use partitions for parallelism and ordering boundaries; use consumer groups for scaling a specific workload.

You want…	Use partitions like…	Use consumer groups like…
More throughput	Increase partitions (more lanes)	Add consumers to the same group (more workers)
Strict ordering for an entity	Key by entity id to keep it in one partition	Single group processes it; one consumer handles that partition at a time
Multiple independent readers	Same topic; partitions unchanged	Use different groups (each group has its own offsets)
Replay/backfill	Retention/compaction settings matter	Use a new group id or reset offsets intentionally

A realistic rule of thumb

Your max parallelism for a single consumer group is bounded by the number of partitions. If you run 10 consumers on a topic with 3 partitions, 7 consumers will sit idle (they have nothing to own).

Common mistakes (and how to fix them)

Most Kafka confusion comes from a handful of predictable assumptions. Here are the ones that cause the most production pain, with fixes that are straightforward once you know what to look for.

Mistake 1 — Expecting global ordering across a topic

Kafka guarantees ordering within a partition, not across all partitions. If related events land in different partitions, their observed order can differ between consumers.

Fix: key related events (order_id, user_id) so they map to the same partition.
Fix: if you truly need global ordering, use one partition (and accept the throughput limit).

Mistake 2 — Random or unstable keys

If the key changes per event, partitioning becomes effectively random and you lose locality and ordering.

Fix: choose a stable key that matches your processing boundary (entity id).
Fix: avoid “timestamp as key” unless you intentionally want distribution, not ordering.

Mistake 3 — More consumers than partitions (and wondering why it’s slow)

Consumers don’t magically split messages. They split partitions. Extra consumers idle.

Fix: increase partitions if you need more parallelism.
Fix: scale within reason; too many partitions can increase overhead.

Mistake 4 — Confusing offsets with message deletion

Committing offsets does not remove events. Kafka retention controls deletion (time/size/compaction).

Fix: explain offsets as “a cursor the group saves.”
Fix: configure retention/compaction based on replay needs.

Mistake 5 — Auto-commit hides failures

If offsets commit before your processing finishes, a crash can skip work (you “acknowledged” too early).

Fix: for critical pipelines, commit offsets after processing succeeds.
Fix: design idempotent processing so replays are safe.

Mistake 6 — Rebalance thrash (consumers constantly rejoining)

If a consumer doesn’t poll in time (long processing, GC pauses, slow I/O), Kafka may consider it dead and rebalance.

Fix: keep processing time predictable; batch work or offload heavy tasks.
Fix: tune timeouts and poll intervals to match your workload.

Mistake 7 — Using Kafka as a blob store

Large messages reduce throughput, increase latency, and make retention expensive.

Fix: store large payloads elsewhere (object storage) and send references in Kafka.
Fix: compress when appropriate and keep message sizes sane.

Mistake 8 — No “reset strategy” for replays

Teams either never replay (fear), or replay accidentally (panic). Both are avoidable.

Fix: treat replays as a normal operation: new group id for backfills, explicit offset resets for corrections.
Fix: log dataset/pipeline versions so you can reproduce outcomes.

FAQ

Is Kafka a queue or a log?

Kafka is a distributed log. It can behave like a queue when a single consumer group processes a topic, but the underlying model is append + retain + replay, with consumption tracked via offsets.

What’s the difference between a topic and a partition?

A topic is the named stream; partitions are shards inside it. Partitions are what Kafka uses to scale storage and throughput. Ordering guarantees apply within a partition, not across the entire topic.

How many partitions do I need?

Start with the parallelism you need and grow intentionally. A consumer group can process up to one partition per consumer at a time, so partitions set your max parallelism. Don’t over-partition early; too many partitions can increase overhead (more files, more coordination).

Why are my events “out of order”?

Because ordering is per partition. If related events use different keys (or no key), they may land in different partitions and be consumed in different orders. Fix it by choosing a stable key (like order_id) for sequences that must stay ordered.

Can two different consumer groups read the same topic?

Yes—and that’s a feature. Each consumer group maintains its own offsets, so multiple teams/systems can independently process the same event stream without interfering with each other.

What does committing an offset actually do?

It records the group’s progress. Committing an offset means “we have processed up to this point in this partition.” It does not delete messages. Deletion is controlled by retention and compaction settings.

Do I still need ZooKeeper?

Newer Kafka deployments use KRaft (no ZooKeeper) for metadata management. Some existing clusters still run with ZooKeeper, so you may see both in the wild, but the “modern default” is moving toward ZooKeeper-free setups.

Cheatsheet

Use this as a “quick recall” when building or reviewing a Kafka pipeline. It’s intentionally short and practical.

Design checklist

Pick an event key that matches your ordering boundary (user_id/order_id)
Version your event schema (and keep changes additive)
Decide the failure model: at-least-once + idempotent processing is a great default
Choose partitions based on needed parallelism (and future growth)
Decide retention/compaction based on replay requirements

Operations checklist

Check consumer lag first when “things are slow”
Watch for frequent rebalances (a stability smell)
Ensure consumers poll regularly (don’t block forever)
Monitor error rates and dead-letter strategies for poison messages
Test replay/backfill steps before you need them

Rule	Meaning	Typical implication
Ordering is per partition	Only events in the same partition preserve order	Use stable keys for sequences that must be ordered
Parallelism is per partition	One partition is owned by one consumer in a group at a time	More consumers than partitions won’t speed up processing
Offsets are the cursor	Consumers track progress by committing offsets	Replay is normal: change group id or reset offsets intentionally

If you remember only three words

Keys. Partitions. Offsets. Those three explain most Kafka “mysteries.”

Wrap-up

Kafka is easier than it looks once you stop thinking “messages” and start thinking “append-only logs with coordinated readers.” Events live in topics, topics scale via partitions, and consumer groups let you scale processing without duplicating work. If you can answer “what’s my key?” and “how many partitions do I need?” you’re already ahead of most first deployments.

Next actions (pick one)

Run the local demo again and try different keys to see partition behavior
Add a third consumer and observe how 3 partitions max out parallelism
Sketch your first real event: name, key, schema version, and what “done” processing means
Write down your replay plan: new group id vs explicit offset reset

When you’re ready to go deeper, the related posts below are great follow-ups: query performance, explain plans, modeling, and modern analytics architecture decisions.

UniLab Editorial

Modern learning notes for practical builders.

Kafka Basics: Events, Partitions, and Consumer Groups

Quickstart: the fastest way to understand Kafka

Do this in 15–20 minutes

What you’ll learn (without theory overload)

Overview: what this post covers (and why it matters)

The practical questions this post answers

Core concepts: the mental model that prevents confusion

Events: what to include (and what to avoid)

Good event design habits

Common foot-guns

Partitions: ordering, scaling, and why keys matter

Consumer groups: “one partition, one consumer” (within a group)

Step-by-step: run Kafka locally and see the behavior

Step 1 — Start Kafka locally (single broker, KRaft)

Run it

Step 2 — Create a topic and produce a few keyed events

Step 3 — Consume with a consumer group (and watch parallelism)

Terminal A (consumer 1)

Terminal B (consumer 2)

Step 4 — Check consumer lag (the fastest health signal)

A minimal operations checklist

Step 5 — Choose partitions and consumer groups deliberately

Common mistakes (and how to fix them)

Mistake 1 — Expecting global ordering across a topic

Mistake 2 — Random or unstable keys

Mistake 3 — More consumers than partitions (and wondering why it’s slow)

Mistake 4 — Confusing offsets with message deletion

Mistake 5 — Auto-commit hides failures

Mistake 6 — Rebalance thrash (consumers constantly rejoining)

Mistake 7 — Using Kafka as a blob store

Mistake 8 — No “reset strategy” for replays

FAQ

Is Kafka a queue or a log?

What’s the difference between a topic and a partition?

How many partitions do I need?

Why are my events “out of order”?

Can two different consumer groups read the same topic?

What does committing an offset actually do?

Do I still need ZooKeeper?

Cheatsheet

Design checklist

Operations checklist

Wrap-up

Next actions (pick one)

Quiz

Related posts