Cyber security · Logging & SIEM

Security Logging That Helps: What to Record and Why

Turn logs into answers instead of noise.

Reading time: ~8–12 min
Level: All levels
Updated:

“We had logs” isn’t the same as “we had answers.” Helpful security logging is opinionated: it records who did what, to which thing, from where, when, and with what result—and it does it consistently enough that you can investigate incidents without guessing. This guide explains what to record (and what to avoid), how to structure logs for a SIEM, and how to keep volume under control.


Quickstart

If you only have 60–90 minutes, do these in order. Each step improves detection and incident response immediately, without requiring a full SIEM rollout.

1) Start with the “security spine” events

These are the events you’ll ask for first during an incident.

  • Authentication: login success/fail, MFA challenge, password reset, account lockout
  • Authorization: allow/deny decisions on sensitive actions (admin, billing, data export)
  • Session: session created/refreshed/terminated, suspicious refresh patterns
  • Admin actions: role changes, permission grants, API key creation, config changes

2) Make logs correlatable

If you can’t tie events together, you can’t tell a story.

  • Add a request_id / trace_id to every service hop
  • Log both user_id and session_id (not just a display name)
  • Capture source_ip, user_agent, and tenant/org_id where applicable

3) Centralize (even if it’s basic)

Centralization is what turns a pile of files into a security capability.

  • Ship logs off-host (don’t rely on local disk during an incident)
  • Normalize to structured JSON where you can
  • Set a retention policy: “hot” for search, “cold” for long-term

4) Add 5 high-signal alerts (not 50)

A few reliable detections beat a noisy dashboard nobody trusts.

  • Repeated auth failures followed by success (possible credential stuffing)
  • Privilege change (role/admin grant) by a non-admin path
  • API key created / rotated outside normal change window
  • Data export/download spike or unusual destination
  • Audit log gap (no events from a system that should be chatty)
Do not log secrets (even “temporarily”)

Avoid passwords, raw access tokens, refresh tokens, private keys, full credit card numbers, and full session cookies. If it can be used to authenticate, treat it as a secret and either exclude it or store a minimal fingerprint (hash/prefix) for debugging.

Overview

Security logging that helps is not “log everything.” It’s log the decisions and the context so you can answer questions like: Which account was compromised? How did they get in? What did they touch? How far did they get?

What this post covers

  • A simple mental model for what to log (and why those events matter)
  • The minimum field set that makes logs searchable, correlatable, and SIEM-friendly
  • A step-by-step way to implement logging across apps, infrastructure, and identity providers
  • Common mistakes that create noise (or hide attacks) and how to fix them
  • A scan-fast cheatsheet plus a quick quiz to check your understanding

Who this is for

  • Builders who want “good enough” security logging without months of SIEM work
  • Small teams creating audit logs for compliance and incident response
  • Anyone drowning in logs and trying to find signal

What “helpful” looks like

  • You can reconstruct a user session in minutes
  • You can explain why an access decision happened
  • You can detect common attacks with a few robust rules
  • You can retain and trust logs during and after incidents
A practical goal

Aim for investigation-grade logs first. Advanced analytics can come later. If your logs aren’t consistent and structured, the fanciest SIEM won’t save you.

Core concepts

Before tooling, get the concepts right. Security logging is about decisions, actors, and evidence. Once you define those, you can choose sources (app, OS, cloud) and storage (SIEM, data lake) without rewriting everything later.

1) Logs vs metrics vs traces

Security relevance

  • Logs: discrete events (who did what), best for investigations and detections
  • Metrics: counts/aggregates (how many), best for dashboards and capacity
  • Traces: request paths (where it went), best for debugging distributed systems

Why this matters

Most incidents start as logs (“failed login”), become traces (“which service issued the token?”), and end as metrics (“how widespread was it?”). You don’t need everything on day one—but your logs should be structured so you can connect them to metrics/traces later.

2) Audit logs vs security logs

People use these terms loosely. Here’s a practical distinction that helps you design your schema:

Log type Primary purpose Examples What you must include
Audit log Accountability (“who changed what?”) role grant, password reset, API key created, settings changed actor, action, target, timestamp, result, change details
Security event Detection/response (“is this suspicious?”) brute force pattern, impossible travel, privilege escalation attempt actor + context (ip/device), decision/reason, correlation IDs
Operational log Debugging (“why did it break?”) stack traces, timeouts, retries, dependency failures request_id/trace_id, component, error category (avoid secrets)

3) The “six W’s” field set

Helpful security logging is surprisingly repetitive: the same small set of fields makes 80% of investigations possible. Use this as your minimum schema (adapt names to your stack, but keep the meaning consistent).

Minimum fields (baseline)

  • when: timestamp (UTC) + event_time vs ingest_time
  • who: user_id / service_account + actor_type
  • what: action (verb) + event_type/category
  • where: source_ip + geo/ASN if available + user_agent/device
  • which thing: target resource id/type (project, org, record, endpoint)
  • what happened: result (allow/deny/success/fail) + reason

Fields that save hours later

  • request_id / trace_id for cross-service correlation
  • session_id + auth_method (password, SSO, passkey, API key)
  • tenant/org_id in multi-tenant apps
  • resource_owner_id (who owns the data being accessed)
  • policy_version or rule id for access decisions

4) Signal vs noise: design for “few good questions”

Logging becomes noise when you record everything at the same priority and without structure. A practical approach is to decide your top investigation questions and ensure you can answer them with certainty.

Five questions your logs should answer

  • Which identity was used (user/service), and how did it authenticate?
  • What sensitive actions happened (and were they allowed/denied)?
  • What changed (permissions, keys, configuration), and by whom?
  • What data moved (exports/downloads), to where, and how much?
  • Are there gaps or tampering signals (missing logs, altered timestamps)?
The “verb-first” naming trick

Name actions as verbs (e.g., login, grant_role, create_api_key, export_data). It makes queries and dashboards obvious and keeps taxonomy stable as features evolve.

Step-by-step

This is a pragmatic rollout you can do in a small team. It’s designed to produce usable security logging quickly, then strengthen it over time (structure, normalization, retention, and alerting).

Step 1 — Define what “good” means for your org

  • Assets: which systems and data are highest value (admin panel, payments, PII, production deploys)
  • Threats: credential stuffing, phishing, insider misuse, API abuse, supply-chain changes
  • Response needs: do you need user-level reconstruction, compliance audit trails, or both?
  • Constraints: privacy rules, budget, retention requirements, and who will actually review alerts

Step 2 — Pick your first log sources (don’t start everywhere)

Start where identity and change lives. If you log these well, you can detect many attacks even without deep endpoint telemetry.

High-leverage sources

  • Identity provider: SSO, MFA events, device posture, admin changes
  • Application audit: role changes, exports, key creation, settings
  • Reverse proxy/WAF: request metadata, blocks, rate limiting, bot signals
  • Cloud control plane: IAM changes, storage access, network/security group changes

Add later (still useful)

  • OS / endpoint: process creation, privileged commands, persistence
  • DNS: suspicious domains, unexpected lookups from servers
  • Database: admin connections, schema changes, bulk reads
  • Email: forwarding rules, unusual login patterns

Step 3 — Standardize an audit event schema (your future SIEM loves you)

The goal is not a perfect standard. The goal is a stable schema across services so queries and detections are reusable. Here’s a compact audit event format that works well for most web apps.

{
  "ts": "2026-01-09T13:21:53Z",
  "event_type": "audit",
  "action": "grant_role",
  "result": "success",
  "reason": "admin_console",
  "actor": {
    "type": "user",
    "id": "usr_7f3c2a",
    "ip": "203.0.113.10",
    "user_agent": "Mozilla/5.0",
    "session_id": "ses_4a1d9c",
    "auth_method": "sso_mfa"
  },
  "target": {
    "type": "user",
    "id": "usr_91b2dd",
    "org_id": "org_3c0e"
  },
  "change": {
    "field": "role",
    "from": "member",
    "to": "admin"
  },
  "correlation": {
    "request_id": "req_01HRQ2F2X0K9",
    "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736"
  },
  "service": {
    "name": "api",
    "env": "prod",
    "version": "2026.01.09"
  }
}
Why structured JSON is worth it

With JSON you can query fields directly (actor.id, action, result) instead of brittle string parsing. It also makes it easier to normalize events from multiple services into one SIEM index.

Step 4 — Centralize and normalize (keep raw + parsed)

Centralization typically means “ship logs from apps/hosts to a collector, then to storage.” A reliable pattern is: collect → enrich → route → store raw + store parsed. Raw events are your evidence; parsed events are your query speed.

Normalization checklist

  • Convert timestamps to UTC and keep original if needed
  • Normalize field names (actor.id vs userId vs uid)
  • Enrich with env/service/version and deployment identifiers
  • Keep both event_time and ingest_time to spot delays

Storage checklist

  • Hot storage for recent search (days/weeks)
  • Warm/cold storage for retention (months/years)
  • Immutable / append-only option for audit trails
  • Access controls: few admins, strong MFA, change logging on the logging system

Below is an example of a small collector pipeline that parses JSON logs, enriches them, and forwards to your backend. (Treat this as a pattern: the important thing is consistency and safe defaults.)

# Fluent Bit example (pattern): parse JSON, enrich, and forward
[SERVICE]
  Flush        1
  Log_Level    info

[INPUT]
  Name         tail
  Tag          app.audit
  Path         /var/log/app/audit.log
  Parser       json
  Mem_Buf_Limit 50MB
  Skip_Long_Lines On

[FILTER]
  Name         record_modifier
  Match        app.audit
  Record       env prod
  Record       service api

[FILTER]
  Name         nest
  Match        app.audit
  Operation    lift
  Nested_under actor
  Add_prefix   actor_

[OUTPUT]
  Name         http
  Match        app.audit
  Host         logs.example.internal
  Port         443
  URI          /ingest
  Format       json_lines
  tls          On
Collector safety rules
  • Backpressure and buffering: don’t drop logs silently under load
  • Separate pipelines for security/audit vs noisy debug logs
  • Lock down collector endpoints (mTLS if possible)
  • Monitor for “log gaps” (the absence of logs can be a signal)

Step 5 — Redact and minimize (privacy + safety)

Helpful logs are specific, but they’re not a data dump. Keep what you need for security, and redact what creates risk. A pragmatic approach is field allowlists (preferable) plus redaction as a safety net.

Good minimization defaults

  • Store user identifiers (user_id) instead of full profiles
  • Store IP/user-agent for security, but avoid unnecessary request bodies
  • Store resource ids instead of full content (e.g., record_id, not the record data)
  • Hash or partially mask sensitive values (last 4 digits, token prefix) if needed for debugging

Common redaction targets

  • Authorization headers, cookies, session tokens
  • Passwords, reset tokens, one-time codes
  • Private keys and connection strings
  • Payment data and high-risk PII fields

Here’s a tiny Python “belt and suspenders” sanitizer for JSON-line events before shipping. It’s intentionally simple: the best approach is to avoid logging sensitive fields at the source, but this helps reduce accidental leaks.

import json
import re
import sys

SECRET_KEYS = {"password", "pass", "token", "access_token", "refresh_token", "authorization", "cookie", "set-cookie"}
TOKEN_LIKE = re.compile(r"(eyJ[a-zA-Z0-9_-]{10,}\.[a-zA-Z0-9_-]{10,}\.[a-zA-Z0-9_-]{10,})")  # JWT-ish

def sanitize(obj):
    if isinstance(obj, dict):
        out = {}
        for k, v in obj.items():
            lk = str(k).lower()
            if lk in SECRET_KEYS:
                out[k] = "[REDACTED]"
            else:
                out[k] = sanitize(v)
        return out
    if isinstance(obj, list):
        return [sanitize(x) for x in obj]
    if isinstance(obj, str):
        return TOKEN_LIKE.sub("[REDACTED_TOKEN]", obj)
    return obj

for line in sys.stdin:
    line = line.strip()
    if not line:
        continue
    event = json.loads(line)
    clean = sanitize(event)
    sys.stdout.write(json.dumps(clean, separators=(",", ":")) + "\n")

Step 6 — Build a small detection set (and tune it)

Detections work best when they are specific, explainable, and tied to response actions. Start with a handful you will actually respond to.

Starter detections (high signal)

Detection What to query for Suggested response
Auth spray / stuffing many failed logins across many users from one IP (or many IPs to one user) rate limit, add MFA enforcement, block IP/ASN if appropriate
Privilege changes role/admin grants, permission scope increases, policy changes verify change ticket, revert if unexpected, review actor session
New credential created API key created, OAuth client added, SSH key added notify owner, rotate if suspicious, check subsequent activity
Suspicious data movement bulk export/download, unusual time/location, new destination lock account, revoke sessions, audit access trail, contact stakeholders
Audit trail gap missing events from a system that normally logs regularly check collector health, investigate potential tampering/outage

Step 7 — Retention, integrity, and “trusting your evidence”

Security logs are evidence. Evidence should be hard to delete, easy to prove, and retained long enough to be useful. Even if you don’t have strict compliance requirements, choose retention that matches your realistic detection and response window.

Retention rule of thumb

  • Hot: 7–30 days (fast search for investigations)
  • Warm/cold: 90–365+ days (incident discovery lag is real)
  • Audit-critical: 1–7 years depending on your domain

Integrity basics

  • Separate write access from read access
  • Enable immutable storage / object lock when feasible
  • Log all changes to log pipelines, parsers, and retention policies
  • Alert on deletion attempts or unusual index retention changes
A “good enough” finishing move

Document your schema (field meanings), your top detections, and your retention policy in one place. If an on-call engineer can’t find it in 60 seconds, it won’t be used during an incident.

Common mistakes

Most logging failures aren’t about tools. They’re about missing context, inconsistent fields, or collecting so much that nobody trusts the signal. Here are the patterns that show up repeatedly—and fixes you can apply quickly.

Mistake 1 — Logging messages instead of events

String logs like “user did a thing” become unqueryable and inconsistent across services.

  • Fix: log structured events with stable fields (action, actor, target, result).
  • Fix: enforce a shared schema and validate it in CI for critical events.

Mistake 2 — Missing correlation IDs

Without request/session IDs, incident response becomes manual guesswork across systems.

  • Fix: generate a request_id at the edge (gateway) and propagate it everywhere.
  • Fix: log session_id for auth and admin actions, plus actor.id.

Mistake 3 — Logging secrets or high-risk data

Logs are often broadly accessible internally. A leaked token in logs becomes an internal breach.

  • Fix: adopt field allowlists and blocklist secret keys (authorization, token, cookie).
  • Fix: add redaction at the collector as a last line of defense.

Mistake 4 — Treating all logs as equal

If everything is “important,” nothing is. Noise masks real attacks.

  • Fix: separate streams: audit/security vs debug/verbose.
  • Fix: keep alerts tied to response playbooks (what will we do?).

Mistake 5 — No “deny” and no reason

Many systems log only successes. Denies are often the earliest signal of probing and abuse.

  • Fix: log allow/deny for sensitive actions with the policy rule or reason code.
  • Fix: log access checks at the enforcement point (not just at the UI).

Mistake 6 — Centralized storage without integrity controls

If an attacker can delete or edit logs, you lose evidence and detection capability.

  • Fix: restrict write access, enable immutable retention where possible.
  • Fix: alert on retention/index changes and deletion attempts.
The fastest “noise reduction” win

Pick one action category (auth + admin changes) and make it perfect first: structured, consistent, minimal, and centrally searchable. Then expand.

FAQ

What are the most important security logs to collect first?

Start with authentication, authorization decisions, and admin/config changes. These events form the “spine” of incident response: they show entry, privilege, and impact.

What fields should every audit log event include?

At minimum: timestamp (UTC), action, result, actor id/type, source IP, target resource id/type, and a correlation id (request_id/trace_id). Add reason/policy for access decisions and change details for modifications.

Should I log denied requests?

Yes—especially for sensitive actions. Denies are often the first evidence of probing, brute force, privilege escalation attempts, or misconfigured clients. Include a reason code so you can separate “expected” denies (missing permission) from suspicious patterns (policy bypass attempts).

How do I reduce log volume without losing security signal?

Keep security/audit events high fidelity, and reduce volume elsewhere by sampling/debug gating. Practical moves: use structured fields (so you can filter precisely), separate streams, and keep alerting focused on a small set of high-signal patterns (auth abuse, privilege change, key creation, data movement).

How long should I retain security logs?

A pragmatic baseline is 7–30 days hot (fast search) plus 90–365+ days cold for investigation lag. Audit-critical events may need longer retention depending on your industry and obligations. When in doubt, retain longer in cheaper storage, but keep access tightly controlled.

Can I rely on cloud provider logs alone?

Cloud control plane logs are necessary but not sufficient. They tell you about infrastructure and IAM changes. For real incident response you also need application audit logs (what users did) and identity provider events (how they authenticated). The best coverage comes from combining these layers with correlation IDs.

Is it OK to log request/response bodies for debugging?

Only in tightly controlled environments, and ideally not in production. Request bodies often contain secrets and PII. Prefer logging metadata (endpoint, size, status, actor/target ids) and storing sensitive payloads separately with explicit consent and access controls if you truly need them.

Cheatsheet

A scan-fast checklist for security logging that helps (printable mindset).

What to log (high signal)

  • Auth: success/fail, MFA, password reset, account lockouts
  • Access decisions: allow/deny + reason on sensitive actions
  • Admin changes: roles, permissions, API keys, security settings
  • Data movement: exports, bulk downloads, unusual spikes
  • Pipeline health: collector failures, missing logs, retention changes

Minimum fields (copy this)

  • ts (UTC) + ingest_time if you can
  • event_type (audit/security/ops) + action (verb)
  • result (success/fail/allow/deny) + reason (rule/policy)
  • actor: id, type, session_id, auth_method
  • where: source_ip, user_agent/device
  • target: resource type/id, org/tenant id
  • correlation: request_id/trace_id
  • service: name, env, version

What not to log

  • Passwords, reset tokens, MFA codes
  • Raw access/refresh tokens, session cookies
  • Private keys, full connection strings
  • Full request bodies by default (often contains PII/secrets)
  • Anything you wouldn’t want broadly searchable internally

Alert starter pack (5)

  • Brute force pattern: many fails → success
  • Privilege change: role/permission grant
  • Credential created: API key/OAuth client/SSH key
  • Data movement anomaly: export spike or new destination
  • Audit gap: missing logs or sudden volume drop
If you’re unsure what to do next

Pick one feature area (auth + admin actions), standardize its schema, centralize it, then add two detections. That single slice often improves security more than “logging everything everywhere.”

Wrap-up

Security logging that helps is less about volume and more about clarity: consistent events, stable fields, and enough context to reconstruct what happened. Start with identity and change events, standardize a schema, centralize reliably, and add a few alerts you will actually respond to.

Your next actions (pick one)

  • Add a request_id and session_id to your logs end-to-end
  • Implement structured audit events for role changes, key creation, and data export
  • Centralize logs with buffering/backpressure and separate security streams from debug noise
  • Create 5 high-signal detections and tune them for a week
  • Write a one-page “logging schema + retention” doc so on-call can use it
Pair this with threat modeling

The best way to decide what to log is to decide what you’re defending. A lightweight threat model will tell you which events and assets deserve the most reliable logging first.

Quiz

Quick self-check. This quiz is auto-generated for cyber / security / logging.

1) What single field most improves your ability to trace an incident across microservices?
2) Which option describes the safest and most useful way to log authentication events?
3) Why should you log access denials (deny) with a reason code for sensitive actions?
4) Which retention approach best balances investigations and cost?