Skip to main content

Incident Timeline

Every incident has an immutable timeline that logs all events, actions, and state changes. Use the timeline for forensic analysis, compliance, and postmortems.

What is the Timeline?

The incident timeline is an immutable, append-only log of all events related to an incident:

  • Immutable: Events cannot be edited or deleted after creation
  • Append-Only: New events are always added to the end
  • Timestamped: Every event has precise timestamp (milliseconds)
  • Attributed: Tracks who/what triggered each event

Event Types

Event TypeDescriptionTriggered By
createdIncident createdSystem (auto) or User (manual)
acknowledgedEngineer acknowledged incidentUser
note_addedInvestigation note addedUser
status_changedComponent status updatedSystem or User
alert_sentAlert notification sentSystem
escalatedAlert escalated to next levelSystem
resolvedIncident resolvedSystem (auto) or User (manual)
reopenedIncident reopenedSystem or User

Timeline Example

Incident: Production API Down (inc_abc123)

┌─────────────────────────────────────────────────────────────┐
│ 10:30:00.245  [created]                                     │
│ Actor: System                                               │
│ Details: Incident created after 3 consecutive failures      │
│ Severity: critical                                          │
├─────────────────────────────────────────────────────────────┤
│ 10:30:00.567  [alert_sent]                                  │
│ Actor: System                                               │
│ Details: Alert sent to Slack (#incidents)                  │
│ Channel: chan_slack_123                                     │
├─────────────────────────────────────────────────────────────┤
│ 10:30:00.892  [alert_sent]                                  │
│ Actor: System                                               │
│ Details: Alert sent to Email (team@example.com)            │
│ Channel: chan_email_456                                     │
├─────────────────────────────────────────────────────────────┤
│ 10:32:15.123  [acknowledged]                                │
│ Actor: John Doe (usr_john_123)                              │
│ Details: Incident acknowledged                              │
│ Time to Acknowledge: 2m 14s                                 │
├─────────────────────────────────────────────────────────────┤
│ 10:33:00.456  [note_added]                                  │
│ Actor: John Doe                                             │
│ Note: "Checking API service logs, seeing 500 errors"       │
├─────────────────────────────────────────────────────────────┤
│ 10:34:30.789  [note_added]                                  │
│ Actor: John Doe                                             │
│ Note: "Database connection pool exhausted (50/50)"         │
├─────────────────────────────────────────────────────────────┤
│ 10:35:00.012  [note_added]                                  │
│ Actor: John Doe                                             │
│ Note: "Root cause: long-running queries blocking pool"     │
├─────────────────────────────────────────────────────────────┤
│ 10:36:00.345  [note_added]                                  │
│ Actor: Jane Smith (usr_jane_456)                            │
│ Note: "Restarting service to clear connection pool"        │
├─────────────────────────────────────────────────────────────┤
│ 10:38:42.678  [resolved]                                    │
│ Actor: System (auto-recovery)                               │
│ Details: Monitor recovered after 3 successful checks        │
│ Downtime: 8m 42s                                            │
│ MTTR: 6m 27s                                                │
└─────────────────────────────────────────────────────────────┘

Accessing the Timeline

Via Dashboard

  1. Navigate to Incidents
  2. Click on an incident
  3. Scroll to Timeline section
  4. View chronological event log

Via API

GET /v1/incidents/:id/timeline

# Response:
{
  "events": [
    {
      "type": "created",
      "timestamp": "2026-02-13T10:30:00.245Z",
      "actor": "System",
      "details": "Incident created after 3 consecutive failures",
      "metadata": {
        "severity": "critical",
        "monitorId": "mon_abc123"
      }
    },
    {
      "type": "acknowledged",
      "timestamp": "2026-02-13T10:32:15.123Z",
      "actor": "John Doe",
      "actorId": "usr_john_123",
      "details": "Incident acknowledged",
      "metadata": {
        "timeToAcknowledge": 134
      }
    }
  ]
}

Use Cases

1. Postmortem Analysis

Review timeline to understand what happened:

  • When was the incident first detected?
  • How long until someone acknowledged?
  • What investigation steps were taken?
  • How long to resolve?
  • Were alerts escalated?

2. Compliance & Audit

Timeline provides immutable audit trail:

  • Who accessed the incident?
  • What actions were taken?
  • When was each status change made?
  • Satisfies SOC 2, ISO 27001 requirements

3. Performance Metrics

Calculate SLA metrics from timeline:

  • MTTD: Detection timestamp - first failure timestamp
  • MTTA: Acknowledgment timestamp - detection timestamp
  • MTTR: Resolution timestamp - acknowledgment timestamp

4. Forensic Investigation

Debug complex incidents:

  • Correlate events across multiple incidents
  • Identify patterns in failures
  • Understand cascading failures

Timeline Metadata

Each event includes rich metadata:

{
  "type": "alert_sent",
  "timestamp": "2026-02-13T10:30:00.567Z",
  "actor": "System",
  "details": "Alert sent to Slack",
  "metadata": {
    "channelId": "chan_slack_123",
    "channelName": "Engineering Team",
    "channelType": "slack",
    "webhookUrl": "https://hooks.slack.com/...",
    "deliveryStatus": "success",
    "responseTime": 234
  }
}

Event Filters

Filter timeline events by type:

# Show only notes
GET /v1/incidents/:id/timeline?type=note_added

# Show only status changes
GET /v1/incidents/:id/timeline?type=status_changed

# Show events by specific user
GET /v1/incidents/:id/timeline?actor=usr_john_123

# Time range filter
GET /v1/incidents/:id/timeline?from=2026-02-13T10:30:00Z&to=2026-02-13T11:00:00Z

Exporting Timeline

JSON Export

GET /v1/incidents/:id/timeline?format=json

# Downloads timeline.json with full event log

CSV Export

GET /v1/incidents/:id/timeline?format=csv

# CSV format:
timestamp,type,actor,details
2026-02-13T10:30:00.245Z,created,System,"Incident created"
2026-02-13T10:32:15.123Z,acknowledged,John Doe,"Incident acknowledged"

Markdown Export

GET /v1/incidents/:id/timeline?format=markdown

# Markdown format for postmortems:
## Incident Timeline

**10:30:00** - Incident created (System)
- Severity: critical
- Monitor: Production API

**10:32:15** - Acknowledged (John Doe)
- Time to acknowledge: 2m 14s

**10:38:42** - Resolved (System)
- Downtime: 8m 42s

Timeline Retention

PlanRetentionExport
Free30 daysJSON only
Developer90 daysJSON, CSV
Pro1 yearJSON, CSV, Markdown
EnterpriseUnlimitedAll formats + API access

Best Practices

1. Add Context-Rich Notes

Include commands, logs, findings:

Good Note:
"Database connection pool at 50/50. Ran `SHOW PROCESSLIST`, found 12 queries >30s. 
Top offender: analytics dashboard query (45s avg). Killed PID 12345."

Bad Note:
"Checked database"

2. Link to External Resources

  • Datadog dashboard URLs
  • Sentry error IDs
  • CloudWatch log streams
  • GitHub PR/commit links

3. Export for Postmortems

Use Markdown export as postmortem starting point:

  1. Export timeline as Markdown
  2. Add root cause analysis
  3. Add impact metrics
  4. Add action items
  5. Publish to wiki/docs

4. Review Metrics Regularly

Analyze timeline data to improve:

  • Time to acknowledge trending up? → Improve on-call
  • Frequent escalations? → Adjust alert thresholds
  • Long MTTR? → Better runbooks needed

Next Steps