Incident Timeline
Every incident has an immutable timeline that logs all events, actions, and state changes. Use the timeline for forensic analysis, compliance, and postmortems.
What is the Timeline?
The incident timeline is an immutable, append-only log of all events related to an incident:
- Immutable: Events cannot be edited or deleted after creation
- Append-Only: New events are always added to the end
- Timestamped: Every event has precise timestamp (milliseconds)
- Attributed: Tracks who/what triggered each event
Event Types
| Event Type | Description | Triggered By |
|---|---|---|
created | Incident created | System (auto) or User (manual) |
acknowledged | Engineer acknowledged incident | User |
note_added | Investigation note added | User |
status_changed | Component status updated | System or User |
alert_sent | Alert notification sent | System |
escalated | Alert escalated to next level | System |
resolved | Incident resolved | System (auto) or User (manual) |
reopened | Incident reopened | System or User |
Timeline Example
Incident: Production API Down (inc_abc123)
┌─────────────────────────────────────────────────────────────┐
│ 10:30:00.245 [created] │
│ Actor: System │
│ Details: Incident created after 3 consecutive failures │
│ Severity: critical │
├─────────────────────────────────────────────────────────────┤
│ 10:30:00.567 [alert_sent] │
│ Actor: System │
│ Details: Alert sent to Slack (#incidents) │
│ Channel: chan_slack_123 │
├─────────────────────────────────────────────────────────────┤
│ 10:30:00.892 [alert_sent] │
│ Actor: System │
│ Details: Alert sent to Email (team@example.com) │
│ Channel: chan_email_456 │
├─────────────────────────────────────────────────────────────┤
│ 10:32:15.123 [acknowledged] │
│ Actor: John Doe (usr_john_123) │
│ Details: Incident acknowledged │
│ Time to Acknowledge: 2m 14s │
├─────────────────────────────────────────────────────────────┤
│ 10:33:00.456 [note_added] │
│ Actor: John Doe │
│ Note: "Checking API service logs, seeing 500 errors" │
├─────────────────────────────────────────────────────────────┤
│ 10:34:30.789 [note_added] │
│ Actor: John Doe │
│ Note: "Database connection pool exhausted (50/50)" │
├─────────────────────────────────────────────────────────────┤
│ 10:35:00.012 [note_added] │
│ Actor: John Doe │
│ Note: "Root cause: long-running queries blocking pool" │
├─────────────────────────────────────────────────────────────┤
│ 10:36:00.345 [note_added] │
│ Actor: Jane Smith (usr_jane_456) │
│ Note: "Restarting service to clear connection pool" │
├─────────────────────────────────────────────────────────────┤
│ 10:38:42.678 [resolved] │
│ Actor: System (auto-recovery) │
│ Details: Monitor recovered after 3 successful checks │
│ Downtime: 8m 42s │
│ MTTR: 6m 27s │
└─────────────────────────────────────────────────────────────┘Accessing the Timeline
Via Dashboard
- Navigate to Incidents
- Click on an incident
- Scroll to Timeline section
- View chronological event log
Via API
GET /v1/incidents/:id/timeline
# Response:
{
"events": [
{
"type": "created",
"timestamp": "2026-02-13T10:30:00.245Z",
"actor": "System",
"details": "Incident created after 3 consecutive failures",
"metadata": {
"severity": "critical",
"monitorId": "mon_abc123"
}
},
{
"type": "acknowledged",
"timestamp": "2026-02-13T10:32:15.123Z",
"actor": "John Doe",
"actorId": "usr_john_123",
"details": "Incident acknowledged",
"metadata": {
"timeToAcknowledge": 134
}
}
]
}Use Cases
1. Postmortem Analysis
Review timeline to understand what happened:
- When was the incident first detected?
- How long until someone acknowledged?
- What investigation steps were taken?
- How long to resolve?
- Were alerts escalated?
2. Compliance & Audit
Timeline provides immutable audit trail:
- Who accessed the incident?
- What actions were taken?
- When was each status change made?
- Satisfies SOC 2, ISO 27001 requirements
3. Performance Metrics
Calculate SLA metrics from timeline:
- MTTD: Detection timestamp - first failure timestamp
- MTTA: Acknowledgment timestamp - detection timestamp
- MTTR: Resolution timestamp - acknowledgment timestamp
4. Forensic Investigation
Debug complex incidents:
- Correlate events across multiple incidents
- Identify patterns in failures
- Understand cascading failures
Timeline Metadata
Each event includes rich metadata:
{
"type": "alert_sent",
"timestamp": "2026-02-13T10:30:00.567Z",
"actor": "System",
"details": "Alert sent to Slack",
"metadata": {
"channelId": "chan_slack_123",
"channelName": "Engineering Team",
"channelType": "slack",
"webhookUrl": "https://hooks.slack.com/...",
"deliveryStatus": "success",
"responseTime": 234
}
}Event Filters
Filter timeline events by type:
# Show only notes
GET /v1/incidents/:id/timeline?type=note_added
# Show only status changes
GET /v1/incidents/:id/timeline?type=status_changed
# Show events by specific user
GET /v1/incidents/:id/timeline?actor=usr_john_123
# Time range filter
GET /v1/incidents/:id/timeline?from=2026-02-13T10:30:00Z&to=2026-02-13T11:00:00ZExporting Timeline
JSON Export
GET /v1/incidents/:id/timeline?format=json
# Downloads timeline.json with full event logCSV Export
GET /v1/incidents/:id/timeline?format=csv
# CSV format:
timestamp,type,actor,details
2026-02-13T10:30:00.245Z,created,System,"Incident created"
2026-02-13T10:32:15.123Z,acknowledged,John Doe,"Incident acknowledged"Markdown Export
GET /v1/incidents/:id/timeline?format=markdown
# Markdown format for postmortems:
## Incident Timeline
**10:30:00** - Incident created (System)
- Severity: critical
- Monitor: Production API
**10:32:15** - Acknowledged (John Doe)
- Time to acknowledge: 2m 14s
**10:38:42** - Resolved (System)
- Downtime: 8m 42sTimeline Retention
| Plan | Retention | Export |
|---|---|---|
| Free | 30 days | JSON only |
| Developer | 90 days | JSON, CSV |
| Pro | 1 year | JSON, CSV, Markdown |
| Enterprise | Unlimited | All formats + API access |
Best Practices
1. Add Context-Rich Notes
Include commands, logs, findings:
Good Note:
"Database connection pool at 50/50. Ran `SHOW PROCESSLIST`, found 12 queries >30s.
Top offender: analytics dashboard query (45s avg). Killed PID 12345."
Bad Note:
"Checked database"2. Link to External Resources
- Datadog dashboard URLs
- Sentry error IDs
- CloudWatch log streams
- GitHub PR/commit links
3. Export for Postmortems
Use Markdown export as postmortem starting point:
- Export timeline as Markdown
- Add root cause analysis
- Add impact metrics
- Add action items
- Publish to wiki/docs
4. Review Metrics Regularly
Analyze timeline data to improve:
- Time to acknowledge trending up? → Improve on-call
- Frequent escalations? → Adjust alert thresholds
- Long MTTR? → Better runbooks needed
Next Steps
- Incident Lifecycle: Full incident flow
- Incident Management: Overview and features
- Incidents API: Programmatic access
- Best Practices: Incident response tips