Skip to main content

Alert System

Get notified when it matters. BlackTide's smart alert system helps you stay informed without alert fatigue.

How Alerts Work

The BlackTide alert system consists of three main components that work together:

1. Monitor

Runs periodic checks on your services and records results

2. Alert Rule

Defines conditions for when to send notifications

3. Alert Channel

Destination where notifications are sent (Email, Slack, etc.)

Alert Components

Alert Rules

Alert Rules define when you should be notified. They contain conditions like "alert after 3 consecutive failures" or "alert if latency exceeds 2 seconds."

Key features:

  • Consecutive failure thresholds (prevent false alarms)
  • Latency-based alerts (catch performance degradation)
  • Custom conditions per monitor
  • Recovery notifications
  • Silence windows for maintenance

Learn more about Alert Rules →

Alert Channels

Alert Channels define where notifications are sent. You can have multiple channels and route different alerts to different destinations.

Supported channels:

  • Email - Individual addresses or distribution lists
  • Slack - Specific channels with threaded replies
  • Discord - Community or engineering channels
  • Telegram - Instant mobile notifications
  • PagerDuty - On-call rotation with escalation
  • Opsgenie - Alert scheduling and escalation
  • Webhooks - Custom HTTP endpoints

Learn more about Alert Channels →

Alert Flow

Here's how an alert flows through the system:

  1. Monitor fails - A check returns an error or times out
  2. Condition evaluated - Alert rule checks if conditions are met (e.g., 3 consecutive failures)
  3. Alert triggered - If conditions match, an alert is created
  4. Notification sent - Alert is sent to all configured channels
  5. Recovery detected - When monitor recovers, recovery notification is sent (if enabled)

Preventing Alert Fatigue

Too many alerts can be just as bad as no alerts. BlackTide helps you avoid alert fatigue:

Consecutive Failures

Instead of alerting on the first failure, wait for N consecutive failures. This filters out transient network blips.

Alert Silencing

Temporarily silence alerts during planned maintenance or known issues:

  • Schedule maintenance windows in advance
  • Silence specific monitors or all monitors
  • Auto-resume alerting when maintenance ends

Learn more about Alert Silencing →

Smart Routing

Route different alerts to different teams:

  • Production API down → PagerDuty + Slack #incidents
  • Staging API down → Email + Slack #engineering
  • Non-critical service degraded → Email only

Alert Types

Monitor Down

Sent when a monitor fails consecutive checks and is marked as down. This is the most critical alert type.

Monitor Up (Recovery)

Sent when a monitor recovers after being down. Helps you track Mean Time To Recovery (MTTR).

Performance Degradation

Sent when response times exceed a threshold. Helps you catch slow services before they fail completely.

SSL Certificate Expiry

Sent 30, 14, and 7 days before an SSL certificate expires. Prevents unexpected SSL errors.

Best Practices

Alert Metrics

Track the effectiveness of your alert system:

  • Mean Time To Alert (MTTA) - How quickly you're notified of issues
  • Mean Time To Acknowledge (MTTA) - How quickly someone responds
  • Mean Time To Resolve (MTTR) - How long it takes to fix issues
  • False Positive Rate - Percentage of alerts that weren't real issues

Quick Start

Ready to set up your first alert? Follow these steps:

  1. Create a monitor (if you haven't already)
  2. Configure an alert channel (Email is the fastest)
  3. Create an alert rule that uses your channel
  4. Test the integration and wait for your first check

Further Reading