Alert System

Get notified when it matters. BlackTide's smart alert system helps you stay informed without alert fatigue.

Smart Alerting

Our alert system uses consecutive failures and customizable conditions to prevent false alarms while ensuring you never miss a critical outage.

How Alerts Work

The BlackTide alert system consists of three main components that work together:

1. Monitor

Runs periodic checks on your services and records results

2. Alert Rule

Defines conditions for when to send notifications

3. Alert Channel

Destination where notifications are sent (Email, Slack, etc.)

Alert Components

Alert Rules

Alert Rules define when you should be notified. They contain conditions like "alert after 3 consecutive failures" or "alert if latency exceeds 2 seconds."

Key features:

Consecutive failure thresholds (prevent false alarms)
Latency-based alerts (catch performance degradation)
Custom conditions per monitor
Recovery notifications
Silence windows for maintenance

Learn more about Alert Rules →

Alert Channels

Alert Channels define where notifications are sent. You can have multiple channels and route different alerts to different destinations.

Supported channels:

Email - Individual addresses or distribution lists
Slack - Specific channels with threaded replies
Discord - Community or engineering channels
Telegram - Instant mobile notifications
PagerDuty - On-call rotation with escalation
Opsgenie - Alert scheduling and escalation
Webhooks - Custom HTTP endpoints

Learn more about Alert Channels →

Alert Flow

Here's how an alert flows through the system:

Monitor fails - A check returns an error or times out
Condition evaluated - Alert rule checks if conditions are met (e.g., 3 consecutive failures)
Alert triggered - If conditions match, an alert is created
Notification sent - Alert is sent to all configured channels
Recovery detected - When monitor recovers, recovery notification is sent (if enabled)

Example Scenario

Your API monitor fails 3 checks in a row (3 minutes total). The alert rule triggers and sends notifications to Slack (#engineering channel) and PagerDuty (on-call rotation). When the API recovers, a recovery notification is sent to the same channels.

Preventing Alert Fatigue

Too many alerts can be just as bad as no alerts. BlackTide helps you avoid alert fatigue:

Consecutive Failures

Instead of alerting on the first failure, wait for N consecutive failures. This filters out transient network blips.

Recommended Settings

Critical services: 2-3 consecutive failures
Standard services: 3-5 consecutive failures
Non-critical services: 5+ consecutive failures

Alert Silencing

Temporarily silence alerts during planned maintenance or known issues:

Schedule maintenance windows in advance
Silence specific monitors or all monitors
Auto-resume alerting when maintenance ends

Learn more about Alert Silencing →

Smart Routing

Route different alerts to different teams:

Production API down → PagerDuty + Slack #incidents
Staging API down → Email + Slack #engineering
Non-critical service degraded → Email only

Alert Types

Monitor Down

Sent when a monitor fails consecutive checks and is marked as down. This is the most critical alert type.

Monitor Up (Recovery)

Sent when a monitor recovers after being down. Helps you track Mean Time To Recovery (MTTR).

Performance Degradation

Sent when response times exceed a threshold. Helps you catch slow services before they fail completely.

SSL Certificate Expiry

Sent 30, 14, and 7 days before an SSL certificate expires. Prevents unexpected SSL errors.

Best Practices

Alert Configuration Best Practices

Start conservative - Begin with higher consecutive failure thresholds (3-5) and adjust down if needed
Use multiple channels - Combine Email + Slack for redundancy
Enable recovery notifications - Know when issues are resolved without checking manually
Set up escalation - Use PagerDuty or Opsgenie for critical services with on-call rotation
Test your alerts - Use the "Send Test" button to verify channels work
Review alert history - Regularly check for false alarms and adjust rules

Alert Metrics

Track the effectiveness of your alert system:

Mean Time To Alert (MTTA) - How quickly you're notified of issues
Mean Time To Acknowledge (MTTA) - How quickly someone responds
Mean Time To Resolve (MTTR) - How long it takes to fix issues
False Positive Rate - Percentage of alerts that weren't real issues

Quick Start

Ready to set up your first alert? Follow these steps:

Create a monitor (if you haven't already)
Configure an alert channel (Email is the fastest)
Create an alert rule that uses your channel
Test the integration and wait for your first check