Alerting Overview
Use alerts to automatically detect conditions in your log and event data.
As you use ChaosSearch and its Search Analytics interface to search and visualize the data from your log and event files, you typically find see conditions such as a peak or a drop in a value, an error message, or a behavior that service administrators might want to investigate.
Alerting is a Search Analytics feature that automates the detection of these types of important conditions and "pushes" an alert to associated application/service managers or troubleshooting personnel. You can send notifications to commonly used tools like a Slack channel or a team's monitoring tool using a custom webhook interface. When you apply alerting monitors to Live Index groups, you can configure the system to watch for and send an alert when conditions are detected.
You use the ChaosSearch Search Analytics > Alerts page to review a list of all alerts for detected conditions, and to manage the monitors, triggers, and destinations that configure the rules. You can also acknowledge an alert to show other ChaosSearch users that someone is investigating the alert.
The alerting process requires you to configure and manage the following resources:
-
Monitors define a condition or behavior that you want to watch for and to be notified when it occurs. You can define a monitor with an extraction query or a visual graph.
-
Triggers specify a threshold at which the condition or behavior is of concern, how frequently to run the check, and an associated destination for the notification. A monitor must have at least one trigger to be enabled, and could have up to 10 triggers to define special conditions with specific priority or destinations. When a monitor condition is detected, an alert enters the Active state and a notification is sent.
-
Destinations define a location to which an alert is sent when triggered. You can send messages to a Slack channel or to a designated application via a custom webhook.
Best Practices for Alerts and Notifications
Alerts are a useful tool, but it can be easy to under-configure them and miss out on important conditions, or to over-configure them and distract teams with too many notifications to investigate. There is a practical middle-ground where alerts, frequency of checks, and severity settings can help to set the proper radar for teams to monitor and respond proactively. As part of the alert planning, be sure to consider:
- The important conditions to monitor and how frequently to check for them based on the condition and the ingest rate of the data that contains the information
- The correct destination to notify the appropriate personnel for the conditions
- The message information and severity to help responders prioritize and take action
A well-structured alerting plan can help teams to manage conditions proactively, use resources more efficiently, and improve end-user experience with faster detection and time-to-resolution.
Avoid the Alert Firehose Effect
As a good practice, start with a smaller set of very specific alert conditions, then tune and grow the monitors over time as you identify the conditions that are most important and detectable from your log and event data.
For assistance with your alert configuration, contact your ChaosSearch Customer Success representative.
Updated 8 days ago