Alert Monitors

Use monitors to define a condition or behavior that you want to watch for in your log and event data.

A monitor is a definition of an important condition or behavior that you want to watch for in your log and event files indexed by ChaosSearch. You can define a condition using a query DSL definition or a visual graph.

A monitor requires at least one associated trigger to define the actions when an alert is raised. When teamed with a trigger and a destination, the monitor can be enabled to watch for the defined condition and to raise an alert that sends a notification.

Creating a Monitor

To create a monitor:

  1. In Search Analytics > Alerting, click the Monitors tab.

The Monitors page opens with a list of the currently defined monitors. The page includes a summary of information such as whether monitors are enabled or disabled, their last update, and information about related alerts and errors. From this page you can perform actions such as acknowledging alerts and managing monitor definitions.

3268
  1. Click Create monitor. The Create Monitor window opens.
1958
  1. In the Monitor name field, type a name for the monitor.
  2. In the Monitor defining method field, select how to you want to define the monitor:
  • Select Visual editor to create a monitor that watches for when a value is above or below a threshold for a period of time.
  • Select Extraction query editor to use Elasticsearch query DSL to specify the conditions to watch for.
  1. In the Schedule field, configure how frequently you want to run the monitor to run to check for the condition. You can select a number and a time unit (minutes, hours, or days).
  2. In the Data source section, in the Index field, select the ChaosSearch Refinery view to use with the monitor.
  3. The information to specify for a visual query and an extraction query method is different; see the following sections Using an Extraction Query or Using a Visual Editor for details about each type of method.
  4. Specify one or more triggers for the monitor so that it can be enabled. See Define One or More Triggers for a Monitor later in this topic.
  5. Click Create to save the new monitor.

Review the following sections for more details about the monitor options.

Using an Extraction Query

If you select the Extraction query editor method to define a monitor, as in the following example, the window updates to show the fields that you must define for the monitor.

2082
  1. In the Data Source > Index field, select the Refinery view to use with the monitor. The window displays a Query section with two fields. The left column Define extraction query defaults to a full match_all query DSL operation. The right column Extraction query response is initially empty.
1946
  1. Click Run to populate the column on the right. The response in the right column produces the values from the selected view that could be added to the extraction query that you build in the left column.
1866
  1. Specify your query DSL extraction query in the left column to define the monitoring condition. Note that the query currently supports the query{} section syntax. The aggs{} syntax is not yet supported. For more information about the query syntax, see Elasticsearch API Support.

As an example, the following query DSL searches for a domain value in the log and event files from the last 15 minutes:

{
    "query": {
        "bool": {
            "filter": [
                {
                    "match_phrase": {
                        "domain": {
                            "query": "test.domain.com"
                        }
                    }
                },
                {
                    "range": {
                        "@timestamp": {
                            "gte": "now-15m"                        }
                    }
                }
            ]
        }
    }
}

Another simple query DSL example for a specific field match follows:

2028

Using a Visual Editor

If you select the Visual editor method for a monitor definition, as in the following example, the monitor window updates to show new fields for the visualization.

  1. In the Data Source > Index field, select the Refinery view to use with the monitor. The window displays new fields for the visualization details.
  2. In the Time field, select the time column that you want to use for the X axis date histogram. Click the field to select one of the timeval fields in the view.
  3. In the Query pane, the default metric is a count of documents (in ChaosSearch, that means a hits/records count). Click Add metric to add a specific metric from the view to visualize instead of the hits count.
  4. In the Time range for the last field, specify the monitoring time range. For example, to check the last 1 hour of the log and event files related to the view that you selected, select 1 from the number drop-down, and hour(s) from the units drop-down list. The available units are minutes, hours, or days.
  5. In the Data filter field, you can specify one optional filter rule to apply to the monitoring query.
  6. In the Group by field, you can specify one optional group by rule to group the results of the monitor query.
  7. Click Preview query and performance. The window updates to display a graph area, and populates the data for the graph within the time range you specified. The chart information also shows the monitor duration, runtime, and hits for the monitor query.
1688

If a visual graph does not appear, there might not be data for your selected time period. Try adjusting the time range to a value where data is expected (such as last hour, or last 12 hours) just as an example.

🚧

Monitor Permissions and Users

When you create and save a monitor, the monitor definition is updated with the information for the ChaosSearch groups associated with your user account. Use caution when reviewing monitor definitions, because saving a monitor as a different user could break the monitor. If RBAC group assignments change, or if permissions assigned to the RBAC groups used for a monitor change, the monitor might not work after those updates.

See the troubleshooting section for more information.

Define One or More Triggers for a Monitor

A monitor will not run and watch for conditions unless there is at least one trigger defined to that monitor.

To define a trigger:

  1. When you are creating or editing a monitor, scroll to the bottom of the window to the Triggers section.
2068
  1. Click Add trigger to open a panel with the fields for a new trigger.
1828
  1. In the Trigger name field, type a name for the trigger.
  2. In the Severity level field, specify an alert severity level from 1 (Highest) to 5 (Lowest).
  3. In the Trigger condition field, specify any conditions for the trigger to fire. The default trigger rule is that the monitor query must return at least one hit/result. Click Info for more information about the scripting variables.
1966
  1. In the Actions panel, define the actions to take when the Trigger condition is met.
    • In the Action name field, type a name for the action.
    • In the Destinations list, select a destination to which the alert is sent.
    • In the Message subject field, type a clear message that will be sent in the alert.
    • In the Message field, update the content as needed to provide helpful information to the alert system user about the monitoring condition and problem.
  2. You can choose to add another action if desired, for up to 10 separate actions on the monitor.
  3. When you finish specifying the trigger(s) for the monitor, create or update the monitor.

Troubleshooting Monitor Authorization Permissions

If a configured monitor that was working previously begins to raise an alert with the following message, there is a groups permission error to troubleshoot:

Error: chaossumo.util.akka.http.ChaosDirectives$Exceptions$AuthorizationException$: Authorization failed.

The problem could be that the groups for the user that created or last updated the monitor did not have the
kibana-opendistro:alerting:alerts:read permission. The monitor does not have permission to run. There could also be an issue where the RBAC groups associated with the user who last saved the monitor changed and no longer have the proper permissions to use the views, object groups, or query associated with the monitor. Or, the groups associated with the monitor definition changed when a user updated the monitor.

When troubleshooting this error, it can be very helpful to obtain the groups that are configured for the monitor that raised the alert, so that one or more groups could be investigated for the alerts permission. One way to obtain the group IDs for a monitor is to use the Browser DevTools window to display more information about the monitor.

  1. Navigate to the Monitors page and open the DevTools window.
  2. Select the monitor that triggered the alert. You should see a Name with the same Monitor ID value in the DevTools left frame of the Network tab.
  3. Select the monitor ID name element, then click Preview. The right pane updates with information about the resource. Click to expand the groupIds property.
1592

In the example above, groupIds is set to default, which is common for monitors created by the root user, especially during the ChaosSearch trial phase. During the production transition, the default group is usually updated to have a smaller set of basic permissions for new users who are not otherwise assigned to groups. Each site administrator typically creates new groups for the production environment to specify the various levels of user access that are needed. The resolution for this problem is to update and save the monitor while logged in as a user who has proper group assignments with the full complement of permissions so that the monitor can run successfully. As an alternative, if the current user is the person who must manage the monitors, the solution is to ensure that the monitor administrator has the correct group assignments to fully manage and run monitors.

Sometimes the groupIds assigned to a monitor are a sequence of one or more internal group IDs for the user who last saved the monitor. In this case, if a monitor sends Authorization alerts, a group is missing the alerts permission, or the user might not be assigned to the correct groups needed for the monitor to run. The group IDs list can provide the ChaosSearch Customer Success team member with information needed to diagnose the root cause of the authorization alert.