After at least one object group and one Refinery view exist to index your cloud-storage content and to access the related Chaos® Index data, in the ChaosSearch console, click Search Analytics. The Discover page is the default starting point to run a basic search of your indexed data.
There are five steps to run a basic Discover search:
Make sure that you are on the Discover page.
Select a Refinery view in the drop-down list to specify the indexed data that you want to search.
Specify one or more search terms in the Search field. (If you leave Search blank, as in this example, the search returns all of the records for the time frame. Such a wide search might work for smaller indexes but it is not practical for searching across millions or billions of records.)
Specify a time frame. The default is the last 15 minutes of data. For live indexes, the last 15 minutes is usually a good sample size. For static indexes, the events and log data could have much older timestamps, so you might need to adjust the time range to one that is aligned with the timestamps for events and log records inside the indexed data in the target view (as in this example).
Click Refresh (or Refresh data) to run the search.
In this example, the search field was left empty to return all records in this very small data set like this example, which is a small
orders table based on TPC-H data. ChaosSearch is designed to index data at scale including complex application logs, which means that an unbounded search for a longer time range could return billions of records. A free text search with no terms, or very common terms, could take a long time to return a set of results that could be too large to scan easily.
Important Discover behaviors:
When you run a Discover search, ChaosSearch uses the Refinery view to find all of the matching records within the indexed data for the query criteria and timeframe. The Discover UI hits value is the number of matching records that ChaosSearch found. The records that are displayed in the UI are limited by a configuration setting, which is usually set to a default of 500. (That limit could be capped on a system level, or per-subaccount level.)
If the matching records for the query exceed the display limit, the displayed results are the first records returned from the ChaosSearch distributed query engine. If you re-run the same query, you could see a different subset of matching results. The limit controls the display in the UI. Any metrics or data aggregations used in the Discover query are calculated from the entire result set including rows that are not shown in the UI.
If users typically re-run the same query and their analysis depends on the Discover UI and a consistent set of displayed results, it is important to use filtering options to narrow the matching result set to a value below the configured display limit. You could also increase the display limit, but use caution; a larger number of displayed records affects the Discover performance and could increase query time. If the display number is too large for the environment, Discover could fail due to browser memory limitations.
It is important to remember that new files might be added to customer cloud storage and indexed over time. The same query could return a larger hits value or new/different records in the display because of newly available matching records. It is also possible that queries re-run for a time period in the past could show a lower hits value and results because some indexed data might have aged out since the last time that query ran. (The data retention period is a setting in the object group definition.)
As an example of a more bounded search using the
orders data, you might want to see records that reference a specific clerk ID. If you know which field (
o_clerk) contains the information that you are searching for, field-level searches return a more granular set of results, and in less time. For example, to find the orders processed by
o_clerk:Clerk#000000497 in the search window and refresh. Search values could be case-sensitive or case-insensitive, depending upon how the Refinery view was created.
For this sample data, five results are returned.
You can combine the field-level search criteria with AND, OR, and NOT syntax to create even more granular searches.
OpenSearch Dashboards supports many search value options such as wildcards, field-level searches, filter searches and combinations. More information is available in the Search Analytics help topics.
The Discover page uses Dashboards Query Language (DQL) by default. If you click the DQL link, you can turn off DQL and use Lucene search syntax instead.
Some users might have pre-built dashboards or visualizations that offer graphical or tabular representations of data analysis for their log and event files, built by ChaosSearch Customer Success or customer data analysts.
Click the Visualize or the Dashboard options in the left menu to see if there are pre-built visualizations that you can use for your data. The following image is a sample visualization of the
orders data showing a trend of orders by priority:
Visualizations offer another representation to turning your search queries into graphical or tabular displays that can be quickly scanned to show important information in your indexed data.
Dashboards combine multiple visualizations (either saved visualizations or ad-hoc ones created during dashboard development) on one page so that users can compare important factors for the data in a side-by-side summary. A sample dashboard for
orders data follows with the bar chart visualization and another pie chart visualization:
Creating visualizations and dashboards can take some time and practice for new users to learn, especially for developing the analytics that support them. The OpenSearch Dashboards interface offers guidance to help with the process for creating them.
Updated 12 days ago