When you define an object group to specify the cloud-storage files that you want to index and the rules for indexing them, and then start indexing, the Chaos Index service creates one or more daily intervals. The intervals are listed in the Intervals areas for object groups and are also used in views.
Intervals are very important components of the ChaosSearch environment. This topic provides a closer look at intervals, what they do, and how they are managed within the ChaosSearch ecosystem.
Intervals are created by the Chaos Index service, and are based on two key inputs:
- The object group that defines your raw cloud storage files to index, and how to index them
- The modification date (in cloud storage) of matching raw storage files that were indexed
When ChaosSearch starts to index an object group, it searches the defined cloud storage locations (in the object group) to find the matching files to index. For each matching file, it captures the modification date for the file and indexes its contents to create the patented, lossless index data used by the ChaosSearch Refinery for queries and analytics.
Using the object group name, and the modification date of the cloud storage file, ChaosSearch creates daily intervals to organize the indexed data. Each daily interval contains the indexed data and related information for all of the matching files with the same modification date. A sample Intervals page for an object group follows:
As shown in the example, interval names include the object group and modification date of the indexed files, for example:
This is the daily interval for the
cloudtrail_data object group with the indexed log and event files that have a modification date of October 4, 2022.
Keep in mind that the indexed data for your cloud storage object files is stored in the read-write bucket that you own. The intervals provide the organization and control to manage that indexed data and its related metadata, its use in views, and its lifespan.
ChaosSearch's unique indexing design allows users to keep their indexed data for a very long time, even indefinitely if needed. The timeline of the analysis that you want to keep is up to you.
In some cases, customers will eventually age out and move the bulkier, original, cloud files to less-expensive long term archival storage, and keep their more compact Chaos Index data in accessible storage for longer periods to enable historical analysis and querying.
The daily intervals are the key to that historical data retention. When you create an object group, the Retention Policy setting specifies how long to keep the indexed data. The data clean-up process uses the date in the daily interval name to identify the indexed data to delete. As an example, if an object group uses the default retention policy of 14 Days, ChaosSearch automatically cleans up and removes any daily intervals for that object group with a name date that is earlier than the two-week period. That is, on October 14, 2022, the object group daily intervals
_<object-group-name>_2022-09-30_ and earlier are deleted in the clean-up process.
Be careful when changing (especially reducing) retention periods for daily intervals.
Always use caution when decreasing the retention period for an object group. If a group changes from Unlimited to 3 Days, for example, the work to delete a large number of daily intervals could impact system performance. Also, you cannot restore those daily intervals except by re-indexing the original cloud storage files.
If the object group has an Unlimited retention period, the object group's daily intervals are never automatically deleted.
Administrators can also delete daily intervals manually from the Storage > Intervals page. Typically, daily intervals are manually deleted if the administrator wants to manually clean up some old intervals, or when the object group is going to be deleted. (You must delete any daily intervals for an object group before you can delete the group.) Sometimes, intervals might be deleted as part of a trial and setup process with ChaosSearch Customer Success when re-indexing an object group is necessary.
When you create a Chaos Refinery view to query and analyze the indexed data for one or more object groups, part of the view definition is the list of daily intervals to include in the view. For all the daily intervals related to the selected object group(s), you could select:
- One, more, or all of the daily intervals
- Daily intervals that match an interval name pattern specified by a regular expression
- A rolling time window of daily intervals, such as those for the last 7 days, last 45 days, or last 12 months
The interval pattern and window both match on the daily interval name to determine which intervals to include in the view (and thus in the analysis).
Cloud storage modification dates and timestamps inside the log and event files
In most cases, the modification date for a log or event file written to cloud storage is very close or the same as the day in the event timestamps captured in the the events and log entries inside the file. However, sometimes the modification/saved date of a file could differ from the timestamps written inside the log and event file. The daily interval file names use the modification date, not the dates from timestamps inside the files.
As an example, a log file for events that occurred on January 1, 2022 could have a later cloud storage modification date such as March 15, 2022. When ChaosSearch indexes the file, the date used for the daily interval file name will be 2022-03-15 (the storage modification date). Queries against the view show the January 1, 2022 event timestamps.
Updated 14 days ago