Indexing Your Data

Index your cloud storage objects to create the ChaosSearch index files for visualizing and analyzing the information in the content.

ChaosSearch indexing is a deep data analysis of your cloud object storage content based on an object group filter. After you create an object group, start the indexing process to create the files that empower the Chaos Refinery® index views and Analytics visualizations.

Start Indexing

To start the indexing process:

  1. In the Storage area, select the object group for which you want to start indexing the cloud objects.
  2. Click Start Indexing.

When indexing execution begins, the service starts a deep analysis of associated content, generating a proprietary data representation, and inferring the data structure (schema).

View Index Information

Once the analysis is complete, the system displays a comprehensive report of your data. This report shows detailed information including:

  • Object Group Details: name, creation date, source type, compression, retention policy, predicate, regex
  • Trending Schema: a pie with slices for the data types found, and numbers of each type
  • Indexed Structure: column/field mapping IDs, name, and data types

Indexing Options

Live Indexing

The Live indexing option causes ChaosSearch to index the existing content in the cloud storage bucket that matches the object group file filtering criteria, and then to watch for and index any new matching files after they are written to the storage bucket. Live indexing is the typical option for most environments where the cloud storage bucket receives new objects to index on a regular cadence.

Live indexing requires the configuration of an Amazon Web Services (AWS) SQS integration or Google Pub/Sub integration (depending on the source cloud storage service) to notify the indexing service when new objects are available. For more information, see Live Indexing - Amazon SQS or Live Indexing - Google PubSub.

When you define the object group for the files that you want to index, ChaosSearch allows you to specify how to index your data. You first specify the file patterns, define any needed object and/or schema filters and column transformations, then you reach the Create Object Group window.

To configure Live Indexing, select the Live Indexing option and paste in the ARN created for the AWS SQS service, or the GCP Pub/Sub Project ID. A sample window follows:


If you do not select Live Indexing, the indexing style defaults to static indexing. Static indexing causes the system to index the matching objects defined by the object group only when you click Start Indexing. It assumes that the bucket contents are relatively static. If new objects are added to the bucket, you can click Start Indexing again to run a new static index for any new files added since the last time the index service ran.

Within the Create Object Group window, you can choose a Retention Policy for Index Lifecycle Management.