Object groups are a virtual filter; they use file name prefixes and similar rules to associate data files that have a common content format so that they can be indexed using similar rules.
Most sites will typically have a few object groups to categorize different types of source data files. As a best practice, work with ChaosSearch team members to plan a few object groups to organize your data sources.
To create an object group:
- Log in to the ChaosSearch console and click Storage.
- Click Create Object Group in the top right corner.
The Object Group Preview window appears:
- In the left list, select the bucket for which you want to add the object group. The Object Group Preview pane shows the files and folders at the top level of the bucket.
The following sections describe the options for selecting the bucket objects to include in the object group and the filtering controls that help you to tune the indexed contents for analytics.
The Object Group Preview is where you select and organize the files to include in your virtual grouping. You can specify the files to include by using the following methods:
- Prefix – Type a prefix string to filter the bucket file list and show only those file names that begin with the specified prefix.
- Regex Filter – Type a regular expression that selects the matching file names to include.
Regular Expression Editing
Selecting a file in the preview list populates the regular expression for that file name. This can be a helpful way to pre-populate a regex to edit for refinement. Click the pencil icon to open a regex editor to modify the expression and see the effect on the file selections. Click the X icon to clear the prefix and regex settings if needed.
- Object Filter – Click Object Filter to display a window of additional options such as filters by file modification date, file size, cloud storage class, partitions, or custom object tags/metadata values.
After you select the files that you want to include in the object group, click Next. The Content Preview window appears. The following window shows a sample CSV format type.
The content preview summarizes the format of the selected file(s) (such as log, JSON, CSV, or unknown) and the compression types (such as none, GZIP, or snappy). CSV, JSON, and LOG files have options to help with their index processing.
The Schema Filter window provides options for overriding the data type of one or more columns for virtual data transformations of the source content, and also offers a way to create a list to include or exclude specific columns for the index.
In the sample CSV window above, note that CSV format files have options for the file record delimiter (comma, tab, space, or other), and to specify whether the file has a first row with the column name headers.
Review the file schema format to make sure that your object group file inclusion rules are correct.
ChaosSearch analyzes the file contents for patterns and generates a format recommendation.
Compressed Data Files Have a Content Preview
ChaosSearch has the unique capability of providing a content preview even if the files are compressed. This allows you the unique ability to stay in the page while constructing regular expressions for further refinement.
ChaosSearch supports multiple format types that each have their own formatting options:
If Format is unknown, select LOG and type a regular expression to parse the file contents for indexing.
ChaosSearch supports multiple compression types:
A sample window for the CSV format is shown earlier in this page. It has a compression and delimiter field, as well as a setting that specifies whether the first row in the file contains column header values.
A sample window for a LOG format follows. The window contains fields for the compression detected in the files, and a regular expression field to parse the information contained in the log file.
For a JSON file, the Content Preview window shows options for processing the index including array flattening and expansion. If you plan to index JSON files, review [JSON File Processing].
Object groups have a number of options and powerful filtering and data transformation controls. As a summary of the creation process:
- Select the bucket and start the create process.
- Select the file(s) to include in the object group, and specify the content options to complete the content preview.
- Finalize any needed schema transformations and content filters.
- Click Create Object Group to add the group.
- Type a name for the object group.
- Select indexing options (static versus live indexing), and the index interval. (Daily is currently the only supported value.) For more information about Live Indexing support, see Indexing Your Data.
and specify how long to keep/retain the index files that ChaosSearch creates, then click Create.
The Schema Filter window opens a window for overriding the data type of one or more columns in the data file. You can create virtual data transformations of the source content, such as to change strings to IP addresses, or certain numeric values from integers to strings when the values represent IDs for example. You can also create and input a JSON definition file to include or exclude specific columns for the index.
The advanced filtering options allow you to create more complex data file filters for your index based on additional object metadata.
You can restrict data file inclusions by values such as:
- File last modification date
- File size in bytes within the specified range
- Storage class (cloud storage types)
- Partition key values
You can also use object tags and metadata to filter the files. Select one or more filtering options and click Submit.
Updated 4 months ago