Object groups are a virtual filter; they use file name prefixes and similar rules to associate data files that have a common content format so that they can be indexed using similar rules.
Most sites will typically have a few object groups to categorize different types of source data files. As a best practice, work with ChaosSearch team members to plan a few object groups to organize your data sources.
To create an object group:
- Log in to the ChaosSearch console and click Storage.
- Click Create Object Group in the top right corner.
The Object Group Preview window appears:
- In the left list, select the bucket for which you want to add the object group. The Object Group Preview pane shows the files and folders at the top level of the bucket.
The following sections describe the options for selecting the bucket objects to include in the object group and the filtering controls that help you to tune the indexed contents for analytics.
The Object Group Preview is where you select and organize the files to include in your virtual grouping. You can specify the files to include by using the following methods:
- Prefix – Type a prefix string to filter the bucket file list and show only those file names that begin with the specified prefix.
- Regex Filter – Type a regular expression that selects the matching file names to include.
Regular Expression Editing
Selecting a file in the preview list populates the regular expression for that file name. This can be a helpful way to pre-populate a regex to edit for refinement. Click the pencil icon to open a regex editor to modify the expression and see the effect on the file selections. Click the X icon to clear the prefix and regex settings if needed.
- Object Filter – Click Object Filter to display a window of additional options such as filters by file modification date, file size, cloud storage class, partitions, or custom object tags/metadata values.
After you select the files that you want to include in the object group, click Next. The Content Preview window appears. The following window shows a sample CSV format type.
The content preview summarizes the format of the selected file(s) (such as log, JSON, CSV, or unknown) and the compression types (such as none, GZIP, or snappy). CSV, JSON, and LOG files display options to help with their index processing.
The Schema Filter window provides options for overriding the data type auto-detection of one or more fields within the source files. ChaosSearch includes auto-detection routines that scan the matching storage files and auto-detect data types for numbers, strings, time values, and periods. Administrators can refine or lock in the data type for one or more fields using Field Type Overrides. This override enables virtual data transformations of the source content. Additional controls on the schema transformation page support the ability to include or exclude a list of specific fields for the index.
In the sample CSV window above, note that CSV format files have options for the file record delimiter (comma, tab, space, or other), and to specify whether the CSV file includes a heading row with the name of the fields in the CSV.
Review the file schema format to make sure that your object group file inclusion rules are correct.
ChaosSearch analyzes the file contents for patterns and generates a format recommendation.
Compressed Data Files Have a Content Preview
ChaosSearch can provide a source file content preview even if the files are compressed. This allows you to stay in the page while constructing regular expressions for further refinement.
ChaosSearch supports multiple format types that each have their own formatting options:
If Format is unknown, select LOG and type a regular expression to parse the file contents for indexing.
ChaosSearch supports multiple compression types:
A sample window for the CSV format is shown earlier in this page. It has a compression and delimiter field, as well as a setting that specifies whether the first row in the file contains column header values.
A sample window for a LOG format follows. The window contains fields for the compression detected in the files, and a regular expression field to parse the information contained in the log file.
For a JSON file, the Content Preview window has options for indexing JSON's nested arrays including expansion options and depth levels. Review JSON Flex Processing for more information on the ChaosSearch JSON indexing features.
Object groups have a number of options and powerful filtering and field specification controls. As a summary of the creation process:
- Select the bucket and start the create process.
- Select the file(s) to include in the object group, and specify the content options to complete the content preview.
- Finalize any needed field and indexing filters.
- Click Create Object Group to add the group.
- Type a name for the object group.
- Select indexing options (static versus live indexing), and how long to keep/retain the daily interval files that ChaosSearch creates, then click Create.
The Schema Filter window opens a window for overriding the data type of one or more fields in the index. You can assign specific data types to the fields, such as to change strings to IP addresses, or certain numeric values from integers to strings when the values represent string IDs. You can also create and input a JSON definition file to perform much more granular field processing options for the resulting indexed data.
The advanced filtering options allow you to create more complex data file filters for your index based on additional object metadata.
You can restrict data file inclusions by values such as:
- File last modification date
- File size in bytes within the specified range
- Storage class (cloud storage types)
- Partition key values
You can also use object tags and metadata to filter the files. Select one or more filtering options and click Submit.
Updated 23 days ago