Recommendations for Business Analytics and Complex SQL

Suggested steps for best SQL query performance

As you start working with SQL queries against the ChaosSearch indexed data, it is a good practice to begin with some simpler queries to see how quickly you can obtain results from the indexed data. Review the indexed data results to see the information your log and event files are providing, and start planning more advanced queries to get the insights from your data.

Be sure to run initial test queries filtered for a typical analysis range, such as a few hours or a day for time-series data, or for a specific key attribute if the data is not time-based. For example, the following queries show data for a specific day, or for a specific hour of a day:

SELECT DISTINCT sc_status  
FROM "abc-cloudfront-xlarge-view"  
WHERE timestamp >= TIMESTAMP '2024-01-20' AND timestamp \<= TIMESTAMP '2024-01-21'  
LIMIT 10

SELECT DISTINCT sc_status
FROM "abc-cloudfront-xlarge-view"
WHERE timestamp >= TIMESTAMP '2024-01-23T14:00:00Z' AND timestamp < TIMESTAMP '2024-01-23T15:00:00Z'
LIMIT 10

ChaosSearch indexed data has some very powerful advantages:

  • ChaosSearch indexed data is much more compact and optimized for all types of searches.

  • The indexed data can benefit from valuable schema-on-write field transformations to manage key columns.

  • The width of the indexed data records can be pared down to the key analysis columns because the schema policy control include/exclude features of object groups allow users to omit unnecessary data from the generated indexed data.

The indexed data could help you to avoid the SQL workarounds needed for other environments, and focus on very tuned and performant queries that can provide records and results to your SQL applications much more quickly than other environments.

See Unsupported SQL Syntax for information about SQL syntax and tokens that are not supported.