Query Performance Considerations

Review for guidance on factors that contribute to query performance.

There are many factors that can influence overall query performance, such as:

  • Total size of the data set being searched
  • Length of the query window
  • Specificity or scoping of the request
  • Number of requests in a single query
  • Active concurrency and competition for worker resources
  • Total workers available

In most cases, the QueryStats.context.query field provides a great deal of context for evaluating query behavior. A basic query value has contents similar to the following code sample. From this block, you can review what was being queried, the length of the query window, and how many aggregations occurred for the data.

{
  "pauseState": null,
  "queryAuthority": {
    "_type": "chaossumo.query.QueryAuthority.SubAccount",
    "_workerLimit": 1,
    "context": {},
    "indexAuthority": [
      {
        "effect": true,
        "glob": "*"
      }
    ],
    "principal": {
      "name": "[email protected]"
    }
  },
  "queryId": "795b0a33-b3cf-4968-a88d-fedd55d59c2f",
  "requests": [
    {
      "body": {
        "aggregations": {
          "2": {
            "date_histogram": {
              "extended_bounds": null,
              "field": "Records.eventTime",
              "interval": "3h"
            },
            "meta": null
          }
        },
        "query": {
          "bool": {
            "filter": [
              {
                "bool": {
                  "should": [
                    {
                      "match_phrase": {
                        "Records.eventName": {
                          "query": "CreateBucket"
                        }
                      }
                    }
                  ]
                }
              },
              {
                "range": {
                  "Records.eventTime": {
                    "gte": "1634941475909",
                    "lte": "1635805475909"
                  }
                }
              }
            ],
            "must": [],
            "must_not": [],
            "should": []
          }
        },
        "sort": [
          {
            "Records.eventTime": "desc"
          }
        ]
      },
      "params": {
        "size": 500
      },
      "target": {
        "indices": "cloudtrail-logging-view"
      }
    }
  ]
}

The QueryStats.context.segment_stats.totalChunks field often has some correlation to the total time QueryStats.timing.query_total_ms. Indicating the number of segments read from object storage and analyzed to service a query, a large totalChunks value can indicate a very wide query window or a very loose search constraint that might benefit from a more-specific query approach. However, the total number of chunks does not equate to a specific time, as the number and complexity of aggregations or specificity in the search parameters on a given set of data can also influence overall query time.

When concurrent requests are made from different principals, the underlying compute worker resources are automatically balanced to service both requests. Reviewing the fields QueryStats.timestamp and QueryStats.timing.query_total_ms can indicate when a query might have taken longer due to workers being shared across multiple requests.

If a regular query changes in behavior, either due to an unknown failure or an abnormal performance compared to previous experience, the ChaosSearch team can help you to diagnose the cause of the change. Provide the QueryStats.context.query_id to your Customer Success partner for assistance with troubleshooting the unexpected change in behavior.