OpenSearch Python Client

The OpenSearch python client can be used for programmatic querying of the ChaosSearch indexed data.

ChaosSearch supports programmatic access for searching the Chaos Index data using OpenSearch clients such as the python low-level client.

Client Methods and Parameters

ChaosSearch supports the use of the multi-search (_msearch) and search (_search) OpenSearch API operations to run queries against the Chaos indexed data. The _msearch method is generally recommended for search API calls.

Other OpenSearch API methods related to index administration and document management are not supported for use with ChaosSearch.

The supported query DSL syntax is documented in Elasticsearch API Support. Within the params/body of the query, ChaosSearch supports the following parameters:

  • params.size
  • body.query
  • body.aggregations
  • body.sort
  • headers -- to specify indices/views

Other parameters are not supported. If you have interest in other search parameters, contact ChaosSearch Customer Success for more information about programmatic support for the types of search operations or behaviors.

URL Parameters and Metadata Options

URL parameters and metadata options are not supported for use with ChaosSearch querying.

Timeouts and Redirects

In your python scripts, make sure that you plan for handling timeouts and for the redirects that ChaosSearch returns as heartbeats for long-running operations. When a user submits an _msearch request, after two minutes of performing the query, ChaosSearch replies with a redirect. Following that redirect will "reconnect" you to the query (and if two more minutes elapse, the process repeats).

The python client must be configured to allow queries to run for longer than the 10-second default, and to follow redirects from ChaosSearch. If you observe read timeout and response timeout errors on the python client side, you can add the timeout and response_timeout settings shown below and start with a value of 120 as a default.

from opensearchpy import OpenSearch, helpers, exceptions, RequestsHttpConnection
import json
from requests_aws4auth import AWS4Auth

awsauth = AWS4Auth("Access-Key-ID", "Secret-Access-Key", "us-east-1", 's3')
os = OpenSearch(
  hosts = [{'host': 'test.chaossearch.io', 'port': 443, 'url_prefix': '/elastic', 'use_ssl': True}],
  http_auth=awsauth,
  connection_class=RequestsHttpConnection,
  timeout=120,
  response_timeout=120,
  verify_certs=True
)
...

Search Example

from opensearchpy import OpenSearch, helpers, exceptions, RequestsHttpConnection
import json
from requests_aws4auth import AWS4Auth


access_key = '<access-key>'
secret_key = '<secret-key>'
region = 'us-east-1'

awsauth = AWS4Auth(access_key, secret_key, region, 's3')
os = OpenSearch(
  hosts = [{'host': '<company>.chaossearch.io', 'port': 443, 'url_prefix': '/elastic', 'use_ssl': True}],
  http_auth=awsauth,
  connection_class=RequestsHttpConnection,
  verify_certs=True
)

client = os

try:
    client_info = OpenSearch.info(client)
except exceptions.ConnectionError as err:
    print ('Opensearch client error:', err)
    client = None

if client != None:
    search_body = {
        "size": 500,
        "aggs": {
            "2": {
                "date_histogram": {
                    "field": "@timestamp",
                    "fixed_interval": "30s",
                    "time_zone": "America/Los_Angeles",
                    "min_doc_count": 1
                }
            }
        },
        "query": {
            "bool": {
                "must": [],
                "filter": [
                    {
                        "match_all": {}
                    },
                    {
                        "match_phrase": {
                            "domain": "<company.com>"
                        }
                    },
                    {
                        "range": {
                            "@timestamp": {
                                "gte": "2024-04-23T16:19:41.333Z",
                                "lte": "2024-04-23T16:34:41.333Z",
                                "format": "strict_date_optional_time"
                            }
                        }
                    }
                ],
                "should": [],
                "must_not": []
            }
        }
    }

resp = client.search(
    index='<my-view-name>', # <---- SPECIFY ChaosSearch Refinery View HERE
    body=search_body,
    scroll='30s',
    size=2000,
)

hits = resp['hits']['hits']
for num, hit in enumerate(hits):
    print('\n', num, '', hit)

Multi-Search Example

from opensearchpy import OpenSearch, helpers, exceptions, RequestsHttpConnection
import json
from requests_aws4auth import AWS4Auth


access_key = '<Access-Key>'
secret_key = '<Secret-Key>'
region = 'us-east-1'

awsauth = AWS4Auth(access_key, secret_key, region, 's3')
os = OpenSearch(
  hosts=[{'host': '<mycompany>.chaossearch.io', 'port': 443, 'url_prefix': '/elastic', 'use_ssl': True}],
  http_auth=awsauth,
  connection_class=RequestsHttpConnection,
  verify_certs=True
)

client = os

try:
    client_info = OpenSearch.info(client)
    print('OpenSearch client info:', json.dumps(client_info, indent=4))
except exceptions.ConnectionError as err:
    print('Opensearch client error:', err)
    client = None

if client is not None:
    search_body = [
        {"index": "<my-view-name>"},
        {
            "size": 20,
            "aggs": {
                "2": {
                    "date_histogram": {
                        "field": "@timestamp",
                        "fixed_interval": "30s",
                        "time_zone": "America/Los_Angeles",
                        "min_doc_count": 1
                    }
                }
            },
            "query": {
                "bool": {
                    "must": [],
                    "filter": [
                        {
                            "match_all": {}
                        },
                        {
                            "match_phrase": {
                                "domain": "<my-search-domain>"
                            }
                        },
                        {
                            "range": {
                                "@timestamp": {
                                    "gte": "2024-04-30T16:42:00.680Z",
                                    "lte": "2024-04-30T16:57:00.680Z",
                                    "format": "strict_date_optional_time"
                                }
                            }
                        }
                    ],
                    "should": [],
                    "must_not": []
                }
            }
        }
    ]

resp = client.msearch(
    body=search_body
)

responses = resp['responses']
for response in responses:
    print(response)
    hits = response['hits']['hits']
    for num, hit in enumerate(hits):
        print('\n', num, '', hit)