Minimize Cost and Log Shipping Delays

With ChaosSearch, business analytics become an easy-add capability for your existing data lake.

ChaosSearch offers a new approach to indexing and searching data at scale. The differences begin with the log shipping and preparation work. Other solutions require time and money to prepare and ship content for analysis, but ChaosSearch bypasses a lot of that complexity and expense.

Traditional Log Shipping

For businesses that want to make their log sources queryable, the typical process is to bring those logs from different sources and services together into a log shipper service that correlates, aggregates, and normalizes the log content into a common schema, then saves the transformed files to their target cloud object store like AWS S3.


In Google Cloud Platform (GCP) environments, the process is similar; data travels from source apps to log shippers for storage in a GCP bucket (versus an S3 bucket).

The log and data transfer can be very process-intensive, especially for customers who have log files at scale. The processing time and business cycles to move, prepare, and store massive amounts of log files can be prohibitive. The need for data architects to plan, define, and refine the schemas to enable querying can add time and costs to the project planning stages.

If changes or new data schemas are needed in the future for different analysis, the process starts over, duplicating the time and cost for the data architecture, analysts, and IT teams, and the cloud storage costs especially if all the raw data variations are kept for analysis.

Once the prepared data is stored in your AWS or GCP cloud storage bucket, its journey might not be over. Many search and querying applications require you to move or replicate your cloud data again, with costly transfers to import or ingest that data into their remote database solutions or repos required for their applications.

The ChaosSearch Approach

The ChaosSearch solution and its patented Chaos Index® is designed to work with your cloud storage data where it already lives, creating a hot analytics data lake. ChaosSearch removes the need for this content and outside ETL schema preprocessing for querying your object store content—saving time and cost. ChaosSearch connects seamlessly to your cloud storage to scan and index your files as they exist in your raw storage. ChaosSearch offers tools to manage the common schema-on-write changes to tune the indexed data for analytics.

Using services like Amazon SQS (or Google Pub/Sub) for notifications when new content is stored in the raw buckets, ChaosSearch can automatically watch for and index new content to augment the indexed data that is processed and stored inside the customer-owned Chaos Index bucket. With the compact indexes, SaaS convenience, and automatic indexing of new log and data files through cloud notifications, the up-front work is minimized and the time-to-value maximized for log data analytics at scale.


Within the ChaosSearch account space, the services that support the ChaosSearch operations including indexing, analysis, authentication/access, and related operations all run as docker instances on EC2 (or GCE) workers in the same regions as the customer's data. ChaosSearch leverages all the benefits of a stateless and seamless architecture; workers can support all typical workload needs and the worker count can autoscale to cover volume increases and special cases. Workers are also resilient; if one fails, the scheduler redirects any indexing and querying requests to the other workers while a replacement worker spins up, all without data loss.

ChaosSearch includes analytics applications—OpenSearch Dashboards (Kibana) and Apache Superset—to allow users to quickly investigate and visualize their data. The same indexed data and Refinery views are flexible and can support either onboard application, and can also integrate with other observability and analysis tools, to enable users different applications to quickly and easily get to the business value within their log and event files.