Organization Reference Example 1

A sample case study for log file and object group planning

Customer Scenario

An AWS customer has mostly JSON log sources and/or logs that are being converted to JSON files through their logging pipelines.

Good Design

A single object group that uses partitioning manages the JSON use cases. Partitioning allows the customer to add new applications to the logging bucket without having to update any AWS or ChaosSearch configurations. The CSV files and custom regex files use cases have their own object groups. New applications and environments can log to the bucket and be processed as part of this object group without having to reconfigure the object group or to provision anything in AWS.

📘

Pathname Spacing in Documentation

In the good and poor examples below, note that extra spaces are included around the forward slash (/) characters to highlight the file organization hierarchy within the documentation. Do not add these extra spaces to pathnames in the actual bucket, or to the regular expressions that you specify within the ChaosSearch UIs when creating object group filters or partition expressions.

# AWS S3 "central-logging-bucket"

# Object Group 1 - JSON use cases. Objects are compressed. 
# SQS queue is receiving notifications for the prefix "jsonapps".
# Partition regex example - 
# Partitioning is configured on the second and third directories to allow narrow query scoping with the following regex: "jsonapps\/(\S+?)\/(\S+?)\/.*"
# Partitioning also allows additional use cases to be enabled without any reconfiguration 
jsonapps / app1 / prod / account-1 / subservice-1 / date / logs.gz
                                   / subservice-2 / date / logs.gz
                       / account-2 / subservice-1 / date / logs.gz
                                   / subservice-2 / date / logs.gz
                / dev  / account-n / subservice-1 / date / logs.gz
                                   / subservice-2 / date / logs.gz
                / uat  / account-n / subservice-1 / date / logs.gz
                                   / subservice-2 / date / logs.gz
         / app2 / prod / account-n / subservice-1 / date / logs.gz
                                   / subservice-2 / date / logs.gz
                / dev  / account-n / subservice-1 / date / logs.gz
                                   / subservice-2 / date / logs.gz
                / uat  / account-n / subservice-1 / date / logs.gz
                                   / subservice-2 / date / logs.gz
         / app3 / prod / account-n / subservice-1 / date / logs.gz
                                   / subservice-2 / date / logs.gz
                / dev  / account-n / subservice-1 / date / logs.gz
                                   / subservice-2 / date / logs.gz
                / uat  / account-n / subservice-1 / date / logs.gz
                                   / subservice-2 / date / logs.gz
         / app4 / prod / account-n / subservice-1 / date / logs.gz
                                   / subservice-2 / date / logs.gz
                / dev  / account-n / subservice-1 / date / logs.gz
                                   / subservice-2 / date / logs.gz

# Object Group 2 - CSV use case. Objects are compressed.
# SQS queue is receiving notifications for the prefix "csvapps/gzipped" on the bucket.
# Partitioning is configured on the third and fourth directories to allow narrow query scoping with the following regex: "csvapps\/gzipped\/(\S+?)\/(\S+?)\/.*"
csvapps / gzipped / app5 / prod / account-1 / subservice-1 / date / logs.gz
                                            / subservice-2 / date / logs.gz
                         / dev  / account-1 / subservice-1 / date / logs.gz
                                            / subservice-2 / date / logs.gz
                         
# Object Group 3 - CSV use case. Objects are not compressed. 
# SQS queue is receiving notifications for the prefix "csvapps/unzipped" on the bucket.
# Partitioning is configured on the third and fourth directories to allow narrow query scoping with the following regex: "csvapps\/unzipped\/(\S+?)\/(\S+?)\/.*".
# Note: If it were possible, it would be better if these logs were gzipped and folded into Object Group 2
        / unzipped / app6 / prod / account-n / date / logs
                          / dev  / account-n / date / logs
                         

# Object Group 4 - Custom regex format use cases. Objects not compressed.
# SQS queue is receiving notifications for prefix "regex" on the bucket.
# Partitioning is configured on the second and third directories to allow narrow query scoping with the following regex: "regex\/(\S+?)\/(\S+?)\/.*"
# Note: If it were possible, it would be better to convert these to json by applying the regex in the logging pipeline and including them as another partition in Object Group 1.
regex / app-7 / prod / date / logs
              / dev  / date / logs
              / uat  / date / logs
regex / app-8 / prod / date / logs
              / dev  / date / logs
              / uat  / date / logs

Poor Design

Rather than organizing by logging format type, the logs are sorted by account. An object group is organized by log format within an account. This organization significantly increases the number of bucket notifications relative to the AWS limits, requires new notifications to be configured each time new applications are onboarded, and consumes resources for live indexing that could be available to service user queries. Since data is organized by account, the Refinery views must span multiple object groups and need to be updated each time a new account is onboarded.

# AWS S3 "central-logging-bucket"  (WIP)

# Object Group 1 - Account-1 JSON use cases. Objects are compressed. 
# SQS queue is receiving notifications for the prefix "account-1/jsonapps".
account-1 / jsonapps / app1 / date / logs.gz
                     / app2 / date / logs.gz
                     / appn / date / logs.gz

# Object Group 2 - Account-1 CSV use cases. Objects are compressed. 
# SQS queue is receiving notifications for the prefix "account-1/csvapps/gzipped".
account-1 / csvapps / gzipped / app5 / date / logs.gz


# Object Group 3 - Account-1 CSV use cases. Objects are not compressed. 
# SQS queue is receiving notifications for the prefix "account-1/csvapps/unzipped".
account-1 / csvapps / unzipped / app6 / date / logs.gz


# Object Group 4 - Account-2 JSON use cases. Objects are compressed. 
# SQS queue is receiving notifications for the prefix "account-2/jsonapps".
account-2 / jsonapps / app1 / date / logs.gz
                     / app2 / date / logs.gz
                     / appn / date / logs.gz

# Object Group 5 - Account-2 CSV use cases. Objects are compressed. 
# SQS queue is receiving notifications for the prefix "account-2/csvapps/gzipped".
account-2 / csvapps / gzipped / app5 / date / logs.gz


# Object Group 6 - Account-2 CSV use cases. Objects are not compressed. 
# SQS queue is receiving notifications for the prefix "account-2/csvapps/unzipped".
account-2 / csvapps / unzipped / app6 / date / logs.gz

# Object Group 7 - Account-3 JSON use cases. Objects are compressed. 
# SQS queue is receiving notifications for the prefix "account-3/jsonapps".
account-3 / jsonapps / app1 / date / logs.gz
                     / app2 / date / logs.gz
                     / appn / date / logs.gz

# Object Group 8 - Account-3 CSV use cases. Objects are compressed. 
# SQS queue is receiving notifications for the prefix "account-3/csvapps/gzipped".
account-3 / csvapps / gzipped / app5 / date / logs.gz


# Object Group 9 - Account-3 CSV use cases. Objects are not compressed. 
# SQS queue is receiving notifications for the prefix "account-3/csvapps/unzipped".
account-3 / csvapps / unzipped / app6 / date / logs.gz

Did this page help you?