Indexing AWS CloudTrail JSON Files

Suggestions and good practices for indexing and querying CloudTrail JSON log files

CloudTrail is an Amazon service that helps users with operational and risk auditing, governance, and compliance of their AWS account and its related actions. Actions taken by a user, role, or AWS service are recorded as events in CloudTrail, and for users that create a "trail", the events are output as JSON logs to an Amazon S3 bucket for the account.

JSON Flex and its options for indexing, selecting, and Refinery view transforms are well suited for indexing and searching CloudTrail JSON log files.

About CloudTrail Event Files

AWS CloudTrail log files are usually formed as one, top-level Records array that contains objects and arrays for one or more events. The Records structure can vary in its shape, fields, and complexity based on the log file setup and the applications or services captured in the log.

A sample CloudTrail event log follows. The example shows that an IAM user named Alice used the AWS CLI to call the Amazon EC2 StartInstances action by using the ec2-start-instances command for instance i-ebeaf9e2.

{"Records": [{
    "eventVersion": "1.0",
    "userIdentity": {
        "type": "IAMUser",
        "principalId": "EX_PRINCIPAL_ID",
        "arn": "arn:aws:iam::123456789012:user/Alice",
        "accessKeyId": "EXAMPLE_KEY_ID",
        "accountId": "123456789012",
        "userName": "Alice"
    "eventTime": "2014-03-06T21:22:54Z",
    "eventSource": "",
    "eventName": "StartInstances",
    "awsRegion": "us-east-2",
    "sourceIPAddress": "",
    "userAgent": "ec2-api-tools",
    "requestParameters": {"instancesSet": {"items": [{"instanceId": "i-ebeaf9e2"}]}},
    "responseElements": {"instancesSet": {"items": [{
        "instanceId": "i-ebeaf9e2",
        "currentState": {
            "code": 0,
            "name": "pending"
        "previousState": {
            "code": 80,
            "name": "stopped"


About the CloudTrail Records Contents

For an overview of the record format and fields, see the Amazon Web Services online help for the CloudTrail record contents.

More information on the CloudTrail log files and service is available in the AWS Working with CloudTrail log files.

As shown in the sample, the Records array is usually at the top level and contains varying levels of nested arrays and objects. It can be helpful to review the shape and content of the CloudTrail logs in use at your site.

The following topics describe some recommended JSON Flex techniques that can improve indexing and the querying and visualization behaviors for CloudTrail logs.