SQL Functions

A summary of built-in functions currently supported for SQL Analytics

This topic provides more information about supported ChaosSearch built-in SQL functions. For a summary of supported functions by category, see SQL Functions Support Reference.

Aggregate Functions

Aggregate functions operate on a set of values to compute a single result.

AVG

Returns the average (arithmetic mean) of all input values.

select avg(cs_bytes),c_ip from "abc-cloudfront-xlarge-view" group by c_ip limit 3

avg(cs_bytes)   c_ip
277             123.125.71.113
249             217.69.143.64
298             217.69.143.66

COUNT

Returns the number of input rows.

SELECT count("Records.eventName") FROM "cloudtrail-xl-view"

count("Records.eventName")
53243


SELECT count(DISTINCT "Records.eventName") FROM "cloudtrail-xl-view"

count(DISTINCT "Records.eventName")
207

πŸ“˜

The COUNT and DISTINCT operations are exact counts. DISTINCT is limited to a maximum of 1 million distinct values.

MAX

Returns the maximum value of all input values.

select max(cs_bytes) from "abc-cloudfront-xlarge-view" 

max(cs_bytes)
8171

MIN

Returns the minimum value of all input values.

select min(cs_bytes) from "abc-cloudfront-xlarge-view" 

min(cs_bytes)
15

SUM

Returns the sum of all input values.

select sum(cs_bytes) from "abc-cloudfront-xlarge-view" 

sum(cs_bytes)
2565315             

Binary Functions

FROM_HEX

Use from_hex to convert/decode binary data from a hexadecimal encoded string.

select from_hex('666f6f626172')

from_hex('666f6f626172')
foobar

TO_HEX

Use to_hex to convert/decode binary data into a hexadecimal string representation.

select c_ip,to_hex(c_ip) as "hex coded IP" from "abc-cloudfront-xlarge-view" limit 3

c_ip           hex coded IP
66.249.65.218  36362e3234392e36352e323138
66.249.65.217  36362e3234392e36352e323137
81.249.170.45  38312e3234392e3137302e3435

Conversion Functions

CAST

Use the cast function to explicitly cast a value as a type. This function casts an extracted value to one of the supported data types such as varchar, double, or timestamp.

πŸ“˜

Implicit Casting in Trino Connectors

Trino implicitly converts numeric and character values to the correct type if a conversion is possible. Trino will not convert between character and numeric types. For example, a query that expects a varchar will not automatically convert a double value to an equivalent varchar.

Casting is very useful with SQL functions such as json_value and regexp_extract where it can cast an extracted string of characters to another type, such as double or timestamp data.

cast(value AS type)

For example, you can use cast to change a timestamp to an epoch value:

SELECT timestamp, cast(timestamp as double) from "sample-elb-view" limit 5;
timestamp                 CAST(timestamp AS double)
2017-11-04 01:46:05.833   1509759965833
2017-11-04 01:46:22.089   1509759982089
2017-11-04 01:46:53.396   1509760013396
2017-11-04 01:47:20.990   1509760040990
2017-11-04 01:48:11.679   1509760091679

As an example with regexp_extract, the following query extracts the trailing digit characters from a clerk_id varchar field, casts the numbers to a double, and adds 1000 to the ID value:

SELECT o_clerk, cast(regexp_extract(o_clerk,'Clerk#00000(\d\d\d\d)') as double) as 
clerk_number, clerk_number+1000 FROM "orders-view"  LIMIT 5;

o_clerk          clerk_number  (clerk_number + 1000)
Clerk#000000951  951           1951
Clerk#000000880  880           1880
Clerk#000000955  955           1955
Clerk#000000124  124           1124
Clerk#000000925  925           1925

Note that if you do not cast the digit characters as in the example above, the query still works because of implicit casting for the extracted values:

SELECT o_clerk, regexp_extract(o_clerk,'Clerk#00000(\d\d\d\d)') as 
clerk_number, clerk_number+1000 FROM "orders-view"  LIMIT 5;

o_clerk          clerk_number  (clerk_number + 1000)
Clerk#000000951  951           1951
Clerk#000000880  880           1880
Clerk#000000955  955           1955
Clerk#000000124  124           1124
Clerk#000000925  925           1925

If the cast function cannot perform the requested cast, the function returns a null value, similar to the Trino try_cast function, as in the following example:

SELECT cast("Records.eventName" as timestamp) from "cloud-records-all" limit 3;

CAST("Records.eventName" AS timestamp)
NULL
NULL
NULL

Date and Time Functions

DATE

The date function is an alias for CAST(x as date).

DATE_ADD

Use date_add(unit, value, timestamp) to add an interval value of type unit to timestamp. You can subtract the interval with a negative value.

select timestamp, date_add('year',1,timestamp) from "sample-elb-view" limit 5;

timestamp                    date_add('year', 1, timestamp)
2017-11-04T01:46:05.833+0000 2018-11-04T01:46:05.833+0000
2017-11-04T01:46:22.089+0000 2018-11-04T01:46:22.089+0000
2017-11-04T01:46:53.396+0000 2018-11-04T01:46:53.396+0000
2017-11-04T01:47:20.990+0000 2018-11-04T01:47:20.990+0000
2017-11-04T01:48:03.565+0000 2018-11-04T01:48:03.565+0000

DATE_FORMAT

Use the date_format(string,format) function to format a timestamp value as a string using the format controls. The format controls are a series of specifiers as in the following table:

SpecifierDescription
%aAbbreviated weekday name (Sun .. Sat)
%bAbbreviated month name (Jan .. Dec)
%cMonth, numeric (1 .. 12)
%dDay of the month, numeric (01 .. 31)
%eDay of the month, numeric (1 .. 31)
%fFraction of second (6 digits for printing: 000000 .. 999000; 1 - 9 digits for parsing: 0 .. 999999999)
Note that timestamp is truncated to milliseconds.
%HHour (00 .. 23)
%hHour (01 .. 12)
%IHour (01 .. 12)
%iMinutes, numeric (00 .. 59)
%jDay of year (001 .. 366)
%kHour (0 .. 23)
%lHour (1 .. 12)
%MMonth name (January .. December)
%mMonth, numeric (01 .. 12)
%pAM or PM
%rTime of day, 12-hour (equivalent to %h:%i:%s %p)
%S or %sSeconds (00 .. 59)
%TTime of day, 24-hour (equivalent to %H:%i:%s)
%vWeek (01 .. 53), where Monday is the first day of the week; used with %x
%WWeekday name (Sunday .. Saturday)
%xYear for the week, where Monday is the first day of the week, numeric, four digits; used with %v
%YYear, numeric, four digits
%yYear, numeric (two digits)
When parsing, two-digit year format assumes range 1970 .. 2069, so β€œ70” will result in year 1970 but β€œ69” will result in 2069.
%%A literal % character
SELECT DATE_FORMAT(timestamp, '%W %M %d %Y') as test from "sample-elb-view" limit 5;
select timestamp, date_add('year',1,timestamp) from "sample-elb-view" limit 5;

timestamp                    date_add('year', 1, timestamp)
2017-11-04T01:46:05.833+0000 2018-11-04T01:46:05.833+0000
2017-11-04T01:46:22.089+0000 2018-11-04T01:46:22.089+0000
2017-11-04T01:46:53.396+0000 2018-11-04T01:46:53.396+0000
2017-11-04T01:47:20.990+0000 2018-11-04T01:47:20.990+0000
2017-11-04T01:48:03.565+0000 2018-11-04T01:48:03.565+0000

DATE_TRUNC

Use the date_trunc(unit,x) function to truncate the value x to the specifed unit level.

select timestamp, date_trunc('second',timestamp) from "sample-elb-view" limit 1;
Unit ValueSample Truncated Value
secondOriginal: 2017-11-04T01:46:05.833+0000
Truncated: 2017-11-04T01:46:05.000+0000
minuteOriginal: 2017-11-04T01:46:05.833+0000
Truncated: 2017-11-04T01:46:00.000+0000
hourOriginal: 2017-11-04T01:46:05.833+0000
Truncated: 2017-11-04T01:00:00.000+0000
dayOriginal: 2017-11-04T01:46:05.833+0000
Truncated: 2017-11-04T00:00:00.000+0000
weekOriginal: 2017-11-04T01:46:05.833+0000
Truncated: 2017-10-30T00:00:00.000+0000
monthOriginal: 2017-11-04T01:46:05.833+0000
Truncated: 2017-11-01T00:00:00.000+0000
quarterOriginal: 2017-11-04T01:46:05.833+0000
Truncated: 2017-10-01T00:00:00.000+0000
yearOriginal: 2017-11-04T01:46:05.833+0000
Truncated: 2017-01-01T00:00:00.000+0000
SELECT DATE_FORMAT(fieldName, '%Y-%m-%d %T')
 FROM tableName
  WHERE DATE_TRUNC('day', fieldName)
    BETWEEN
      DATE_TRUNC('day', DATE_ADD('day', -30, NOW())) AND
      DATE_TRUNC('day', DATE_ADD('day', -1, NOW()));

FROM_ISO8601_DATE

Use the from_iso8601_date(string) function to convert the ISO 8601 formatted date string into a epoch value. The date can be a calendar date, a week date using ISO week numbering, or year and day of year combined:

SELECT from_iso8601_date('2020-05-11');

from_iso8601_date('2020-05-11')
1589155200000

FROM_ISO8601_TIMESTAMP

Use the from_iso8601_timestamp (string) to convert the ISO 8601 formatted date string, optionally with time and time zone, into an epoch value. The time defaults to 00:00:00.000, and the time zone defaults to the session time zone.

SELECT from_iso8601_timestamp('2020-06-11T11:15:05.055+00:00');

from_iso8601_timestamp('2020-06-11T11:15:05.055+00:00')
1591874105055

FROM_ISO8601_TIMESTAMP_NANOS

Use the from_iso8601_timestamp_nanos(string) to convert the ISO 8601 formatted date and time string to an epoch value. The time zone defaults to the session time zone.

SELECT from_iso8601_timestamp_nanos('2020-05-11T11:15:05');

from_iso8601_timestamp_nanos('2020-05-11T11:15:05')
1589195705000

SECOND

Use the second(field) function to extract the seconds value from a field.

select timestamp, second(timestamp) from "sample-elb-view" limit 1;

timestamp                       second(timestamp)
2017-11-04T01:46:05.833+0000    5

MINUTE

Use the minute(field) function to extract the minute value from a field.

select timestamp, minute(timestamp) from "sample-elb-view" limit 1;

timestamp                       minute(timestamp)
2017-11-04T01:46:05.833+0000    46

HOUR

Use the hour(field) function to extract the hour value from a field.

select timestamp, hour(timestamp) from "sample-elb-view" limit 1;

timestamp										  hour(timestamp)
2017-11-04T01:46:05.833+0000  1

DAY

Use the day(field) function to extract the day value from a field.

select timestamp, day(timestamp) from "sample-elb-view" limit 1;

timestamp											day(timestamp)
2017-11-04T01:46:05.833+0000  4

DAY_OF_MONTH

Use the day_of_month(field) function as an alias for day() to extract the day of the month value from a field.

select timestamp, day_of_month(timestamp) from "sample-elb-view" limit 1;

timestamp											day_of_month(timestamp)
2017-11-04T01:46:05.833+0000  4

DAY_OF_WEEK or DOW

Use the day_of_week(field) function to return the ISO day of the week from field. The value ranges from 1 (Monday) to 7 (Sunday). The alias dow() also works for this function.

select timestamp, day_of_week(timestamp) from "sample-elb-view" limit 1;

timestamp											day_of_week(timestamp)
2017-11-04T01:46:05.833+0000  6

DAY_OF_YEAR or DOY

Use the day_of_year(field) function to extract the day of the year value from a field. The value ranges from 1 to 366.

select timestamp, day_of_year(timestamp) from "sample-elb-view" limit 1;

timestamp											day_of_year(timestamp)
2017-11-04T01:46:05.833+0000  308

WEEK

Use the week(field) function to extract the week value from a field. The value ranges from 1 to 53.

select timestamp, week(timestamp) from "sample-elb-view" limit 1;

timestamp										  week(timestamp)
2017-11-04T01:46:07.244+0000  44

WEEK_OF_YEAR

Use the week_of_year(field) function to extract the ISO week of the year value from a field. The value ranges from 1 to 53.

select timestamp, week_of_year(timestamp) from "sample-elb-view" limit 1;

timestamp											week_of_year(timestamp)
2017-11-04T01:46:05.833+0000  44

MONTH

Use the month(field) function to extract the month value from a field.

select timestamp, month(timestamp) from "sample-elb-view" limit 1;

timestamp										  month(timestamp)
2017-11-04T01:46:07.244+0000  11

QUARTER

Use the quarter(field) function to extract the calendar quarter value from a field. The value ranges from 1 to 4.

select timestamp, quarter(timestamp) from "sample-elb-view" limit 1;

timestamp										  quarter(timestamp)
2017-11-04T01:46:07.244+0000  4

YEAR

Use the year(field) function to extract the year value from a field.

select timestamp, year(timestamp) from "sample-elb-view" limit 1;

timestamp										  year(timestamp)
2017-11-04T01:46:05.833+0000  2017

YEAR_OF_WEEK or YOW

Use the year_of_week(field) function to extract the year of the ISO week from a field. The yow() function is an alias for year_of_week(). The value ranges from 1 to 53.

select timestamp, year_of_week(timestamp) from "sample-elb-view" limit 1;

timestamp											year_of_week(timestamp)
2017-11-04T01:46:05.833+0000  2017

NOW

Use the now() function to display the epoch time when the query executes.

select now();

now()
1670444032565

TO_ISO8601

Use the to_iso8601(field) to convert a timestamp in milliseconds to an ISO 8601 string field.

select to_iso8601(1690469506000);

to_iso8601(1690469506000)
2023-07-27T14:51:46.000+0000

JSON Functions

JSON_VALUE

The json_value function extracts a scalar SQL value from a JSON string value.

json_value(json_string_column, 'JSONPath-expression')

For the json_string_column, specify a ChaosSearch view column that contains JSON string data. Enclose the column name in double quotation marks if the name includes a period separator, such as "Records.requestParameters".

For JSONPath-expression, specify a JSONPath expression enclosed in single quotations for the desired JSON property to return. See Query JSON String Fields in SQL for more information about using the function.

The following example shows how to use json_value to extract a field from a JSON string column:

SELECT "Records.requestParameters", json_value("Records.requestParameters",'$.includeShadowTrails') as includeShadowTrails from "cloudtrail-json-flex-view" where includeShadowTrails is not null;

Records.requestParameters                          includeShadowTrails
{"trailNameList":[],"includeShadowTrails":false}   false
{"trailNameList":[],"includeShadowTrails":false}   false
{"trailNameList":[],"includeShadowTrails":true}    true
{"trailNameList":[],"includeShadowTrails":false}   false
{"trailNameList":[],"includeShadowTrails":false}   false
{"trailNameList":[],"includeShadowTrails":true}    true

πŸ“˜

Partial Trino json_value Implementation

Note that the current json_value implementation does not support the PASSING or RETURNING keywords. The strict and lax modes have been collapsed to align with the behavior of the Materialize with JSONPath transform, so those keywords should not be used in current testing.

Math Functions

ROUND

Returns x rounded to d decimal places.

select round(time_taken, 2), time_taken  from "abc-cloudfront-xlarge-view" limit 10
round(time_taken, 2)  time_taken
0.03                  0.029
0.14                  0.138
0.01                  0.014
0.16                  0.158
0                     0
0.04                  0.035
0                     0.001
0.01                  0.005
0.11                  0.113
0                     0.003

Regular Expression Functions

ChaosSearch supports the Trino connector regexp_extract function.

REGEXP_EXTRACT

The regexp_extract function returns the first substring within a JSON string blob that matches the specified regular expression pattern.

regexp_extract(json_string_column, 'regular_expression')

For the json_string_column, specify a ChaosSearch view column that contains JSON string data. Enclose the column name in double quotation marks if the name includes a period separator, such as "Records.requestParameters".

For regular_expression, specify a Java pattern regular expression enclosed in single quotations to extract the matching string from the JSON string column.

πŸ“˜

The regular expression function uses the Java pattern syntax, with some exceptions. See Regular expression functions.

String Functions

CONCAT

Use concat(string,string[,string...]) or the || operator to concatenate strings within a query. Two sample queries follow that return the same results:

select client_ip, client_port, concat(client_ip,':', client_port) as client from "sample-elb-view" limit 5;

select client_ip, client_port, (client_ip || ':' || client_port) as client from "sample-elb-view" limit 5;

LENGTH

Use length(string) to return the length of a string in characters. For example:

select user_agent, length(user_agent) from "sample-elb-view";

LOWER

Use lower(string) to convert a string to all lowercase letters. For example:

SELECT lower('AbCdE');

select lower(user_agent) from "sample-elb-view" limit 10;

SUBSTR and SUBSTRING

Use substr(string,position) or its alias substring() to return the remaining characters in a string starting with the specified position character. This can be helpful for removing leading common characters like https:// from a URL, for example:

select substr('abcdefg',5)
efg

select substring('abcdefg', -5)
cdefg

select substr(cs_uri_stem,9) from "sample-elb-view";

UPPER

Use upper(string) to convert a string to all uppercase letters. For example:

select upper(cs_uri_stem) from "sample-elb-view";

System Functions

VERSION

Returns the ChaosSearch internal version information for the user deployment.

select version();

version()
1.0.0.52366 (105c5d0564)