SQL Function Support

A summary of the built-in functions currently supported for SQL Analytics

ChaosSearch supports the following built-in SQL functions.

Conversion Functions

CAST

Use the cast function to explicitly cast a value as a type. This function casts an extracted value to one of the supported data types such as varchar, double, or timestamp.

📘

Implicit Casting in Trino Connectors

Trino implicitly converts numeric and character values to the correct type if a conversion is possible. Trino will not convert between character and numeric types. For example, a query that expects a varchar will not automatically convert a double value to an equivalent varchar.

Casting is very useful with SQL functions such as json_value and regexp_extract where it can cast an extracted string of characters to another type, such as double or timestamp data.

cast(value AS type)

For example, you can use cast to change a timestamp to an epoch value:

SELECT timestamp, cast(timestamp as double) from "sample-elb-view" limit 5;
timestamp                 CAST(timestamp AS double)
2017-11-04 01:46:05.833   1509759965833
2017-11-04 01:46:22.089   1509759982089
2017-11-04 01:46:53.396   1509760013396
2017-11-04 01:47:20.990   1509760040990
2017-11-04 01:48:11.679   1509760091679

As an example with regexp_extract, the following query extracts the trailing digit characters from a clerk_id varchar field, casts the numbers to a double, and adds 1000 to the ID value:

SELECT o_clerk, cast(regexp_extract(o_clerk,'Clerk#00000(\d\d\d\d)') as double) as 
clerk_number, clerk_number+1000 FROM "orders-view"  LIMIT 5;

o_clerk          clerk_number  (clerk_number + 1000)
Clerk#000000951  951           1951
Clerk#000000880  880           1880
Clerk#000000955  955           1955
Clerk#000000124  124           1124
Clerk#000000925  925           1925

Note that if you do not cast the digit characters as in the example above, the query still works because of implicit casting for the extracted values:

SELECT o_clerk, regexp_extract(o_clerk,'Clerk#00000(\d\d\d\d)') as 
clerk_number, clerk_number+1000 FROM "orders-view"  LIMIT 5;

o_clerk          clerk_number  (clerk_number + 1000)
Clerk#000000951  951           1951
Clerk#000000880  880           1880
Clerk#000000955  955           1955
Clerk#000000124  124           1124
Clerk#000000925  925           1925

If the cast function cannot perform the requested cast, the function returns a null value, similar to the Trino try_cast function:

SELECT cast("Records.eventName" as timestamp) from "cloud-records-all" limit 3;

CAST("Records.eventName" AS timestamp)
NULL
NULL
NULL

JSON Functions

JSON_VALUE

The json_value function extracts a scalar SQL value from a JSON string value.

json_value(json_string_column, 'JSONPath-expression')

For the json_string_column, specify a ChaosSearch view column that contains JSON string data. Enclose the column name in double quotation marks if the name includes a period separator, such as "Records.requestParameters".

For JSONPath-expression, specify a JSONPath expression enclosed in single quotations for the desired JSON property to return.

📘

Currently a Partial Trino json_value Implementation

Note that the current json_value implementation does not support the PASSING or RETURNING keywords. The strict and lax modes have been collapsed to align with the behavior of the Materialize with JSONPath transform, so those keywords should not be used in current testing.

See Query JSON String Fields in SQL for more information about using the function.

Regular Expression Functions

Currently, ChaosSearch supports only the Trino connector regexp_extract function.

REGEXP_EXTRACT

The regexp_extract function returns the first substring within a JSON string blob that matches the specified regular expression pattern.

regexp_extract(json_string_column, 'regular_expression')

For the json_string_column, specify a ChaosSearch view column that contains JSON string data. Enclose the column name in double quotation marks if the name includes a period separator, such as "Records.requestParameters".

For regular_expression, specify a Java pattern regular expression enclosed in single quotations to extract the matching string from the JSON string column.

📘

The regular expression function uses the Java pattern syntax, with some exceptions. See Regular expression functions.

String Functions

CONCAT

Use concat(string,string[,string...]) or the || operator to concatenate strings within a query. Two sample queries follow that return the same results:

select client_ip, client_port, concat(client_ip,':', client_port) as client from "sample-elb-view" limit 5;

select client_ip, client_port, (client_ip || ':' || client_port) as client from "sample-elb-view" limit 5;
3330

LENGTH

Use length(string) to return the length of a string in characters. For example:

select user_agent, length(user_agent) from "sample-elb-view";
3330

LOWER

Use lower(string) to convert a string to all lowercase letters. For example:

SELECT lower('AbCdE');

select lower(user_agent) from "sample-elb-view" limit 10;
3330

SUBSTR

Use substr(string,position) to return the remaining characters in a string starting with the specified position character. This can be helpful for removing leading common characters like https:// from a URL, for example:

select substr('abcdefg',5)
efg


select substr(cs_uri_stem,9) from "sample-elb-view";
3330

UPPER

Use upper(string) to convert a string to all uppercase letters. For example:

select upper(cs_uri_stem) from "sample-elb-view";
3330

Time/Date Functions

DATE_FORMAT

Use the date_format(string,format) function to format a timestamp value as a string using the format controls. The format controls are a series of specifiers as in the following table:

SpecifierDescription
%aAbbreviated weekday name (Sun .. Sat)
%bAbbreviated month name (Jan .. Dec)
%cMonth, numeric (1 .. 12)
%dDay of the month, numeric (01 .. 31)
%eDay of the month, numeric (1 .. 31)
%fFraction of second (6 digits for printing: 000000 .. 999000; 1 - 9 digits for parsing: 0 .. 999999999)
Note that timestamp is truncated to milliseconds.
%HHour (00 .. 23)
%hHour (01 .. 12)
%IHour (01 .. 12)
%iMinutes, numeric (00 .. 59)
%jDay of year (001 .. 366)
%kHour (0 .. 23)
%lHour (1 .. 12)
%MMonth name (January .. December)
%mMonth, numeric (01 .. 12)
%pAM or PM
%rTime of day, 12-hour (equivalent to %h:%i:%s %p)
%S or %sSeconds (00 .. 59)
%TTime of day, 24-hour (equivalent to %H:%i:%s)
%vWeek (01 .. 53), where Monday is the first day of the week; used with %x
%WWeekday name (Sunday .. Saturday)
%xYear for the week, where Monday is the first day of the week, numeric, four digits; used with %v
%YYear, numeric, four digits
%yYear, numeric (two digits)
When parsing, two-digit year format assumes range 1970 .. 2069, so “70” will result in year 1970 but “69” will result in 2069.
%%A literal % character
SELECT DATE_FORMAT(timestamp, '%W %M %d %Y') as test from "sample-elb-view" limit 5;
2902

DATE_ADD

Use date_add(unit, value, timestamp) to add an interval value of type unit to timestamp. You can subtract the interval with a negative value.

select timestamp, date_add('year',1,timestamp) from "sample-elb-view" limit 5;

timestamp                    date_add('year', 1, timestamp)
2017-11-04T01:46:05.833+0000 2018-11-04T01:46:05.833+0000
2017-11-04T01:46:22.089+0000 2018-11-04T01:46:22.089+0000
2017-11-04T01:46:53.396+0000 2018-11-04T01:46:53.396+0000
2017-11-04T01:47:20.990+0000 2018-11-04T01:47:20.990+0000
2017-11-04T01:48:03.565+0000 2018-11-04T01:48:03.565+0000

DATE_TRUNC

Use the date_trunc(unit,x) function to truncate the value x to the specifed unit level.

select timestamp, date_trunc('second',timestamp) from "sample-elb-view" limit 1;
Unit ValueSample Truncated Value
secondOriginal: 2017-11-04T01:46:05.833+0000
Truncated: 2017-11-04T01:46:05.000+0000
minuteOriginal: 2017-11-04T01:46:05.833+0000
Truncated: 2017-11-04T01:46:00.000+0000
hourOriginal: 2017-11-04T01:46:05.833+0000
Truncated: 2017-11-04T01:00:00.000+0000
dayOriginal: 2017-11-04T01:46:05.833+0000
Truncated: 2017-11-04T00:00:00.000+0000
weekOriginal: 2017-11-04T01:46:05.833+0000
Truncated: 2017-10-30T00:00:00.000+0000
monthOriginal: 2017-11-04T01:46:05.833+0000
Truncated: 2017-11-01T00:00:00.000+0000
quarterOriginal: 2017-11-04T01:46:05.833+0000
Truncated: 2017-10-01T00:00:00.000+0000
yearOriginal: 2017-11-04T01:46:05.833+0000
Truncated: 2017-01-01T00:00:00.000+0000
SELECT DATE_FORMAT(fieldName, '%Y-%m-%d %T')
 FROM tableName
  WHERE DATE_TRUNC('day', fieldName)
    BETWEEN
      DATE_TRUNC('day', DATE_ADD('day', -30, NOW())) AND
      DATE_TRUNC('day', DATE_ADD('day', -1, NOW()));

FROM_ISO8601_TIMESTAMP

Use the from_iso8601_timestamp (string) to convert the ISO 8601 formatted date string, optionally with time and time zone, into an epoch value. The time defaults to 00:00:00.000, and the time zone defaults to the session time zone.

SELECT from_iso8601_timestamp('2020-06-11T11:15:05.055+00:00');

from_iso8601_timestamp('2020-06-11T11:15:05.055+00:00')
1591874105055

FROM_ISO8601_TIMESTAMP_NANOS

Use the from_iso8601_timestamp_nanos(string) to convert the ISO 8601 formatted date and time string to an epoch value. The time zone defaults to the session time zone.

SELECT from_iso8601_timestamp_nanos('2020-05-11T11:15:05');

from_iso8601_timestamp_nanos('2020-05-11T11:15:05')
1589195705000

FROM_ISO8601_DATE

Use the from_iso8601_date(string) function to convert the ISO 8601 formatted date string into a epoch value. The date can be a calendar date, a week date using ISO week numbering, or year and day of year combined:

SELECT from_iso8601_date('2020-05-11');

from_iso8601_date('2020-05-11')
1589155200000

TO_ISO8601

Use the to_iso8601(field) to convert an epoch value into ISO 8601 string field. Note that ChaosSearch formats timestamp values into this format by default.

MINUTE

Use the minute(field) function to extract the minute value from a field.

select timestamp, minute(timestamp) from "sample-elb-view" limit 1;

timestamp                       minute(timestamp)
2017-11-04T01:46:05.833+0000    46

HOUR

Use the hour(field) function to extract the hour value from a field.

select timestamp, hour(timestamp) from "sample-elb-view" limit 1;

timestamp                                         hour(timestamp)
2017-11-04T01:46:05.833+0000  1

DAY

Use the day(field) function to extract the day value from a field.

select timestamp, day(timestamp) from "sample-elb-view" limit 1;

timestamp                                           day(timestamp)
2017-11-04T01:46:05.833+0000  4

DAY_OF_WEEK

Use the day_of_week(field) function to extract the day of the week value from a field.

select timestamp, day_of_week(timestamp) from "sample-elb-view" limit 1;

timestamp                                           day_of_week(timestamp)
2017-11-04T01:46:05.833+0000  6

DAY_OF_YEAR

Use the day_of_year(field) function to extract the day of the year value from a field.

select timestamp, day_of_year(timestamp) from "sample-elb-view" limit 1;

timestamp                                           day_of_year(timestamp)
2017-11-04T01:46:05.833+0000  308

MONTH

Use the month(field) function to extract the month value from a field.

select timestamp, month(timestamp) from "sample-elb-view" limit 1;

timestamp                                         month(timestamp)
2017-11-04T01:46:07.244+0000  11

YEAR

Use the year(field) function to extract the year value from a field.

select timestamp, year(timestamp) from "sample-elb-view" limit 1;

timestamp                                         year(timestamp)
2017-11-04T01:46:05.833+0000  2017

NOW

Use the now() function to display the epoch time when the query executes.

select now();

now()
1670444032565