Expressions#

Expressions#

col(name)

Creates an expression which refers to a table's column.

lit(v[, data_type])

Creates a SQL literal.

range_(start, end)

Indicates a range from 'start' to 'end', which can be used in columns selection.

and_(predicate0, predicate1, *predicates)

Boolean AND in three-valued logic.

or_(predicate0, predicate1, *predicates)

Boolean OR in three-valued logic.

not_(expression)

Inverts a given boolean expression.

current_database()

Returns the current database

current_date()

Returns the current SQL date in local time zone.

current_time()

Returns the current SQL time in local time zone.

current_timestamp()

Returns the current SQL timestamp in local time zone, the return type of this expression is TIMESTAMP_LTZ.

current_watermark(rowtimeAttribute)

Returns the current watermark for the given rowtime attribute, or NULL if no common watermark of all upstream operations is available at the current operation in the pipeline.

local_time()

Returns the current SQL time in local time zone.

local_timestamp()

Returns the current SQL timestamp in local time zone, the return type of this expression s TIMESTAMP.

to_date(date_str[, format])

Converts the date string with the given format (by default 'yyyy-MM-dd') to a date.

to_timestamp(timestamp_str[, format])

Converts the date time string with the given format (by default: 'yyyy-MM-dd HH:mm:ss') under the 'UTC+0' time zone to a timestamp.

to_timestamp_ltz(numeric_epoch_time, precision)

Converts a numeric type epoch time to TIMESTAMP_LTZ.

temporal_overlaps(left_time_point, ...)

Determines whether two anchored time intervals overlap.

date_format(timestamp, format)

Formats a timestamp as a string using a specified format.

timestamp_diff(time_point_unit, time_point1, ...)

Returns the (signed) number of TimePointUnit between time_point1 and time_point2.

convert_tz(date_str, tz_from, tz_to)

Converts a datetime string date_str (with default ISO timestamp format 'yyyy-MM-dd HH:mm:ss') from time zone tz_from to time zone tz_to.

from_unixtime(unixtime[, format])

Converts unix timestamp (seconds since '1970-01-01 00:00:00' UTC) to datetime string the given format.

unix_timestamp([date_str, format])

Gets the current unix timestamp in seconds if no arguments are not specified.

array(head, *tail)

Creates an array of literals.

row(head, *tail)

Creates a row of expressions.

map_(key, value, *tail)

Creates a map of expressions.

map_from_arrays(key, value)

Creates a map from an array of keys and an array of values.

row_interval(rows)

Creates an interval of rows.

pi()

Returns a value that is closer than any other value to pi.

e()

Returns a value that is closer than any other value to e.

rand([seed])

Returns a pseudorandom double value between 0.0 (inclusive) and 1.0 (exclusive) with a initial seed if specified.

rand_integer(bound[, seed])

Returns a pseudorandom integer value between 0 (inclusive) and the specified value (exclusive) with a initial seed if specified.

atan2(y, x)

Calculates the arc tangent of a given coordinate.

negative(v)

Returns negative numeric.

concat(first, *others)

Returns the string that results from concatenating the arguments.

concat_ws(separator, first, *others)

Returns the string that results from concatenating the arguments and separator.

uuid()

Returns an UUID (Universally Unique Identifier) string (e.g., "3d3c68f7-f608-473f-b60c-b0c44ad4cc4e") according to RFC 4122 type 4 (pseudo randomly generated) UUID.

null_of(data_type)

Returns a null literal value of a given data type.

log(v[, base])

If base is specified, calculates the logarithm of the given value to the given base.

source_watermark()

Source watermark declaration for schema.

if_then_else(condition, if_true, if_false)

Ternary conditional operator that decides which of two other expressions should be evaluated based on a evaluated boolean condition.

coalesce(*args)

Returns the first argument that is not NULL.

with_columns(head, *tails)

Creates an expression that selects a range of columns.

without_columns(head, *tails)

Creates an expression that selects all columns except for the given range of columns.

json_string(value)

Serializes a value into JSON.

json_object([on_null])

Builds a JSON object string from a list of key-value pairs.

json_object_agg(on_null, key_expr, value_expr)

Builds a JSON object string by aggregating key-value expressions into a single JSON object.

json_array([on_null])

Builds a JSON array string from a list of values.

json_array_agg(on_null, item_expr)

Builds a JSON object string by aggregating items into an array.

call(f, *args)

The first parameter f could be a str or a Python user-defined function.

call_sql(sql_expression)

A call to a SQL expression.

Expression#

arithmetic functions#

Expression.exp

Calculates the Euler's number raised to the given power.

Expression.log10

Calculates the base 10 logarithm of the given value.

Expression.log2

Calculates the base 2 logarithm of the given value.

Expression.ln

Calculates the natural logarithm of the given value.

Expression.log([base])

Calculates the natural logarithm of the given value if base is not specified.

Expression.cosh

Calculates the hyperbolic cosine of a given number.

Expression.sinh

Calculates the hyperbolic sine of a given number.

Expression.sin

Calculates the sine of a given number.

Expression.cos

Calculates the cosine of a given number.

Expression.tan

Calculates the tangent of a given number.

Expression.cot

Calculates the cotangent of a given number.

Expression.asin

Calculates the arc sine of a given number.

Expression.acos

Calculates the arc cosine of a given number.

Expression.atan

Calculates the arc tangent of a given number.

Expression.tanh

Calculates the hyperbolic tangent of a given number.

Expression.degrees

Converts numeric from radians to degrees.

Expression.radians

Converts numeric from degrees to radians.

Expression.sqrt

Calculates the square root of a given value.

Expression.abs

Calculates the absolute value of given value.

Expression.sign

Calculates the signum of a given number.

Expression.round(places)

Rounds the given number to integer places right to the decimal point.

Expression.between(lower_bound, upper_bound)

Returns true if the given expression is between lower_bound and upper_bound (both inclusive).

Expression.not_between(lower_bound, upper_bound)

Returns true if the given expression is not between lower_bound and upper_bound (both inclusive).

Expression.then(if_true, if_false)

Ternary conditional operator that decides which of two other expressions should be evaluated based on a evaluated boolean condition.

Expression.if_null(null_replacement)

Returns null_replacement if the given expression is null; otherwise the expression is returned.

Expression.is_null

Returns true if the given expression is null.

Expression.is_not_null

Returns true if the given expression is not null.

Expression.is_true

Returns true if given boolean expression is true.

Expression.is_false

Returns true if given boolean expression is false.

Expression.is_not_true

Returns true if given boolean expression is not true (for null and false).

Expression.is_not_false

Returns true if given boolean expression is not false (for null and true).

Expression.distinct

Similar to a SQL distinct aggregation clause such as COUNT(DISTINCT a), declares that an aggregation function is only applied on distinct input values.

Expression.sum

Returns the sum of the numeric field across all input values.

Expression.sum0

Returns the sum of the numeric field across all input values.

Expression.min

Returns the minimum value of field across all input values.

Expression.max

Returns the maximum value of field across all input values.

Expression.count

Returns the number of input rows for which the field is not null.

Expression.avg

Returns the average (arithmetic mean) of the numeric field across all input values.

Expression.first_value

Returns the first value of field across all input values.

Expression.last_value

Returns the last value of field across all input values.

Expression.list_agg([separator])

Concatenates the values of string expressions and places separator values between them.

Expression.stddev_pop

Returns the population standard deviation of an expression(the square root of var_pop).

Expression.stddev_samp

Returns the sample standard deviation of an expression(the square root of var_samp).

Expression.var_pop

Returns the population standard variance of an expression.

Expression.var_samp

Returns the sample variance of a given expression.

Expression.collect

Returns multiset aggregate of a given expression.

Expression.array_agg

New in version 1.12.0.

Expression.alias(name, *extra_names)

Specifies a name for an expression i.e. a field.

Expression.cast(data_type)

Returns a new value being cast to type type.

Expression.try_cast(data_type)

Like cast, but in case of error, returns NULL rather than failing the job.

Expression.asc

Specifies ascending order of an expression i.e. a field for order_by.

Expression.desc

Specifies descending order of an expression i.e. a field for order_by.

Expression.in_(first_element_or_table, ...)

If first_element_or_table is a Table, Returns true if an expression exists in a given table sub-query.

Expression.start

Returns the start time (inclusive) of a window when applied on a window reference.

Expression.end

Returns the end time (exclusive) of a window when applied on a window reference.

Expression.bin

Returns a string representation of an integer numeric value in binary format.

Expression.hex

Returns a string representation of an integer numeric value or a string in hex format.

Expression.unhex

Converts hexadecimal string expr to BINARY.

Expression.truncate([n])

Returns a number of truncated to n decimal places.

Expression.percentile(percentage[, frequency])

Returns the exact percentile value of expr at the specified percentage in a group.

string functions#

Expression.starts_with(start_expr)

Returns whether expr starts with start_expr.

Expression.ends_with(end_expr)

Returns whether expr ends with end_expr.

Expression.substring(begin_index[, length])

Creates a substring of the given string at given index for a given length.

Expression.substr(begin_index[, length])

Creates a substring of the given string at given index for a given length.

Expression.trim_leading([character])

Removes leading space characters from the given string if character is None.

Expression.trim_trailing([character])

Removes trailing space characters from the given string if character is None.

Expression.trim([character])

Removes leading and trailing space characters from the given string if character is None.

Expression.replace([search, replacement])

Returns a new string which replaces all the occurrences of the search target with the replacement string (non-overlapping).

Expression.translate(from_str, to_str)

Translate an expr where all characters in from_str have been replaced with those in to_str.

Expression.char_length

Returns the length of a string.

Expression.upper_case

Returns all of the characters in a string in upper case using the rules of the default locale.

Expression.lower_case

Returns all of the characters in a string in lower case using the rules of the default locale.

Expression.init_cap

Converts the initial letter of each word in a string to uppercase.

Expression.like([pattern, escape])

Returns true, if a string matches the specified LIKE pattern e.g.

Expression.similar([pattern])

Returns true, if a string matches the specified SQL regex pattern.

Expression.position([haystack])

Returns the position of string in an other string starting at 1.

Expression.lpad(length, pad)

Returns a string left-padded with the given pad string to a length of len characters.

Expression.rpad(length, pad)

Returns a string right-padded with the given pad string to a length of len characters.

Expression.overlay(new_string, starting[, ...])

Replaces a substring of string with a string starting at a position (starting at 1).

Expression.regexp(regex)

Returns True if any (possibly empty) substring matches the regular expression, otherwise False.

Expression.regexp_count(regex)

Returns the number of times str matches the regex pattern.

Expression.regexp_replace(regex, replacement)

Returns a string with all substrings that match the regular expression consecutively being replaced.

Expression.regexp_extract(regex[, extract_index])

Returns a string extracted with a specified regular expression and a regex match group index.

Expression.regexp_extract_all(regex[, ...])

Extracts all the substrings in str that match the regex expression and correspond to the regex group extract_index.

Expression.regexp_instr(regex)

Returns the position of the first substring in str that matches regex.

Expression.regexp_substr(regex)

Returns the first substring in str that matches regex.

Expression.from_base64

Returns the base string decoded with base64.

Expression.to_base64

Returns the base64-encoded result of the input string.

Expression.ascii

Returns the numeric value of the first character of the input string.

Expression.chr

Returns the ASCII character result of the input integer.

Expression.decode(charset)

Decodes the first argument into a String using the provided character set.

Expression.encode(charset)

Encodes the string into a BINARY using the provided character set.

Expression.left(length)

Returns the leftmost integer characters from the input string.

Expression.right(length)

Returns the rightmost integer characters from the input string.

Expression.instr(s)

Returns the position of the first occurrence in the input string.

Expression.locate(s[, pos])

Returns the position of the first occurrence in the input string after position integer.

Expression.url_decode()

Decodes a given string in 'application/x-www-form-urlencoded' format using the UTF-8 encoding scheme.

Expression.url_encode()

Translates a string into 'application/x-www-form-urlencoded' format using the UTF-8 encoding scheme.

Expression.parse_url(part_to_extract[, key])

Parse url and return various parameter of the URL.

Expression.printf(*obj)

Returns a formatted string from printf-style format strings.

Expression.ltrim([trim_str])

Removes any leading characters within trim_str from str.

Expression.rtrim([trim_str])

Removes any trailing characters within trim_str from str.

Expression.btrim([trim_str])

Removes any leading and trailing characters within trim_str from str.

Expression.repeat(n)

Returns a string that repeats the base string n times.

Expression.over(alias)

Defines an aggregation to be used for a previously specified over window.

Expression.reverse

Reverse each character in current string.

Expression.split_index(separator, index)

Split target string with custom separator and pick the index-th(start with 0) result.

Expression.str_to_map([list_delimiter, ...])

Creates a map by parsing text.

Expression.elt(expr, *exprs)

Returns the index-th expression.

temporal functions#

Expression.to_date

Parses a date string in the form "yyyy-MM-dd" to a SQL Date.

Expression.to_time

Parses a time string in the form "HH:mm:ss" to a SQL Time.

Expression.to_timestamp

Parses a timestamp string in the form "yyyy-MM-dd HH:mm:ss[.SSS]" to a SQL Timestamp.

Expression.extract(time_interval_unit)

Extracts parts of a time point or time interval.

Expression.floor([time_interval_unit])

If time_interval_unit is specified, it rounds down a time point to the given unit, e.g.

Expression.ceil([time_interval_unit])

If time_interval_unit is specified, it rounds up a time point to the given unit, e.g.

advanced type helper functions#

Expression.get(name_or_index)

Accesses the field of a Flink composite type (such as Tuple, POJO, etc.) by name or index and returns it's value.

Expression.flatten

Converts a Flink composite type (such as Tuple, POJO, etc.) and all of its direct subtypes into a flat representation where every subtype is a separate field.

Expression.at(index)

Accesses the element of an array or map based on a key or an index (starting at 1).

Expression.cardinality

Returns the number of elements of an array or number of entries of a map.

Expression.element

Returns the sole element of an array with a single element.

Expression.array_concat(*arrays)

Returns an array that is the result of concatenating at least one array.

Expression.array_contains(needle)

Returns whether the given element exists in an array.

Expression.array_distinct()

Returns an array with unique elements.

Expression.array_join(delimiter[, ...])

Returns a string that represents the concatenation of the elements in the given array and the elements' data type in the given array is string.

Expression.array_position(needle)

Returns the position of the first occurrence of element in the given array as int.

Expression.array_remove(needle)

Removes all elements that equal to element from array.

Expression.array_reverse()

Returns an array in reverse order.

Expression.array_max()

Returns the maximum value from the array.

Expression.array_slice(start_offset[, ...])

Returns a subarray of the input array between 'start_offset' and 'end_offset' inclusive.

Expression.array_min()

Returns the minimum value from the array.

Expression.array_sort([ascending_order, ...])

Returns the array in sorted order.

Expression.array_union(array)

Returns an array of the elements in the union of array1 and array2, without duplicates.

Expression.map_entries

Returns an array of all entries in the given map.

Expression.map_keys

Returns the keys of the map as an array.

Expression.map_union(*maps)

Returns a map created by merging at least one map.

Expression.map_values

Returns the values of the map as an array.

Expression.array_except(array)

Returns an ARRAY that contains the elements from array1 that are not in array2.

Expression.array_intersect(array)

Returns an ARRAY that contains the elements from array1 that are also in array2, without duplicates.

Expression.split(delimiter)

Returns an array of substrings by splitting the input string based on the given delimiter.

time definition functions#

Expression.rowtime

Declares a field as the rowtime attribute for indicating, accessing, and working in Flink's event time.

Expression.proctime

Declares a field as the proctime attribute for indicating, accessing, and working in Flink's processing time.

Expression.year

Creates an interval of the given number of years.

Expression.years

Creates an interval of the given number of years.

Expression.quarter

Creates an interval of the given number of quarters.

Expression.quarters

Creates an interval of the given number of quarters.

Expression.month

Creates an interval of the given number of months.

Expression.months

Creates an interval of the given number of months.

Expression.week

Creates an interval of the given number of weeks.

Expression.weeks

Creates an interval of the given number of weeks.

Expression.day

Creates an interval of the given number of days.

Expression.days

Creates an interval of the given number of days.

Expression.hour

Creates an interval of the given number of hours.

Expression.hours

Creates an interval of the given number of hours.

Expression.minute

Creates an interval of the given number of minutes.

Expression.minutes

Creates an interval of the given number of minutes.

Expression.second

Creates an interval of the given number of seconds.

Expression.seconds

Creates an interval of the given number of seconds.

Expression.milli

Creates an interval of the given number of millis.

Expression.millis

Creates an interval of the given number of millis.

hash functions#

Expression.md5

Returns the MD5 hash of the string argument; null if string is null.

Expression.sha1

Returns the SHA-1 hash of the string argument; null if string is null.

Expression.sha224

Returns the SHA-224 hash of the string argument; null if string is null.

Expression.sha256

Returns the SHA-256 hash of the string argument; null if string is null.

Expression.sha384

Returns the SHA-384 hash of the string argument; null if string is null.

Expression.sha512

Returns the SHA-512 hash of the string argument; null if string is null.

Expression.sha2(hash_length)

Returns the hash for the given string expression using the SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, or SHA-512).

JSON functions#

Expression.is_json([json_type])

Determine whether a given string is valid JSON.

Expression.json_exists(path[, on_error])

Determines whether a JSON string satisfies a given search criterion.

Expression.json_value(path[, ...])

Extracts a scalar from a JSON string.

Expression.json_query(path[, ...])

Extracts JSON values from a JSON string.

Expression.json_quote()

Quotes a string as a JSON value by wrapping it with double quote characters, escaping interior quote and special characters ('"', '', '/', 'b', 'f', 'n', 'r', 't'), and returning the result as a string.

Expression.json_unquote()

Unquotes JSON value, unescapes escaped special characters ('"', '', '/', 'b', 'f', 'n', 'r', 't', 'u' hex hex hex hex) and returns the result as a string.