Class DataTypes


  • @PublicEvolving
    public final class DataTypes
    extends Object
    A DataType can be used to declare input and/or output types of operations. This class enumerates all pre-defined data types of the Table & SQL API.

    For convenience, this class also contains methods for creating UnresolvedDataTypes that need to be resolved at later stages. This is in particular useful for more complex types that are expressed as Class (see of(Class)) or types that need to be looked up in a catalog (see of(String)).

    NOTE: Planners might not support every data type with the desired precision or parameter. Please see the planner compatibility and limitations section in the website documentation before using a data type.

    • Method Detail

      • of

        public static UnresolvedDataType of​(Class<?> unresolvedClass)
        Creates an unresolved type that will be resolved to a DataType by analyzing the given class later.

        During the resolution, Java reflection is used which can be supported by DataTypeHint annotations for nested, structured types.

        It will throw an ValidationException in cases where the reflective extraction needs more information or simply fails.

        The following examples show how to use and enrich the extraction process:

        
         // returns INT
         of(Integer.class)
        
         // returns TIMESTAMP(9)
         of(java.time.LocalDateTime.class)
        
         // returns an anonymous, unregistered structured type
         // that is deeply integrated into the API compared to opaque RAW types
         class User {
        
           // extract fields automatically
           public String name;
           public int age;
        
           // enrich the extraction with precision information
           public @DataTypeHint("DECIMAL(10,2)") BigDecimal accountBalance;
        
           // enrich the extraction with forcing using RAW types
           public @DataTypeHint(forceRawPattern = "scala.") Address address;
        
           // enrich the extraction by specifying defaults
           public @DataTypeHint(defaultSecondPrecision = 3) Log log;
         }
         of(User.class)
         

        Note: In most of the cases, the UnresolvedDataType will be automatically resolved by the API. At other locations, a DataTypeFactory is provided.

      • of

        public static UnresolvedDataType of​(String unresolvedName)
        Creates an unresolved type that will be resolved to a DataType by using a fully or partially defined name.

        It includes both built-in types (e.g. "INT") as well as user-defined types (e.g. "mycat.mydb.Money").

        Note: In most of the cases, the UnresolvedDataType will be automatically resolved by the API. At other locations, a DataTypeFactory is provided.

      • CHAR

        public static DataType CHAR​(int n)
        Data type of a fixed-length character string CHAR(n) where n is the number of code points. n must have a value between 1 and Integer.MAX_VALUE (both inclusive).
        See Also:
        CharType
      • VARCHAR

        public static DataType VARCHAR​(int n)
        Data type of a variable-length character string VARCHAR(n) where n is the maximum number of code points. n must have a value between 1 and Integer.MAX_VALUE (both inclusive).
        See Also:
        VarCharType
      • STRING

        public static DataType STRING()
        Data type of a variable-length character string with defined maximum length. This is a shortcut for VARCHAR(2147483647) for representing JVM strings.
        See Also:
        VarCharType
      • BOOLEAN

        public static DataType BOOLEAN()
        Data type of a boolean with a (possibly) three-valued logic of TRUE, FALSE, UNKNOWN.
        See Also:
        BooleanType
      • BINARY

        public static DataType BINARY​(int n)
        Data type of a fixed-length binary string (=a sequence of bytes) BINARY(n) where n is the number of bytes. n must have a value between 1 and Integer.MAX_VALUE (both inclusive).
        See Also:
        BinaryType
      • VARBINARY

        public static DataType VARBINARY​(int n)
        Data type of a variable-length binary string (=a sequence of bytes) VARBINARY(n) where n is the maximum number of bytes. n must have a value between 1 and Integer.MAX_VALUE (both inclusive).
        See Also:
        VarBinaryType
      • BYTES

        public static DataType BYTES()
        Data type of a variable-length binary string (=a sequence of bytes) with defined maximum length. This is a shortcut for VARBINARY(2147483647) for representing JVM byte arrays.
        See Also:
        VarBinaryType
      • DECIMAL

        public static DataType DECIMAL​(int precision,
                                       int scale)
        Data type of a decimal number with fixed precision and scale DECIMAL(p, s) where p is the number of digits in a number (=precision) and s is the number of digits to the right of the decimal point in a number (=scale). p must have a value between 1 and 38 (both inclusive). s must have a value between 0 and p (both inclusive).
        See Also:
        DecimalType
      • TINYINT

        public static DataType TINYINT()
        Data type of a 1-byte signed integer with values from -128 to 127.
        See Also:
        TinyIntType
      • SMALLINT

        public static DataType SMALLINT()
        Data type of a 2-byte signed integer with values from -32,768 to 32,767.
        See Also:
        SmallIntType
      • INT

        public static DataType INT()
        Data type of a 4-byte signed integer with values from -2,147,483,648 to 2,147,483,647.
        See Also:
        IntType
      • BIGINT

        public static DataType BIGINT()
        Data type of an 8-byte signed integer with values from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.
        See Also:
        BigIntType
      • FLOAT

        public static DataType FLOAT()
        Data type of a 4-byte single precision floating point number.
        See Also:
        FloatType
      • DOUBLE

        public static DataType DOUBLE()
        Data type of an 8-byte double precision floating point number.
        See Also:
        DoubleType
      • DATE

        public static DataType DATE()
        Data type of a date consisting of year-month-day with values ranging from 0000-01-01 to 9999-12-31.

        Compared to the SQL standard, the range starts at year 0000.

        See Also:
        DataType
      • TIME

        public static DataType TIME​(int precision)
        Data type of a time WITHOUT time zone TIME(p) where p is the number of digits of fractional seconds (=precision). p must have a value between 0 and 9 (both inclusive).

        An instance consists of hour:minute:second[.fractional] with up to nanosecond precision and values ranging from 00:00:00.000000000 to 23:59:59.999999999.

        Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to LocalTime. A time WITH time zone is not provided.

        See Also:
        TIME(), TimeType
      • TIME

        public static DataType TIME()
        Data type of a time WITHOUT time zone TIME with no fractional seconds by default.

        An instance consists of hour:minute:second with up to second precision and values ranging from 00:00:00 to 23:59:59.

        Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to LocalTime. A time WITH time zone is not provided.

        See Also:
        TIME(int), TimeType
      • TIMESTAMP

        public static DataType TIMESTAMP​(int precision)
        Data type of a timestamp WITHOUT time zone TIMESTAMP(p) where p is the number of digits of fractional seconds (=precision). p must have a value between 0 and 9 (both inclusive).

        An instance consists of year-month-day hour:minute:second[.fractional] with up to nanosecond precision and values ranging from 0000-01-01 00:00:00.000000000 to 9999-12-31 23:59:59.999999999.

        Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to LocalDateTime.

        See Also:
        TIMESTAMP_WITH_TIME_ZONE(int), TIMESTAMP_WITH_LOCAL_TIME_ZONE(int), TimestampType
      • TIMESTAMP

        public static DataType TIMESTAMP()
        Data type of a timestamp WITHOUT time zone TIMESTAMP with 6 digits of fractional seconds by default.

        An instance consists of year-month-day hour:minute:second[.fractional] with up to microsecond precision and values ranging from 0000-01-01 00:00:00.000000 to 9999-12-31 23:59:59.999999.

        Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to LocalDateTime.

        See Also:
        TIMESTAMP(int), TIMESTAMP_WITH_TIME_ZONE(int), TIMESTAMP_WITH_LOCAL_TIME_ZONE(int), TimestampType
      • TIMESTAMP_WITH_TIME_ZONE

        public static DataType TIMESTAMP_WITH_TIME_ZONE​(int precision)
        Data type of a timestamp WITH time zone TIMESTAMP(p) WITH TIME ZONE where p is the number of digits of fractional seconds (=precision). p must have a value between 0 and 9 (both inclusive).

        An instance consists of year-month-day hour:minute:second[.fractional] zone with up to nanosecond precision and values ranging from 0000-01-01 00:00:00.000000000 +14:59 to 9999-12-31 23:59:59.999999999 -14:59.

        Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to OffsetDateTime.

        See Also:
        TIMESTAMP(int), TIMESTAMP_WITH_LOCAL_TIME_ZONE(int), ZonedTimestampType
      • TIMESTAMP_WITH_TIME_ZONE

        public static DataType TIMESTAMP_WITH_TIME_ZONE()
        Data type of a timestamp WITH time zone TIMESTAMP WITH TIME ZONE with 6 digits of fractional seconds by default.

        An instance consists of year-month-day hour:minute:second[.fractional] zone with up to microsecond precision and values ranging from 0000-01-01 00:00:00.000000 +14:59 to 9999-12-31 23:59:59.999999 -14:59.

        Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to OffsetDateTime.

        See Also:
        TIMESTAMP_WITH_TIME_ZONE(int), TIMESTAMP(int), TIMESTAMP_WITH_LOCAL_TIME_ZONE(int), ZonedTimestampType
      • TIMESTAMP_WITH_LOCAL_TIME_ZONE

        public static DataType TIMESTAMP_WITH_LOCAL_TIME_ZONE​(int precision)
        Data type of a timestamp WITH LOCAL time zone TIMESTAMP(p) WITH LOCAL TIME ZONE where p is the number of digits of fractional seconds (=precision). p must have a value between 0 and 9 (both inclusive).

        An instance consists of year-month-day hour:minute:second[.fractional] zone with up to nanosecond precision and values ranging from 0000-01-01 00:00:00.000000000 +14:59 to 9999-12-31 23:59:59.999999999 -14:59. Leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to OffsetDateTime.

        Compared to ZonedTimestampType, the time zone offset information is not stored physically in every datum. Instead, the type assumes Instant semantics in UTC time zone at the edges of the table ecosystem. Every datum is interpreted in the local time zone configured in the current session for computation and visualization.

        This type fills the gap between time zone free and time zone mandatory timestamp types by allowing the interpretation of UTC timestamps according to the configured session timezone.

        See Also:
        TIMESTAMP(int), TIMESTAMP_WITH_TIME_ZONE(int), LocalZonedTimestampType
      • TIMESTAMP_WITH_LOCAL_TIME_ZONE

        public static DataType TIMESTAMP_WITH_LOCAL_TIME_ZONE()
        Data type of a timestamp WITH LOCAL time zone TIMESTAMP WITH LOCAL TIME ZONE with 6 digits of fractional seconds by default.

        An instance consists of year-month-day hour:minute:second[.fractional] zone with up to microsecond precision and values ranging from 0000-01-01 00:00:00.000000 +14:59 to 9999-12-31 23:59:59.999999 -14:59. Leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to OffsetDateTime.

        Compared to ZonedTimestampType, the time zone offset information is not stored physically in every datum. Instead, the type assumes Instant semantics in UTC time zone at the edges of the table ecosystem. Every datum is interpreted in the local time zone configured in the current session for computation and visualization.

        This type fills the gap between time zone free and time zone mandatory timestamp types by allowing the interpretation of UTC timestamps according to the configured session timezone.

        See Also:
        TIMESTAMP_WITH_LOCAL_TIME_ZONE(int), TIMESTAMP(int), TIMESTAMP_WITH_TIME_ZONE(int), LocalZonedTimestampType
      • INTERVAL

        public static DataType INTERVAL​(DataTypes.Resolution resolution)
        Data type of a temporal interval. There are two types of temporal intervals: day-time intervals with up to nanosecond granularity or year-month intervals with up to month granularity.

        An interval of day-time consists of +days hours:months:seconds.fractional with values ranging from -999999 23:59:59.999999999 to +999999 23:59:59.999999999. The type must be parameterized to one of the following resolutions: interval of days, interval of days to hours, interval of days to minutes, interval of days to seconds, interval of hours, interval of hours to minutes, interval of hours to seconds, interval of minutes, interval of minutes to seconds, or interval of seconds. The value representation is the same for all types of resolutions. For example, an interval of seconds of 70 is always represented in an interval-of-days-to-seconds format (with default precisions): +00 00:01:10.000000).

        An interval of year-month consists of +years-months with values ranging from -9999-11 to +9999-11. The type must be parameterized to one of the following resolutions: interval of years, interval of years to months, or interval of months. The value representation is the same for all types of resolutions. For example, an interval of months of 50 is always represented in an interval-of-years-to-months format (with default year precision): +04-02.

        Examples: INTERVAL(DAY(2)) for a day-time interval or INTERVAL(YEAR(4)) for a year-month interval.

        See Also:
        DayTimeIntervalType, YearMonthIntervalType
      • INTERVAL

        public static DataType INTERVAL​(DataTypes.Resolution upperResolution,
                                        DataTypes.Resolution lowerResolution)
        Data type of a temporal interval. There are two types of temporal intervals: day-time intervals with up to nanosecond granularity or year-month intervals with up to month granularity.

        An interval of day-time consists of +days hours:months:seconds.fractional with values ranging from -999999 23:59:59.999999999 to +999999 23:59:59.999999999. The type must be parameterized to one of the following resolutions: interval of days, interval of days to hours, interval of days to minutes, interval of days to seconds, interval of hours, interval of hours to minutes, interval of hours to seconds, interval of minutes, interval of minutes to seconds, or interval of seconds. The value representation is the same for all types of resolutions. For example, an interval of seconds of 70 is always represented in an interval-of-days-to-seconds format (with default precisions): +00 00:01:10.000000.

        An interval of year-month consists of +years-months with values ranging from -9999-11 to +9999-11. The type must be parameterized to one of the following resolutions: interval of years, interval of years to months, or interval of months. The value representation is the same for all types of resolutions. For example, an interval of months of 50 is always represented in an interval-of-years-to-months format (with default year precision): +04-02.

        Examples: INTERVAL(DAY(2), SECOND(9)) for a day-time interval or INTERVAL(YEAR(4), MONTH()) for a year-month interval.

        See Also:
        DayTimeIntervalType, YearMonthIntervalType
      • ARRAY

        public static DataType ARRAY​(DataType elementDataType)
        Data type of an array of elements with same subtype.

        Compared to the SQL standard, the maximum cardinality of an array cannot be specified but is fixed at Integer.MAX_VALUE. Also, any valid type is supported as a subtype.

        See Also:
        ArrayType
      • MULTISET

        public static DataType MULTISET​(DataType elementDataType)
        Data type of a multiset (=bag). Unlike a set, it allows for multiple instances for each of its elements with a common subtype. Each unique value (including NULL) is mapped to some multiplicity.

        There is no restriction of element types; it is the responsibility of the user to ensure uniqueness.

        See Also:
        MultisetType
      • MULTISET

        public static UnresolvedDataType MULTISET​(AbstractDataType<?> elementDataType)
        Unresolved data type of a multiset (=bag). Unlike a set, it allows for multiple instances for each of its elements with a common subtype. Each unique value (including NULL) is mapped to some multiplicity.

        There is no restriction of element types; it is the responsibility of the user to ensure uniqueness.

        Note: Compared to MULTISET(DataType), this method produces an UnresolvedDataType. In most of the cases, the UnresolvedDataType will be automatically resolved by the API. At other locations, a DataTypeFactory is provided.

        See Also:
        MultisetType
      • MAP

        public static DataType MAP​(DataType keyDataType,
                                   DataType valueDataType)
        Data type of an associative array that maps keys (including NULL) to values (including NULL). A map cannot contain duplicate keys; each key can map to at most one value.

        There is no restriction of key types; it is the responsibility of the user to ensure uniqueness. The map type is an extension to the SQL standard.

        See Also:
        MapType
      • MAP

        public static UnresolvedDataType MAP​(AbstractDataType<?> keyDataType,
                                             AbstractDataType<?> valueDataType)
        Unresolved data type of an associative array that maps keys (including NULL) to values (including NULL). A map cannot contain duplicate keys; each key can map to at most one value.

        There is no restriction of key types; it is the responsibility of the user to ensure uniqueness. The map type is an extension to the SQL standard.

        Note: Compared to MAP(DataType, DataType), this method produces an UnresolvedDataType. In most of the cases, the UnresolvedDataType will be automatically resolved by the API. At other locations, a DataTypeFactory is provided.

        See Also:
        MapType
      • ROW

        public static DataType ROW​(DataTypes.Field... fields)
        Data type of a sequence of fields. A field consists of a field name, field type, and an optional description. The most specific type of a row of a table is a row type. In this case, each column of the row corresponds to the field of the row type that has the same ordinal position as the column.

        Compared to the SQL standard, an optional field description simplifies the handling with complex structures.

        Use FIELD(String, DataType) or FIELD(String, DataType, String) to construct fields.

        See Also:
        RowType
      • ROW

        public static DataType ROW​(DataType... fieldDataTypes)
        Data type of a sequence of fields.

        This is shortcut for ROW(Field...) where the field names will be generated using f0, f1, f2, ....

      • ROW

        public static DataType ROW()
        Data type of a row type with no fields. It only exists for completeness.
        See Also:
        ROW(Field...)
      • NULL

        public static DataType NULL()
        Data type for representing untyped NULL values. A null type has no other value except NULL, thus, it can be cast to any nullable type similar to JVM semantics.

        This type helps in representing unknown types in API calls that use a NULL literal as well as bridging to formats such as JSON or Avro that define such a type as well.

        The null type is an extension to the SQL standard.

        Note: The runtime does not support this type. It is a pure helper type during translation and planning. Table columns cannot be declared with this type. Functions cannot declare return types of this type.

        See Also:
        NullType
      • RAW

        public static <T> DataType RAW​(Class<T> clazz,
                                       TypeSerializer<T> serializer)
        Data type of an arbitrary serialized type. This type is a black box within the table ecosystem and is only deserialized at the edges.

        The raw type is an extension to the SQL standard.

        This method assumes that a TypeSerializer instance is present. Use RAW(Class) for automatically generating a serializer.

        Parameters:
        clazz - originating value class
        serializer - type serializer
        See Also:
        RawType
      • RAW

        public static <T> UnresolvedDataType RAW​(Class<T> clazz)
        Unresolved data type of an arbitrary serialized type. This type is a black box within the table ecosystem and is only deserialized at the edges.

        The raw type is an extension to the SQL standard.

        Compared to RAW(Class, TypeSerializer), this method produces an UnresolvedDataType where no serializer is known and a generic serializer should be used. During the resolution, a RAW(Class, TypeSerializer) with Flink's default RAW serializer is created and automatically configured.

        Note: In most of the cases, the UnresolvedDataType will be automatically resolved by the API. At other locations, a DataTypeFactory is provided.

        See Also:
        RawType
      • STRUCTURED

        public static <T> DataType STRUCTURED​(Class<T> implementationClass,
                                              DataTypes.Field... fields)
        Data type of a user-defined object structured type. Structured types contain zero, one or more attributes. Each attribute consists of a name and a type. A type cannot be defined so that one of its attribute types (transitively) uses itself.

        There are two kinds of structured types. Types that are stored in a catalog and are identified by an ObjectIdentifier or anonymously defined, unregistered types (usually reflectively extracted) that are identified by an implementation Class.

        This method helps in manually constructing anonymous, unregistered types. This is useful in cases where the reflective extraction using of(Class) is not applicable. However, of(Class) is the recommended way of creating inline structured types as it also considers DataTypeHints.

        Structured types are converted to internal data structures by the runtime. The given implementation class is only used at the edges of the table ecosystem (e.g. when bridging to a function or connector). Serialization and equality (hashCode/equals) are handled by the runtime based on the logical type. An implementation class must offer a default constructor with zero arguments or a full constructor that assigns all attributes.

        Note: A caller of this method must make sure that the DataType.getConversionClass() of the given fields matches with the attributes of the given implementation class, otherwise an exception might be thrown during runtime.

        See Also:
        of(Class), StructuredType
      • SECOND

        public static DataTypes.Resolution SECOND​(int precision)
        Resolution in seconds and (possibly) fractional seconds. The precision is the number of digits of fractional seconds. It must have a value between 0 and 9 (both inclusive). If no fractional is specified, it is equal to 6 by default.
        See Also:
        SECOND()
      • DAY

        public static DataTypes.Resolution DAY​(int precision)
        Resolution in days. The precision is the number of digits of days. It must have a value between 1 and 6 (both inclusive). If no precision is specified, it is equal to 2 by default.
        See Also:
        DAY()
      • YEAR

        public static DataTypes.Resolution YEAR​(int precision)
        Resolution in years. The precision is the number of digits of years. It must have a value between 1 and 4 (both inclusive). If no precision is specified, it is equal to 2.
        See Also:
        YEAR()