Modules allow users to extend Flink’s built-in objects, such as defining functions that behave like Flink built-in functions. They are pluggable, and while Flink provides a few pre-built modules, users can write their own.
For example, users can define their own geo functions and plug them into Flink as built-in functions to be used in Flink SQL and Table APIs. Another example is users can load an out-of-shelf Hive module to use Hive built-in functions as Flink built-in functions.
CoreModule
contains all of Flink’s system (built-in) functions and is loaded by default.
The HiveModule
provides Hive built-in functions as Flink’s system functions to SQL and Table API users.
Flink’s Hive documentation provides full details on setting up the module.
Users can develop custom modules by implementing the Module
interface.
To use custom modules in SQL CLI, users should develop both a module and its corresponding module factory by implementing
the ModuleFactory
interface.
A module factory defines a set of properties for configuring the module when the SQL CLI bootstraps.
Properties are passed to a discovery service where the service tries to match the properties to
a ModuleFactory
and instantiate a corresponding module instance.
Objects provided by modules are considered part of Flink’s system (built-in) objects; thus, they don’t have any namespaces.
When there are two objects of the same name residing in two modules, Flink always resolves the object reference to the one in the 1st loaded module.
Users can load and unload modules in an existing Flink session.
All modules defined using YAML must provide a type
property that specifies the type.
The following types are supported out of the box.
Catalog | Type Value |
---|---|
CoreModule | core |
HiveModule | hive |