Interface SupportsPartitionPushDown

  • All Known Implementing Classes:
    FileSystemTableSource

    @PublicEvolving
    public interface SupportsPartitionPushDown
    Enables to pass available partitions to the planner and push down partitions into a ScanTableSource.

    Partitions split the data stored in an external system into smaller portions that are identified by one or more string-based partition keys. A single partition is represented as a Map < String, String > which maps each partition key to a partition value. Partition keys and their order is defined by the catalog table.

    For example, data can be partitioned by region and within a region partitioned by month. The order of the partition keys (in the example: first by region then by month) is defined by the catalog table. A list of partitions could be:

       List(
         ['region'='europe', 'month'='2020-01'],
         ['region'='europe', 'month'='2020-02'],
         ['region'='asia', 'month'='2020-01'],
         ['region'='asia', 'month'='2020-02']
       )
     

    By default, if this interface is not implemented, the data is read entirely with a subsequent filter operation after the source.

    For efficiency, the planner can pass the number of required partitions and a source must exclude those partitions from reading (including reading the metadata). See applyPartitions(List).

    By default, the list of all partitions is queried from the catalog if necessary. However, depending on the external system, it might be necessary to query the list of partitions in a connector-specific way instead of using the catalog information. See listPartitions().

    Note: After partitions are pushed into the source, the runtime will not perform a subsequent filter operation for partition keys.

    • Method Detail

      • listPartitions

        Optional<List<Map<String,​String>>> listPartitions()
        Returns a list of all partitions that a source can read if available.

        A single partition maps each partition key to a partition value.

        If Optional.empty() is returned, the list of partitions is queried from the catalog.

      • applyPartitions

        void applyPartitions​(List<Map<String,​String>> remainingPartitions)
        Provides a list of remaining partitions. After those partitions are applied, a source must not read the data of other partitions during runtime.

        See the documentation of SupportsPartitionPushDown for more information.