Interface SplitReader<E,​SplitT extends SourceSplit>

  • Type Parameters:
    E - the element type.
    SplitT - the split type.
    All Superinterfaces:
    AutoCloseable

    @PublicEvolving
    public interface SplitReader<E,​SplitT extends SourceSplit>
    extends AutoCloseable
    An interface used to read from splits. The implementation could either read from a single split or from multiple splits.
    • Method Detail

      • fetch

        RecordsWithSplitIds<E> fetch()
                              throws IOException
        Fetch elements into the blocking queue for the given splits. The fetch call could be blocking but it should get unblocked when wakeUp() is invoked. In that case, the implementation may either decide to return without throwing an exception, or it can just throw an interrupted exception. In either case, this method should be reentrant, meaning that the next fetch call should just resume from where the last fetch call was waken up or interrupted. It is up to the implementer to either read all the records of the split or to stop reading them at some point (for example when a given threshold is exceeded). In that later case, when fetch is called again, the reading should restart at the record where it left off based on the SplitState.
        Returns:
        the Ids of the finished splits.
        Throws:
        IOException - when encountered IO errors, such as deserialization failures.
      • handleSplitsChanges

        void handleSplitsChanges​(SplitsChange<SplitT> splitsChanges)
        Handle the split changes. This call should be non-blocking.

        For the consistency of internal state in SourceReaderBase, if an invalid split is added to the reader (for example splits without any records), it should be put back into RecordsWithSplitIds as finished splits so that SourceReaderBase could be able to clean up resources created for it.

        For the consistency of internal state in SourceReaderBase, if a split is removed, it should be put back into RecordsWithSplitIds as finished splits so that SourceReaderBase could be able to clean up resources created for it.

        Parameters:
        splitsChanges - the split changes that the SplitReader needs to handle.
      • wakeUp

        void wakeUp()
        Wake up the split reader in case the fetcher thread is blocking in fetch().
      • pauseOrResumeSplits

        default void pauseOrResumeSplits​(Collection<SplitT> splitsToPause,
                                         Collection<SplitT> splitsToResume)
        Pauses or resumes reading of individual splits readers.

        Note that no other methods can be called in parallel, so it's fine to non-atomically update subscriptions. This method is simply providing connectors with more expressive APIs the opportunity to update all subscriptions at once.

        This is currently used to align the watermarks of splits, if watermark alignment is used and the source reads from more than one split.

        The default implementation throws an UnsupportedOperationException where the default implementation will be removed in future releases. To be compatible with future releases, it is recommended to implement this method and override the default implementation.

        Parameters:
        splitsToPause - the splits to pause
        splitsToResume - the splits to resume