Interface HighAvailabilityServices

    • Field Detail

      • DEFAULT_LEADER_ID

        static final UUID DEFAULT_LEADER_ID
        This UUID should be used when no proper leader election happens, but a simple pre-configured leader is used. That is for example the case in non-highly-available standalone setups.
      • DEFAULT_JOB_ID

        static final JobID DEFAULT_JOB_ID
        This JobID should be used to identify the old JobManager when using the HighAvailabilityServices. With the new mode every JobMaster will have a distinct JobID assigned.
    • Method Detail

      • getResourceManagerLeaderRetriever

        LeaderRetrievalService getResourceManagerLeaderRetriever()
        Gets the leader retriever for the cluster's resource manager.
      • getDispatcherLeaderRetriever

        LeaderRetrievalService getDispatcherLeaderRetriever()
        Gets the leader retriever for the dispatcher. This leader retrieval service is not always accessible.
      • getJobManagerLeaderRetriever

        @Deprecated
        LeaderRetrievalService getJobManagerLeaderRetriever​(JobID jobID)
        Deprecated.
        This method should only be used by the legacy code where the JobManager acts as the master.
        Gets the leader retriever for the job JobMaster which is responsible for the given job.
        Parameters:
        jobID - The identifier of the job.
        Returns:
        Leader retrieval service to retrieve the job manager for the given job
      • getJobManagerLeaderRetriever

        LeaderRetrievalService getJobManagerLeaderRetriever​(JobID jobID,
                                                            String defaultJobManagerAddress)
        Gets the leader retriever for the job JobMaster which is responsible for the given job.
        Parameters:
        jobID - The identifier of the job.
        defaultJobManagerAddress - JobManager address which will be returned by a static leader retrieval service.
        Returns:
        Leader retrieval service to retrieve the job manager for the given job
      • getResourceManagerLeaderElection

        LeaderElection getResourceManagerLeaderElection()
        Gets the LeaderElection for the cluster's resource manager.
      • getCheckpointRecoveryFactory

        CheckpointRecoveryFactory getCheckpointRecoveryFactory()
                                                        throws Exception
        Gets the checkpoint recovery factory for the job manager.
        Returns:
        Checkpoint recovery factory
        Throws:
        Exception
      • getExecutionPlanStore

        ExecutionPlanStore getExecutionPlanStore()
                                          throws Exception
        Gets the submitted execution plan store for the job manager.
        Returns:
        Submitted execution plan store
        Throws:
        Exception - if the submitted execution plan store could not be created
      • getJobResultStore

        JobResultStore getJobResultStore()
                                  throws Exception
        Gets the store that holds information about the state of finished jobs.
        Returns:
        Store of finished job results
        Throws:
        Exception - if job result store could not be created
      • createBlobStore

        BlobStore createBlobStore()
                           throws IOException
        Creates the BLOB store in which BLOBs are stored in a highly-available fashion.
        Returns:
        Blob store
        Throws:
        IOException - if the blob store could not be created
      • getClusterRestEndpointLeaderElection

        default LeaderElection getClusterRestEndpointLeaderElection()
        Gets the LeaderElection for the cluster's rest endpoint.
      • close

        void close()
            throws Exception
        Closes the high availability services, releasing all resources.

        This method does not delete or clean up any data stored in external stores (file systems, ZooKeeper, etc). Another instance of the high availability services will be able to recover the job.

        If an exception occurs during closing services, this method will attempt to continue closing other services and report exceptions only after all services have been attempted to be closed.

        Specified by:
        close in interface AutoCloseable
        Throws:
        Exception - Thrown, if an exception occurred while closing these services.
      • cleanupAllData

        void cleanupAllData()
                     throws Exception
        Deletes all data stored by high availability services in external stores.

        After this method was called, any job or session that was managed by these high availability services will be unrecoverable.

        If an exception occurs during cleanup, this method will attempt to continue the cleanup and report exceptions only after all cleanup steps have been attempted.

        Throws:
        Exception - if an error occurred while cleaning up data stored by them.
      • closeWithOptionalClean

        default void closeWithOptionalClean​(boolean cleanupData)
                                     throws Exception
        Calls cleanupAllData() (if true is passed as a parameter) before calling close() on this instance. Any error that appeared during the cleanup will be propagated after calling close().
        Throws:
        Exception
      • globalCleanupAsync

        default CompletableFuture<Void> globalCleanupAsync​(JobID jobId,
                                                           Executor executor)
        Description copied from interface: GloballyCleanableResource
        globalCleanupAsync is expected to be called from the main thread. Heavy IO tasks should be outsourced into the passed cleanupExecutor. Thread-safety must be ensured.
        Specified by:
        globalCleanupAsync in interface GloballyCleanableResource
        Parameters:
        jobId - The JobID of the job for which the local data should be cleaned up.
        executor - The fallback executor for IO-heavy operations.
        Returns:
        The cleanup result future.