YarnIntraNonHaMasterServices (flink 1.8-SNAPSHOT API)

java.lang.Object
- org.apache.flink.yarn.highavailability.YarnHighAvailabilityServices
- - org.apache.flink.yarn.highavailability.AbstractYarnNonHaServices
  - - org.apache.flink.yarn.highavailability.YarnIntraNonHaMasterServices

All Implemented Interfaces:

AutoCloseable, HighAvailabilityServices
```
public class YarnIntraNonHaMasterServices
extends AbstractYarnNonHaServices
```
These YarnHighAvailabilityServices are for the Application Master in setups where there is one ResourceManager that is statically configured in the Flink configuration.
Handled failure types
- User code & operator failures: Failed operators are recovered from checkpoints.
- Task Manager Failures: Failed Task Managers are restarted and their tasks are recovered from checkpoints.
Non-recoverable failure types
- Application Master failures: These failures cannot be recovered, because TaskManagers have no way to discover the new Application Master's address.
Internally, these services put their recovery data into YARN's working directory, except for checkpoints, which are in the configured checkpoint directory. That way, checkpoints can be resumed with a new job/application, even if the complete YARN application is killed and cleaned up.
Because ResourceManager and JobManager run both in the same process (Application Master), they use an embedded leader election service to find each other.
A typical YARN setup that uses these HA services first starts the ResourceManager inside the ApplicationMaster and puts its RPC endpoint address into the configuration with which the TaskManagers are started. Because of this static addressing scheme, the setup cannot handle failures of the JobManager and ResourceManager, which are running as part of the Application Master.
See Also:

HighAvailabilityServices

Field Summary
- Fields inherited from class org.apache.flink.yarn.highavailability.YarnHighAvailabilityServices
  blobStoreService, FLINK_RECOVERY_DATA_DIR, flinkFileSystem, haDataDirectory, hadoopFileSystem, LOG, workingDirectory
- Fields inherited from interface org.apache.flink.runtime.highavailability.HighAvailabilityServices
  DEFAULT_JOB_ID, DEFAULT_LEADER_ID

Constructor Summary

Constructors
Constructor and Description
`YarnIntraNonHaMasterServices(Configuration config, Configuration hadoopConf)` Creates new YarnIntraNonHaMasterServices for the given Flink and YARN configuration.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`close()` Closes the high availability services, releasing all resources.
`LeaderElectionService`	`getDispatcherLeaderElectionService()` Gets the leader election service for the cluster's dispatcher.
`LeaderRetrievalService`	`getDispatcherLeaderRetriever()` Gets the leader retriever for the dispatcher.
`LeaderElectionService`	`getJobManagerLeaderElectionService(JobID jobID)` Gets the leader election service for the given job.
`LeaderRetrievalService`	`getJobManagerLeaderRetriever(JobID jobID)` Gets the leader retriever for the job JobMaster which is responsible for the given job
`LeaderRetrievalService`	`getJobManagerLeaderRetriever(JobID jobID, String defaultJobManagerAddress)` Gets the leader retriever for the job JobMaster which is responsible for the given job
`LeaderElectionService`	`getResourceManagerLeaderElectionService()` Gets the leader election service for the cluster's resource manager.
`LeaderRetrievalService`	`getResourceManagerLeaderRetriever()` Gets the leader retriever for the cluster's resource manager.
`LeaderElectionService`	`getWebMonitorLeaderElectionService()`
`LeaderRetrievalService`	`getWebMonitorLeaderRetriever()`

Methods inherited from class org.apache.flink.yarn.highavailability.AbstractYarnNonHaServices
getCheckpointRecoveryFactory, getRunningJobsRegistry, getSubmittedJobGraphStore

Methods inherited from class org.apache.flink.yarn.highavailability.YarnHighAvailabilityServices
closeAndCleanupAllData, createBlobStore, forSingleJobAppMaster, forYarnTaskManager, isClosed

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - YarnIntraNonHaMasterServices
```
public YarnIntraNonHaMasterServices(Configuration config,
                                    Configuration hadoopConf)
                             throws IOException
```
    Creates new YarnIntraNonHaMasterServices for the given Flink and YARN configuration.
    This constructor initializes access to the HDFS to store recovery data, and creates the embedded leader election services through which ResourceManager and JobManager find and confirm each other.
    
    Parameters:
    
    config - The Flink configuration of this component / process.
    
    hadoopConf - The Hadoop configuration for the YARN cluster.
    
    Throws:
    
    IOException - Thrown, if the initialization of the Hadoop file system used by YARN fails.
    
    IllegalConfigurationException - Thrown, if the Flink configuration does not properly describe the ResourceManager address and port.
- Method Detail
  - getResourceManagerLeaderRetriever
```
public LeaderRetrievalService getResourceManagerLeaderRetriever()
```
    Description copied from interface: HighAvailabilityServices
    
    Gets the leader retriever for the cluster's resource manager.
  - getDispatcherLeaderRetriever
```
public LeaderRetrievalService getDispatcherLeaderRetriever()
```
    Description copied from interface: HighAvailabilityServices
    
    Gets the leader retriever for the dispatcher. This leader retrieval service is not always accessible.
  - getResourceManagerLeaderElectionService
```
public LeaderElectionService getResourceManagerLeaderElectionService()
```
    Description copied from interface: HighAvailabilityServices
    
    Gets the leader election service for the cluster's resource manager.
    
    Returns:
    
    Leader election service for the resource manager leader election
  - getDispatcherLeaderElectionService
```
public LeaderElectionService getDispatcherLeaderElectionService()
```
    Description copied from interface: HighAvailabilityServices
    
    Gets the leader election service for the cluster's dispatcher.
    
    Returns:
    
    Leader election service for the dispatcher leader election
  - getJobManagerLeaderElectionService
```
public LeaderElectionService getJobManagerLeaderElectionService(JobID jobID)
```
    Description copied from interface: HighAvailabilityServices
    
    Gets the leader election service for the given job.
    
    Parameters:
    
    jobID - The identifier of the job running the election.
    
    Returns:
    
    Leader election service for the job manager leader election
  - getWebMonitorLeaderElectionService
```
public LeaderElectionService getWebMonitorLeaderElectionService()
```
  - getJobManagerLeaderRetriever
```
public LeaderRetrievalService getJobManagerLeaderRetriever(JobID jobID)
```
    Description copied from interface: HighAvailabilityServices
    
    Gets the leader retriever for the job JobMaster which is responsible for the given job
    
    Parameters:
    
    jobID - The identifier of the job.
    
    Returns:
    
    Leader retrieval service to retrieve the job manager for the given job
  - getJobManagerLeaderRetriever
```
public LeaderRetrievalService getJobManagerLeaderRetriever(JobID jobID,
                                                           String defaultJobManagerAddress)
```
    Description copied from interface: HighAvailabilityServices
    
    Gets the leader retriever for the job JobMaster which is responsible for the given job
    
    Parameters:
    
    jobID - The identifier of the job.
    
    defaultJobManagerAddress - JobManager address which will be returned by a static leader retrieval service.
    
    Returns:
    
    Leader retrieval service to retrieve the job manager for the given job
  - getWebMonitorLeaderRetriever
```
public LeaderRetrievalService getWebMonitorLeaderRetriever()
```
  - close
```
public void close()
           throws Exception
```
    Description copied from interface: HighAvailabilityServices
    
    Closes the high availability services, releasing all resources.
    This method does not delete or clean up any data stored in external stores (file systems, ZooKeeper, etc). Another instance of the high availability services will be able to recover the job.
    If an exception occurs during closing services, this method will attempt to continue closing other services and report exceptions only after all services have been attempted to be closed.
    
    Specified by:
    
    close in interface AutoCloseable
    
    Specified by:
    
    close in interface HighAvailabilityServices
    
    Overrides:
    
    close in class YarnHighAvailabilityServices
    
    Throws:
    
    Exception - Thrown, if an exception occurred while closing these services.

Back to Flink Website

Class YarnIntraNonHaMasterServices

Handled failure types

Non-recoverable failure types

Field Summary

Fields inherited from class org.apache.flink.yarn.highavailability.YarnHighAvailabilityServices

Fields inherited from interface org.apache.flink.runtime.highavailability.HighAvailabilityServices

Constructor Summary

Method Summary

Methods inherited from class org.apache.flink.yarn.highavailability.AbstractYarnNonHaServices

Methods inherited from class org.apache.flink.yarn.highavailability.YarnHighAvailabilityServices

Methods inherited from class java.lang.Object

Constructor Detail

YarnIntraNonHaMasterServices

Method Detail

getResourceManagerLeaderRetriever

getDispatcherLeaderRetriever

getResourceManagerLeaderElectionService

getDispatcherLeaderElectionService

getJobManagerLeaderElectionService

getWebMonitorLeaderElectionService

getJobManagerLeaderRetriever

getJobManagerLeaderRetriever

getWebMonitorLeaderRetriever

close

Back to Flink Website