HadoopFileSystem (Flink : 1.17-SNAPSHOT API)

java.lang.Object
- org.apache.flink.core.fs.FileSystem
- - org.apache.flink.runtime.fs.hdfs.HadoopFileSystem

Direct Known Subclasses:

FlinkOSSFileSystem, FlinkS3FileSystem
```
public class HadoopFileSystem
extends FileSystem
```
A FileSystem that wraps an Hadoop File System.

Nested Class Summary
- Nested classes/interfaces inherited from class org.apache.flink.core.fs.FileSystem
  FileSystem.WriteMode

Constructor Summary

Constructors
Constructor and Description

HadoopFileSystem(org.apache.hadoop.fs.FileSystem hadoopFileSystem)
Wraps the given Hadoop File System object as a Flink File System object.

Constructors
Constructor and Description
`HadoopFileSystem(org.apache.hadoop.fs.FileSystem hadoopFileSystem)` Wraps the given Hadoop File System object as a Flink File System object.

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`HadoopDataOutputStream`	`create(Path f, boolean overwrite, int bufferSize, short replication, long blockSize)` Opens an FSDataOutputStream at the indicated Path.
`HadoopDataOutputStream`	`create(Path f, FileSystem.WriteMode overwrite)` Opens an FSDataOutputStream to a new file at the given path.
`RecoverableWriter`	`createRecoverableWriter()` Creates a new `RecoverableWriter`.
`boolean`	`delete(Path f, boolean recursive)` Delete a file.
`boolean`	`exists(Path f)` Check if exists.
`long`	`getDefaultBlockSize()` Return the number of bytes that large input files should be optimally be split into to minimize I/O time.
`BlockLocation[]`	`getFileBlockLocations(FileStatus file, long start, long len)` Return an array containing hostnames, offset and size of portions of the given file.
`FileStatus`	`getFileStatus(Path f)` Return a file status object that represents the path.
`org.apache.hadoop.fs.FileSystem`	`getHadoopFileSystem()` Gets the underlying Hadoop FileSystem.
`Path`	`getHomeDirectory()` Returns the path of the user's home directory in this file system.
`FileSystemKind`	`getKind()` Gets a description of the characteristics of this file system.
`URI`	`getUri()` Returns a URI whose scheme and authority identify this file system.
`Path`	`getWorkingDirectory()` Returns the path of the file system's current working directory.
`boolean`	`isDistributedFS()` Returns true if this is a distributed file system.
`FileStatus[]`	`listStatus(Path f)` List the statuses of the files/directories in the given path if the path is a directory.
`boolean`	`mkdirs(Path f)` Make the given file and all non-existent parents into directories.
`HadoopDataInputStream`	`open(Path f)` Opens an FSDataInputStream at the indicated Path.
`HadoopDataInputStream`	`open(Path f, int bufferSize)` Opens an FSDataInputStream at the indicated Path.
`boolean`	`rename(Path src, Path dst)` Renames the file/directory src to dst.
`static org.apache.hadoop.fs.Path`	`toHadoopPath(Path path)`

Methods inherited from class org.apache.flink.core.fs.FileSystem
create, get, getDefaultFsUri, getLocalFileSystem, getUnguardedFileSystem, initialize, initialize, initOutPathDistFS, initOutPathLocalFS

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - HadoopFileSystem
```
public HadoopFileSystem(org.apache.hadoop.fs.FileSystem hadoopFileSystem)
```
    Wraps the given Hadoop File System object as a Flink File System object. The given Hadoop file system object is expected to be initialized already.
    
    Parameters:
    
    hadoopFileSystem - The Hadoop FileSystem that will be used under the hood.
- Method Detail
  - getHadoopFileSystem
```
public org.apache.hadoop.fs.FileSystem getHadoopFileSystem()
```
    Gets the underlying Hadoop FileSystem.
    
    Returns:
    
    The underlying Hadoop FileSystem.
  - getWorkingDirectory
```
public Path getWorkingDirectory()
```
    Description copied from class: FileSystem
    
    Returns the path of the file system's current working directory.
    
    Specified by:
    
    getWorkingDirectory in class FileSystem
    
    Returns:
    
    the path of the file system's current working directory
  - getHomeDirectory
```
public Path getHomeDirectory()
```
    Description copied from class: FileSystem
    
    Returns the path of the user's home directory in this file system.
    
    Specified by:
    
    getHomeDirectory in class FileSystem
    
    Returns:
    
    the path of the user's home directory in this file system.
  - getUri
```
public URI getUri()
```
    Description copied from class: FileSystem
    
    Returns a URI whose scheme and authority identify this file system.
    
    Specified by:
    
    getUri in class FileSystem
    
    Returns:
    
    a URI whose scheme and authority identify this file system
  - getFileStatus
```
public FileStatus getFileStatus(Path f)
                         throws IOException
```
    Description copied from class: FileSystem
    
    Return a file status object that represents the path.
    
    Specified by:
    
    getFileStatus in class FileSystem
    
    Parameters:
    
    f - The path we want information from
    
    Returns:
    
    a FileStatus object
    
    Throws:
    
    FileNotFoundException - when the path does not exist; IOException see specific implementation
    
    IOException
  - getFileBlockLocations
```
public BlockLocation[] getFileBlockLocations(FileStatus file,
                                             long start,
                                             long len)
                                      throws IOException
```
    Description copied from class: FileSystem
    
    Return an array containing hostnames, offset and size of portions of the given file. For a nonexistent file or regions, null will be returned. This call is most helpful with DFS, where it returns hostnames of machines that contain the given file. The FileSystem will simply return an elt containing 'localhost'.
    
    Specified by:
    
    getFileBlockLocations in class FileSystem
    
    Throws:
    
    IOException
  - open
```
public HadoopDataInputStream open(Path f,
                                  int bufferSize)
                           throws IOException
```
    Description copied from class: FileSystem
    
    Opens an FSDataInputStream at the indicated Path.
    
    Specified by:
    
    open in class FileSystem
    
    Parameters:
    
    f - the file name to open
    
    bufferSize - the size of the buffer to be used.
    
    Throws:
    
    IOException
  - open
```
public HadoopDataInputStream open(Path f)
                           throws IOException
```
    Description copied from class: FileSystem
    
    Opens an FSDataInputStream at the indicated Path.
    
    Specified by:
    
    open in class FileSystem
    
    Parameters:
    
    f - the file to open
    
    Throws:
    
    IOException
  - create
```
public HadoopDataOutputStream create(Path f,
                                     boolean overwrite,
                                     int bufferSize,
                                     short replication,
                                     long blockSize)
                              throws IOException
```
    Description copied from class: FileSystem
    
    Opens an FSDataOutputStream at the indicated Path.
    This method is deprecated, because most of its parameters are ignored by most file systems. To control for example the replication factor and block size in the Hadoop Distributed File system, make sure that the respective Hadoop configuration file is either linked from the Flink configuration, or in the classpath of either Flink or the user code.
    
    Overrides:
    
    create in class FileSystem
    
    Parameters:
    
    f - the file name to open
    
    overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
    
    bufferSize - the size of the buffer to be used.
    
    replication - required block replication for the file.
    
    blockSize - the size of the file blocks
    
    Throws:
    
    IOException - Thrown, if the stream could not be opened because of an I/O, or because a file already exists at that path and the write mode indicates to not overwrite the file.
  - create
```
public HadoopDataOutputStream create(Path f,
                                     FileSystem.WriteMode overwrite)
                              throws IOException
```
    Description copied from class: FileSystem
    
    Opens an FSDataOutputStream to a new file at the given path.
    If the file already exists, the behavior depends on the given WriteMode. If the mode is set to FileSystem.WriteMode.NO_OVERWRITE, then this method fails with an exception.
    
    Specified by:
    
    create in class FileSystem
    
    Parameters:
    
    f - The file path to write to
    
    overwrite - The action to take if a file or directory already exists at the given path.
    
    Returns:
    
    The stream to the new file at the target path.
    
    Throws:
    
    IOException - Thrown, if the stream could not be opened because of an I/O, or because a file already exists at that path and the write mode indicates to not overwrite the file.
  - delete
```
public boolean delete(Path f,
                      boolean recursive)
               throws IOException
```
    Description copied from class: FileSystem
    
    Delete a file.
    
    Specified by:
    
    delete in class FileSystem
    
    Parameters:
    
    f - the path to delete
    
    recursive - if path is a directory and set to true, the directory is deleted else throws an exception. In case of a file the recursive can be set to either true or false
    
    Returns:
    
    true if delete is successful, false otherwise
    
    Throws:
    
    IOException
  - exists
```
public boolean exists(Path f)
               throws IOException
```
    Description copied from class: FileSystem
    
    Check if exists.
    
    Overrides:
    
    exists in class FileSystem
    
    Parameters:
    
    f - source file
    
    Throws:
    
    IOException
  - listStatus
```
public FileStatus[] listStatus(Path f)
                        throws IOException
```
    Description copied from class: FileSystem
    
    List the statuses of the files/directories in the given path if the path is a directory.
    
    Specified by:
    
    listStatus in class FileSystem
    
    Parameters:
    
    f - given path
    
    Returns:
    
    the statuses of the files/directories in the given path
    
    Throws:
    
    IOException
  - mkdirs
```
public boolean mkdirs(Path f)
               throws IOException
```
    Description copied from class: FileSystem
    
    Make the given file and all non-existent parents into directories. Has the semantics of Unix 'mkdir -p'. Existence of the directory hierarchy is not an error.
    
    Specified by:
    
    mkdirs in class FileSystem
    
    Parameters:
    
    f - the directory/directories to be created
    
    Returns:
    
    true if at least one new directory has been created, false otherwise
    
    Throws:
    
    IOException - thrown if an I/O error occurs while creating the directory
  - rename
```
public boolean rename(Path src,
                      Path dst)
               throws IOException
```
    Description copied from class: FileSystem
    
    Renames the file/directory src to dst.
    
    Specified by:
    
    rename in class FileSystem
    
    Parameters:
    
    src - the file/directory to rename
    
    dst - the new name of the file/directory
    
    Returns:
    
    true if the renaming was successful, false otherwise
    
    Throws:
    
    IOException
  - getDefaultBlockSize
```
public long getDefaultBlockSize()
```
    Description copied from class: FileSystem
    
    Return the number of bytes that large input files should be optimally be split into to minimize I/O time.
    
    Overrides:
    
    getDefaultBlockSize in class FileSystem
    
    Returns:
    
    the number of bytes that large input files should be optimally be split into to minimize I/O time
  - isDistributedFS
```
public boolean isDistributedFS()
```
    Description copied from class: FileSystem
    
    Returns true if this is a distributed file system. A distributed file system here means that the file system is shared among all Flink processes that participate in a cluster or job and that all these processes can see the same files.
    
    Specified by:
    
    isDistributedFS in class FileSystem
    
    Returns:
    
    True, if this is a distributed file system, false otherwise.
  - getKind
```
public FileSystemKind getKind()
```
    Description copied from class: FileSystem
    
    Gets a description of the characteristics of this file system.
    
    Specified by:
    
    getKind in class FileSystem
  - createRecoverableWriter
```
public RecoverableWriter createRecoverableWriter()
                                          throws IOException
```
    Description copied from class: FileSystem
    
    Creates a new RecoverableWriter. A recoverable writer creates streams that can persist and recover their intermediate state. Persisting and recovering intermediate state is a core building block for writing to files that span multiple checkpoints.
    The returned object can act as a shared factory to open and recover multiple streams.
    This method is optional on file systems and various file system implementations may not support this method, throwing an UnsupportedOperationException.
    
    Overrides:
    
    createRecoverableWriter in class FileSystem
    
    Returns:
    
    A RecoverableWriter for this file system.
    
    Throws:
    
    IOException - Thrown, if the recoverable writer cannot be instantiated.
  - toHadoopPath
```
public static org.apache.hadoop.fs.Path toHadoopPath(Path path)
```

Back to Flink Website

Class HadoopFileSystem

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.flink.core.fs.FileSystem

Constructor Summary

Method Summary

Methods inherited from class org.apache.flink.core.fs.FileSystem

Methods inherited from class java.lang.Object

Constructor Detail

HadoopFileSystem

Method Detail

getHadoopFileSystem

getWorkingDirectory

getHomeDirectory

getUri

getFileStatus

getFileBlockLocations

open

open

create

create

delete

exists

listStatus

mkdirs

rename

getDefaultBlockSize

isDistributedFS

getKind

createRecoverableWriter

toHadoopPath

Back to Flink Website