@Public public abstract class FileSystem extends Object
Flink implements and supports some file system types directly (for example the default machine-local file system). Other file system types are accessed by an implementation that bridges to the suite of file systems supported by Hadoop (such as for example HDFS).
Modifier and Type | Class and Description |
---|---|
static class |
FileSystem.FSKey
An identifier of a file system, via its scheme and its authority.
|
static class |
FileSystem.WriteMode
The possible write modes.
|
Constructor and Description |
---|
FileSystem() |
Modifier and Type | Method and Description |
---|---|
static void |
closeAndDisposeFileSystemCloseableRegistryForThread()
Create a SafetyNetCloseableRegistry for a Task.
|
abstract FSDataOutputStream |
create(Path f,
boolean overwrite)
Opens an FSDataOutputStream at the indicated Path.
|
abstract FSDataOutputStream |
create(Path f,
boolean overwrite,
int bufferSize,
short replication,
long blockSize)
Opens an FSDataOutputStream at the indicated Path.
|
static void |
createAndSetFileSystemCloseableRegistryForThread()
Create a SafetyNetCloseableRegistry for a Task.
|
abstract boolean |
delete(Path f,
boolean recursive)
Delete a file.
|
boolean |
exists(Path f)
Check if exists.
|
static FileSystem |
get(URI uri)
Returns a reference to the
FileSystem instance for accessing the
file system identified by the given URI . |
long |
getDefaultBlockSize()
Return the number of bytes that large input files should be optimally be split into to minimize I/O time.
|
abstract BlockLocation[] |
getFileBlockLocations(FileStatus file,
long start,
long len)
Return an array containing hostnames, offset and size of
portions of the given file.
|
abstract FileStatus |
getFileStatus(Path f)
Return a file status object that represents the path.
|
abstract Path |
getHomeDirectory()
Returns the path of the user's home directory in this file system.
|
static FileSystem |
getLocalFileSystem()
Returns a reference to the
FileSystem instance for accessing the
local file system. |
static FileSystem |
getUnguardedFileSystem(URI uri) |
abstract URI |
getUri()
Returns a URI whose scheme and authority identify this file system.
|
abstract Path |
getWorkingDirectory()
Returns the path of the file system's current working directory.
|
abstract void |
initialize(URI name)
Called after a new FileSystem instance is constructed.
|
boolean |
initOutPathDistFS(Path outPath,
FileSystem.WriteMode writeMode,
boolean createDirectory)
Initializes output directories on distributed file systems according to the given write mode.
|
boolean |
initOutPathLocalFS(Path outPath,
FileSystem.WriteMode writeMode,
boolean createDirectory)
Initializes output directories on local file systems according to the given write mode.
|
abstract boolean |
isDistributedFS()
Returns true if this is a distributed file system, false otherwise.
|
static boolean |
isFlinkSupportedScheme(String scheme)
Returns a boolean indicating whether a scheme has built-in Flink support.
|
abstract FileStatus[] |
listStatus(Path f)
List the statuses of the files/directories in the given path if the path is
a directory.
|
abstract boolean |
mkdirs(Path f)
Make the given file and all non-existent parents into directories.
|
abstract FSDataInputStream |
open(Path f)
Opens an FSDataInputStream at the indicated Path.
|
abstract FSDataInputStream |
open(Path f,
int bufferSize)
Opens an FSDataInputStream at the indicated Path.
|
abstract boolean |
rename(Path src,
Path dst)
Renames the file/directory src to dst.
|
static void |
setDefaultScheme(Configuration config)
Sets the default filesystem scheme based on the user-specified configuration parameter
fs.default-scheme . |
@Internal public static void createAndSetFileSystemCloseableRegistryForThread()
@Internal public static void closeAndDisposeFileSystemCloseableRegistryForThread()
public static FileSystem getLocalFileSystem()
FileSystem
instance for accessing the
local file system.FileSystem
instance for accessing the
local file system.public static void setDefaultScheme(Configuration config) throws IOException
Sets the default filesystem scheme based on the user-specified configuration parameter
fs.default-scheme
. By default this is set to file:///
(see ConfigConstants.FILESYSTEM_SCHEME
and
ConfigConstants.DEFAULT_FILESYSTEM_SCHEME
),
and the local filesystem is used.
As an example, if set to hdfs://localhost:9000/
, then an HDFS deployment
with the namenode being on the local node and listening to port 9000 is going to be used.
In this case, a file path specified as /user/USERNAME/in.txt
is going to be transformed into hdfs://localhost:9000/user/USERNAME/in.txt
. By
default this is set to file:///
which points to the local filesystem.
config
- the configuration from where to fetch the parameter.IOException
@Internal public static FileSystem getUnguardedFileSystem(URI uri) throws IOException
IOException
public static FileSystem get(URI uri) throws IOException
FileSystem
instance for accessing the
file system identified by the given URI
.uri
- the URI
identifying the file systemFileSystem
instance for accessing the file system identified by the given
URI
.IOException
- thrown if a reference to the file system instance could not be obtainedpublic static boolean isFlinkSupportedScheme(String scheme)
scheme
- a file system schemepublic abstract Path getWorkingDirectory()
public abstract Path getHomeDirectory()
public abstract URI getUri()
public abstract void initialize(URI name) throws IOException
name
- a URI
whose authority section names the host, port, etc. for this file systemIOException
public abstract FileStatus getFileStatus(Path f) throws IOException
f
- The path we want information fromFileNotFoundException
- when the path does not exist;
IOException see specific implementationIOException
public abstract BlockLocation[] getFileBlockLocations(FileStatus file, long start, long len) throws IOException
IOException
public abstract FSDataInputStream open(Path f, int bufferSize) throws IOException
f
- the file name to openbufferSize
- the size of the buffer to be used.IOException
public abstract FSDataInputStream open(Path f) throws IOException
f
- the file to openIOException
public long getDefaultBlockSize()
public abstract FileStatus[] listStatus(Path f) throws IOException
f
- given pathIOException
public boolean exists(Path f) throws IOException
f
- source fileIOException
public abstract boolean delete(Path f, boolean recursive) throws IOException
f
- the path to deleterecursive
- if path is a directory and set to true
, the directory is deleted else throws an exception. In
case of a file the recursive can be set to either true
or false
true
if delete is successful, false
otherwiseIOException
public abstract boolean mkdirs(Path f) throws IOException
f
- the directory/directories to be createdtrue
if at least one new directory has been created, false
otherwiseIOException
- thrown if an I/O error occurs while creating the directorypublic abstract FSDataOutputStream create(Path f, boolean overwrite, int bufferSize, short replication, long blockSize) throws IOException
f
- the file name to openoverwrite
- if a file with this name already exists, then if true,
the file will be overwritten, and if false an error will be thrown.bufferSize
- the size of the buffer to be used.replication
- required block replication for the file.blockSize
- the size of the file blocksIOException
public abstract FSDataOutputStream create(Path f, boolean overwrite) throws IOException
f
- the file name to openoverwrite
- if a file with this name already exists, then if true,
the file will be overwritten, and if false an error will be thrown.IOException
public abstract boolean rename(Path src, Path dst) throws IOException
src
- the file/directory to renamedst
- the new name of the file/directorytrue
if the renaming was successful, false
otherwiseIOException
public abstract boolean isDistributedFS()
public boolean initOutPathLocalFS(Path outPath, FileSystem.WriteMode writeMode, boolean createDirectory) throws IOException
Files contained in an existing directory are not deleted, because multiple instances of a DataSinkTask might call this function at the same time and hence might perform concurrent delete operations on the file system (possibly deleting output files of concurrently running tasks). Since concurrent DataSinkTasks are not aware of each other, coordination of delete and create operations would be difficult.
outPath
- Output path that should be prepared.writeMode
- Write mode to consider.createDirectory
- True, to initialize a directory at the given path, false to prepare space for a file.IOException
- Thrown, if any of the file system access operations failed.public boolean initOutPathDistFS(Path outPath, FileSystem.WriteMode writeMode, boolean createDirectory) throws IOException
outPath
- Output path that should be prepared.writeMode
- Write mode to consider.createDirectory
- True, to initialize a directory at the given path, false otherwise.IOException
- Thrown, if any of the file system access operations failed.Copyright © 2014–2017 The Apache Software Foundation. All rights reserved.