public final class HadoopFileSystem extends FileSystem implements HadoopFileSystemWrapper
FileSystem
base class for the Hadoop File System. The
class is a wrapper class which encapsulated the original Hadoop HDFS API.
If no file system class is specified, the wrapper will automatically load the Hadoop
distributed file system (HDFS).FileSystem.FSKey, FileSystem.WriteMode
Constructor and Description |
---|
HadoopFileSystem(Class<? extends org.apache.hadoop.fs.FileSystem> fsClass)
Creates a new DistributedFileSystem object to access HDFS
|
Modifier and Type | Method and Description |
---|---|
HadoopDataOutputStream |
create(Path f,
boolean overwrite)
Opens an FSDataOutputStream at the indicated Path.
|
HadoopDataOutputStream |
create(Path f,
boolean overwrite,
int bufferSize,
short replication,
long blockSize)
Opens an FSDataOutputStream at the indicated Path.
|
boolean |
delete(Path f,
boolean recursive)
Delete a file.
|
long |
getDefaultBlockSize()
Return the number of bytes that large input files should be optimally be split into to minimize I/O time.
|
BlockLocation[] |
getFileBlockLocations(FileStatus file,
long start,
long len)
Return an array containing hostnames, offset and size of
portions of the given file.
|
FileStatus |
getFileStatus(Path f)
Return a file status object that represents the path.
|
static org.apache.hadoop.conf.Configuration |
getHadoopConfiguration()
Returns a new Hadoop Configuration object using the path to the hadoop conf configured
in the main configuration (flink-conf.yaml).
|
org.apache.hadoop.fs.FileSystem |
getHadoopFileSystem()
Gets the underlying Hadoop FileSystem.
|
Class<?> |
getHadoopWrapperClassNameForFileSystem(String scheme)
Test whether the HadoopWrapper can wrap the given file system scheme.
|
Path |
getHomeDirectory()
Returns the path of the user's home directory in this file system.
|
URI |
getUri()
Returns a URI whose scheme and authority identify this file system.
|
Path |
getWorkingDirectory()
Returns the path of the file system's current working directory.
|
void |
initialize(URI path)
Called after a new FileSystem instance is constructed.
|
boolean |
isDistributedFS()
Returns true if this is a distributed file system, false otherwise.
|
FileStatus[] |
listStatus(Path f)
List the statuses of the files/directories in the given path if the path is
a directory.
|
boolean |
mkdirs(Path f)
Make the given file and all non-existent parents into directories.
|
HadoopDataInputStream |
open(Path f)
Opens an FSDataInputStream at the indicated Path.
|
HadoopDataInputStream |
open(Path f,
int bufferSize)
Opens an FSDataInputStream at the indicated Path.
|
boolean |
rename(Path src,
Path dst)
Renames the file/directory src to dst.
|
exists, get, getLocalFileSystem, initOutPathDistFS, initOutPathLocalFS, setDefaultScheme
public HadoopFileSystem(Class<? extends org.apache.hadoop.fs.FileSystem> fsClass) throws IOException
IOException
- throw if the required HDFS classes cannot be instantiatedpublic static org.apache.hadoop.conf.Configuration getHadoopConfiguration()
public Path getWorkingDirectory()
FileSystem
getWorkingDirectory
in class FileSystem
public Path getHomeDirectory()
FileSystem
getHomeDirectory
in class FileSystem
public URI getUri()
FileSystem
getUri
in class FileSystem
public org.apache.hadoop.fs.FileSystem getHadoopFileSystem()
public void initialize(URI path) throws IOException
FileSystem
initialize
in class FileSystem
path
- a URI
whose authority section names the host, port, etc. for this file systemIOException
public FileStatus getFileStatus(Path f) throws IOException
FileSystem
getFileStatus
in class FileSystem
f
- The path we want information fromFileNotFoundException
- when the path does not exist;
IOException see specific implementationIOException
public BlockLocation[] getFileBlockLocations(FileStatus file, long start, long len) throws IOException
FileSystem
getFileBlockLocations
in class FileSystem
IOException
public HadoopDataInputStream open(Path f, int bufferSize) throws IOException
FileSystem
open
in class FileSystem
f
- the file name to openbufferSize
- the size of the buffer to be used.IOException
public HadoopDataInputStream open(Path f) throws IOException
FileSystem
open
in class FileSystem
f
- the file to openIOException
public HadoopDataOutputStream create(Path f, boolean overwrite, int bufferSize, short replication, long blockSize) throws IOException
FileSystem
create
in class FileSystem
f
- the file name to openoverwrite
- if a file with this name already exists, then if true,
the file will be overwritten, and if false an error will be thrown.bufferSize
- the size of the buffer to be used.replication
- required block replication for the file.blockSize
- the size of the file blocksIOException
public HadoopDataOutputStream create(Path f, boolean overwrite) throws IOException
FileSystem
create
in class FileSystem
f
- the file name to openoverwrite
- if a file with this name already exists, then if true,
the file will be overwritten, and if false an error will be thrown.IOException
public boolean delete(Path f, boolean recursive) throws IOException
FileSystem
delete
in class FileSystem
f
- the path to deleterecursive
- if path is a directory and set to true
, the directory is deleted else throws an exception. In
case of a file the recursive can be set to either true
or false
true
if delete is successful, false
otherwiseIOException
public FileStatus[] listStatus(Path f) throws IOException
FileSystem
listStatus
in class FileSystem
f
- given pathIOException
public boolean mkdirs(Path f) throws IOException
FileSystem
mkdirs
in class FileSystem
f
- the directory/directories to be createdtrue
if at least one new directory has been created, false
otherwiseIOException
- thrown if an I/O error occurs while creating the directorypublic boolean rename(Path src, Path dst) throws IOException
FileSystem
rename
in class FileSystem
src
- the file/directory to renamedst
- the new name of the file/directorytrue
if the renaming was successful, false
otherwiseIOException
public long getDefaultBlockSize()
FileSystem
getDefaultBlockSize
in class FileSystem
public boolean isDistributedFS()
FileSystem
isDistributedFS
in class FileSystem
public Class<?> getHadoopWrapperClassNameForFileSystem(String scheme)
HadoopFileSystemWrapper
getHadoopWrapperClassNameForFileSystem
in interface HadoopFileSystemWrapper
scheme
- The scheme of the file system.Copyright © 2014–2017 The Apache Software Foundation. All rights reserved.