HadoopDataInputStream (Flink : 1.19-SNAPSHOT API)

java.lang.Object
- java.io.InputStream
- - org.apache.flink.core.fs.FSDataInputStream
  - - org.apache.flink.runtime.fs.hdfs.HadoopDataInputStream

All Implemented Interfaces:

Closeable, AutoCloseable
```
public final class HadoopDataInputStream
extends FSDataInputStream
```
Concrete implementation of the FSDataInputStream for Hadoop's input streams. This supports all file systems supported by Hadoop, such as HDFS and S3 (S3a/S3n).

Field Summary

Fields
Modifier and Type Field and Description

static int MIN_SKIP_BYTES
Minimum amount of bytes to skip forward before we issue a seek instead of discarding read.

Fields
Modifier and Type	Field and Description
`static int`	`MIN_SKIP_BYTES` Minimum amount of bytes to skip forward before we issue a seek instead of discarding read.

Constructor Summary

Constructors
Constructor and Description

HadoopDataInputStream(org.apache.hadoop.fs.FSDataInputStream fsDataInputStream)
Creates a new data input stream from the given Hadoop input stream.

Constructors
Constructor and Description
`HadoopDataInputStream(org.apache.hadoop.fs.FSDataInputStream fsDataInputStream)` Creates a new data input stream from the given Hadoop input stream.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`int`	`available()`
`void`	`close()`
`void`	`forceSeek(long seekPos)` Positions the stream to the given location.
`org.apache.hadoop.fs.FSDataInputStream`	`getHadoopInputStream()` Gets the wrapped Hadoop input stream.
`long`	`getPos()` Gets the current position in the input stream.
`int`	`read()`
`int`	`read(byte[] buffer, int offset, int length)`
`void`	`seek(long seekPos)` Seek to the given offset from the start of the file.
`long`	`skip(long n)`
`void`	`skipFully(long bytes)` Skips over a given amount of bytes in the stream.

Methods inherited from class java.io.InputStream
mark, markSupported, read, reset

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - MIN_SKIP_BYTES
```
public static final int MIN_SKIP_BYTES
```
    Minimum amount of bytes to skip forward before we issue a seek instead of discarding read.
    The current value is just a magic number. In the long run, this value could become configurable, but for now it is a conservative, relatively small value that should bring safe improvements for small skips (e.g. in reading meta data), that would hurt the most with frequent seeks.
    The optimal value depends on the DFS implementation and configuration plus the underlying filesystem. For now, this number is chosen "big enough" to provide improvements for smaller seeks, and "small enough" to avoid disadvantages over real seeks. While the minimum should be the page size, a true optimum per system would be the amounts of bytes the can be consumed sequentially within the seektime. Unfortunately, seektime is not constant and devices, OS, and DFS potentially also use read buffers and read-ahead.
    
    See Also:
    
    Constant Field Values
- Constructor Detail
  - HadoopDataInputStream
```
public HadoopDataInputStream(org.apache.hadoop.fs.FSDataInputStream fsDataInputStream)
```
    Creates a new data input stream from the given Hadoop input stream.
    
    Parameters:
    
    fsDataInputStream - The Hadoop input stream
- Method Detail
  - seek
```
public void seek(long seekPos)
          throws IOException
```
    Description copied from class: FSDataInputStream
    
    Seek to the given offset from the start of the file. The next read() will be from that location. Can't seek past the end of the stream.
    
    Specified by:
    
    seek in class FSDataInputStream
    
    Parameters:
    
    seekPos - the desired offset
    
    Throws:
    
    IOException - Thrown if an error occurred while seeking inside the input stream.
  - getPos
```
public long getPos()
            throws IOException
```
    Description copied from class: FSDataInputStream
    
    Gets the current position in the input stream.
    
    Specified by:
    
    getPos in class FSDataInputStream
    
    Returns:
    
    current position in the input stream
    
    Throws:
    
    IOException - Thrown if an I/O error occurred in the underlying stream implementation while accessing the stream's position.
  - read
```
public int read()
         throws IOException
```
    Specified by:
    
    read in class InputStream
    
    Throws:
    
    IOException
  - close
```
public void close()
           throws IOException
```
    Specified by:
    
    close in interface Closeable
    
    Specified by:
    
    close in interface AutoCloseable
    
    Overrides:
    
    close in class InputStream
    
    Throws:
    
    IOException
  - read
```
public int read(@Nonnull
                byte[] buffer,
                int offset,
                int length)
         throws IOException
```
    Overrides:
    
    read in class InputStream
    
    Throws:
    
    IOException
  - available
```
public int available()
              throws IOException
```
    Overrides:
    
    available in class InputStream
    
    Throws:
    
    IOException
  - skip
```
public long skip(long n)
          throws IOException
```
    Overrides:
    
    skip in class InputStream
    
    Throws:
    
    IOException
  - getHadoopInputStream
```
public org.apache.hadoop.fs.FSDataInputStream getHadoopInputStream()
```
    Gets the wrapped Hadoop input stream.
    
    Returns:
    
    The wrapped Hadoop input stream.
  - forceSeek
```
public void forceSeek(long seekPos)
               throws IOException
```
    Positions the stream to the given location. In contrast to seek(long), this method will always issue a "seek" command to the dfs and may not replace it by skip(long) for small seeks.
    Notice that the underlying DFS implementation can still decide to do skip instead of seek.
    
    Parameters:
    
    seekPos - the position to seek to.
    
    Throws:
    
    IOException
  - skipFully
```
public void skipFully(long bytes)
               throws IOException
```
    Skips over a given amount of bytes in the stream.
    
    Parameters:
    
    bytes - the number of bytes to skip.
    
    Throws:
    
    IOException

Back to Flink Website

Class HadoopDataInputStream

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.io.InputStream

Methods inherited from class java.lang.Object

Field Detail

MIN_SKIP_BYTES

Constructor Detail

HadoopDataInputStream

Method Detail

seek

getPos

read

close

read

available

skip

getHadoopInputStream

forceSeek

skipFully

Back to Flink Website