Class AbstractColumnReader<VECTOR extends WritableColumnVector>
- java.lang.Object
-
- org.apache.flink.formats.parquet.vector.reader.AbstractColumnReader<VECTOR>
-
- All Implemented Interfaces:
ColumnReader<VECTOR>
- Direct Known Subclasses:
BooleanColumnReader
,ByteColumnReader
,BytesColumnReader
,DoubleColumnReader
,FixedLenBytesColumnReader
,FloatColumnReader
,IntColumnReader
,LongColumnReader
,ShortColumnReader
,TimestampColumnReader
public abstract class AbstractColumnReader<VECTOR extends WritableColumnVector> extends Object implements ColumnReader<VECTOR>
AbstractColumnReader
. SeeColumnReaderImpl
, part of the code is referred from Apache Spark and Apache Parquet.
-
-
Field Summary
Fields Modifier and Type Field Description protected org.apache.parquet.column.ColumnDescriptor
descriptor
protected org.apache.parquet.column.Dictionary
dictionary
The dictionary, if this column has dictionary encoding.protected int
maxDefLevel
Maximum definition level for this column.protected org.apache.flink.formats.parquet.vector.reader.RunLengthDecoder
runLenDecoder
Run length decoder for data and dictionary.
-
Constructor Summary
Constructors Constructor Description AbstractColumnReader(org.apache.parquet.column.ColumnDescriptor descriptor, org.apache.parquet.column.page.PageReader pageReader)
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected void
afterReadPage()
After read a page, we may need some initialization.protected void
checkTypeName(org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName expectedName)
protected abstract void
readBatch(int rowId, int num, VECTOR column)
Read batch fromrunLenDecoder
anddataInputStream
.protected abstract void
readBatchFromDictionaryIds(int rowId, int num, VECTOR column, WritableIntVector dictionaryIds)
Decode dictionary ids to data.void
readToVector(int readNumber, VECTOR vector)
Reads `total` values from this columnReader into column.protected boolean
supportLazyDecode()
Support lazy dictionary ids decode.
-
-
-
Field Detail
-
dictionary
protected final org.apache.parquet.column.Dictionary dictionary
The dictionary, if this column has dictionary encoding.
-
maxDefLevel
protected final int maxDefLevel
Maximum definition level for this column.
-
descriptor
protected final org.apache.parquet.column.ColumnDescriptor descriptor
-
runLenDecoder
protected org.apache.flink.formats.parquet.vector.reader.RunLengthDecoder runLenDecoder
Run length decoder for data and dictionary.
-
-
Constructor Detail
-
AbstractColumnReader
public AbstractColumnReader(org.apache.parquet.column.ColumnDescriptor descriptor, org.apache.parquet.column.page.PageReader pageReader) throws IOException
- Throws:
IOException
-
-
Method Detail
-
checkTypeName
protected void checkTypeName(org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName expectedName)
-
readToVector
public final void readToVector(int readNumber, VECTOR vector) throws IOException
Reads `total` values from this columnReader into column.- Specified by:
readToVector
in interfaceColumnReader<VECTOR extends WritableColumnVector>
- Parameters:
readNumber
- number to read.vector
- vector to write.- Throws:
IOException
-
afterReadPage
protected void afterReadPage()
After read a page, we may need some initialization.
-
supportLazyDecode
protected boolean supportLazyDecode()
Support lazy dictionary ids decode. See more inParquetDictionary
. If return false, we will decode all the data first.
-
readBatch
protected abstract void readBatch(int rowId, int num, VECTOR column)
Read batch fromrunLenDecoder
anddataInputStream
.
-
readBatchFromDictionaryIds
protected abstract void readBatchFromDictionaryIds(int rowId, int num, VECTOR column, WritableIntVector dictionaryIds)
Decode dictionary ids to data. FromrunLenDecoder
anddictionaryIdsDecoder
.
-
-