public abstract class AbstractColumnReader<VECTOR extends WritableColumnVector> extends Object implements ColumnReader<VECTOR>
ColumnReader
. See ColumnReaderImpl
, part of the code is referred from
Apache Spark and Apache Parquet.Modifier and Type | Field and Description |
---|---|
protected org.apache.parquet.column.ColumnDescriptor |
descriptor |
protected org.apache.parquet.column.Dictionary |
dictionary
The dictionary, if this column has dictionary encoding.
|
protected int |
maxDefLevel
Maximum definition level for this column.
|
protected org.apache.flink.formats.parquet.vector.reader.RunLengthDecoder |
runLenDecoder
Run length decoder for data and dictionary.
|
Constructor and Description |
---|
AbstractColumnReader(org.apache.parquet.column.ColumnDescriptor descriptor,
org.apache.parquet.column.page.PageReader pageReader) |
Modifier and Type | Method and Description |
---|---|
protected void |
afterReadPage()
After read a page, we may need some initialization.
|
protected void |
checkTypeName(org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName expectedName) |
protected abstract void |
readBatch(int rowId,
int num,
VECTOR column)
Read batch from
runLenDecoder and dataInputStream . |
protected abstract void |
readBatchFromDictionaryIds(int rowId,
int num,
VECTOR column,
WritableIntVector dictionaryIds)
Decode dictionary ids to data.
|
void |
readToVector(int readNumber,
VECTOR vector)
Reads `total` values from this columnReader into column.
|
protected boolean |
supportLazyDecode()
Support lazy dictionary ids decode.
|
protected final org.apache.parquet.column.Dictionary dictionary
protected final int maxDefLevel
protected final org.apache.parquet.column.ColumnDescriptor descriptor
protected org.apache.flink.formats.parquet.vector.reader.RunLengthDecoder runLenDecoder
public AbstractColumnReader(org.apache.parquet.column.ColumnDescriptor descriptor, org.apache.parquet.column.page.PageReader pageReader) throws IOException
IOException
protected void checkTypeName(org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName expectedName)
public final void readToVector(int readNumber, VECTOR vector) throws IOException
readToVector
in interface ColumnReader<VECTOR extends WritableColumnVector>
readNumber
- number to read.vector
- vector to write.IOException
protected void afterReadPage()
protected boolean supportLazyDecode()
ParquetDictionary
. If return false,
we will decode all the data first.protected abstract void readBatch(int rowId, int num, VECTOR column)
runLenDecoder
and dataInputStream
.protected abstract void readBatchFromDictionaryIds(int rowId, int num, VECTOR column, WritableIntVector dictionaryIds)
runLenDecoder
and dictionaryIdsDecoder
.Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.