T- The type of the element
@PublicEvolving public abstract class Vectorizer<T> extends Object implements Serializable
Users have to extend this class and override the vectorize() method with the logic to
transform the element to a
|Modifier and Type||Method and Description|
Adds arbitrary user metadata to the outgoing ORC file.
Provides the ORC schema.
Users are not supposed to use this method since this is intended to be used only by the
Transforms the provided element to ColumnVectors and sets them in the exposed VectorizedRowBatch.
public Vectorizer(String schema)
public org.apache.orc.TypeDescription getSchema()
public void setWriter(org.apache.orc.Writer writer)
writer- the underlying ORC Writer.
public void addUserMetadata(String key, ByteBuffer value)
Users who want to dynamically add new metadata either based on either the input or from an
external system can do so by calling
addUserMetadata(...) inside the overridden
key- a key to label the data with.
value- the contents of the metadata.
public abstract void vectorize(T element, org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch batch) throws IOException
element- The input element
batch- The batch to write the ColumnVectors
IOException- if there is an error while transforming the input.
Copyright © 2014–2023 The Apache Software Foundation. All rights reserved.