public class HiveTableSink extends Object implements AppendStreamTableSink, PartitionableTableSink, OverwritableTableSink
Constructor and Description |
---|
HiveTableSink(boolean userMrWriter,
boolean isBounded,
org.apache.hadoop.mapred.JobConf jobConf,
ObjectIdentifier identifier,
CatalogTable table) |
Modifier and Type | Method and Description |
---|---|
TableSink |
configure(String[] fieldNames,
TypeInformation[] fieldTypes)
Returns a copy of this
TableSink configured with the field names and types of the
table to emit. |
boolean |
configurePartitionGrouping(boolean supportsGrouping)
If returns true, sink can trust all records will definitely be grouped by partition fields
before consumed by the
TableSink , i.e. |
DataStreamSink |
consumeDataStream(DataStream dataStream)
Consumes the DataStream and return the sink transformation
DataStreamSink . |
DataType |
getConsumedDataType()
Returns the data type consumed by this
TableSink . |
TableSchema |
getTableSchema()
Returns the schema of the consumed table.
|
void |
setOverwrite(boolean overwrite)
Configures whether the insert should overwrite existing data or not.
|
void |
setStaticPartition(Map<String,String> partitionSpec)
Sets the static partition into the
TableSink . |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getFieldNames, getFieldTypes, getOutputType
public HiveTableSink(boolean userMrWriter, boolean isBounded, org.apache.hadoop.mapred.JobConf jobConf, ObjectIdentifier identifier, CatalogTable table)
public final DataStreamSink consumeDataStream(DataStream dataStream)
StreamTableSink
DataStreamSink
. The
returned DataStreamSink
will be used to set resources for the sink operator.consumeDataStream
in interface StreamTableSink
public DataType getConsumedDataType()
TableSink
TableSink
.getConsumedDataType
in interface TableSink
TableSink
.public TableSchema getTableSchema()
TableSink
getTableSchema
in interface TableSink
TableSchema
of the consumed table.public TableSink configure(String[] fieldNames, TypeInformation[] fieldTypes)
TableSink
TableSink
configured with the field names and types of the
table to emit.public boolean configurePartitionGrouping(boolean supportsGrouping)
PartitionableTableSink
TableSink
, i.e. the sink will receive all elements of one
partition and then all elements of another partition, elements of different partitions will
not be mixed. For some sinks, this can be used to reduce number of the partition writers to
improve writing performance.
This method is used to configure the behavior of input whether to be grouped by partition, if true, at the same time the sink should also configure itself, i.e. set an internal field that changes the writing behavior (writing one partition at a time).
configurePartitionGrouping
in interface PartitionableTableSink
supportsGrouping
- whether the execution mode supports grouping, e.g. grouping (usually
use sort to implement) is only supported in batch mode, not supported in streaming mode.supportsGrouping
is false, it should never return true (requires
grouping), otherwise it will fail.public void setStaticPartition(Map<String,String> partitionSpec)
PartitionableTableSink
TableSink
. The static partition may be partial of
all partition columns. See the class Javadoc for more details.
The static partition is represented as a Map<String, String>
which maps from
partition field name to partition value. The partition values are all encoded as strings,
i.e. encoded using String.valueOf(...). For example, if we have a static partition f0=1024, f1="foo", f2="bar"
. f0 is an integer type, f1 and f2 are string types. They will
all be encoded as strings: "1024", "foo", "bar". And can be decoded to original literals
based on the field types.
setStaticPartition
in interface PartitionableTableSink
partitionSpec
- user specified static partitionpublic void setOverwrite(boolean overwrite)
OverwritableTableSink
setOverwrite
in interface OverwritableTableSink
Copyright © 2014–2021 The Apache Software Foundation. All rights reserved.