@Experimental public class JdbcInputFormat extends RichInputFormat<Row,InputSplit> implements ResultTypeQueryable<Row>
TypeInformation>[] fieldTypes = new TypeInformation>[] {
BasicTypeInfo.INT_TYPE_INFO,
BasicTypeInfo.STRING_TYPE_INFO,
BasicTypeInfo.STRING_TYPE_INFO,
BasicTypeInfo.DOUBLE_TYPE_INFO,
BasicTypeInfo.INT_TYPE_INFO
};
RowTypeInfo rowTypeInfo = new RowTypeInfo(fieldTypes);
JdbcInputFormat jdbcInputFormat = JdbcInputFormat.buildJdbcInputFormat()
.setDrivername("org.apache.derby.jdbc.EmbeddedDriver")
.setDBUrl("jdbc:derby:memory:ebookshop")
.setQuery("select * from books")
.setRowTypeInfo(rowTypeInfo)
.finish();
In order to query the JDBC source in parallel, you need to provide a parameterized query
template (i.e. a valid PreparedStatement
) and a JdbcParameterValuesProvider
which
provides binding values for the query parameters. E.g.:
Serializable[][] queryParameters = new String[2][1];
queryParameters[0] = new String[]{"Kumar"};
queryParameters[1] = new String[]{"Tan Ah Teck"};
JdbcInputFormat jdbcInputFormat = JdbcInputFormat.buildJdbcInputFormat()
.setDrivername("org.apache.derby.jdbc.EmbeddedDriver")
.setDBUrl("jdbc:derby:memory:ebookshop")
.setQuery("select * from books WHERE author = ?")
.setRowTypeInfo(rowTypeInfo)
.setParametersProvider(new JdbcGenericParameterValuesProvider(queryParameters))
.finish();
Modifier and Type | Class and Description |
---|---|
static class |
JdbcInputFormat.JdbcInputFormatBuilder
Builder for
JdbcInputFormat . |
Modifier and Type | Field and Description |
---|---|
protected Boolean |
autoCommit |
protected JdbcConnectionProvider |
connectionProvider |
protected int |
fetchSize |
protected boolean |
hasNext |
protected static org.slf4j.Logger |
LOG |
protected Object[][] |
parameterValues |
protected String |
queryTemplate |
protected ResultSet |
resultSet |
protected int |
resultSetConcurrency |
protected int |
resultSetType |
protected RowTypeInfo |
rowTypeInfo |
protected static long |
serialVersionUID |
protected PreparedStatement |
statement |
Constructor and Description |
---|
JdbcInputFormat() |
Modifier and Type | Method and Description |
---|---|
static JdbcInputFormat.JdbcInputFormatBuilder |
buildJdbcInputFormat()
A builder used to set parameters to the output format's configuration in a fluent way.
|
void |
close()
Closes all resources used.
|
void |
closeInputFormat()
Closes this InputFormat instance.
|
void |
configure(Configuration parameters)
Configures this input format.
|
InputSplit[] |
createInputSplits(int minNumSplits)
Computes the input splits.
|
protected Connection |
getDbConn() |
InputSplitAssigner |
getInputSplitAssigner(InputSplit[] inputSplits)
Returns the assigner for the input splits.
|
RowTypeInfo |
getProducedType()
Gets the data type (as a
TypeInformation ) produced by this function or input format. |
protected PreparedStatement |
getStatement() |
BaseStatistics |
getStatistics(BaseStatistics cachedStatistics)
Gets the basic statistics from the input described by this format.
|
Row |
nextRecord(Row reuse)
Stores the next resultSet row in a tuple.
|
void |
open(InputSplit inputSplit)
Connects to the source database and executes the query in a parallel fashion if this
InputFormat is built using a parameterized query (i.e. |
void |
openInputFormat()
Opens this InputFormat instance.
|
boolean |
reachedEnd()
Checks whether all data has been read.
|
getRuntimeContext, setRuntimeContext
protected static final long serialVersionUID
protected static final org.slf4j.Logger LOG
protected JdbcConnectionProvider connectionProvider
protected String queryTemplate
protected int resultSetType
protected int resultSetConcurrency
protected RowTypeInfo rowTypeInfo
protected transient PreparedStatement statement
protected transient ResultSet resultSet
protected int fetchSize
protected Boolean autoCommit
protected boolean hasNext
protected Object[][] parameterValues
public RowTypeInfo getProducedType()
ResultTypeQueryable
TypeInformation
) produced by this function or input format.getProducedType
in interface ResultTypeQueryable<Row>
public void configure(Configuration parameters)
InputFormat
This method is always called first on a newly instantiated input format.
configure
in interface InputFormat<Row,InputSplit>
parameters
- The configuration with all parameters (note: not the Flink config but the
TaskConfig).public void openInputFormat()
RichInputFormat
openInputFormat
in class RichInputFormat<Row,InputSplit>
InputFormat
public void closeInputFormat()
RichInputFormat
RichInputFormat.openInputFormat()
should be closed in this method.closeInputFormat
in class RichInputFormat<Row,InputSplit>
InputFormat
public void open(InputSplit inputSplit) throws IOException
InputFormat
is built using a parameterized query (i.e. using a PreparedStatement
) and a proper JdbcParameterValuesProvider
, in a non-parallel
fashion otherwise.open
in interface InputFormat<Row,InputSplit>
inputSplit
- which is ignored if this InputFormat is executed as a non-parallel source,
a "hook" to the query parameters otherwise (using its splitNumber)IOException
- if there's an error during the execution of the querypublic void close() throws IOException
close
in interface InputFormat<Row,InputSplit>
IOException
- Indicates that a resource could not be closed.public boolean reachedEnd() throws IOException
reachedEnd
in interface InputFormat<Row,InputSplit>
IOException
public Row nextRecord(Row reuse) throws IOException
nextRecord
in interface InputFormat<Row,InputSplit>
reuse
- row to be reused.Row
IOException
public BaseStatistics getStatistics(BaseStatistics cachedStatistics) throws IOException
InputFormat
When this method is called, the input format is guaranteed to be configured.
getStatistics
in interface InputFormat<Row,InputSplit>
cachedStatistics
- The statistics that were cached. May be null.IOException
public InputSplit[] createInputSplits(int minNumSplits) throws IOException
InputSplitSource
createInputSplits
in interface InputFormat<Row,InputSplit>
createInputSplits
in interface InputSplitSource<InputSplit>
minNumSplits
- Number of minimal input splits, as a hint.IOException
public InputSplitAssigner getInputSplitAssigner(InputSplit[] inputSplits)
InputSplitSource
getInputSplitAssigner
in interface InputFormat<Row,InputSplit>
getInputSplitAssigner
in interface InputSplitSource<InputSplit>
@VisibleForTesting protected PreparedStatement getStatement()
@VisibleForTesting protected Connection getDbConn()
public static JdbcInputFormat.JdbcInputFormatBuilder buildJdbcInputFormat()
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.