public class TPCHQuery10 extends Object
This program implements the following SQL equivalent:
SELECT
c_custkey,
c_name,
c_address,
n_name,
c_acctbal
SUM(l_extendedprice * (1 - l_discount)) AS revenue,
FROM
customer,
orders,
lineitem,
nation
WHERE
c_custkey = o_custkey
AND l_orderkey = o_orderkey
AND YEAR(o_orderdate) > '1990'
AND l_returnflag = 'R'
AND c_nationkey = n_nationkey
GROUP BY
c_custkey,
c_name,
c_acctbal,
n_name,
c_address
Compared to the original TPC-H query this version does not print c_phone and c_comment, only filters by years greater than 1990 instead of a period of 3 months, and does not sort the result by revenue.
Input files are plain text CSV files using the pipe character ('|') as field separator as generated by the TPC-H data generator which is available at http://www.tpc.org/tpch/.
Usage:
TPCHQuery10 --customer <path> --orders <path> --lineitem<path> --nation <path> --output <path>
This example shows how to use:
Note: All Flink DataSet APIs are deprecated since Flink 1.18 and will be removed in a future Flink major version. You can still build your application in DataSet, but you should move to either the DataStream and/or Table API. This class is retained for testing purposes.
Constructor and Description |
---|
TPCHQuery10() |
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.