public class DefaultCostEstimator extends CostEstimator
This estimator works with actual estimates (as far as they are available) and falls back to setting relative costs, if no estimates are available. That way, the estimator makes sure that plans with different strategies are costed differently, also in the absence of estimates. The different relative costs in the absence of estimates represent this estimator's heuristic guidance towards certain strategies.
For robustness reasons, we always assume that the whole data is shipped during a repartition step. We deviate from
the typical estimate of (n - 1) / n
(with n being the number of nodes), because for a parallelism
of 1, that would yield a shipping of zero bytes. While this is usually correct, the runtime scheduling may still
choose to move tasks to different nodes, so that we do not know that no data is shipped.
Constructor and Description |
---|
DefaultCostEstimator() |
Modifier and Type | Method and Description |
---|---|
void |
addArtificialDamCost(EstimateProvider estimates,
long bufferSize,
Costs costs) |
void |
addBlockNestedLoopsCosts(EstimateProvider outerSide,
EstimateProvider innerSide,
long blockSize,
Costs costs,
int costWeight) |
void |
addBroadcastCost(EstimateProvider estimates,
int replicationFactor,
Costs costs) |
void |
addCachedHybridHashCosts(EstimateProvider buildSideInput,
EstimateProvider probeSideInput,
Costs costs,
int costWeight)
Calculates the costs for the cached variant of the hybrid hash join.
|
void |
addFileInputCost(long fileSizeInBytes,
Costs costs) |
void |
addHashPartitioningCost(EstimateProvider estimates,
Costs costs) |
void |
addHybridHashCosts(EstimateProvider buildSideInput,
EstimateProvider probeSideInput,
Costs costs,
int costWeight) |
void |
addLocalMergeCost(EstimateProvider input1,
EstimateProvider input2,
Costs costs,
int costWeight) |
void |
addLocalSortCost(EstimateProvider estimates,
Costs costs) |
void |
addRandomPartitioningCost(EstimateProvider estimates,
Costs costs) |
void |
addRangePartitionCost(EstimateProvider estimates,
Costs costs) |
void |
addStreamedNestedLoopsCosts(EstimateProvider outerSide,
EstimateProvider innerSide,
long bufferSize,
Costs costs,
int costWeight) |
costOperator
public void addRandomPartitioningCost(EstimateProvider estimates, Costs costs)
addRandomPartitioningCost
in class CostEstimator
public void addHashPartitioningCost(EstimateProvider estimates, Costs costs)
addHashPartitioningCost
in class CostEstimator
public void addRangePartitionCost(EstimateProvider estimates, Costs costs)
addRangePartitionCost
in class CostEstimator
public void addBroadcastCost(EstimateProvider estimates, int replicationFactor, Costs costs)
addBroadcastCost
in class CostEstimator
public void addFileInputCost(long fileSizeInBytes, Costs costs)
addFileInputCost
in class CostEstimator
public void addLocalSortCost(EstimateProvider estimates, Costs costs)
addLocalSortCost
in class CostEstimator
public void addLocalMergeCost(EstimateProvider input1, EstimateProvider input2, Costs costs, int costWeight)
addLocalMergeCost
in class CostEstimator
public void addHybridHashCosts(EstimateProvider buildSideInput, EstimateProvider probeSideInput, Costs costs, int costWeight)
addHybridHashCosts
in class CostEstimator
public void addCachedHybridHashCosts(EstimateProvider buildSideInput, EstimateProvider probeSideInput, Costs costs, int costWeight)
addCachedHybridHashCosts
in class CostEstimator
public void addStreamedNestedLoopsCosts(EstimateProvider outerSide, EstimateProvider innerSide, long bufferSize, Costs costs, int costWeight)
addStreamedNestedLoopsCosts
in class CostEstimator
public void addBlockNestedLoopsCosts(EstimateProvider outerSide, EstimateProvider innerSide, long blockSize, Costs costs, int costWeight)
addBlockNestedLoopsCosts
in class CostEstimator
public void addArtificialDamCost(EstimateProvider estimates, long bufferSize, Costs costs)
addArtificialDamCost
in class CostEstimator
Copyright © 2014–2020 The Apache Software Foundation. All rights reserved.