Class SamplingSkewReshapingPass
- java.lang.Object
-
- org.apache.nemo.common.pass.Pass
-
- org.apache.nemo.compiler.optimizer.pass.compiletime.CompileTimePass
-
- org.apache.nemo.compiler.optimizer.pass.compiletime.reshaping.ReshapingPass
-
- org.apache.nemo.compiler.optimizer.pass.compiletime.reshaping.SamplingSkewReshapingPass
-
public final class SamplingSkewReshapingPass extends ReshapingPass
Optimizes the PartitionSet property of shuffle edges to handle data skews using the SamplingVertex.This pass effectively partitions the IRDAG by non-oneToOne edges, clones each subDAG partition using SamplingVertex to process sampled data, and executes each cloned partition prior to executing the corresponding original partition.
Suppose the IRDAG is partitioned into three sub-DAG partitions with shuffle dependencies as follows: P1 - P2 - P3
Then, this pass will produce something like: P1' - P1 - P2' - P2 - P3 where Px' consists of SamplingVertex objects that clone the execution of Px. (P3 is not cloned here because it is a sink partition, and none of the outgoing edges of its vertices needs to be optimized)
For each Px' this pass also inserts a TriggerVertex, to use its data statistics for dynamically optimizing the execution behaviors of Px.
-
-
Constructor Summary
Constructors Constructor Description SamplingSkewReshapingPass()
Default constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description IRDAG
apply(IRDAG dag)
-
Methods inherited from class org.apache.nemo.compiler.optimizer.pass.compiletime.reshaping.ReshapingPass
getPrerequisiteExecutionProperties
-
Methods inherited from class org.apache.nemo.common.pass.Pass
addCondition, getCondition
-
-