Package org.apache.sysds.resource.cost
Class CostEstimator
- java.lang.Object
-
- org.apache.sysds.resource.cost.CostEstimator
-
public class CostEstimator extends Object
Class for estimating the execution time of a program. For estimating the time for new set of resources, a new instance of CostEstimator should be created.
-
-
Constructor Summary
Constructors Constructor Description CostEstimator(Program program, CloudInstance driverNode, CloudInstance executorNode)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static doubleestimateExecutionTime(Program program, CloudInstance driverNode, CloudInstance executorNode)Entry point for estimating the execution time of a program.VarStatsgetStats(String statsName)Intended to be called only when it is certain that the corresponding variable is not a scalar and its statistics are in_statsalready.VarStatsgetStatsWithDefaultScalar(String statsName)Intended to be called when the corresponding variable could be scalar.doublegetTimeEstimate()doublegetTimeEstimateCPInst(CPInstruction inst)Estimates the execution time of a single CP instruction following the formula C(p) = T_w + max(T_r, T_c) with: T_w - instruction write (to mem.) time T_r - instruction read (to mem.) time T_c - instruction compute timedoublegetTimeEstimateInst(Instruction inst)doublegetTimeEstimateSparkJob(VarStats varToCollect)voidmaintainFCallInputStats(FunctionCallCPInstruction finst)Creates copies of theVarStatsfor the function argument.voidmaintainFCallOutputStats(FunctionCallCPInstruction finst, FunctionProgramBlock fpb)Creates copies of theVarStatsfor the function output parameters.voidmaintainStats(Instruction inst)Keep the basic-block variable statistics updated and compute I/O cost.doubleparseSPInst(SPInstruction inst)Parse a Spark instruction, and it stores the corresponding cost for computing the output variable in the RDD statistics' object related to that variable.voidputStats(HashMap<String,VarStats> inputStats)Meant to be used for testing purposes
-
-
-
Constructor Detail
-
CostEstimator
public CostEstimator(Program program, CloudInstance driverNode, CloudInstance executorNode)
-
-
Method Detail
-
estimateExecutionTime
public static double estimateExecutionTime(Program program, CloudInstance driverNode, CloudInstance executorNode) throws CostEstimationException
Entry point for estimating the execution time of a program.- Parameters:
program- compiled runtime programdriverNode- ?executorNode- ?- Returns:
- estimated time for execution of the program
given the resources set in
SparkExecutionContext - Throws:
CostEstimationException- in case of errors
-
putStats
public void putStats(HashMap<String,VarStats> inputStats)
Meant to be used for testing purposes- Parameters:
inputStats- ?
-
getStats
public VarStats getStats(String statsName)
Intended to be called only when it is certain that the corresponding variable is not a scalar and its statistics are in_statsalready.- Parameters:
statsName- the corresponding operand name- Returns:
VarStats objectif the given key is present in the map saving the current variable statistics.- Throws:
RuntimeException- if the corresponding variable is not in_stats
-
getStatsWithDefaultScalar
public VarStats getStatsWithDefaultScalar(String statsName)
Intended to be called when the corresponding variable could be scalar.- Parameters:
statsName- the corresponding operand name- Returns:
VarStats objectin any case
-
getTimeEstimate
public double getTimeEstimate() throws CostEstimationException- Throws:
CostEstimationException
-
maintainFCallInputStats
public void maintainFCallInputStats(FunctionCallCPInstruction finst)
Creates copies of theVarStatsfor the function argument. Meant to be called before estimating the execution time of the function program block of the corresponding function call instruction, otherwise the relevant statistics would not be available for the estimation.- Parameters:
finst- ?
-
maintainFCallOutputStats
public void maintainFCallOutputStats(FunctionCallCPInstruction finst, FunctionProgramBlock fpb)
Creates copies of theVarStatsfor the function output parameters. Meant to be called after estimating the execution time of the function program block of the corresponding function call instruction, otherwise the relevant statistics would not have been created yet.- Parameters:
finst- ?fpb- ?
-
maintainStats
public void maintainStats(Instruction inst)
Keep the basic-block variable statistics updated and compute I/O cost. NOTE: At program execution reading the files is done once the matrix is needed but cost estimation the place for adding cost is not relevant.- Parameters:
inst- ?
-
getTimeEstimateInst
public double getTimeEstimateInst(Instruction inst) throws CostEstimationException
- Throws:
CostEstimationException
-
getTimeEstimateCPInst
public double getTimeEstimateCPInst(CPInstruction inst) throws CostEstimationException
Estimates the execution time of a single CP instruction following the formula C(p) = T_w + max(T_r, T_c) with:- T_w - instruction write (to mem.) time
- T_r - instruction read (to mem.) time
- T_c - instruction compute time
- Parameters:
inst- instruction for estimation- Returns:
- estimated time in seconds
- Throws:
CostEstimationException- when the hardware configuration is not sufficient
-
parseSPInst
public double parseSPInst(SPInstruction inst) throws CostEstimationException
Parse a Spark instruction, and it stores the corresponding cost for computing the output variable in the RDD statistics' object related to that variable. This method is responsible for initializing the correspondingRDDStatsobject for each output variable, including for outputs that are explicitly brought back to CP (Spark action within the instruction). It returns the time estimate only for those instructions that bring the output explicitly to CP. For the rest, the estimated time (cost) is stored as part of the corresponding RDD statistics, emulating the lazy evaluation execution of Spark.- Parameters:
inst- Spark instruction for parsing- Returns:
- if explicit action, estimated time in seconds, else always 0
- Throws:
CostEstimationException- ?
-
getTimeEstimateSparkJob
public double getTimeEstimateSparkJob(VarStats varToCollect)
-
-