Class QDictionary
- java.lang.Object
-
- org.apache.sysds.runtime.compress.colgroup.dictionary.ADictionary
-
- org.apache.sysds.runtime.compress.colgroup.dictionary.ACachingMBDictionary
-
- org.apache.sysds.runtime.compress.colgroup.dictionary.QDictionary
-
- All Implemented Interfaces:
Serializable,IDictionary
public class QDictionary extends ACachingMBDictionary
This dictionary class aims to encapsulate the storage and operations over unique floating point values of a column group. The primary reason for its introduction was to provide an entry point for specialization such as shared dictionaries, which require additional information.- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.sysds.runtime.compress.colgroup.dictionary.IDictionary
IDictionary.DictType
-
-
Field Summary
-
Fields inherited from interface org.apache.sysds.runtime.compress.colgroup.dictionary.IDictionary
LOG
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description doubleaggregate(double init, Builtin fn)Aggregate all the contained values, useful in value only computations where the operation is iterating through all values contained in the dictionary.QDictionaryclone()Returns a deep clone of the dictionary.int[]countNNZZeroColumns(int[] counts)Count the number of non zero values in each column of the dictionary, multiplied with the countsstatic QDictionarycreate(byte[] values, double scale, int nCol, boolean check)MatrixBlockDictionarycreateMBDict(int nCol)booleanequals(IDictionary o)Indicate if the other dictionary is equal to this.IDictionary.DictTypegetDictType()Get the dictionary type this dictionary is.longgetExactSizeOnDisk()Calculate the space consumption if the dictionary is stored on disk.longgetInMemorySize()Returns the memory usage of the dictionary.static longgetInMemorySize(int valuesCount)MatrixBlockDictionarygetMBDict()longgetNumberNonZeros(int[] counts, int nCol)Calculate the number of non zeros in the dictionary.intgetNumberOfColumns(int nCol)Get the number of columns in this dictionary, provided you know the number of values, or rows.intgetNumberOfValues(int nCol)Get the number of distinct tuples given that the column group has n columnsdoublegetSparsity()Get the sparsity of the dictionary.StringgetString(int colIndexes)Get a string representation of the dictionary, that considers the layout of the data.doublegetValue(int i)Get Specific value contained in the dictionary at index.doublegetValue(int r, int c, int nCol)Get Specific value contain in dictionary at index.double[]getValues()Get all the values contained in the dictionary as a linearized double array.static QDictionaryread(DataInput in)IDictionarysliceOutColumnRange(int idxStart, int idxEnd, int previousNumberOfColumns)Modify the dictionary by removing columns not within the index range.double[]sumAllRowsToDouble(int nrColumns)Method used as a pre-aggregate of each tuple in the dictionary, to single double values.double[]sumAllRowsToDoubleSq(int nrColumns)Method used as a pre-aggregate of each tuple in the dictionary, to single double values.voidwrite(DataOutput out)Write the dictionary to a DataOutput.-
Methods inherited from class org.apache.sysds.runtime.compress.colgroup.dictionary.ACachingMBDictionary
getMBDict
-
Methods inherited from class org.apache.sysds.runtime.compress.colgroup.dictionary.ADictionary
addToEntry, addToEntry, addToEntryVectorized, aggregateCols, aggregateColsWithReference, aggregateRows, aggregateRowsWithDefault, aggregateRowsWithReference, aggregateWithReference, append, applyScalarOp, applyScalarOpAndAppend, applyScalarOpWithReference, applyUnaryOp, applyUnaryOpAndAppend, applyUnaryOpWithReference, binOpLeft, binOpLeftAndAppend, binOpLeftWithReference, binOpRight, binOpRight, binOpRightAndAppend, binOpRightWithReference, cbind, centralMoment, centralMoment, centralMomentWithDefault, centralMomentWithDefault, centralMomentWithReference, centralMomentWithReference, colProduct, colProductWithReference, colSum, colSumSq, colSumSqWithReference, containsValue, containsValueWithReference, correctNan, equals, equals, getNumberNonZerosWithReference, getRow, MMDict, MMDictDense, MMDictScaling, MMDictScalingDense, MMDictScalingSparse, MMDictSparse, multiplyScalar, preaggValuesFromDense, product, productAllRowsToDouble, productAllRowsToDoubleWithDefault, productAllRowsToDoubleWithReference, productWithDefault, productWithReference, putDense, putSparse, reorder, replace, replaceWithReference, rexpandCols, rexpandColsWithReference, rightMMPreAggSparse, scaleTuples, subtractTuple, sum, sumAllRowsToDoubleSqWithDefault, sumAllRowsToDoubleSqWithReference, sumAllRowsToDoubleWithDefault, sumAllRowsToDoubleWithReference, sumSq, sumSqWithReference, TSMMToUpperTriangle, TSMMToUpperTriangleDense, TSMMToUpperTriangleDenseScaling, TSMMToUpperTriangleScaling, TSMMToUpperTriangleSparse, TSMMToUpperTriangleSparseScaling, TSMMWithScaling
-
-
-
-
Method Detail
-
create
public static QDictionary create(byte[] values, double scale, int nCol, boolean check)
-
getValues
public double[] getValues()
Description copied from interface:IDictionaryGet all the values contained in the dictionary as a linearized double array.- Specified by:
getValuesin interfaceIDictionary- Overrides:
getValuesin classADictionary- Returns:
- linearized double array
-
getValue
public double getValue(int i)
Description copied from interface:IDictionaryGet Specific value contained in the dictionary at index.- Specified by:
getValuein interfaceIDictionary- Overrides:
getValuein classADictionary- Parameters:
i- The index to extract the value from- Returns:
- The value contained at the index
-
getValue
public final double getValue(int r, int c, int nCol)Description copied from interface:IDictionaryGet Specific value contain in dictionary at index.- Specified by:
getValuein interfaceIDictionary- Overrides:
getValuein classADictionary- Parameters:
r- Row targetc- Col targetnCol- nCol in dictionary- Returns:
- value
-
getInMemorySize
public long getInMemorySize()
Description copied from interface:IDictionaryReturns the memory usage of the dictionary.- Returns:
- a long value in number of bytes for the dictionary.
-
getInMemorySize
public static long getInMemorySize(int valuesCount)
-
aggregate
public double aggregate(double init, Builtin fn)Description copied from interface:IDictionaryAggregate all the contained values, useful in value only computations where the operation is iterating through all values contained in the dictionary.- Specified by:
aggregatein interfaceIDictionary- Overrides:
aggregatein classADictionary- Parameters:
init- The initial Value, in cases such as Max value, this could be -infinityfn- The Function to apply to values- Returns:
- The aggregated value as a double.
-
clone
public QDictionary clone()
Description copied from interface:IDictionaryReturns a deep clone of the dictionary.- Specified by:
clonein interfaceIDictionary- Specified by:
clonein classADictionary- Returns:
- A deep clone
-
write
public void write(DataOutput out) throws IOException
Description copied from interface:IDictionaryWrite the dictionary to a DataOutput.- Parameters:
out- the output sink to write the dictionary to.- Throws:
IOException- if the sink fails.
-
read
public static QDictionary read(DataInput in) throws IOException
- Throws:
IOException
-
getExactSizeOnDisk
public long getExactSizeOnDisk()
Description copied from interface:IDictionaryCalculate the space consumption if the dictionary is stored on disk.- Returns:
- the long count of bytes to store the dictionary.
-
getNumberOfValues
public int getNumberOfValues(int nCol)
Description copied from interface:IDictionaryGet the number of distinct tuples given that the column group has n columns- Parameters:
nCol- The number of Columns in the ColumnGroup.- Returns:
- the number of value tuples contained in the dictionary.
-
getNumberOfColumns
public int getNumberOfColumns(int nCol)
Description copied from interface:IDictionaryGet the number of columns in this dictionary, provided you know the number of values, or rows.- Parameters:
nCol- The number of rows/values known inside this dictionary- Returns:
- The number of columns
-
sumAllRowsToDouble
public double[] sumAllRowsToDouble(int nrColumns)
Description copied from interface:IDictionaryMethod used as a pre-aggregate of each tuple in the dictionary, to single double values. Note if the number of columns is one the actual dictionaries values are simply returned.- Specified by:
sumAllRowsToDoublein interfaceIDictionary- Overrides:
sumAllRowsToDoublein classADictionary- Parameters:
nrColumns- The number of columns in the ColGroup to know how to get the values from the dictionary.- Returns:
- a double array containing the row sums from this dictionary.
-
sumAllRowsToDoubleSq
public double[] sumAllRowsToDoubleSq(int nrColumns)
Description copied from interface:IDictionaryMethod used as a pre-aggregate of each tuple in the dictionary, to single double values. Note if the number of columns is one the actual dictionaries values are simply returned.- Specified by:
sumAllRowsToDoubleSqin interfaceIDictionary- Overrides:
sumAllRowsToDoubleSqin classADictionary- Parameters:
nrColumns- The number of columns in the ColGroup to know how to get the values from the dictionary.- Returns:
- a double array containing the row sums from this dictionary.
-
getString
public String getString(int colIndexes)
Description copied from interface:IDictionaryGet a string representation of the dictionary, that considers the layout of the data.- Parameters:
colIndexes- The number of columns in the dictionary.- Returns:
- A string that is nicer to print.
-
sliceOutColumnRange
public IDictionary sliceOutColumnRange(int idxStart, int idxEnd, int previousNumberOfColumns)
Description copied from interface:IDictionaryModify the dictionary by removing columns not within the index range.- Specified by:
sliceOutColumnRangein interfaceIDictionary- Overrides:
sliceOutColumnRangein classADictionary- Parameters:
idxStart- The column index to start at.idxEnd- The column index to end at (not inclusive)previousNumberOfColumns- The number of columns contained in the dictionary.- Returns:
- A dictionary containing the sliced out columns values only.
-
getNumberNonZeros
public long getNumberNonZeros(int[] counts, int nCol)Description copied from interface:IDictionaryCalculate the number of non zeros in the dictionary. The number of non zeros should be scaled with the counts given. This gives the exact number of non zero values in the parent column group.- Parameters:
counts- The counts of each dictionary entrynCol- The number of columns in this dictionary- Returns:
- The nonZero count
-
countNNZZeroColumns
public int[] countNNZZeroColumns(int[] counts)
Description copied from interface:IDictionaryCount the number of non zero values in each column of the dictionary, multiplied with the counts- Specified by:
countNNZZeroColumnsin interfaceIDictionary- Overrides:
countNNZZeroColumnsin classADictionary- Parameters:
counts- The counts to multiply with.- Returns:
- The nonzero count of each column in the dictionary.
-
getDictType
public IDictionary.DictType getDictType()
Description copied from interface:IDictionaryGet the dictionary type this dictionary is.- Returns:
- The Dictionary type this is.
-
getSparsity
public double getSparsity()
Description copied from interface:IDictionaryGet the sparsity of the dictionary.- Specified by:
getSparsityin interfaceIDictionary- Overrides:
getSparsityin classADictionary- Returns:
- a sparsity between 0 and 1
-
equals
public boolean equals(IDictionary o)
Description copied from interface:IDictionaryIndicate if the other dictionary is equal to this.- Parameters:
o- The other object- Returns:
- If it is equal
-
getMBDict
public MatrixBlockDictionary getMBDict()
- Overrides:
getMBDictin classADictionary
-
createMBDict
public MatrixBlockDictionary createMBDict(int nCol)
- Specified by:
createMBDictin classACachingMBDictionary
-
-