org.apache.mahout.math
Class FileBasedSparseBinaryMatrix

java.lang.Object
  extended by org.apache.mahout.math.AbstractMatrix
      extended by org.apache.mahout.math.FileBasedSparseBinaryMatrix
All Implemented Interfaces:
Cloneable, Iterable<MatrixSlice>, Matrix, VectorIterable

public final class FileBasedSparseBinaryMatrix
extends AbstractMatrix

Provides a way to get data from a file and treat it as if it were a matrix, but avoids putting all that data onto the Java heap. Instead, the file is mapped into non-heap memory as a DoubleBuffer and we access that instead. The interesting aspect of this is that the values in the matrix are binary and sparse so we don't need to store the actual data, just the location of non-zero values.

Currently file data is formatted as follows:

It would be preferable to use something like protobufs to define the format so that we can use different row formats for different kinds of data. For instance, Golay coding of column numbers or compressed bit vectors might be good representations for some purposes.


Nested Class Summary
static class FileBasedSparseBinaryMatrix.BinaryReadOnlyElement
           
 
Nested classes/interfaces inherited from class org.apache.mahout.math.AbstractMatrix
AbstractMatrix.TransposeViewVector
 
Field Summary
 
Fields inherited from class org.apache.mahout.math.AbstractMatrix
COL, columnLabelBindings, columns, ROW, rowLabelBindings, rows
 
Constructor Summary
FileBasedSparseBinaryMatrix(int rows, int columns)
          Constructs an empty matrix of the given size.
 
Method Summary
 Matrix assignColumn(int column, Vector other)
          Assign the other vector values to the column of the receiver
 Matrix assignRow(int row, Vector other)
          Assign the other vector values to the row of the receiver
 double getQuick(int rowIndex, int columnIndex)
          Return the value at the given indexes, without checking bounds
 Matrix like()
          Return an empty matrix of the same underlying class as the receiver
 Matrix like(int rows, int columns)
          Returns an empty matrix of the same underlying class as the receiver and of the specified size.
 void setData(File f)
           
 void setQuick(int row, int column, double value)
          Set the value at the given index, without checking bounds
 Matrix viewPart(int[] offset, int[] size)
          Return a view into part of a matrix.
 Vector viewRow(int rowIndex)
          Returns a view of a row.
static void writeMatrix(File f, Matrix m)
           
 
Methods inherited from class org.apache.mahout.math.AbstractMatrix
aggregate, aggregateColumns, aggregateRows, asFormatString, assign, assign, assign, assign, assign, clone, columnSize, determinant, divide, get, get, getColumnLabelBindings, getNumNondefaultElements, getRowLabelBindings, iterateAll, iterator, minus, numCols, numRows, numSlices, plus, plus, rowSize, set, set, set, set, set, set, setColumnLabelBindings, setRowLabelBindings, times, times, times, timesSquared, toString, transpose, viewColumn, viewDiagonal, viewPart, zSum
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

FileBasedSparseBinaryMatrix

public FileBasedSparseBinaryMatrix(int rows,
                                   int columns)
Constructs an empty matrix of the given size.

Parameters:
rows - The number of rows in the result.
columns - The number of columns in the result.
Method Detail

setData

public void setData(File f)
             throws IOException
Throws:
IOException

writeMatrix

public static void writeMatrix(File f,
                               Matrix m)
                        throws IOException
Throws:
IOException

assignColumn

public Matrix assignColumn(int column,
                           Vector other)
Assign the other vector values to the column of the receiver

Parameters:
column - the int row to assign
other - a Vector
Returns:
the modified receiver
Throws:
CardinalityException - if the cardinalities differ

assignRow

public Matrix assignRow(int row,
                        Vector other)
Assign the other vector values to the row of the receiver

Parameters:
row - the int row to assign
other - a Vector
Returns:
the modified receiver
Throws:
CardinalityException - if the cardinalities differ

getQuick

public double getQuick(int rowIndex,
                       int columnIndex)
Return the value at the given indexes, without checking bounds

Parameters:
rowIndex - an int row index
columnIndex - an int column index
Returns:
the double at the index

like

public Matrix like()
Return an empty matrix of the same underlying class as the receiver

Returns:
a Matrix

like

public Matrix like(int rows,
                   int columns)
Returns an empty matrix of the same underlying class as the receiver and of the specified size.

Parameters:
rows - the int number of rows
columns - the int number of columns

setQuick

public void setQuick(int row,
                     int column,
                     double value)
Set the value at the given index, without checking bounds

Parameters:
row - an int row index into the receiver
column - an int column index into the receiver
value - a double value to set

viewPart

public Matrix viewPart(int[] offset,
                       int[] size)
Return a view into part of a matrix. Changes to the view will change the original matrix.

Specified by:
viewPart in interface Matrix
Overrides:
viewPart in class AbstractMatrix
Parameters:
offset - an int[2] offset into the receiver
size - the int[2] size of the desired result
Returns:
a matrix that shares storage with part of the original matrix.
Throws:
CardinalityException - if the length is greater than the cardinality of the receiver
IndexException - if the offset is negative or the offset+length is outside of the receiver

viewRow

public Vector viewRow(int rowIndex)
Returns a view of a row. Changes to the view will affect the original.

Specified by:
viewRow in interface Matrix
Overrides:
viewRow in class AbstractMatrix
Parameters:
rowIndex - Which row to return.
Returns:
A vector that references the desired row.


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.