org.apache.mahout.math.neighborhood
Class FastProjectionSearch

java.lang.Object
  extended by org.apache.mahout.math.neighborhood.Searcher
      extended by org.apache.mahout.math.neighborhood.UpdatableSearcher
          extended by org.apache.mahout.math.neighborhood.FastProjectionSearch
All Implemented Interfaces:
Iterable<Vector>

public class FastProjectionSearch
extends UpdatableSearcher

Does approximate nearest neighbor search by projecting the vectors similar to ProjectionSearch. The main difference between this class and the ProjectionSearch is the use of sorted arrays instead of binary search trees to implement the sets of scalar projections. Instead of taking log n time to add a vector to each of the vectors, * the pending additions are kept separate and are searched using a brute search. When there are "enough" pending additions, they're committed into the main pool of vectors.


Field Summary
 
Fields inherited from class org.apache.mahout.math.neighborhood.Searcher
distanceMeasure
 
Constructor Summary
FastProjectionSearch(DistanceMeasure distanceMeasure, int numProjections, int searchSize)
           
 
Method Summary
 void add(Vector vector)
          Add a new Vector to the Searcher that will be checked when getting the nearest neighbors.
 void clear()
           
 Iterator<Vector> iterator()
          This iterates on the snapshot of the contents first instantiated regardless of any future modifications.
 boolean remove(Vector vector, double epsilon)
           
 List<WeightedThing<Vector>> search(Vector query, int limit)
          When querying the Searcher for the closest vectors, a list of WeightedThings is returned.
 WeightedThing<Vector> searchFirst(Vector query, boolean differentThanQuery)
          Returns the closest vector to the query.
 int size()
          Returns the number of WeightedVectors being searched for nearest neighbors.
 
Methods inherited from class org.apache.mahout.math.neighborhood.Searcher
addAll, addAllMatrixSlices, addAllMatrixSlicesAsWeightedVectors, getCandidateQueue, getDistanceMeasure, search, searchFirst
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FastProjectionSearch

public FastProjectionSearch(DistanceMeasure distanceMeasure,
                            int numProjections,
                            int searchSize)
Method Detail

add

public void add(Vector vector)
Add a new Vector to the Searcher that will be checked when getting the nearest neighbors.

The vector IS NOT CLONED. Do not modify the vector externally otherwise the internal Searcher data structures could be invalidated.

Specified by:
add in class Searcher

size

public int size()
Returns the number of WeightedVectors being searched for nearest neighbors.

Specified by:
size in class Searcher

search

public List<WeightedThing<Vector>> search(Vector query,
                                          int limit)
When querying the Searcher for the closest vectors, a list of WeightedThings is returned. The value of the WeightedThing is the neighbor and the weight is the the distance (calculated by some metric - see a concrete implementation) between the query and neighbor. The actual type of vector in the pair is the same as the vector added to the Searcher.

Specified by:
search in class Searcher
Parameters:
query - the vector to search for
limit - the number of results to return
Returns:
the list of weighted vectors closest to the query

searchFirst

public WeightedThing<Vector> searchFirst(Vector query,
                                         boolean differentThanQuery)
Returns the closest vector to the query. When only one the nearest vector is needed, use this method, NOT search(query, limit) because it's faster (less overhead).

Specified by:
searchFirst in class Searcher
Parameters:
query - the vector to search for
differentThanQuery - if true, returns the closest vector different than the query (this only matters if the query is among the searched vectors), otherwise, returns the closest vector to the query (even the same vector).
Returns:
the weighted vector closest to the query

remove

public boolean remove(Vector vector,
                      double epsilon)
Specified by:
remove in class UpdatableSearcher

clear

public void clear()
Specified by:
clear in class UpdatableSearcher

iterator

public Iterator<Vector> iterator()
This iterates on the snapshot of the contents first instantiated regardless of any future modifications. Changes done after the iterator is created will not be visible to the iterator but will be visible when searching.

Returns:
iterator through the vectors in this searcher.


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.