org.apache.mahout.classifier.naivebayes
Class AbstractNaiveBayesClassifier

java.lang.Object
  extended by org.apache.mahout.classifier.AbstractVectorClassifier
      extended by org.apache.mahout.classifier.naivebayes.AbstractNaiveBayesClassifier
Direct Known Subclasses:
ComplementaryNaiveBayesClassifier, StandardNaiveBayesClassifier

public abstract class AbstractNaiveBayesClassifier
extends AbstractVectorClassifier

Class implementing the Naive Bayes Classifier Algorithm. Note that this class supports classifyFull(org.apache.mahout.math.Vector), but not classify or classifyScalar. The reason that these two methods are not supported is because the scores computed by a NaiveBayesClassifier do not represent probabilities.


Field Summary
 
Fields inherited from class org.apache.mahout.classifier.AbstractVectorClassifier
MIN_LOG_LIKELIHOOD
 
Constructor Summary
protected AbstractNaiveBayesClassifier(NaiveBayesModel model)
           
 
Method Summary
 Vector classify(Vector instance)
          Unsupported method.
 Vector classifyFull(Vector instance)
          Computes and returns a vector containing n scores, where n is numCategories(), given an input vector instance.
 Vector classifyFull(Vector r, Vector instance)
          Computes and returns a vector containing n scores, where n is numCategories(), given an input vector instance.
 double classifyScalar(Vector instance)
          Unsupported method.
protected  NaiveBayesModel getModel()
           
protected abstract  double getScoreForLabelFeature(int label, int feature)
           
protected  double getScoreForLabelInstance(int label, Vector instance)
           
 int numCategories()
          Returns the number of categories that a target variable can be assigned to.
 
Methods inherited from class org.apache.mahout.classifier.AbstractVectorClassifier
classify, classifyFull, classifyNoLink, classifyScalar, logLikelihood
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AbstractNaiveBayesClassifier

protected AbstractNaiveBayesClassifier(NaiveBayesModel model)
Method Detail

getModel

protected NaiveBayesModel getModel()

getScoreForLabelFeature

protected abstract double getScoreForLabelFeature(int label,
                                                  int feature)

getScoreForLabelInstance

protected double getScoreForLabelInstance(int label,
                                          Vector instance)

numCategories

public int numCategories()
Description copied from class: AbstractVectorClassifier
Returns the number of categories that a target variable can be assigned to. A vector classifier will encode it's output as an integer from 0 to numCategories()-1 (inclusive).

Specified by:
numCategories in class AbstractVectorClassifier
Returns:
The number of categories.

classifyFull

public Vector classifyFull(Vector instance)
Description copied from class: AbstractVectorClassifier
Computes and returns a vector containing n scores, where n is numCategories(), given an input vector instance. Higher scores indicate that the input vector is more likely to belong to the corresponding category. The categories are denoted by the integers 0 through n-1 (inclusive).

Using this method it is possible to classify an input vector, for example, by selecting the category with the largest score. If classifier is an instance of AbstractVectorClassifier and input is a Vector of features describing an element to be classified, then the following code could be used to classify input.
Vector scores = classifier.classifyFull(input);<br> int assignedCategory = scores.maxValueIndex();<br> Here assignedCategory is the index of the category with the maximum score.

If an n-1 encoding is acceptable, and allocation performance is an issue, then the AbstractVectorClassifier.classify(Vector) method is probably better to use.

Overrides:
classifyFull in class AbstractVectorClassifier
Parameters:
instance - A vector of features to be classified.
Returns:
A vector of probabilities, one for each category.
See Also:
AbstractVectorClassifier.classify(Vector), AbstractVectorClassifier.classifyFull(Vector r, Vector instance)

classifyFull

public Vector classifyFull(Vector r,
                           Vector instance)
Description copied from class: AbstractVectorClassifier
Computes and returns a vector containing n scores, where n is numCategories(), given an input vector instance. Higher scores indicate that the input vector is more likely to belong to the corresponding category. The categories are denoted by the integers 0 through n-1 (inclusive). The main difference between this method and AbstractVectorClassifier.classifyFull(Vector) is that this method allows a user to provide a previously allocated Vector r to store the returned scores.

Using this method it is possible to classify an input vector, for example, by selecting the category with the largest score. If classifier is an instance of AbstractVectorClassifier, result is a non-null Vector, and input is a Vector of features describing an element to be classified, then the following code could be used to classify input.
Vector scores = classifier.classifyFull(result, input); // Notice that scores == result<br> int assignedCategory = scores.maxValueIndex();<br> Here assignedCategory is the index of the category with the maximum score.

Overrides:
classifyFull in class AbstractVectorClassifier
Parameters:
r - Where to put the results.
instance - A vector of features to be classified.
Returns:
A vector of scores/probabilities, one for each category.

classifyScalar

public double classifyScalar(Vector instance)
Unsupported method. This implementation simply throws an UnsupportedOperationException.

Specified by:
classifyScalar in class AbstractVectorClassifier
Parameters:
instance - The feature vector to be classified.
Returns:
The score for category 1.
See Also:
AbstractVectorClassifier.classify(Vector)

classify

public Vector classify(Vector instance)
Unsupported method. This implementation simply throws an UnsupportedOperationException.

Specified by:
classify in class AbstractVectorClassifier
Parameters:
instance - A feature vector to be classified.
Returns:
A vector of probabilities in 1 of n-1 encoding.


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.