org.apache.mahout.vectorizer
Class SimpleTextEncodingVectorizer
java.lang.Object
org.apache.mahout.vectorizer.SimpleTextEncodingVectorizer
- All Implemented Interfaces:
- Vectorizer
public class SimpleTextEncodingVectorizer
- extends Object
- implements Vectorizer
Runs a Map/Reduce job that encodes FeatureVectorEncoder
the
input and writes it to the output as a sequence file.
Only works on basic text, where the value in the SequenceFile is a blob of text.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
SimpleTextEncodingVectorizer
public SimpleTextEncodingVectorizer()
createVectors
public void createVectors(org.apache.hadoop.fs.Path input,
org.apache.hadoop.fs.Path output,
VectorizerConfig config)
throws IOException,
ClassNotFoundException,
InterruptedException
- Specified by:
createVectors
in interface Vectorizer
- Throws:
IOException
ClassNotFoundException
InterruptedException
Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.