org.apache.mahout.classifier.sgd
Class SimpleCsvExamples

java.lang.Object
  extended by org.apache.mahout.classifier.sgd.SimpleCsvExamples

public final class SimpleCsvExamples
extends Object

Shows how different encoding choices can make big speed differences.

Run with command line options --generate 1000000 test.csv to generate a million data lines in test.csv.

Run with command line options --parser test.csv to time how long it takes to parse and encode those million data points

Run with command line options --fast test.csv to time how long it takes to parse and encode those million data points using byte-level parsing and direct value encoding.

This doesn't demonstrate text encoding which is subject to somewhat different tricks. The basic idea of caching hash locations and byte level parsing still very much applies to text, however.


Field Summary
static char SEPARATOR_CHAR
           
 
Method Summary
static void main(String[] args)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

SEPARATOR_CHAR

public static final char SEPARATOR_CHAR
See Also:
Constant Field Values
Method Detail

main

public static void main(String[] args)
                 throws IOException
Throws:
IOException


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.