org.apache.mahout.classifier.sequencelearning.hmm
Class HmmTrainer

java.lang.Object
  extended by org.apache.mahout.classifier.sequencelearning.hmm.HmmTrainer

public final class HmmTrainer
extends Object

Class containing several algorithms used to train a Hidden Markov Model. The three main algorithms are: supervised learning, unsupervised Viterbi and unsupervised Baum-Welch.


Method Summary
static HmmModel trainBaumWelch(HmmModel initialModel, int[] observedSequence, double epsilon, int maxIterations, boolean scaled)
          Iteratively train the parameters of the given initial model wrt the observed sequence using Baum-Welch training.
static HmmModel trainSupervised(int nrOfHiddenStates, int nrOfOutputStates, int[] observedSequence, int[] hiddenSequence, double pseudoCount)
          Create an supervised initial estimate of an HMM Model based on a sequence of observed and hidden states.
static HmmModel trainSupervisedSequence(int nrOfHiddenStates, int nrOfOutputStates, Collection<int[]> hiddenSequences, Collection<int[]> observedSequences, double pseudoCount)
          Create an supervised initial estimate of an HMM Model based on a number of sequences of observed and hidden states.
static HmmModel trainViterbi(HmmModel initialModel, int[] observedSequence, double pseudoCount, double epsilon, int maxIterations, boolean scaled)
          Iteratively train the parameters of the given initial model wrt to the observed sequence using Viterbi training.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

trainSupervised

public static HmmModel trainSupervised(int nrOfHiddenStates,
                                       int nrOfOutputStates,
                                       int[] observedSequence,
                                       int[] hiddenSequence,
                                       double pseudoCount)
Create an supervised initial estimate of an HMM Model based on a sequence of observed and hidden states.

Parameters:
nrOfHiddenStates - The total number of hidden states
nrOfOutputStates - The total number of output states
observedSequence - Integer array containing the observed sequence
hiddenSequence - Integer array containing the hidden sequence
pseudoCount - Value that is assigned to non-occurring transitions to avoid zero probabilities.
Returns:
An initial model using the estimated parameters

trainSupervisedSequence

public static HmmModel trainSupervisedSequence(int nrOfHiddenStates,
                                               int nrOfOutputStates,
                                               Collection<int[]> hiddenSequences,
                                               Collection<int[]> observedSequences,
                                               double pseudoCount)
Create an supervised initial estimate of an HMM Model based on a number of sequences of observed and hidden states.

Parameters:
nrOfHiddenStates - The total number of hidden states
nrOfOutputStates - The total number of output states
hiddenSequences - Collection of hidden sequences to use for training
observedSequences - Collection of observed sequences to use for training associated with hidden sequences.
pseudoCount - Value that is assigned to non-occurring transitions to avoid zero probabilities.
Returns:
An initial model using the estimated parameters

trainViterbi

public static HmmModel trainViterbi(HmmModel initialModel,
                                    int[] observedSequence,
                                    double pseudoCount,
                                    double epsilon,
                                    int maxIterations,
                                    boolean scaled)
Iteratively train the parameters of the given initial model wrt to the observed sequence using Viterbi training.

Parameters:
initialModel - The initial model that gets iterated
observedSequence - The sequence of observed states
pseudoCount - Value that is assigned to non-occurring transitions to avoid zero probabilities.
epsilon - Convergence criteria
maxIterations - The maximum number of training iterations
scaled - Use Log-scaled implementation, this is computationally more expensive but offers better numerical stability for large observed sequences
Returns:
The iterated model

trainBaumWelch

public static HmmModel trainBaumWelch(HmmModel initialModel,
                                      int[] observedSequence,
                                      double epsilon,
                                      int maxIterations,
                                      boolean scaled)
Iteratively train the parameters of the given initial model wrt the observed sequence using Baum-Welch training.

Parameters:
initialModel - The initial model that gets iterated
observedSequence - The sequence of observed states
epsilon - Convergence criteria
maxIterations - The maximum number of training iterations
scaled - Use log-scaled implementations of forward/backward algorithm. This is computationally more expensive, but offers better numerical stability for long output sequences.
Returns:
The iterated model


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.