org.apache.mahout.cf.taste.impl.recommender.svd
Class ParallelSGDFactorizer
java.lang.Object
org.apache.mahout.cf.taste.impl.recommender.svd.AbstractFactorizer
org.apache.mahout.cf.taste.impl.recommender.svd.ParallelSGDFactorizer
- All Implemented Interfaces:
- Refreshable, Factorizer
public class ParallelSGDFactorizer
- extends AbstractFactorizer
Minimalistic implementation of Parallel SGD factorizer based on
"Scalable Collaborative Filtering Approaches for Large Recommender Systems"
and
"Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent"
Constructor Summary |
ParallelSGDFactorizer(DataModel dataModel,
int numFeatures,
double lambda,
int numEpochs)
|
ParallelSGDFactorizer(DataModel dataModel,
int numFeatures,
double lambda,
int numIterations,
double mu0,
double decayFactor,
int stepOffset,
double forgettingExponent)
|
ParallelSGDFactorizer(DataModel dataModel,
int numFeatures,
double lambda,
int numIterations,
double mu0,
double decayFactor,
int stepOffset,
double forgettingExponent,
double biasMuRatio,
double biasLambdaRatio)
|
ParallelSGDFactorizer(DataModel dataModel,
int numFeatures,
double lambda,
int numIterations,
double mu0,
double decayFactor,
int stepOffset,
double forgettingExponent,
double biasMuRatio,
double biasLambdaRatio,
int numThreads)
|
ParallelSGDFactorizer(DataModel dataModel,
int numFeatures,
double lambda,
int numIterations,
double mu0,
double decayFactor,
int stepOffset,
double forgettingExponent,
int numThreads)
|
Method Summary |
Factorization |
factorize()
|
protected void |
initialize()
|
protected void |
update(Preference preference,
double mu)
TODO: this is the vanilla sgd by Tacaks 2009, I speculate that using scaling technique proposed in:
Towards Optimal One Pass Large Scale Learning with Averaged Stochastic Gradient Descent section 5, page 6
can be beneficial in term s of both speed and accuracy. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
userVectors
protected volatile double[][] userVectors
- user features
itemVectors
protected volatile double[][] itemVectors
- item features
ParallelSGDFactorizer
public ParallelSGDFactorizer(DataModel dataModel,
int numFeatures,
double lambda,
int numEpochs)
throws TasteException
- Throws:
TasteException
ParallelSGDFactorizer
public ParallelSGDFactorizer(DataModel dataModel,
int numFeatures,
double lambda,
int numIterations,
double mu0,
double decayFactor,
int stepOffset,
double forgettingExponent)
throws TasteException
- Throws:
TasteException
ParallelSGDFactorizer
public ParallelSGDFactorizer(DataModel dataModel,
int numFeatures,
double lambda,
int numIterations,
double mu0,
double decayFactor,
int stepOffset,
double forgettingExponent,
int numThreads)
throws TasteException
- Throws:
TasteException
ParallelSGDFactorizer
public ParallelSGDFactorizer(DataModel dataModel,
int numFeatures,
double lambda,
int numIterations,
double mu0,
double decayFactor,
int stepOffset,
double forgettingExponent,
double biasMuRatio,
double biasLambdaRatio)
throws TasteException
- Throws:
TasteException
ParallelSGDFactorizer
public ParallelSGDFactorizer(DataModel dataModel,
int numFeatures,
double lambda,
int numIterations,
double mu0,
double decayFactor,
int stepOffset,
double forgettingExponent,
double biasMuRatio,
double biasLambdaRatio,
int numThreads)
throws TasteException
- Throws:
TasteException
initialize
protected void initialize()
throws TasteException
- Throws:
TasteException
factorize
public Factorization factorize()
throws TasteException
- Throws:
TasteException
update
protected void update(Preference preference,
double mu)
- TODO: this is the vanilla sgd by Tacaks 2009, I speculate that using scaling technique proposed in:
Towards Optimal One Pass Large Scale Learning with Averaged Stochastic Gradient Descent section 5, page 6
can be beneficial in term s of both speed and accuracy.
Tacaks' method doesn't calculate gradient of regularization correctly, which has non-zero elements everywhere of
the matrix. While Tacaks' method can only updates a single row/column, if one user has a lot of recommendation,
her vector will be more affected by regularization using an isolated scaling factor for both user vectors and
item vectors can remove this issue without inducing more update cost it even reduces it a bit by only performing
one addition and one multiplication.
BAD SIDE1: the scaling factor decreases fast, it has to be scaled up from time to time before dropped to zero or
caused roundoff error
BAD SIDE2: no body experiment on it before, and people generally use very small lambda
so it's impact on accuracy may still be unknown.
BAD SIDE3: don't know how to make it work for L1-regularization or
"pseudorank?" (sum of singular values)-regularization
Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.