|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.mahout.math.stats.LogLikelihood
public final class LogLikelihood
Utility methods for working with log-likelihood
Nested Class Summary | |
---|---|
static class |
LogLikelihood.ScoredItem<T>
|
Method Summary | ||
---|---|---|
static
|
compareFrequencies(com.google.common.collect.Multiset<T> a,
com.google.common.collect.Multiset<T> b,
int maxReturn,
double threshold)
Compares two sets of counts to see which items are interestingly over-represented in the first set. |
|
static double |
entropy(long... elements)
Calculates the unnormalized Shannon entropy. |
|
static double |
logLikelihoodRatio(long k11,
long k12,
long k21,
long k22)
Calculates the Raw Log-likelihood ratio for two events, call them A and B. |
|
static double |
rootLogLikelihoodRatio(long k11,
long k12,
long k21,
long k22)
Calculates the root log-likelihood ratio for two events. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Method Detail |
---|
public static double entropy(long... elements)
public static double logLikelihoodRatio(long k11, long k12, long k21, long k22)
Event A | Everything but A | |
Event B | A and B together (k_11) | B, but not A (k_12) |
Everything but B | A without B (k_21) | Neither A nor B (k_22) |
k11
- The number of times the two events occurred togetherk12
- The number of times the second event occurred WITHOUT the first eventk21
- The number of times the first event occurred WITHOUT the second eventk22
- The number of times something else occurred (i.e. was neither of these events
public static double rootLogLikelihoodRatio(long k11, long k12, long k21, long k22)
logLikelihoodRatio(long, long, long, long)
.
k11
- The number of times the two events occurred togetherk12
- The number of times the second event occurred WITHOUT the first eventk21
- The number of times the first event occurred WITHOUT the second eventk22
- The number of times something else occurred (i.e. was neither of these events
public static <T> List<LogLikelihood.ScoredItem<T>> compareFrequencies(com.google.common.collect.Multiset<T> a, com.google.common.collect.Multiset<T> b, int maxReturn, double threshold)
a
- The first counts.b
- The reference counts.maxReturn
- The maximum number of items to return. Use maxReturn >= a.elementSet.size() to return all
scores above the threshold.threshold
- The minimum score for items to be returned. Use 0 to return all items more common
in a than b. Use -Double.MAX_VALUE (not Double.MIN_VALUE !) to not use a threshold.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |