org.apache.mahout.cf.taste.model
Interface IDMigrator
- All Superinterfaces:
- Refreshable
- All Known Subinterfaces:
- UpdatableIDMigrator
- All Known Implementing Classes:
- AbstractIDMigrator, AbstractJDBCIDMigrator, FileIDMigrator, MemoryIDMigrator, MySQLJDBCIDMigrator
public interface IDMigrator
- extends Refreshable
Mahout 0.2 changed the framework to operate only in terms of numeric (long) ID values for users and items.
This is, obviously, not compatible with applications that used other key types -- most commonly
String
. Implementation of this class provide support for mapping String to longs and vice versa in
order to provide a smoother migration path to applications that must still use strings as IDs.
The mapping from strings to 64-bit numeric values is fixed here, to provide a standard implementation that
is 'portable' or reproducible outside the framework easily. See toLongID(String)
.
Because this mapping is deterministically computable, it does not need to be stored. Indeed, subclasses'
job is to store the reverse mapping. There are an infinite number of strings but only a fixed number of
longs, so, it is possible for two strings to map to the same value. Subclasses do not treat this as an
error but rather retain only the most recent mapping, overwriting a previous mapping. The probability of
collision in a 64-bit space is quite small, but not zero. However, in the context of a collaborative
filtering problem, the consequence of a collision is small, at worst -- perhaps one user receives another
recommendations.
- Since:
- 0.2
toLongID
long toLongID(String stringID)
- Returns:
- the top 8 bytes of the MD5 hash of the bytes of the given
String
's UTF-8 encoding as a
long.
- Throws:
TasteException
- if an error occurs while storing the mapping
toStringID
String toStringID(long longID)
throws TasteException
- Returns:
- the string ID most recently associated with the given long ID, or null if doesn't exist
- Throws:
TasteException
- if an error occurs while retrieving the mapping
Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.