This presentation is about why initialization of matrix factorization methods is important and proposes an interesting initialization method (coined SimFactor). The method revolves around a similarity preserving dimensionality reduction technique. Context-based initialization is introduced as well.
As most of my recommender systems related research, this presentation focuses on implicit feedback (the case where user preferences are not coded explicitely in the data).
Originally presented at the 2nd workshop on Context-awareness in Retrieval and Recommendations (CaRR 2012) in Lisbon.
1 of 20
More Related Content
Initialization of matrix factorization (CaRR 2012 presentation)
1. ENHANCING MATRIX FACTORIZATION
THROUGH INITIALIZATION FOR
IMPLICIT FEEDBACK DATABASES
Bal叩zs Hidasi
Domonkos Tikk
Gravity R&D Ltd.
Budapest University of Technology and Economics
CARR WORKSHOP, 14TH FEBRUARY 2012, LISBON
3. MATRIX FACTORIZATION
Collaborative Filtering
One of the most common approaches
Approximates the rating matrix as product of low-
rank matrices
Items
Q
Users
R P
3/19
4. MATRIX FACTORIZATION
Initialize P and Q with small random numbers
Teach P and Q
Alternating Least Squares
Gradient Descent
Etc.
Transforms the data to a feature space
Separately for users and items
Noise reduction
Compression
Generalization
4/19
5. IMPLICIT FEEDBACK
No ratings
User-item interactions (events)
Much noisier
Presence of an event might not be positive feedback
Absence of an event does not mean negative
feedback
No negative feedback is available!
More common problem
MF for implicit feedback
Less accurate results due to noise
Mostly ALS is used 5/19
Scalability problems (rating matrix is dense)
6. CONCEPT
Good MF model
The feature vectors of similar entities are similar
If data is too noisy similar entities wont be similar by
their features
Start MF from a good point
Feature vector similarities are OK
Data is more than just events
Metadata
Info about items/users
Contextual data
In what context did the event occured
Can we incorporate those to help implicit MF? 6/19
7. NAIVE APPROACH
Describe items using any data we have (detailed
later)
Long, sparse vectors for item description
Compress these vectors to dense feature vectors
PCA, MLP, MF,
Length of desired vectors = Number of features in MF
Use these features as starting points
7/19
8. NAIVE APPROACH
Compression and also noise reduction
Does not really care about similarities
But often feature similarities are not that bad
If MF is used
Half of the results is thrown out
Descriptors
features Descriptor features
Items
Item
Description of items
8/19
9. SIMFACTOR ALGORITHM
Try to preserve similarities better
Starting from an MF of item description
Descriptors
Descriptors features
features
Description of items
Items
Item
(D)
Similarities of items: DD
Description of items
Some metrics require transformation on D
(D)
Item
Description of items
similarities
(S)
= (D) 9/19
10. SIMFACTOR ALGORITHM
Descriptors features
features
Item Descriptors features Item
Item
(X)
similarities (Y) features
(S) (X)
(Y)
Similarity approximation
features
Item Item
Item
(X)
similarities YY features
(S) (X)
YY KxK symmetric 10/19
Eigendecomposition
11. SIMFACTOR ALGORITHM
YY = U 了 U
了 diagonal 了 = SQRT(了) * SQRT(了)
features
Item Item
Item
SQRT SQRT
(X)
similarities U (了) (了) U features
(S) (X)
X*U*SQRT(了) = (SQRT(了)*U*X)=F
F is MxK matrix
S F * F F used for initialization
Item
similarities F 11/19
F
(S)
12. CREATING THE DESCRIPTION MATRIX
Any data about the entity
Vector-space reprezentation
For Items:
Metadata vector (title, category, description, etc)
Event vector (who bought the item)
Context-state vector (in which context state was it
bought)
Context-event (in which context state who bought it)
For Users:
All above except metadata
Currently: Choose one source for D matrix
12/19
Context used: seasonality
13. EXPERIMENTS: SIMILARITY PRESERVATION
Real life dataset: online grocery shopping events
SimFactor RMSE improvement over naive in similarity
approximation
52.36%
48.70%
26.22%
16.86%
13.39% 12.38%
10.81%
Item context state User context state Item context- User context- Item event data User event data Item metadata
event event
13/19
SimFactor approximates similarities better
14. EXPERIMENTS: INITIALIZATION
Using different description matrices
And both naive and SimFactor initialization
Baseline: random init
Evaluation metric: recall@50
14/19
15. EXPERIMENTS: GROCERY DB
Up to 6% improvement
Best methods use SimFactor and user context data
Top5 methods on Grocery DB
5.71%
4.88%
4.30%
4.12% 4.04%
15/19
User context state User context state User context event User event data User context event
(SimFactor) (Naive) (SimFactor) (SimFactor) (Naive)
16. EXPERIMENTS: IMPLICITIZED MOVIELENS
Keeping 5 star ratings implicit events
Up to 10% improvement
Best methods use SimFactor and item context data
Top5 methods on MovieLens DB
10%
9.17% 9.17% 9.17% 9.17%
16/19
Item context state User context state Item context event Item context event Item context state
(SimFactor) (SimFactor) (SimFactor) (Naive) (Naive)
17. DISCUSSION OF RESULTS
SimFactor yields better results than naive
Context information yields better results than other
descriptions
Context information separates well between entities
Grocery: User context
Peoples routines
Different types of shoppings in different times
MovieLens: Item context
Different types of movies watched on different hours
Context-based similarity
17/19
18. WHY CONTEXT?
Grocery example
Correlation between context states by users low
ITEM USER
Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat Sun
Mon 1,00 0,79 0,79 0,78 0,76 0,70 0,74 Mon 1,00 0,36 0,34 0,34 0,35 0,29 0,19
Tue 0,79 1,00 0,79 0,78 0,76 0,69 0,73 Tue 0,36 1,00 0,34 0,34 0,33 0,29 0,19
Wed 0,79 0,79 1,00 0,79 0,76 0,70 0,74 Wed 0,34 0,34 1,00 0,36 0,35 0,27 0,17
Thu 0,78 0,78 0,79 1,00 0,76 0,71 0,74 Thu 0,34 0,34 0,36 1,00 0,39 0,30 0,16
Fri 0,76 0,76 0,76 0,76 1,00 0,71 0,72 Fri 0,35 0,33 0,35 0,39 1,00 0,32 0,16
Sat 0,70 0,69 0,70 0,71 0,71 1,00 0,71 Sat 0,29 0,29 0,27 0,30 0,32 1,00 0,33
Sun 0,74 0,73 0,74 0,74 0,72 0,71 1,00 Sun 0,19 0,19 0,17 0,16 0,16 0,33 1,00
Why can context aware algorithms be efficient?
Different recommendations in different context states
18/19
Context differentiates well between entities
Easier subtasks
19. CONCLUSION & FUTURE WORK
SimFactor Similarity preserving compression
Similarity based MF initialization:
Description matrix from any data
Apply SimFactor
Use output as initial features for MF
Context differentitates between entities well
Future work:
Mixed description matrix (multiple data sources)
Multiple description matrix
Using different context information
Using different similarity metrics 19/19
20. THANKS FOR YOUR ATTENTION!
For more of my recommender systems related research visit my website:
http://www.hidasi.eu