際際滷

際際滷Share a Scribd company logo
ENHANCING MATRIX FACTORIZATION
THROUGH INITIALIZATION FOR
IMPLICIT FEEDBACK DATABASES


                  Bal叩zs Hidasi
                 Domonkos Tikk

                    Gravity R&D Ltd.
     Budapest University of Technology and Economics

    CARR WORKSHOP, 14TH FEBRUARY 2012, LISBON
OUTLINE
 Matrix factoriztion
 Initialization concept

 Methods
     Naive
     SimFactor

 Results
 Discussion




                           2/19
MATRIX FACTORIZATION
 Collaborative Filtering
 One of the most common approaches

 Approximates the rating matrix as product of low-
  rank matrices
                Items


                                           Q
        Users




                R             P




                                                      3/19
MATRIX FACTORIZATION
 Initialize P and Q with small random numbers
 Teach P and Q
     Alternating Least Squares
     Gradient Descent
     Etc.

   Transforms the data to a feature space
     Separately for users and items
     Noise reduction
     Compression
     Generalization

                                                 4/19
IMPLICIT FEEDBACK
 No ratings
 User-item interactions (events)

 Much noisier
     Presence of an event  might not be positive feedback
     Absence of an event  does not mean negative
      feedback
     No negative feedback is available!

 More common problem
 MF for implicit feedback
       Less accurate results due to noise
       Mostly ALS is used                                    5/19
       Scalability problems (rating matrix is dense)
CONCEPT
   Good MF model
     The feature vectors of similar entities are similar
     If data is too noisy  similar entities wont be similar by
      their features
   Start MF from a good point
       Feature vector similarities are OK
   Data is more than just events
       Metadata
           Info about items/users
       Contextual data
           In what context did the event occured
       Can we incorporate those to help implicit MF?               6/19
NAIVE APPROACH
   Describe items using any data we have (detailed
    later)
       Long, sparse vectors for item description
   Compress these vectors to dense feature vectors
     PCA, MLP, MF, 
     Length of desired vectors = Number of features in MF

   Use these features as starting points




                                                             7/19
NAIVE APPROACH
 Compression and also noise reduction
 Does not really care about similarities

 But often feature similarities are not that bad

 If MF is used
           Half of the results is thrown out


                   Descriptors
                                         features   Descriptor features
                                    
    Items




                                           Item


             Description of items

                                                                          8/19
SIMFACTOR ALGORITHM
 Try to preserve similarities better
 Starting from an MF of item description
                  Descriptors

                                                        Descriptors features




                                             features
            Description of items
                                        
    Items




                                               Item
                    (D)


    Similarities of items: DD




                                                                      Description of items
           Some metrics require transformation on D




                                                                              (D)
                            Item
                                            Description of items
                         similarities
                             (S)
                                        =           (D)                                      9/19
SIMFACTOR ALGORITHM




                                                         Descriptors features
                   features
   Item                           Descriptors features                            Item
                    Item

                      (X)
similarities                              (Y)                                  features
    (S)                                                                            (X)




                                                                 (Y)
   Similarity approximation

                                       features

                      Item                                    Item
                                         Item



                                   
                                          (X)

                   similarities                   YY       features
                       (S)                                     (X)


   YY  KxK symmetric                                                                    10/19
       Eigendecomposition
SIMFACTOR ALGORITHM

YY     =      U          了        U

   了 diagonal  了 = SQRT(了) * SQRT(了)
                   features


   Item                                                      Item
               
                     Item



                                        SQRT   SQRT
                      (X)


similarities                  U          (了)    (了)   U   features
    (S)                                                       (X)

 X*U*SQRT(了) = (SQRT(了)*U*X)=F
 F is MxK matrix

 S F * F  F used for initialization

   Item
similarities                     F                                  11/19
                     F




    (S)
CREATING THE DESCRIPTION MATRIX
   Any data about the entity
       Vector-space reprezentation
   For Items:
     Metadata vector (title, category, description, etc)
     Event vector (who bought the item)
     Context-state vector (in which context state was it
      bought)
     Context-event (in which context state who bought it)

   For Users:
       All above except metadata
 Currently: Choose one source for D matrix
                                                             12/19
 Context used: seasonality
EXPERIMENTS: SIMILARITY PRESERVATION
   Real life dataset: online grocery shopping events
            SimFactor RMSE improvement over naive in similarity
                             approximation

     52.36%
                                                                                             48.70%




                                                          26.22%


                                          16.86%
                                                                           13.39%                              12.38%
                       10.81%




Item context state User context state   Item context-   User context-   Item event data   User event data   Item metadata
                                            event          event
                                                                                                                            13/19
   SimFactor approximates similarities better
EXPERIMENTS: INITIALIZATION
 Using different description matrices
 And both naive and SimFactor initialization

 Baseline: random init

 Evaluation metric: recall@50




                                                14/19
EXPERIMENTS: GROCERY DB
 Up to 6% improvement
 Best methods use SimFactor and user context data

                              Top5 methods on Grocery DB
         5.71%


                              4.88%

                                                   4.30%
                                                                       4.12%              4.04%




                                                                                                          15/19
    User context state   User context state   User context event   User event data   User context event
      (SimFactor)             (Naive)            (SimFactor)        (SimFactor)           (Naive)
EXPERIMENTS: IMPLICITIZED MOVIELENS
 Keeping 5 star ratings  implicit events
 Up to 10% improvement

 Best methods use SimFactor and item context data
                            Top5 methods on MovieLens DB

          10%




                              9.17%                9.17%                9.17%                9.17%




                                                                                                             16/19
    Item context state   User context state   Item context event   Item context event   Item context state
       (SimFactor)         (SimFactor)           (SimFactor)             (Naive)             (Naive)
DISCUSSION OF RESULTS
 SimFactor yields better results than naive
 Context information yields better results than other
  descriptions
 Context information separates well between entities
       Grocery: User context
         Peoples routines
         Different types of shoppings in different times

       MovieLens: Item context
           Different types of movies watched on different hours
   Context-based similarity

                                                                   17/19
WHY CONTEXT?
   Grocery example
       Correlation between context states by users low
                   ITEM                                        USER
        Mon Tue Wed Thu Fri Sat Sun                 Mon Tue Wed Thu Fri Sat Sun
Mon      1,00 0,79 0,79 0,78 0,76 0,70 0,74   Mon    1,00 0,36 0,34 0,34 0,35 0,29 0,19
Tue      0,79 1,00 0,79 0,78 0,76 0,69 0,73   Tue    0,36 1,00 0,34 0,34 0,33 0,29 0,19
Wed      0,79 0,79 1,00 0,79 0,76 0,70 0,74   Wed    0,34 0,34 1,00 0,36 0,35 0,27 0,17
Thu      0,78 0,78 0,79 1,00 0,76 0,71 0,74   Thu    0,34 0,34 0,36 1,00 0,39 0,30 0,16
Fri      0,76 0,76 0,76 0,76 1,00 0,71 0,72   Fri    0,35 0,33 0,35 0,39 1,00 0,32 0,16
Sat      0,70 0,69 0,70 0,71 0,71 1,00 0,71   Sat    0,29 0,29 0,27 0,30 0,32 1,00 0,33
Sun      0,74 0,73 0,74 0,74 0,72 0,71 1,00   Sun    0,19 0,19 0,17 0,16 0,16 0,33 1,00
   Why can context aware algorithms be efficient?
     Different recommendations in different context states
                                                                                     18/19
     Context differentiates well between entities
            Easier subtasks
CONCLUSION & FUTURE WORK
 SimFactor  Similarity preserving compression
 Similarity based MF initialization:
     Description matrix from any data
     Apply SimFactor
     Use output as initial features for MF

 Context differentitates between entities well
 Future work:
     Mixed description matrix (multiple data sources)
     Multiple description matrix
     Using different context information
     Using different similarity metrics                 19/19
THANKS FOR YOUR ATTENTION!




For more of my recommender systems related research visit my website:
http://www.hidasi.eu

More Related Content

Initialization of matrix factorization (CaRR 2012 presentation)

  • 1. ENHANCING MATRIX FACTORIZATION THROUGH INITIALIZATION FOR IMPLICIT FEEDBACK DATABASES Bal叩zs Hidasi Domonkos Tikk Gravity R&D Ltd. Budapest University of Technology and Economics CARR WORKSHOP, 14TH FEBRUARY 2012, LISBON
  • 2. OUTLINE Matrix factoriztion Initialization concept Methods Naive SimFactor Results Discussion 2/19
  • 3. MATRIX FACTORIZATION Collaborative Filtering One of the most common approaches Approximates the rating matrix as product of low- rank matrices Items Q Users R P 3/19
  • 4. MATRIX FACTORIZATION Initialize P and Q with small random numbers Teach P and Q Alternating Least Squares Gradient Descent Etc. Transforms the data to a feature space Separately for users and items Noise reduction Compression Generalization 4/19
  • 5. IMPLICIT FEEDBACK No ratings User-item interactions (events) Much noisier Presence of an event might not be positive feedback Absence of an event does not mean negative feedback No negative feedback is available! More common problem MF for implicit feedback Less accurate results due to noise Mostly ALS is used 5/19 Scalability problems (rating matrix is dense)
  • 6. CONCEPT Good MF model The feature vectors of similar entities are similar If data is too noisy similar entities wont be similar by their features Start MF from a good point Feature vector similarities are OK Data is more than just events Metadata Info about items/users Contextual data In what context did the event occured Can we incorporate those to help implicit MF? 6/19
  • 7. NAIVE APPROACH Describe items using any data we have (detailed later) Long, sparse vectors for item description Compress these vectors to dense feature vectors PCA, MLP, MF, Length of desired vectors = Number of features in MF Use these features as starting points 7/19
  • 8. NAIVE APPROACH Compression and also noise reduction Does not really care about similarities But often feature similarities are not that bad If MF is used Half of the results is thrown out Descriptors features Descriptor features Items Item Description of items 8/19
  • 9. SIMFACTOR ALGORITHM Try to preserve similarities better Starting from an MF of item description Descriptors Descriptors features features Description of items Items Item (D) Similarities of items: DD Description of items Some metrics require transformation on D (D) Item Description of items similarities (S) = (D) 9/19
  • 10. SIMFACTOR ALGORITHM Descriptors features features Item Descriptors features Item Item (X) similarities (Y) features (S) (X) (Y) Similarity approximation features Item Item Item (X) similarities YY features (S) (X) YY KxK symmetric 10/19 Eigendecomposition
  • 11. SIMFACTOR ALGORITHM YY = U 了 U 了 diagonal 了 = SQRT(了) * SQRT(了) features Item Item Item SQRT SQRT (X) similarities U (了) (了) U features (S) (X) X*U*SQRT(了) = (SQRT(了)*U*X)=F F is MxK matrix S F * F F used for initialization Item similarities F 11/19 F (S)
  • 12. CREATING THE DESCRIPTION MATRIX Any data about the entity Vector-space reprezentation For Items: Metadata vector (title, category, description, etc) Event vector (who bought the item) Context-state vector (in which context state was it bought) Context-event (in which context state who bought it) For Users: All above except metadata Currently: Choose one source for D matrix 12/19 Context used: seasonality
  • 13. EXPERIMENTS: SIMILARITY PRESERVATION Real life dataset: online grocery shopping events SimFactor RMSE improvement over naive in similarity approximation 52.36% 48.70% 26.22% 16.86% 13.39% 12.38% 10.81% Item context state User context state Item context- User context- Item event data User event data Item metadata event event 13/19 SimFactor approximates similarities better
  • 14. EXPERIMENTS: INITIALIZATION Using different description matrices And both naive and SimFactor initialization Baseline: random init Evaluation metric: recall@50 14/19
  • 15. EXPERIMENTS: GROCERY DB Up to 6% improvement Best methods use SimFactor and user context data Top5 methods on Grocery DB 5.71% 4.88% 4.30% 4.12% 4.04% 15/19 User context state User context state User context event User event data User context event (SimFactor) (Naive) (SimFactor) (SimFactor) (Naive)
  • 16. EXPERIMENTS: IMPLICITIZED MOVIELENS Keeping 5 star ratings implicit events Up to 10% improvement Best methods use SimFactor and item context data Top5 methods on MovieLens DB 10% 9.17% 9.17% 9.17% 9.17% 16/19 Item context state User context state Item context event Item context event Item context state (SimFactor) (SimFactor) (SimFactor) (Naive) (Naive)
  • 17. DISCUSSION OF RESULTS SimFactor yields better results than naive Context information yields better results than other descriptions Context information separates well between entities Grocery: User context Peoples routines Different types of shoppings in different times MovieLens: Item context Different types of movies watched on different hours Context-based similarity 17/19
  • 18. WHY CONTEXT? Grocery example Correlation between context states by users low ITEM USER Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat Sun Mon 1,00 0,79 0,79 0,78 0,76 0,70 0,74 Mon 1,00 0,36 0,34 0,34 0,35 0,29 0,19 Tue 0,79 1,00 0,79 0,78 0,76 0,69 0,73 Tue 0,36 1,00 0,34 0,34 0,33 0,29 0,19 Wed 0,79 0,79 1,00 0,79 0,76 0,70 0,74 Wed 0,34 0,34 1,00 0,36 0,35 0,27 0,17 Thu 0,78 0,78 0,79 1,00 0,76 0,71 0,74 Thu 0,34 0,34 0,36 1,00 0,39 0,30 0,16 Fri 0,76 0,76 0,76 0,76 1,00 0,71 0,72 Fri 0,35 0,33 0,35 0,39 1,00 0,32 0,16 Sat 0,70 0,69 0,70 0,71 0,71 1,00 0,71 Sat 0,29 0,29 0,27 0,30 0,32 1,00 0,33 Sun 0,74 0,73 0,74 0,74 0,72 0,71 1,00 Sun 0,19 0,19 0,17 0,16 0,16 0,33 1,00 Why can context aware algorithms be efficient? Different recommendations in different context states 18/19 Context differentiates well between entities Easier subtasks
  • 19. CONCLUSION & FUTURE WORK SimFactor Similarity preserving compression Similarity based MF initialization: Description matrix from any data Apply SimFactor Use output as initial features for MF Context differentitates between entities well Future work: Mixed description matrix (multiple data sources) Multiple description matrix Using different context information Using different similarity metrics 19/19
  • 20. THANKS FOR YOUR ATTENTION! For more of my recommender systems related research visit my website: http://www.hidasi.eu