際際滷

際際滷Share a Scribd company logo
RECOMMENDER SYSTEMS
NAVULE PAVAN KUMAR RAO
CONTENTS
 Dealing with Data
 Recommendation Techniques Overview
 Content Based Filtering
 Collaborative Filtering
 Similarity Measures
Recommender Systems by Navule Pavan Kumar Rao
DEALING WITH DATA
Implicit Data Explicit Data
Data Gathered from User Behavior Well appreciated Data by User
Ex: Frequency of a product bought by user.
List of products bought by user.
Ratings, Reviews, Comments
Ex: user rate product on scale of 1 to 5
Advantage:
Lot of such data can be captured
Unbiased and versatile data
Advantage: More detailed preference towards
products by users.
More structured Data.
Disadvantage:
Data is more noisy, unstructured.
No evidence what user feels about the
product he uses
Disadvantage:
All users not interested in giving ratings to
products they purchase (sparsity).
Possibility of Bias(from user), Fraud (from
seller)
Recommender Systems by Navule Pavan Kumar Rao
RECOMMENDATION TECHNIQUES
Recommender
System
Content based
Filtering
Collaborative
Filtering
Model based
Filtering
Clustering,
Association Rules
Mining, Bayesian
Networks, Neural
Networks
Memory based
Filtering
User based Item based
Hybrid Filtering
Recommender Systems by Navule Pavan Kumar Rao
CONTENT BASED FILTERING
 Content Based Filtering (CBF) emphasizes more on attributes of product being
recommended.
 CBF recommends the products based on the most similar features in other
products.
 Only related or similar products will be recommended to the user based on
similarities.
 Independent of other users preferences.
 Similarity Measure employed in CBF:
 Cosine Similarity, Euclidian Distance, Pearson Correlation, TF-IDF
Recommender Systems by Navule Pavan Kumar Rao
CONTENT BASED FILTERING
Examples:
Genre based Recommendations
 News: Political, Business, Economies, Cryptocurrencies etc.
 Movies: Fantasy, Action, Comedy, Horror etc.
 Cuisine: Indian, Continental, Italian etc.
Recommender Systems by Navule Pavan Kumar Rao
COLLABORATIVE FILTERING
 Collaborative Filtering (CF) recommends products based on users preferences.
 CF depends on user-item relationship.
 Based on this, it calculates the similarities between their most relevant majority
user profiles and recommends items.
 Most relevant?
 Similarity Measures!
 Majority?
 Weighted Average of their ratings.
U  I I1 I2 I3 I4
U1 1 2 1 1
U2 2 3 5 1
U3 3 4 2 5
U4 5 3 5 4
Recommender Systems by Navule Pavan Kumar Rao
COLLABORATIVE FILTERING
User
Based
Filtering:
Computes the
similar users
Calculate the
Weighted
average across
items*
Predict
Item
Based
Filtering
Compute Similar
Items based on
users
preference
history
Calculate
Weighted
Average
across items*
Predict
* Can use other approaches based on the CBF technique employed
Recommender Systems by Navule Pavan Kumar Rao
Memory Based CBF Techniques:
SIMILARITY MEASURES
 Cosine Similarity
 Euclidian Distance
 TF-IDF
 Spearmans Rank Correlation
 Pearson Correlation
Recommender Systems by Navule Pavan Kumar Rao
SIMILARITY MEASURES
Cosine Similarity
The resultant of Cosine Similarity is the Amplitude which is on y axis.
Higher the result lesser the angle between A and B and more similar they are.
Recommender Systems by Navule Pavan Kumar Rao
SIMILARITY MEASURES
Cosine Similarity
UserMovie Race 1 Race 2 Race 3
UA 4 3 2
UB 4 4 1
UC 2 1 5
駒 $(, ) =
44 + 34 + 21
42 + 32 + 22  42 + 42 + 12
= 0.969
UA and UB are similar users
Recommender Systems by Navule Pavan Kumar Rao
SINGULAR VALUE DECOMPOSITION
NAVULE PAVAN KUMAR RAO
CONTENTS
 Introduction
 SVD in Detail
 SVD in Recommender Systems
 Baseline Estimation Techniques
 Other applications
Recommender Systems by Navule Pavan Kumar Rao
INTRODUCTION
 Data like user preferences, ratings etc,. are represented in the form of matrix
 Represent this matrix into a product(factors) of many matrices and each matrix
holding some significance about the original matrix.
 This technique is referred as Matrix Factorization or Matrix Decomposition.
Recommender Systems by Navule Pavan Kumar Rao
UserMovie Race 1 Race 2 Race 3
UA 4 3 2
UB 4 4 1
UC 2 1 5
MATRIX FACTORIZATION TECHNIQUES
 Based on Eigen Values
 Eigen decomposition
 Jordan decomposition
 Schur decomposition
 Real Schur decomposition
 QZ decomposition
 Takagi's factorization
 Singular value decomposition
 Solve system of Linear Equations
 LU decomposition
 LU reduction
 Block LU decomposition
 Rank factorization
 Cholesky decomposition
 QR decomposition
 RRQR factorization
 Interpolative decomposition
And Many other..!
Recommender Systems by Navule Pavan Kumar Rao
SVD IN DETAIL
SVD (Singular Value Decomposition) is a representation of rectangular matrix of
Gene-Expression Data,
Anxp= Unxn Snxp VT
pxp
For Ex:
n rows  Gene  assumed as users.
p columns  Expression  assumed as ratings to a particular product.
Recommender Systems by Navule Pavan Kumar Rao
UserMovie Race 1 Race 2 Race 3
UA 4 3 2
UB 4 4 1
UC 2 1 5
SVD IN DETAIL
Anxp= Unxn Snxp VT
pxp
Anxp Unxn (Gene) Snxp VT
pxp (Expression)=
Recommender Systems by Navule Pavan Kumar Rao
SVD IN DETAIL
Anxp= Unxn Snxp VT
pxp
Where Us Columns are Eigen Vectors of AAT
and Vs Columns are Eigen Vectors of ATA
S is a diagonal matrix whose elements are singular values* of A in
the decreasing order.
U and V and are Orthogonal, i.e.
UTU = Inxn
VTV = Ipxp
* Singular Values are Square roots of Eigen Values
Recommender Systems by Navule Pavan Kumar Rao
SVD IN RECOMMENDER SYSTEMS
Anxp= Unxn Snxp VT
pxp
SVD is used in Collaborative Filtering approach in the field of Recommender
Systems.
Reverse approach of SVD to find U or V by using Stochastic Gradient Descent
Optimization technique where Anxp is sparse.
Find pg and qe such that
 pg makes the row of matrix U where pg is the gth Gene vector of U
 qe makes the column of matrix VT where qe is the eth Expression vector of VT
age = pg . qe
All the vectors pg are mutually orthogonal, as well as the vectors qe
4 1 ?
? 2 5
3 ? ?
Recommender Systems by Navule Pavan Kumar Rao
SVD IN RECOMMENDER SYSTEMS
Anxp= Unxn VT
pxp
aeg
Anxp Unxn
qe
VT
pxp
=
pg
S being a diagonal matrix it just acts as a scaler on U or VT and is ignor
Recommender Systems by Navule Pavan Kumar Rao
SVD IN RECOMMENDER SYSTEMS
Anxp= Unxn Snxp VT
pxp
 The problem here is we dont know how to build pg and qe , so we take
any random values* for both pg and qe
 Perform baseline estimate either via
 SGD (Stochastic Gradient Descent)
 ALS (Alternative Least Squares)
 Once convergence is achieved, we obtain optimal values of pg and qe
We can then estimate any gene expression as age(estimated) = pg . qe
4 1 ?
? 2 5
3 ? ?
* Random values form normal distribution
Recommender Systems by Navule Pavan Kumar Rao
SVD IN RECOMMENDER SYSTEMS
Anxp= Unxn Snxp VT
pxp
As we have few age(actual) available in matrix A we calculate the age and try to reduce the
difference between the actual and calculated value (Minimization of error)
Errormin= (age(actual)  pg . qe)2
We thus find for vectors that makes the sum of errors minimal.
Then we construct U and V with
 pg as rows of U and
 qe as columns of VT
Finally we have, A = UVT
S being a diagonal matrix it just acts as a scaler on U or VT and is ignor
Recommender Systems by Navule Pavan Kumar Rao
SVD IN RECOMMENDER SYSTEMS
Accuracy Measures
RMSE (Root Mean Square Error)
MAP (Mean Average Precision)
Recommender Systems by Navule Pavan Kumar Rao
SVD IN RECOMMENDER SYSTEMS
There are variantions in SVD with this approach. The one explained is Sparse
SVD for High Dimensional Data.
 PMF (Probabilistic Matrix Factorization)
 SVD++
 NMF (Non-Negative Matrix Factorization)
4 1 ?
? 2 5
3 ? ?
Recommender Systems by Navule Pavan Kumar Rao
OTHER APPLICATIONS OF SVD
Dimensionality Reduction (PCA)
Image Compression
Natural Language Processing (LSI)
And many more..
Recommender Systems by Navule Pavan Kumar Rao
REFERENCES
 http://web.mit.edu/be.400/www/SVD/Singular_Value_Decomposition.htm
 https://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-
2011/positive-definite-matrices-and-applications/singular-value-
decomposition/MIT18_06SCF11_Ses3.5sum.pdf
 http://nicolas-hug.com/blog/matrix_facto_2
Recommender Systems by Navule Pavan Kumar Rao
Recommender Systems by Navule Pavan Kumar Rao
Thank you!

More Related Content

Recommender systems

  • 2. CONTENTS Dealing with Data Recommendation Techniques Overview Content Based Filtering Collaborative Filtering Similarity Measures Recommender Systems by Navule Pavan Kumar Rao
  • 3. DEALING WITH DATA Implicit Data Explicit Data Data Gathered from User Behavior Well appreciated Data by User Ex: Frequency of a product bought by user. List of products bought by user. Ratings, Reviews, Comments Ex: user rate product on scale of 1 to 5 Advantage: Lot of such data can be captured Unbiased and versatile data Advantage: More detailed preference towards products by users. More structured Data. Disadvantage: Data is more noisy, unstructured. No evidence what user feels about the product he uses Disadvantage: All users not interested in giving ratings to products they purchase (sparsity). Possibility of Bias(from user), Fraud (from seller) Recommender Systems by Navule Pavan Kumar Rao
  • 4. RECOMMENDATION TECHNIQUES Recommender System Content based Filtering Collaborative Filtering Model based Filtering Clustering, Association Rules Mining, Bayesian Networks, Neural Networks Memory based Filtering User based Item based Hybrid Filtering Recommender Systems by Navule Pavan Kumar Rao
  • 5. CONTENT BASED FILTERING Content Based Filtering (CBF) emphasizes more on attributes of product being recommended. CBF recommends the products based on the most similar features in other products. Only related or similar products will be recommended to the user based on similarities. Independent of other users preferences. Similarity Measure employed in CBF: Cosine Similarity, Euclidian Distance, Pearson Correlation, TF-IDF Recommender Systems by Navule Pavan Kumar Rao
  • 6. CONTENT BASED FILTERING Examples: Genre based Recommendations News: Political, Business, Economies, Cryptocurrencies etc. Movies: Fantasy, Action, Comedy, Horror etc. Cuisine: Indian, Continental, Italian etc. Recommender Systems by Navule Pavan Kumar Rao
  • 7. COLLABORATIVE FILTERING Collaborative Filtering (CF) recommends products based on users preferences. CF depends on user-item relationship. Based on this, it calculates the similarities between their most relevant majority user profiles and recommends items. Most relevant? Similarity Measures! Majority? Weighted Average of their ratings. U I I1 I2 I3 I4 U1 1 2 1 1 U2 2 3 5 1 U3 3 4 2 5 U4 5 3 5 4 Recommender Systems by Navule Pavan Kumar Rao
  • 8. COLLABORATIVE FILTERING User Based Filtering: Computes the similar users Calculate the Weighted average across items* Predict Item Based Filtering Compute Similar Items based on users preference history Calculate Weighted Average across items* Predict * Can use other approaches based on the CBF technique employed Recommender Systems by Navule Pavan Kumar Rao Memory Based CBF Techniques:
  • 9. SIMILARITY MEASURES Cosine Similarity Euclidian Distance TF-IDF Spearmans Rank Correlation Pearson Correlation Recommender Systems by Navule Pavan Kumar Rao
  • 10. SIMILARITY MEASURES Cosine Similarity The resultant of Cosine Similarity is the Amplitude which is on y axis. Higher the result lesser the angle between A and B and more similar they are. Recommender Systems by Navule Pavan Kumar Rao
  • 11. SIMILARITY MEASURES Cosine Similarity UserMovie Race 1 Race 2 Race 3 UA 4 3 2 UB 4 4 1 UC 2 1 5 駒 $(, ) = 44 + 34 + 21 42 + 32 + 22 42 + 42 + 12 = 0.969 UA and UB are similar users Recommender Systems by Navule Pavan Kumar Rao
  • 13. CONTENTS Introduction SVD in Detail SVD in Recommender Systems Baseline Estimation Techniques Other applications Recommender Systems by Navule Pavan Kumar Rao
  • 14. INTRODUCTION Data like user preferences, ratings etc,. are represented in the form of matrix Represent this matrix into a product(factors) of many matrices and each matrix holding some significance about the original matrix. This technique is referred as Matrix Factorization or Matrix Decomposition. Recommender Systems by Navule Pavan Kumar Rao UserMovie Race 1 Race 2 Race 3 UA 4 3 2 UB 4 4 1 UC 2 1 5
  • 15. MATRIX FACTORIZATION TECHNIQUES Based on Eigen Values Eigen decomposition Jordan decomposition Schur decomposition Real Schur decomposition QZ decomposition Takagi's factorization Singular value decomposition Solve system of Linear Equations LU decomposition LU reduction Block LU decomposition Rank factorization Cholesky decomposition QR decomposition RRQR factorization Interpolative decomposition And Many other..! Recommender Systems by Navule Pavan Kumar Rao
  • 16. SVD IN DETAIL SVD (Singular Value Decomposition) is a representation of rectangular matrix of Gene-Expression Data, Anxp= Unxn Snxp VT pxp For Ex: n rows Gene assumed as users. p columns Expression assumed as ratings to a particular product. Recommender Systems by Navule Pavan Kumar Rao UserMovie Race 1 Race 2 Race 3 UA 4 3 2 UB 4 4 1 UC 2 1 5
  • 17. SVD IN DETAIL Anxp= Unxn Snxp VT pxp Anxp Unxn (Gene) Snxp VT pxp (Expression)= Recommender Systems by Navule Pavan Kumar Rao
  • 18. SVD IN DETAIL Anxp= Unxn Snxp VT pxp Where Us Columns are Eigen Vectors of AAT and Vs Columns are Eigen Vectors of ATA S is a diagonal matrix whose elements are singular values* of A in the decreasing order. U and V and are Orthogonal, i.e. UTU = Inxn VTV = Ipxp * Singular Values are Square roots of Eigen Values Recommender Systems by Navule Pavan Kumar Rao
  • 19. SVD IN RECOMMENDER SYSTEMS Anxp= Unxn Snxp VT pxp SVD is used in Collaborative Filtering approach in the field of Recommender Systems. Reverse approach of SVD to find U or V by using Stochastic Gradient Descent Optimization technique where Anxp is sparse. Find pg and qe such that pg makes the row of matrix U where pg is the gth Gene vector of U qe makes the column of matrix VT where qe is the eth Expression vector of VT age = pg . qe All the vectors pg are mutually orthogonal, as well as the vectors qe 4 1 ? ? 2 5 3 ? ? Recommender Systems by Navule Pavan Kumar Rao
  • 20. SVD IN RECOMMENDER SYSTEMS Anxp= Unxn VT pxp aeg Anxp Unxn qe VT pxp = pg S being a diagonal matrix it just acts as a scaler on U or VT and is ignor Recommender Systems by Navule Pavan Kumar Rao
  • 21. SVD IN RECOMMENDER SYSTEMS Anxp= Unxn Snxp VT pxp The problem here is we dont know how to build pg and qe , so we take any random values* for both pg and qe Perform baseline estimate either via SGD (Stochastic Gradient Descent) ALS (Alternative Least Squares) Once convergence is achieved, we obtain optimal values of pg and qe We can then estimate any gene expression as age(estimated) = pg . qe 4 1 ? ? 2 5 3 ? ? * Random values form normal distribution Recommender Systems by Navule Pavan Kumar Rao
  • 22. SVD IN RECOMMENDER SYSTEMS Anxp= Unxn Snxp VT pxp As we have few age(actual) available in matrix A we calculate the age and try to reduce the difference between the actual and calculated value (Minimization of error) Errormin= (age(actual) pg . qe)2 We thus find for vectors that makes the sum of errors minimal. Then we construct U and V with pg as rows of U and qe as columns of VT Finally we have, A = UVT S being a diagonal matrix it just acts as a scaler on U or VT and is ignor Recommender Systems by Navule Pavan Kumar Rao
  • 23. SVD IN RECOMMENDER SYSTEMS Accuracy Measures RMSE (Root Mean Square Error) MAP (Mean Average Precision) Recommender Systems by Navule Pavan Kumar Rao
  • 24. SVD IN RECOMMENDER SYSTEMS There are variantions in SVD with this approach. The one explained is Sparse SVD for High Dimensional Data. PMF (Probabilistic Matrix Factorization) SVD++ NMF (Non-Negative Matrix Factorization) 4 1 ? ? 2 5 3 ? ? Recommender Systems by Navule Pavan Kumar Rao
  • 25. OTHER APPLICATIONS OF SVD Dimensionality Reduction (PCA) Image Compression Natural Language Processing (LSI) And many more.. Recommender Systems by Navule Pavan Kumar Rao
  • 27. Recommender Systems by Navule Pavan Kumar Rao Thank you!

Editor's Notes

  • #17: http://web.mit.edu/be.400/www/SVD/Singular_Value_Decomposition.htm
  • #18: http://web.mit.edu/be.400/www/SVD/Singular_Value_Decomposition.htm
  • #19: https://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/positive-definite-matrices-and-applications/singular-value-decomposition/MIT18_06SCF11_Ses3.5sum.pdf
  • #20: http://nicolas-hug.com/blog/matrix_facto_2
  • #21: http://nicolas-hug.com/blog/matrix_facto_2
  • #22: http://nicolas-hug.com/blog/matrix_facto_2
  • #23: http://nicolas-hug.com/blog/matrix_facto_2
  • #24: http://nicolas-hug.com/blog/matrix_facto_2
  • #25: http://nicolas-hug.com/blog/matrix_facto_2