This document proposes extending recommendation systems with semantics and context-awareness. It discusses limitations of traditional recommendation models and how semantics and context could help overcome those limitations. The authors propose a model that uses domain concepts with implicit semantics relationships and contextual concepts without semantics. An offline experiment on a pruned MovieLens dataset compares the proposed model to baselines. Results show the proposed contextual-semantic model improves prediction accuracy overall and for cold-start users compared to static and non-semantic models.
1 of 18
Downloaded 51 times
More Related Content
Extending Recommendation Systems With Semantics And Context Awareness
1. Extending recommendation systems
with semantics and context-awareness
CCIA 2011
Victor Codina & Luigi Ceccaroni
vcodina@lsi.upc.edu lceccaroni@BDigital.org
Departament de Llenguatges i Sistemes Informtics Health Informatics
Knowledge Engineering and Machine Learning Group Personalized Computational Medicine
2. Outline
Traditional vs. Contextual recommendation
State-of-the-art & Current limitations
Research question
Semantics acquisition & exploitation
Proposed model
Experimental evaluation
Conclusions & Future work
Extending Recommendation Systems with Semantics and Context-Awareness 2
3. Traditional recommendation problem
Regression problem:
o Given a pair (u U, i I), predict items degree of utility ( )
Estimation based only on user and item information
preferences (u)
Preference Collaborative filtering (CF)
Content-based (CB)
Hybrid
Recommendation model attributes (i)
Matrix recommender
Extending Recommendation Systems with Semantics and Context-Awareness 3
4. Context-aware recommendation problem
Context as additional dimension for estimation
o Given a tuple (u, i, c), predict items degree of utility in context c
o Context = situated action
Training data
Representational view:
c Pre-filtering
c Recommendation model
Multi-Dimensional (MD)
Example:
c = (winter, cold) c Post-filtering
c1 = Season c2 = Temperature
Extending Recommendation Systems with Semantics and Context-Awareness 4
5. State-of-the-art & limitations
Adaptations of latent-factor models (MD paradigm)
Examples:
o N-dimensional Tensor Factorization
o Bias-based Matrix Factorization with temporal dynamics
Best prediction accuracy results on recent competitions
o E.g.: Netflix challenge (2009), Yahoo! Labs KDD Cup (2011)
Main limitations of latent-factor models:
o Lack of transparency in explaining recommendations
o Low cold-start performance (users and items with few ratings)
o Lack of novelty and diversity of recommendations
Extending Recommendation Systems with Semantics and Context-Awareness 5
6. Research questions & main assumptions
Research questions
o Q1. Can we overcome the limitations and improve global
recommendation quality (not only prediction accuracy) by
exploiting domain and context knowledge?
o Q2. Under which conditions is this improvement maximized?
Main assumptions
o There exists semantic relationships among entities of the
recommendation space (users, items, contexts)
o The adequate exploitation of these semantic relationships is
useful to overcome current limitations
Extending Recommendation Systems with Semantics and Context-Awareness 6
7. Knowledge acquisition and representation
Domain/Context Concept Concept
concepts x y
S(x,y)?
Explicit similarity Implicit similarity
Ontology-based Statistics-based
Similarity measure - Edge-based (LCA) - Probabilistic measures (PMI)
- Node-based (MICA) - Dimensionality reduction (LSA)
- Logic-based - Graph-based (SimRank)
uses uses
Ontologies Data collections
Knowledge source - Folksonomies
- Taxonomies (ODP)
- Thesauri (Wordnet) - Item descriptions
Extending Recommendation Systems with Semantics and Context-Awareness 7
8. User/Item representation
Concept-based modeling (weighted overlay approach)
Domain knowledge
(concepts = item attributes)
User u d2 d4 Item i
Pu Pi
d1 d3
(Degree of interest in d1) (Relevance of d3)
Interest inferring method Attribute weighting method
- Explicit feedback (Rating avg) - Structured content (IDF)
- Implicit feedback (Seen frequency) - Unstructured (TFIDF, tagshare)
Extending Recommendation Systems with Semantics and Context-Awareness 8
9. Knowledge exploitation
Knowledge
Can be used for Possible benefits
type
Measuring the semantic matching among Less rigid contextual
Contextual different context states filtering than using
exact matching
Applying semantic inference methods over Enrich item/user
user/item concept-based profiles: profiles with new
- Spreading activation concepts
- Reasoning based on DLs semantically related
Domain-
based Measuring the matching between two More precise
user/item using various semantic matching similarity
strategies: measurements that
- Pairwise (Best-pairs or All-pairs) using traditional
- Groupwise (set-, vector- or graph-based) measures
Extending Recommendation Systems with Semantics and Context-Awareness 9
10. Case of study: a MD semantically-enhanced CB
Contextual prediction model (bias-based):
Overall Contextual Contextual
rating avg User bias Item bias All-pairs Item-User semantic matching
where:
Session bias of (u,d) contextual bias of (u,d)
Stochastic gradient descent for model training:
Extending Recommendation Systems with Semantics and Context-Awareness 10
11. MovieLens Dataset
Contextual concepts without semantics
o 3 contextual factors (season, time of the day, weekend?)
Domain concepts with implicit semantics
o Set of pre-selected tags + set of genres
o Semantic relationships among tags acquired from folksonomy
Original dataset pruned by selecting only items with a
certain amount of pre-selected tags
Extending Recommendation Systems with Semantics and Context-Awareness 11
12. Offline experiment
Last ratings (according to timestamp) testing
o In this way we simulate future predictions for each user
5-fold cross validation
Two recommendation tasks evaluated
o Rating prediction (RMSE) and Top-10 recommendation (Recall)
Threshold-based cold-start performance evaluation
o User profile size < 25 ratings: 10% of users
Performance comparison of the proposed model with:
o 3 model variants
o 5 baseline models
o 1 model based on matrix factorization
Extending Recommendation Systems with Semantics and Context-Awareness 12
13. Results
Paired t test significance among 4 model variants:
o Model 1 Static-CB (static bias + traditional Item-User matching)
o Model 2 Static-SemCB (static bias + All-pairs matching)
o Model 3 Contextual-CB (contextual bias + traditional matching)
o Model 4 Contextual-SemCB (contextual bias + All-pairs matching)
Global RMSE Cold-Start RMSE
0,851 0,919
0,844 0,918
0,837 0,917
0,05 0,001 0,17 0,05 0,62 0,01
0,83 0,916
Model 1 Model 2 Model 3 Model 4 Model 1 Model 2 Model 3 Model 4
(P-values in red)
E.g. P-value = 0,05 means that there is a 95% chance of being a real difference
Extending Recommendation Systems with Semantics and Context-Awareness 13
14. Conclusions
Context-awareness improves prediction accuracy for
users with a certain number of ratings (non cold-start)
o 25+ rating: 90% of users
Semantics slightly improves cold-start performance
The knowledge acquisition method for the MovieLens
folksonomy may be not adequate: limited domain
knowledge
MovieLens users rate several movies at once and not just
after seeing the movie
o Rating-session--specific effects have a major influence in the
user ratings: distorted contextual information
Extending Recommendation Systems with Semantics and Context-Awareness 14
15. Future work
Extending evaluation of the proposed CB model:
o Using datasets from other domains (e.g. music, tourism, health)
o Experimenting with other sources of knowledge (e.g. Amazon
movie taxonomy)
o Experimenting with other methods for semantics exploitation
o Evaluating other properties (e.g. diversity, novelty, coverage)
Extending CF models with the proposed semantic
approach:
o Neighborhood-based
o Matrix Factorization
Extending Recommendation Systems with Semantics and Context-Awareness 15
16. Extending recommendation systems
with semantics and context-awareness
CCIA 2011
Victor Codina & Luigi Ceccaroni
vcodina@lsi.upc.edu lceccaroni@BDigital.org
Departament de Llenguatges i Sistemes Informtics Health Informatics
Knowledge Engineering and Machine Learning Group Personalized Computational Medicine
18. Prediction models of all variants
Model 1 (Static-CB):
Model 2 (Static-SemCB):
Model 3 (Contextual-CB):
Model 4 (Contextual-SemCB):
Extending Recommendation Systems with Semantics and Context-Awareness 18