The document proposes a recommendation system that incorporates semantics to address limitations of traditional recommenders. It uses ontologies to represent user interests and item annotations, and employs semantic inference and similarity methods. An evaluation on movie ratings shows the semantic approach improves accuracy, especially for cold-start users with small profiles. Further experimentation analyzes how the structure of different taxonomy affects performance of the semantic methods.
1 of 22
Downloaded 11 times
More Related Content
Presentacion Dcai 2010
1. A Recommendation System for the
Semantic Web
Victor Codina and Luigi Ceccaroni
vcodina@lsi.upc.edu
Departament de Llenguatges i Sistemes Informtics (LSI)
Universitat Polit竪cnica de Catalunya (UPC)
DCAI 2010, September 7-10 2010, Valencia
3. Introduction Our semantic approach Evaluation Conclusions
The general personalization process
ITEMS
CONTENT ADAPTATION
Item
Representation
USER MODELING Recommendation
strategy
Implicit
feedback
Learning
User Profile
algorithm Personalized
Explicit
Recommendation
feedback
User satisfaction
User behavior USERS
DCAI 2010, September 7-10 2010, Valencia 3
4. Introduction Our semantic approach Evaluation Conclusions
Potential benefits of using semantics
The use of semantics provides several advantages to
reduce some limitations of current recommenders
o Cold-start problem
By inferring missing information exploiting the relationships
of domain ontologies
o Domain-dependency
By employing standard ontology-based languages to
uniformly represent information
DCAI 2010, September 7-10 2010, Valencia 4
5. Introduction Our semantic approach Evaluation Conclusions
Service oriented architecture design
DCAI 2010, September 7-10 2010, Valencia 5
7. Introduction Our semantic approach Evaluation Conclusions
How do we take advantage of semantics?
We incorporate semantics in both stages of the
personalization process to reduce the cold-start problem
o The user-profile learning algorithm employs a domain-based
inference method
It expands and enrich the user-profiles with interests that cannot
be directly inferred from the user feedback
o The Content-based recommendation algorithm employs a
taxonomy-based similarity method
It uses the users interests in more general concepts related to the
items annotations in order to refine the matching calculation
DCAI 2010, September 7-10 2010, Valencia 7
8. Introduction Our semantic approach Evaluation Conclusions
Semantically-enhanced learning algorithm
START. The user provides some feedback
about an item (e.g. a purchase or rating of an item)
User
Step 1. Interest weights of the concepts
related to the item are calculated/updated
Inferred
Learnt
Step 2. A domain-based inference method Updated
infers new interests from the families of
concepts with updated interests
Item
DCAI 2010, September 7-10 2010, Valencia 8
9. Introduction Our semantic approach Evaluation Conclusions
The domain-based inference method
Based on the minimum percentage of direct subconcepts
Two types of propagation
o Upward-based (propagation to the parent concept)
o Sideward-based (propagation to the siblings)
Upward-based? Sport
Pct(subconcepts) = 4/5 = 0.8 [0.5] Sideward-based?
0.8 > UIT = 0.6 => Propagation Pct(subconcepts) = 4/5 = 0.8
0.8 > SIT = 0.9 => No propagation
Baseball Basketball Football Tennis Golf
[-0.5] [0.5] [1.0] [1.0] [?]
Upward-based threshold (UIT) = 0.6
Sideward-based threshold (SIT) = 0.9
DCAI 2010, September 7-10 2010, Valencia 9
10. Introduction Our semantic approach Evaluation Conclusions
Semantically-enhanced content-based filtering
START. The system has to predict if the user
will like/dislike an item User
FOR EACH items annotation DO:
STEP 1. The conceptScore is calculated based on: Partial
The interest degree of the users interests that Partial
match the items annotation
The semantic similarity of the matchings C2
(perfect or partial match) C1 Perfect
END FOR
Item
STEP 2. The itemScore is calculated using the weighted
average of conceptScore values according to their relevance
DCAI 2010, September 7-10 2010, Valencia 10
11. Introduction Our semantic approach Evaluation Conclusions
The taxonomy-based similarity method
Based on the distance in terms of taxonomy levels between
o The items annotation
o The users interest (an ancestor of the items annotation)
Weighted semantic distance among levels using K factor
Level 1 Source Genre
User
Level 2 Sport Romance
Interest
distance = 1
User Item
Level 3 Extreme Annotation Steamy Romance K3 = 0.4
Interest
distance = 1
Item
SIM = 0.6
Level 4 Climbing K4 = 0.3
Annotation
SIM = 0.7
DCAI 2010, September 7-10 2010, Valencia 11
12. Introduction Our semantic approach Evaluation Conclusions
Experimental dataset
Netflix-prize movie dataset
o 480,000 users
o 17,700 movies
o 100M user ratings ranging between 1 and 5
Movie taxonomy used by Netflix for annotating movies
o 1 global hierarchy of concepts describing the movies
o 3 levels of depth
o 550 nodes (items annotations)
RMSE metric
o Measures the error on rating prediction for a set of users
DCAI 2010, September 7-10 2010, Valencia 12
13. Introduction Our semantic approach Evaluation Conclusions
Experimental evaluation
Exp. 1: Traditional vs semantic approach
o GOAL. To evaluate the improvement on accuracy when the
semantics-based methods are employed
Is cold-start problem reduced?
Exp. 2: Semantic approach on two different taxonomies
o GOAL. To analyze if the hierarchical structure of the taxonomy
affect the effectiveness of semantics-based methods
How the taxonomy structure affect their performance?
DCAI 2010, September 7-10 2010, Valencia 13
14. Introduction Our semantic approach Evaluation Conclusions
Exp.1: Traditional vs Semantic approach
Experiment setup
o The error of two algorithm configurations is compared
CB configuration (traditional CB approach)
SEM-CB configuration (semantically-enhanced CB approach)
User profile Interest-prediction
Config. Item - User matching
representation method
Keyword-based
CB Rating-based Perfect matches
profile
Rating-based Perfect + Partial
Ontology-based
SEM-CB + matches
profile
Domain inference (semantic similarity)
DCAI 2010, September 7-10 2010, Valencia 14
16. Introduction Our semantic approach Evaluation Conclusions
Exp.1: Traditional vs Semantic approach
Prediction results grouped by user-profile size (n尊 ratings)
Each interval nearly contains
2% of predictions of the Netflix test-set
DCAI 2010, September 7-10 2010, Valencia 16
17. Introduction Our semantic approach Evaluation Conclusions
Exp.1: Traditional vs Semantic approach
Comparison of RMSE based on user-profile size
The improvement is bigger in users with
small profile-size (the cold-start users)
DCAI 2010, September 7-10 2010, Valencia 17
18. Introduction Our semantic approach Evaluation Conclusions
Exp.2: Semantic approach on different taxonomies
Experiment setup
o Two semantics-based configurations are compared on
different versions of the movie taxonomy:
Sem-CB configuration (employs the original taxonomy)
Sem-CB+ configuration (employs an alternative version)
Taxonomy properties
Avg. Size of nodes
Config. N尊 nodes N尊 levels N尊 hierarchies
per family
SEM-CB 550 3 1 14
SEM-CB+ 550 4 4 7
DCAI 2010, September 7-10 2010, Valencia 18
19. Introduction Our semantic approach Evaluation Conclusions
Exp.2: Semantic approach on different taxonomies
Results: Parameter settings of semantics-based algorithms
Optimal execution Same accuracy
DCAI 2010, September 7-10 2010, Valencia 19
20. Introduction Our semantic approach Evaluation Conclusions
Conclusions and Future work
Main conclusions
o The cold-start problem is reduced by exploiting semantics
o The incorporation of semantics in a traditional CB approach
o The recommender is domain-independent by combining
A service oriented architecture design
Standard ontology-based languages (FOAF, OWL)
Future work
o Further experimentation
In richer domains and with other semantic methods
o The incorporation of semantics into other approaches
e.g. Collaborative Filtering and Hybrid systems
DCAI 2010, September 7-10 2010, Valencia 20
21. A Recommendation System for the
Semantic Web
Victor Codina and Luigi Ceccaroni
vcodina@lsi.upc.edu
Departament de Llenguatges i Sistemes Informtics (LSI)
Universitat Polit竪cnica de Catalunya (UPC)
DCAI 2010, September 7-10 2010, Valencia
22. Introduction Our semantic approach Evaluation Conclusions
Exp.1: Traditional vs Semantically-enhanced
Comparison of overall accuracy results:
1,08
1,06
1,04
1,02
1
0,98
0,96
0,94 RMSE
0,92
0,9
0,88
DCAI 2010, September 7-10 2010, Valencia 22