�ݺ�ߣ

1
ITWP2011, Barcelona

Using Social- and Pseudo-
Social Networks to Improve
Recommendation Quality
Alan Said, Ernesto W. De Luca, Sahin Albayrak

2

Abstract
 The accumulated amount of data in the digital universe
reached 1.2 Zettabytes (1 billion terabytes) in 2010.
 50% increase since 2008.
 Websites increasingly accumulate a wider variety of data on
their users
 Without necessarily using it

 This paper: how can this data be used to improve
recommendation

3

Outline
 Introduction
 Recommender Systems
 Problem statement
 Dataset
 Statistics
 Social and Pseudo-Social networks
 Approach
 Results

4

Introduction
 IMDb, one of the first online recommender systems, turned
20 on October 17th 2010.

 Ever since their beginning, recommender systems have,
through relatively simple techniques, produced
recommendations for their users

 Today’s online systems contain more information about their
users, we should use that information.
 Which information is important?

5

The Problem
• What to do with the heaps of information available?
• What and how to use in order to improve, or learn how to
improve recommendations

• How should we treat
• Friendships?
• Comments?
• Idols?
• common interests?
• How important are these in terms of recommendation
quality?

6

Dataset
 From the movie domain – Moviepilot.de
 Germany’s largest movie recommendation community
 1M+ users
 13M ratings
 50K movies

 Subset used here
 10, 000 randomly selected users with minimum 30 ratings
 1.5M ratings
 50, 000 comments
 4, 000 friendships
 170, 000 idols
 25, 000 ”diggs”

7

Social- and Pseudo-Social
networks
 Social networks
 Explicit statements of friendship between users

 Pseudo social networks
 Users commenting on the same movie
 Users being fans of the same people
 Users ”digging” the same news articles, trailers, etc.

 38% of ratings performed by users with friends
 45% of ratings performed by users with comments
 77% of ratings performed by users who are fans
 29% of ratings performed by users who ”digg”

8

The Approach
 Augmentig k-Nearest Neighbor neighborhoods by using
information from (pseudo) social networks

 Using standard Pearson Similarity
 Increasing the similarity of users in the same networks in order to add
them to the neighborhood

9

The Approach

Standard neighborhood Augmented neighborhood

10

Motivation
 Similarity metrics (Pearson, Jaccard, etc) are based on co-
ratings
 Popular items often heighten similarities without adding ”value”
e.g. movies like ”The Matrix” and ”The Lord of The Rings” often
have similar (high) ratings, even if users do not share taste
 Adding importance to users who share other interests filters out
some of the effects of popular items.

11

Results
10

9

8

7

6

5 MAP
P@10
4

3

2

1

0
Friendships Comments Fans Diggs

12

Conclusion
 Social and interaction (co-commenting, etc) networks seem
to hold more information than standard CF is able to identify
 Similarity metrics do not always tell the complete truth

 ToDo’s:
 Find items that are important for establishing similarity between
users
 Investigate what other information can be used for measuring
similarities

13

Questions?

Thank you!

�ݺ�ߣ

Using Social- and Pseudo-Social Networks to Improve Recommendation Quality

Convert to study materialsBETA

More Related Content

Using Social- and Pseudo-Social Networks to Improve Recommendation Quality