Multimedia content is uploaded, tagged and recommended by users of collaborative systems such as YouTube
and Flickr. These systems can be represented as tagged-graphs, where nodes correspond to users and tagged-
links to recommendations. In this paper we analyze the online computation of user-rankings associated to
a set of tags, called a facet. A simple approach to faceted ranking is to apply an algorithm that calculates
a measure of node centrality, say, PageRank, to a subgraph associated with the given facet. This solution,
however, is not feasible for online computation. We propose an alternative solution: (i) first, a ranking for
each tag is computed offline on the basis of tag-related subgraphs; (ii) then, a faceted order is generated online
by merging rankings corresponding to all the tags in the facet. Based on empirical observations, we show that
step (i) is scalable. We also present efficient algorithms for step (ii), which are evaluated by comparing their
results to those produced by the direct calculation of node centrality based on the facet-dependent graph.
1 of 21
Download to read offline
More Related Content
WEBIST 2009
1. Faceted Ranking In Collaborative Tagging Systems
J. I. Orlicki12 P. Fierens2 J. I. Alvarez-Hamelin23
1 Core Security Technologies
2 ITBA
3 CONICET
WEBIST 2009, Lisbon, Portugal
2. The Problem (Faceted Reputation)
Which ickr photographers are the best regarding a facet, i.e.
tag set, { sea, portugal }?
Nodes are users/channels, edges are favorites and tags are
associated to the favorited content.
3. Single Ranking (1/3)
Basic approach, single rank and ltering. Scales well.
Everything is biased to the richer nodes, tags don't inuence
the ranking.
G goes out, but why is D worstly ranked than A regarding
{sea, portugal}? Is D better than C?
4. Single Ranking (2/3)
Basic approach, single rank and ltering. Scales well.
Everything is biased to the richer nodes, tags don't inuence
the ranking.
G goes out, but why is D worstly ranked than A regarding
{sea, portugal}? Is D better than C?
5. Single Ranking (3/3)
Basic approach, single rank and ltering. Scales well.
Everything is biased to the richer nodes, tags don't inuence
the ranking.
G goes out, but why is D worstly ranked than A regarding
{sea, portugal}? Is D better than C?
6. Edge-intersection, 1st gold standard (1/3)
Filtering edges including the conjunction of tags.
Adequate tag bias, slightly restrictive.
7. Edge-intersection, 1st gold standard (2/3)
Filtering edges including the conjunction of tags.
Adequate tag bias, slightly restrictive.
8. Edge-intersection, 1st gold standard (3/3)
Filtering edges including the conjunction of tags.
Adequate tag bias, slightly restrictive.
9. Node-intersection, 2nd gold standard (1/2)
Filtering edges including the disjunction of tags to rank.
Plus ltering conjuntion of nodes involved in every tag edge
after ranking.
Adequate tag bias, slightly irrestrictive, possibly one tag
prevails over the other.
c
10. Node-intersection, 2nd gold standard (2/2)
Filtering edges including the disjunction of tags to rank.
Plus ltering conjuntion of nodes involved in every tag edge
after ranking.
Adequate tag bias, slightly irrestrictive, possibly one tag
prevails over the other.
11. The Scalability Problem
The previous two algorithms don't scale for online queries.
Another possibility is computing singleton facets oine, and
later merge the results online.
Oine time and spatial complexity will grow linearly on
#edges #tags per edge. Scaling nicely.
100000
YouTube
Flickr
10000
1000
# edges
100
10
1
0.1
1 10 100 1000
# tags
12. Singleton facets, computed oine (1/2)
Singleton facet subgraphs used in ranking, after that only best
K users stored, where K is small.
13. Singleton facets, computed oine (2/2)
Singleton facet subgraphs used in ranking, after that only best
K users stored, where K is small.
14. Probability-product
Inspired by the probability independence rule, multiply
PageRank probability of single tags.
sea portugal rank!
A 0.09 0.02 0.0018 #6
B 0.14 0.04 0.0056 #4
C 0.14 0.40 = 0.0560 #2
D 0.38 0.39 0.1482 #1
E 0.14 0.07 0.0098 #3
F 0.09 0.05 0.0045 #5
Possible bias towards the heaviest tag, eclipsing the others.
15. Rank-sum
Lowest accumulated ordinal/position sum gets the best ranks.
sea portugal rank!
A #3 #6 9 #5
B #2 #5 7 #4
C #2 + #2 = 4 #2
D #1 #1 2 #1
E #2 #3 5 #3
F #3 #4 7 #4
Avoids this kind of topic drift towards one of the tags.
16. Winners-intersection
Top W (small) nodes per singleton facet are used to build a
new small graph.
W = 500 in experiments (W = 3 in example).
sea portugal
A #3
B #2
C #2 #2 = C
D #1 #1 D
E #2 #3 E
F #3