�ݺ�ߣ

Faceted Ranking In Collaborative Tagging Systems

J. I. Orlicki12 P. Fierens2 J. I. Alvarez-Hamelin23
1 Core Security Technologies
2 ITBA

3 CONICET

WEBIST 2009, Lisbon, Portugal

The Problem (Faceted Reputation)
Which ickr photographers are the best regarding a facet, i.e.
tag set, { sea, portugal }?
Nodes are users/channels, edges are favorites and tags are
associated to the favorited content.

Single Ranking (1/3)

Basic approach, single rank and ltering. Scales well.
Everything is biased to the richer nodes, tags don't inuence
the ranking.
G goes out, but why is D worstly ranked than A regarding
{sea, portugal}? Is D better than C?


the ranking.

Edge-intersection, 1st gold standard (1/3)

Filtering edges including the conjunction of tags.
Adequate tag bias, slightly restrictive.

Node-intersection, 2nd gold standard (1/2)
Filtering edges including the disjunction of tags to rank.
Plus ltering conjuntion of nodes involved in every tag edge
after ranking.
Adequate tag bias, slightly irrestrictive, possibly one tag
prevails over the other.

c

Node-intersection, 2nd gold standard (2/2)
Filtering edges including the disjunction of tags to rank.
Plus ltering conjuntion of nodes involved in every tag edge
after ranking.
Adequate tag bias, slightly irrestrictive, possibly one tag
prevails over the other.

The Scalability Problem
The previous two algorithms don't scale for online queries.
Another possibility is computing singleton facets oine, and
later merge the results online.
Oine time and spatial complexity will grow linearly on
#edges × #tags per edge. Scaling nicely.

100000
YouTube
Flickr
10000

1000
# edges

100

10

1

0.1
1 10 100 1000
# tags

Singleton facets, computed oine (1/2)

Singleton facet subgraphs used in ranking, after that only best
K users stored, where K is small.

Singleton facets, computed oine (2/2)
Singleton facet subgraphs used in ranking, after that only best
K users stored, where K is small.

Probability-product

Inspired by the probability independence rule, multiply
PageRank probability of single tags.

sea portugal rank!
A 0.09 0.02 0.0018 #6
B 0.14 0.04 0.0056 #4
C 0.14 × 0.40 = 0.0560 #2
D 0.38 0.39 0.1482 #1
E 0.14 0.07 0.0098 #3
F 0.09 0.05 0.0045 #5

Possible bias towards the heaviest tag, eclipsing the others.

Rank-sum
Lowest accumulated ordinal/position sum gets the best ranks.

sea portugal rank!
A #3 #6 9 #5
B #2 #5 7 #4
C #2 + #2 = 4 #2
D #1 #1 2 #1
E #2 #3 5 #3
F #3 #4 7 #4

Avoids this kind of topic drift towards one of the tags.

Winners-intersection

Top W (small) nodes per singleton facet are used to build a
new small graph.
W = 500 in experiments (W = 3 in example).

sea portugal
A #3
B #2
C #2 ∩ #2 = C
D #1 #1 D
E #2 #3 E
F #3

Experiments, comp. with Edge-intersection, OSim
darker is better results

Conclusions

Exist approximate and scalable methods for faceted ranking in
collaborative tagging systems.
Functional web prototype: Egg-O-Matic

http://egg-o-matic.itba.edu.ar

Loose Ends
Using weighted graphs.
Scientic cites dataset (real egos!).
Industrial-sized dataset (10^7 instead of 10^5 edges)

Prototype (2/2, last slide, thanks!)

�ݺ�ߣ

WEBIST 2009

More Related Content

WEBIST 2009