�ݺ�ߣ

CERTH @ MediaEval 2012 Social
Event Detection Task
Manos Schinas, Georgios Petkos, Symeon Papadopoulos,
Yiannis Kompatsiaris

Pisa, 4-5 October 2012

The problem
• Identify social events in tagged photos collections:
– Challenge 1: Technical Events @ Germany
– Challenge 2: Soccer matches @ Madrid, Hamburg
– Challenge3: Indignados protest @ Madrid

• Alternative formulation:
– Represent a collection of photos as a graph, where items
with high probability to belong to the same event are
connected.
– Each event forms a dense sub-graph in it.
– Points to community detection as method to address the
problem.

2

Approach

Step 1

Step 2

Step 3

3

Graph Creation (1)

• Graph creation is based on the use of “Same
Class” model
– A classifier which predicts whether two images
belong to the same event or not
– Support Vector Machine classifier trained with the
data of the 2011 challenge
– Input features: dissimilarities across user, title, tags,
description, time taken, GIST, SURF/VLAD

4

Graph Creation (2)

• Use the same class model to connect the items
of the collection that belong to the same event
• Retrieve candidate neighbours (~350) to
reduce computational cost
– 50 with respect to textual features
– 150 with respect to time
– 50 with respect to location (when it exists)
– 100 with respect to visual features

5

Event Partitioning and Expansion (1)
• Event partitioning
– The nodes of the graph are clustered into
candidate events by using the Structural Clustering
Algorithm for Networks (SCAN).
– The items clustered together by SCAN are used to
obtain an aggregate representation of each
candidate social event.
– Split the candidate events that exceed a
predefined time range into shorter events.

6

Event Partitioning and Expansion (2)
• Expansion of the candidate events set
– Each image that does not belong to any event
forms a single-item event.
– Merge these single-item events into larger clusters
by checking location and time.
– Add the new events in the set of the candidate
events

7

Event Filtering (1)
• Filter in two ways:
– By using geo-location (if exists)
– By using tag-based models
• Geo-location Filtering
– Discard events that don’t contained into the
bounding box of the specific challenge
– 30% of candidate events are discarded

8

Event Filtering (2)
• Tag-based filtering
– Build term models by finding the 500 dominant
terms for the specific locations and event types.
– we collect images from Flickr that are relevant to
the location or the type of event of interest.
– Images for Madrid, Hamburg and Germany
– Images for indignados, soccer and technical
events

9

Event Filtering (3)
• Tag-based filtering
– Probability of appearance

– We compute the ratio of the probability of
appearance in the focus set over the probability of
appearance in the reference set.
– Keep the 500 terms with the highest ratio
– Jaccard similarity between a tag model and events
terms

10

Evaluation

Notation
Run 1: Same class model trained with 10000 pairs of images.
Run 2: Same class model trained with 30000 pairs of images.
Run 3: Same class model of run 1 with post processing step

11

Discussion (1)
• Moving from a smaller (run 1) to a larger (run
2) training dataset does not seem to improve
most of the performance  over fitting
• Method fails in challenge 1 because these
events are different from these of the training
dataset
• A good tag model has to be used for
classification in post-filtering step

12

Discussion (2)
• Future actions:
– train the same class model with a richer set of
data
– explore different graph construction strategies
and community detection algorithms.
• Ways to improve:
– better topic classification methods
– more sophisticated methods for location
estimation

13

�ݺ�ߣ

CERTH @ MediaEval 2012 Social Event Detection Task

More Related Content

CERTH @ MediaEval 2012 Social Event Detection Task

Editor's Notes