This document describes CERTH's approach to the MediaEval 2012 Social Event Detection task. It involved creating a graph of images based on visual, textual and temporal similarities, clustering the graph to detect candidate events, and filtering events based on geolocation and tags. CERTH evaluated three runs of their system, finding that moving to a larger training dataset did not improve performance, and the method failed on challenge 1 due to dataset differences from training. Future work could involve training on more data and exploring different graph and clustering methods.
1 of 14
Download to read offline
More Related Content
CERTH @ MediaEval 2012 Social Event Detection Task
2. The problem
Identify social events in tagged photos collections:
Challenge 1: Technical Events @ Germany
Challenge 2: Soccer matches @ Madrid, Hamburg
Challenge3: Indignados protest @ Madrid
Alternative formulation:
Represent a collection of photos as a graph, where items
with high probability to belong to the same event are
connected.
Each event forms a dense sub-graph in it.
Points to community detection as method to address the
problem.
2
4. Graph Creation (1)
Graph creation is based on the use of Same
Class model
A classifier which predicts whether two images
belong to the same event or not
Support Vector Machine classifier trained with the
data of the 2011 challenge
Input features: dissimilarities across user, title, tags,
description, time taken, GIST, SURF/VLAD
4
5. Graph Creation (2)
Use the same class model to connect the items
of the collection that belong to the same event
Retrieve candidate neighbours (~350) to
reduce computational cost
50 with respect to textual features
150 with respect to time
50 with respect to location (when it exists)
100 with respect to visual features
5
6. Event Partitioning and Expansion (1)
Event partitioning
The nodes of the graph are clustered into
candidate events by using the Structural Clustering
Algorithm for Networks (SCAN).
The items clustered together by SCAN are used to
obtain an aggregate representation of each
candidate social event.
Split the candidate events that exceed a
predefined time range into shorter events.
6
7. Event Partitioning and Expansion (2)
Expansion of the candidate events set
Each image that does not belong to any event
forms a single-item event.
Merge these single-item events into larger clusters
by checking location and time.
Add the new events in the set of the candidate
events
7
8. Event Filtering (1)
Filter in two ways:
By using geo-location (if exists)
By using tag-based models
Geo-location Filtering
Discard events that dont contained into the
bounding box of the specific challenge
30% of candidate events are discarded
8
9. Event Filtering (2)
Tag-based filtering
Build term models by finding the 500 dominant
terms for the specific locations and event types.
we collect images from Flickr that are relevant to
the location or the type of event of interest.
Images for Madrid, Hamburg and Germany
Images for indignados, soccer and technical
events
9
10. Event Filtering (3)
Tag-based filtering
Probability of appearance
We compute the ratio of the probability of
appearance in the focus set over the probability of
appearance in the reference set.
Keep the 500 terms with the highest ratio
Jaccard similarity between a tag model and events
terms
10
11. Evaluation
Notation
Run 1: Same class model trained with 10000 pairs of images.
Run 2: Same class model trained with 30000 pairs of images.
Run 3: Same class model of run 1 with post processing step
11
12. Discussion (1)
Moving from a smaller (run 1) to a larger (run
2) training dataset does not seem to improve
most of the performance over fitting
Method fails in challenge 1 because these
events are different from these of the training
dataset
A good tag model has to be used for
classification in post-filtering step
12
13. Discussion (2)
Future actions:
train the same class model with a richer set of
data
explore different graph construction strategies
and community detection algorithms.
Ways to improve:
better topic classification methods
more sophisticated methods for location
estimation
13