2. Social media plays an important role during
disasters
? Realtime, popular, free
? Accessible
? Available
CSIRO: positive impact | Classifying Microblogs 2 | for Disasters | Sarvnaz Karimi
3. During disasters people share useful information
? lyttelton tunnel had reopened last night #eqnz
Or ask for help or information
? Kindercare in Fendalton, Christchurch - all okay? We are trying to
get through with no luck. #eqnz
? Need help. Any donors of medicines for diarrhea cases in Baganga,
Davao Oriental pls? #reliefPH #PabloPH pls tweet @KarloPuerto
Or even offer help
? I hv final yr medstudents in parade rd addington! They cn help.
Bruce n boys #eqnz
And sometimes not so useful
? Someone just wondered aloud if the #eqnz was just another sign
from God that he doesn't want The Hobbit to get made. #maybe?
CSIRO: positive impact | Classifying Microblogs 3 | for Disasters | Sarvnaz Karimi
4. Challenges of Working with Twitter Data
? In fact, lots of times Tweets are useless
babbles
? Tweets are really short (140 characters)
? People often speak informal language
? And even in serious messages, tweets can be
abbreviated to compensate for the length
I hv final yr medstudents in parade rd addington! They cn help. Bruce n boys #eqnz
Finding useful content can become looking for a needle
in a haystack!
CSIRO: positive impact | Classifying Microblogs 4 | for Disasters | Sarvnaz Karimi
5. How to filter massive amount of Twitter
messages in order to identify high value
tweets related to natural or man-made
disasters, or even specific types of disaster?
CSIRO: positive impact | Presentation 5 | title | Presenter name
6. Keyword search to find disaster-related tweets
? Lots of false-positives due to multiple senses or ambiguities of
keywords such as “fire”, or even “earthquake”
She’s a natural disaster: a tsunami in her
eyes an earthquake in her chest a hurricane
flooding her mind she’s a traveling
catastrophe
In a pool of over 5700 tweets retrieved using keyword
search, we had over 50% false positives.
CSIRO: positive impact | Classifying Microblogs 6 | for Disasters | Sarvnaz Karimi
7. Our work: Classify Twitter Stream for Disasters
? Classify tweets as Disaster and Non-disaster
Binary Classification
? Classify tweets into disaster types:
– Earthquake
– Storm (hurricane, tornado, cyclone)
– Fire
– Flooding
– Other (e.g Civil disorder, Traffic accident)
Multi-class classification problem
CSIRO: positive impact | Classifying Microblogs 7 | for Disasters | Sarvnaz Karimi
8. Related Studies
? Tweet classification:
o Papers that used classifiers for categories such as news and junk, or opinion,
and private messages.
o Papers that heavily used hashtags.
o Adding context to short tweets by aggregating those that share the same
hashtags, or by adding URL contents.
? Twitter during disasters:
o Qualitative analysis on tweets published during a specific event to study
microblogger behaviour.
o On of the most cited works is by Sakaki et al. (2010), which made a classifier for
earthquake to alert people. Their classifier was based on tweet length, position
of query term (earthquake or shaking) in the tweet, n-grams, context of the
query terms.
We do not focus on specific incidents, and do not assume the hashtags are known.
We study different types of disasters, not just one.
CSIRO: positive impact | Presentation 8 | title | Presenter name
9. Twitter Data
? Sampled a total of 6,500 tweets published in a range of
two years, from December 2010 till November 2012
? Data was gathered using keyword search (fire, flooding,
storm, tornado, hurricane, cyclone, and earthquake,
accident).
? No retweets
? A number of disasters were included, among others:
earthquake in Christchurch, New Zealand, 2011, Cyclone
Yasi QLD, 2011, QLD floods, 2010-2011, bushfires in VIC,
2011, and the Hurricane Sandy, US 2012.
CSIRO: positive impact | Classifying Microblogs 9 | for Disasters | Sarvnaz Karimi
10. Annotations
? Two stage annotations
? Crowd-sourced the annotations using Crowdflower.
? Annotators where asked:
1. Is this tweet talking about a disaster? (Yes or No);
2. What type of disaster is it talking about? (multiple choice)
? Each tweet was annotated by three annotators
? 5,747 had full agreement
? 2850 tweets were identified as disaster-related and
2,897 as non-disaster
CSIRO: positive impact | Classifying Microblogs 10 | for Disasters | Sarvnaz Karimi
11. Classifiers
? SVM Classifier
? Multinomial Naive Bayes Classifier
? We only reported SVM. Naive Bayes consistently
performed worse in all the experiments.
C. Chang and C. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology
CSIRO: positive impact | Classifying Microblogs 11 | for Disasters | Sarvnaz Karimi
12. Classification Features
Specific Features:
? N-grams
? Hashtags
? Mentions
Generic Features:
? Mention count
? Hashtag count
? Links
? Tweet length
CSIRO: positive impact | Presentation 12 | title | Presenter name
What is the effect of using
incident-specific compared to
generic features in
classification accuracy? What
are the best features to use for
disaster classifiers?
13. Evaluation: Cross-validation vs. Time-Split
? K-fold cross-validation (e.g., 10 fold) is used in most similar
studies (Sriram et al., 2010, Takemura and Tajima, 2012, Vosecky et al., 2012)
Problem:
? It overlooks the time-dependency among microblog data, and
uses future-evidence, including hashtags, disaster names
Alternative:
? Time-split evaluation: Sort the data based on time, take the latest
chunk as testing and others for training.
CSIRO: positive impact | Classifying Microblogs 13 | for Disasters | Sarvnaz Karimi
14. Disaster or Non-Disaster
CSIRO: positive impact | Classifying Microblogs 14 | for Disasters | Sarvnaz Karimi
16. What features worked
? When training data is small, counts were better features.
– Disaster-related tweets had 1.2 hashtags on average, versus 0.4 for non-disaster
tweets
? When our knowledge of an event is limited, hashtags or mentions
are not so useful.
? In our experiments, classification accuracy using bigram features
was worse than unigram.
CSIRO: positive impact | Presentation 16 | title | Presenter name
17. Generic Features vs. Event-specific Features
? We need to learn the patterns that imply a type of natural or man-made
disaster:
A massive cloud of smoke can be seen in south-west Lake
Macquarie from the Wyee bushfire #nswfires #wyeefire
@NewcastleHerald
Same location, no disaster:
Lake Macquarie is big & beautiful http: // lockerz.
com/ s/ 257143427
CSIRO: positive impact | Presentation 17 | title | Presenter name
18. Can we cross-train for disaster types?
Training Testing
Application:
- Compromise for disaster types with little training data.
- Reduce ambiguity
CSIRO: positive impact | Classifying Microblogs 18 | for Disasters | Sarvnaz Karimi
19. Cross-Disaster Classification
How much our classifiers can be generalised to identify
previously unseen disaster types?
CSIRO: positive impact | Classifying Microblogs 19 | for Disasters | Sarvnaz Karimi
Specific Feature Generic feature
?We used under-sampling to create training and testing data
20. Can we cross-train for disaster types?
? Yes! Our results showed promise, especially for fire.
? “Language of disaster”
? Using generic features was more effective.
CSIRO: positive impact | Classifying Microblogs 20 | for Disasters | Sarvnaz Karimi
21. What’s Next
Events are often associated with a location
1. Better Classifiers: We can use existence of location information
as a feature to strengthen our classifiers
2. Help taking actions on the information: Once we know a tweet
is talking about a disaster, we can then extract information on
locations. This could help emergency responders in resource
allocation.
? We have already established that traditional Named Entity
Recognisers are able to identify locations in tweets with high
accuracy*. Now we need to pinpoint them on the map!
* J. Lingad, S. Karimi, J. Yin, Location Extraction From Disaster-Related Microblogs, Proceedings of the 22nd international conference on World
Wide Web companion, 2013
CSIRO: positive impact | Classifying Microblogs 21 | for Disasters | Sarvnaz Karimi
Editor's Notes
What is happening in realtime (official news come later)
How to get help
how we all can help
Let others know you’re fine
In this talk, we focus on Twitter but general finding may apply for other social media.
There are useful information people share during an incident, e.g. A tunnel oppened
There are many useless tweets that pollute the ones with useful information and add noise.
In terms of language processing, we have to deal with short text with no obvious context, that is often informal, and words are arbitrarily shortened.
Because we’re dealing with a media that is not supervised, we have to know that there are large volume of data there that are largely useless for many applications.
If one was to monitor social media for events such as fire or earthquake, plain keyword search is pretty much useless.
We decided to go with a classification approach
Tweet classification is not new.
Other avenue is adding context to the classifiers.