際際滷

際際滷Share a Scribd company logo
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Your Name  Your Company
your@email
www.yourwebsite.com
Stuart E. Middleton
University of Southampton IT Innovation Centre
sem@it-innovation.soton.ac.uk @stuart_e_middle @IT_Innov @RevealEU
www.it-innovation.soton.ac.uk
Extracting Attributed Verification and Debunking Reports from Social Media:
MediaEval-2015 Trust and Credibility Analysis of Image and Video
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Overview
1
 Problem Statement
 Approach
 Results
 Discussion
 Suggestions for Verification Challenge 2016
UoS-ITI Team
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Verification of Images and Videos for Breaking News
2
 Breaking News Timescales
 Minutes not hours - its old news after a couple of hours
 Journalists need to verify copy and get it published before their rivals do
 Journalistic Manual Verification Procedures for User Generated Content (UGC)
 Check content provenance - original post? location? timestamp? similar posts? website? ...
 Check author / source - attributed or author? known (un)reliable? popular? reputation? post history? ...
 Check content credibility - right image metadata? right location? right people? right weather? ...
 Phone the author up - triangulate facts, quiz author to check genuine, get authorization to publish
 Automate the Simpler Verification Steps
 Empowering journalists
 Increases the volume of contextual content that can be considered
 Focus humans on the more complex & subjective cross-checking tasks
 Contact content authors via phone and ask them difficult questions
 Does human behaviour 'look right' in a video?
 Cross-reference buildings / landmarks in image backgrounds to Google StreetView / image databases
 ... see the VerificationHandbook 損 http://verificationhandbook.com/
Problem Statement
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Attribute evidence to trusted or untrusted sources
3
 Hypothesis
 The 'wisdom of the crowd' is not really wisdom at all when it comes to verifying suspicious content
 It is better to rank evidence according to the most trusted & credible sources like journalists do
 Semi-automated approach
 Manually create a list of trusted sources
 Tweets 損 NLP 損 Extract fake & genuine claims & attribution to sources 損 Evidence
 Evidence 損 Cross-check all content for image / video 損 Fake/real decision based on best evidence
 Trustworthiness hierarchy for tweeted claims about images & videos
 Claim = statement that its a fake image / video or its genuine
 Claim authored by trusted source   
 Claim authored by untrusted source   
 Claim attributed to trusted source  
 Claim attributed to untrusted source  
 Unattributed claim 
Approach
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Regex patterns
4
Approach
Named Entity Patterns
@ (NNP|NN)
# (NNP|NN)
(NNP|NN) (NNP|NN)
(NNP|NN)
Attribution Patterns
<NE> *{0,3} <IMAGE> ...
<NE> *{0,2} <RELEASE> *{0,4} <IMAGE> ...
... <IMAGE> *{0,6} <FROM> *{0,1} <NE>
... <FROM> *{0,1} <NE>
... <IMAGE> *{0,1} <NE>
... <RT> <SEP>{0,1} <NE>
Faked Patterns
... *{0,2} <FAKED> ...
... <REAL> ? ...
... <NEGATIVE> *{0,1} <REAL> ...
Genuine Patterns
... <IMAGE> *{0,2} <REAL> ...
... <REAL> *{0,2} <IMAGE> ...
... <IS> *{0,1} <REAL> ...
... <NEGATIVE> *{0,1} <FAKE> ...
e.g.
CNN
BBC News
@bbcnews
e.g.
FBI has released prime suspect photos ...
... pic - BBC News
... image released via CNN
... RT: BBC News
e.g.
... what a fake! ...
... is it real? ...
... thats not real ...
e.g.
... this image is totally genuine ...
... its real ...
Key
<NE> = named entity (e.g. trusted source)
<IMAGE> = image variants(e.g. pic, image, video)
<FROM> = from variants(e.g. via, from, attributed)
<REAL> = real variants (e.g. real, genuine)
<NEGATIVE> = negative variants (e.g. not, isn't)
<RT> = RT variants (e.g. RT, MT)
<SEP> = separator variants (e.g. : - = )
<IS> = is | its | thats
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Fake & Real Tweet Classifier
5
Results
fake classification real classification
P R F1 P R F1
faked & genuine patterns
1.0 0.03 0.06 0.75 0.001 0.003
faked & genuine & attribution patterns
1.0 0.03 0.06 0.43 0.03 0.06
faked & genuine & attribution patterns & cross-check
1.0 0.72 0.83 0.74 0.74 0.74
fake classification real classification
P R F1 P R F1
faked & genuine & attribution patterns & cross-check
1.0 0.04 0.09 0.62 0.23 0.33
Fake & Real Image Classifier
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Fake & Real Tweet Classifier
6
Results
fake classification real classification
P R F1 P R F1
faked & genuine patterns
1.0 0.03 0.06 0.75 0.001 0.003
faked & genuine & attribution patterns
1.0 0.03 0.06 0.43 0.03 0.06
faked & genuine & attribution patterns & cross-check
1.0 0.72 0.83 0.74 0.74 0.74
fake classification real classification
P R F1 P R F1
faked & genuine & attribution patterns & cross-check
1.0 0.04 0.09 0.62 0.23 0.33
Fake & Real Image Classifier
No mistakes classifying
fakes in testset
Low false positives important
for end users like journalists
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Fake & Real Tweet Classifier
7
Results
fake classification real classification
P R F1 P R F1
faked & genuine patterns
1.0 0.03 0.06 0.75 0.001 0.003
faked & genuine & attribution patterns
1.0 0.03 0.06 0.43 0.03 0.06
faked & genuine & attribution patterns & cross-check
1.0 0.72 0.83 0.74 0.74 0.74
fake classification real classification
P R F1 P R F1
faked & genuine & attribution patterns & cross-check
1.0 0.04 0.09 0.62 0.23 0.33
Fake & Real Image Classifier
Performance looks good
when averaged on whole
dataset
Not good for all images though
Better classifying real images
than fake ones
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Application to our journalism use case
8
 Classifying tweets in isolation (fake and real) is of limited value
 High precision (89%+) but low recall (1%)
 Cross-check tweets then ranking by trustworthiness
 No false positives for fake classification using testset
 High precision (94%+) with average recall (43%+) looking across events in devset and testset
 Typically viral images & videos will have 100's of tweets before journalists become aware of them so a
recall of 20% is probably OK in this context
 Image classifiers
 Fake image classifier 損 High precision (96-100%) but low recall (4-10%)
 Real image classifier 損 High precision (62-95%) but low recall (19-23%)
 Classification explained in ways journalists understand & therefore trust
 Image X claimed verified by Tweet Y attributing to trusted entity Z
 We can alert journalists to trustworthy reports of verification and/or debunking
 Our approach does not replace manual verification techniques
 Someone still needs to actually verify the content!
Discussion
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Focus on image classification not Tweet classification
9
 The long term aim is to classify the images & videos NOT the tweets about them
 Suggestion 損 Score image classification results as well as tweet classification results
 End users usually wants to know if its real, not if its fake
 Classifying something as fake is usually a means to an end (e.g. to allow filtering)
 Suggestion 損 Score results for fake classification & real classification
Suggestions for Verification Challenge 2016
Improve the Tweet datasets to avoid bias to a single event
 Suggest using leave one event out cross validation when computing P/R/F1
 Suggest removing tweet repetition
 Some events (e.g. Syrian Boy) contain many duplicate tweets with a different author
 A classifier might only work well on 1 or 2 text styles BUT score highly as they are repeated a lot
 Suggest evenly balancing number of tweets per event type to avoid bias
 Devset - Hurricane Sandy event has about 84% of the tweets
 Testset - Syrian Boy event has about 47% of the tweets
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium 10
Any questions?
Stuart E. Middleton
University of Southampton IT Innovation Centre
email: sem@it-innovation.soton.ac.uk
web: www.it-innovation.soton.ac.uk
twitter:@stuart_e_middle, @IT_Innov, @RevealEU
Many thanks for your attention!

More Related Content

"Extracting Attributed Verification and Debunking Reports from Social Media: MediaEval-2015 Trust and Credibility Analysis of Image and Video"

  • 1. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Your Name Your Company your@email www.yourwebsite.com Stuart E. Middleton University of Southampton IT Innovation Centre sem@it-innovation.soton.ac.uk @stuart_e_middle @IT_Innov @RevealEU www.it-innovation.soton.ac.uk Extracting Attributed Verification and Debunking Reports from Social Media: MediaEval-2015 Trust and Credibility Analysis of Image and Video
  • 2. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Overview 1 Problem Statement Approach Results Discussion Suggestions for Verification Challenge 2016 UoS-ITI Team
  • 3. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Verification of Images and Videos for Breaking News 2 Breaking News Timescales Minutes not hours - its old news after a couple of hours Journalists need to verify copy and get it published before their rivals do Journalistic Manual Verification Procedures for User Generated Content (UGC) Check content provenance - original post? location? timestamp? similar posts? website? ... Check author / source - attributed or author? known (un)reliable? popular? reputation? post history? ... Check content credibility - right image metadata? right location? right people? right weather? ... Phone the author up - triangulate facts, quiz author to check genuine, get authorization to publish Automate the Simpler Verification Steps Empowering journalists Increases the volume of contextual content that can be considered Focus humans on the more complex & subjective cross-checking tasks Contact content authors via phone and ask them difficult questions Does human behaviour 'look right' in a video? Cross-reference buildings / landmarks in image backgrounds to Google StreetView / image databases ... see the VerificationHandbook 損 http://verificationhandbook.com/ Problem Statement
  • 4. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Attribute evidence to trusted or untrusted sources 3 Hypothesis The 'wisdom of the crowd' is not really wisdom at all when it comes to verifying suspicious content It is better to rank evidence according to the most trusted & credible sources like journalists do Semi-automated approach Manually create a list of trusted sources Tweets 損 NLP 損 Extract fake & genuine claims & attribution to sources 損 Evidence Evidence 損 Cross-check all content for image / video 損 Fake/real decision based on best evidence Trustworthiness hierarchy for tweeted claims about images & videos Claim = statement that its a fake image / video or its genuine Claim authored by trusted source Claim authored by untrusted source Claim attributed to trusted source Claim attributed to untrusted source Unattributed claim Approach
  • 5. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Regex patterns 4 Approach Named Entity Patterns @ (NNP|NN) # (NNP|NN) (NNP|NN) (NNP|NN) (NNP|NN) Attribution Patterns <NE> *{0,3} <IMAGE> ... <NE> *{0,2} <RELEASE> *{0,4} <IMAGE> ... ... <IMAGE> *{0,6} <FROM> *{0,1} <NE> ... <FROM> *{0,1} <NE> ... <IMAGE> *{0,1} <NE> ... <RT> <SEP>{0,1} <NE> Faked Patterns ... *{0,2} <FAKED> ... ... <REAL> ? ... ... <NEGATIVE> *{0,1} <REAL> ... Genuine Patterns ... <IMAGE> *{0,2} <REAL> ... ... <REAL> *{0,2} <IMAGE> ... ... <IS> *{0,1} <REAL> ... ... <NEGATIVE> *{0,1} <FAKE> ... e.g. CNN BBC News @bbcnews e.g. FBI has released prime suspect photos ... ... pic - BBC News ... image released via CNN ... RT: BBC News e.g. ... what a fake! ... ... is it real? ... ... thats not real ... e.g. ... this image is totally genuine ... ... its real ... Key <NE> = named entity (e.g. trusted source) <IMAGE> = image variants(e.g. pic, image, video) <FROM> = from variants(e.g. via, from, attributed) <REAL> = real variants (e.g. real, genuine) <NEGATIVE> = negative variants (e.g. not, isn't) <RT> = RT variants (e.g. RT, MT) <SEP> = separator variants (e.g. : - = ) <IS> = is | its | thats
  • 6. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Fake & Real Tweet Classifier 5 Results fake classification real classification P R F1 P R F1 faked & genuine patterns 1.0 0.03 0.06 0.75 0.001 0.003 faked & genuine & attribution patterns 1.0 0.03 0.06 0.43 0.03 0.06 faked & genuine & attribution patterns & cross-check 1.0 0.72 0.83 0.74 0.74 0.74 fake classification real classification P R F1 P R F1 faked & genuine & attribution patterns & cross-check 1.0 0.04 0.09 0.62 0.23 0.33 Fake & Real Image Classifier
  • 7. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Fake & Real Tweet Classifier 6 Results fake classification real classification P R F1 P R F1 faked & genuine patterns 1.0 0.03 0.06 0.75 0.001 0.003 faked & genuine & attribution patterns 1.0 0.03 0.06 0.43 0.03 0.06 faked & genuine & attribution patterns & cross-check 1.0 0.72 0.83 0.74 0.74 0.74 fake classification real classification P R F1 P R F1 faked & genuine & attribution patterns & cross-check 1.0 0.04 0.09 0.62 0.23 0.33 Fake & Real Image Classifier No mistakes classifying fakes in testset Low false positives important for end users like journalists
  • 8. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Fake & Real Tweet Classifier 7 Results fake classification real classification P R F1 P R F1 faked & genuine patterns 1.0 0.03 0.06 0.75 0.001 0.003 faked & genuine & attribution patterns 1.0 0.03 0.06 0.43 0.03 0.06 faked & genuine & attribution patterns & cross-check 1.0 0.72 0.83 0.74 0.74 0.74 fake classification real classification P R F1 P R F1 faked & genuine & attribution patterns & cross-check 1.0 0.04 0.09 0.62 0.23 0.33 Fake & Real Image Classifier Performance looks good when averaged on whole dataset Not good for all images though Better classifying real images than fake ones
  • 9. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Application to our journalism use case 8 Classifying tweets in isolation (fake and real) is of limited value High precision (89%+) but low recall (1%) Cross-check tweets then ranking by trustworthiness No false positives for fake classification using testset High precision (94%+) with average recall (43%+) looking across events in devset and testset Typically viral images & videos will have 100's of tweets before journalists become aware of them so a recall of 20% is probably OK in this context Image classifiers Fake image classifier 損 High precision (96-100%) but low recall (4-10%) Real image classifier 損 High precision (62-95%) but low recall (19-23%) Classification explained in ways journalists understand & therefore trust Image X claimed verified by Tweet Y attributing to trusted entity Z We can alert journalists to trustworthy reports of verification and/or debunking Our approach does not replace manual verification techniques Someone still needs to actually verify the content! Discussion
  • 10. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Focus on image classification not Tweet classification 9 The long term aim is to classify the images & videos NOT the tweets about them Suggestion 損 Score image classification results as well as tweet classification results End users usually wants to know if its real, not if its fake Classifying something as fake is usually a means to an end (e.g. to allow filtering) Suggestion 損 Score results for fake classification & real classification Suggestions for Verification Challenge 2016 Improve the Tweet datasets to avoid bias to a single event Suggest using leave one event out cross validation when computing P/R/F1 Suggest removing tweet repetition Some events (e.g. Syrian Boy) contain many duplicate tweets with a different author A classifier might only work well on 1 or 2 text styles BUT score highly as they are repeated a lot Suggest evenly balancing number of tweets per event type to avoid bias Devset - Hurricane Sandy event has about 84% of the tweets Testset - Syrian Boy event has about 47% of the tweets
  • 11. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium 10 Any questions? Stuart E. Middleton University of Southampton IT Innovation Centre email: sem@it-innovation.soton.ac.uk web: www.it-innovation.soton.ac.uk twitter:@stuart_e_middle, @IT_Innov, @RevealEU Many thanks for your attention!