The document discusses a project called REVEAL that aims to analyze social media streams to extract modalities like reputation and credibility of information. It outlines approaches used like geosemantic classification of evidence and knowledge-based trust modelling. Initial results show potential for identifying trusted sources and classifying evidence to help verify news stories. Future work includes user trials to test the automated analysis against journalist fact checking.
1 of 17
Download to read offline
More Related Content
REVEAL Project - Trust and Credibility Analysis
1. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Your Name Your Company
your@email
www.yourwebsite.com
Stuart E. Middleton
University of Southampton IT Innovation Centre
sem@it-innovation.soton.ac.uk
www.it-innovation.soton.ac.uk
REVEAL Project - Trust and Credibility Analysis
2. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
RDSM 2015 Invited Talk Overview
1
REVEAL Project
Modality Analysis of Social Media Streams
Geosemantics and Spatio-Temporal Grounding of Rumours
Knowledge-based Approach to Trust and Credibility Modelling
Future Work
Overview
3. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Overview
2
Objectives
Enable users to reveal hidden modalities such as reputation, influence or credibility of information
Approach - Modality Extraction and Analysis
Real-time modality extraction
On-demand analytics capabilities
Event-driven architecture using RabbitMQ to communicate
Processing based on a scalable STORM cluster (real-time) & standalone HTTP services (on-demand)
Journalism Use Case
Newsgathering - Find newsworthy content and evidence to help verify this content
Enterprise Use Case
Forums - Identify and help newbies, track product feedback & sentiment & emerging trends
REVEAL Project
4. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Modality Extraction
3
Social Media Streams
Twitter, You Tube, Instagram, Facebook, Four Square
Search (historical) and Stream (real-time)
Social Network Analysis
Community Detection, Community Graph Extraction, Community Classification (e.g. topics), Role
Analysis (e.g. popular participant), Influence Models ...
Content Analysis
Image Feature Extraction (e.g. sky, city), Image Similarity Clustering, Multimedia Indexing, Image
Manipulation Detection, Topic Models, Original Content Detection, Text Stylometry ...
Geospatial Analysis
Geoparsing, Geosemantic Classification, Image-based Geolocation, Geospatial Topic Model ...
Semantic Analysis
Directed Linked-Data Crawler, Semantic Context (e.g. context summaries based on linked data) ...
Modality Analysis of Social Media Streams
5. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Real-Time Annotation of Social Media
4
Modality Analysis of Social Media Streams
JSON Twitter @stuart_e_middle <tweet>
Author
Timestamp
URI's
Hashtags
...
Geoparsed
Location
Influence
Score
Topic
Model
JSON Facebook BBCNews <post>
JSON YouTube CNN <video>
JSON Instagram bbcnews <image>
11:30
11:35
11:40
11:45
11:50
Content Created
Text
Stylometry
6. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Incremental Aggregation of Annotated Social Media
5
Modality Analysis of Social Media Streams
JSON Facebook BBCNews <post>
JSON YouTube CNN <video>
JSON Instagram bbcnews <image>
JSON Twitter @stuart.e.middle <tweet>
JSON Facebook Profile @gadgetshow
JSON Instagram gadgetshow <image>
Cross Check
- Timestamps
- Locations
- Authors
...
Trust Analysis
- Trusted sources
- Reputations
- Correlation to known facts
...
Trustworthy
Evidence
Credible
Evidence
User Feedback
User Feedback
7. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Geosemantic Classification of Evidence
6
What is geosemantics
Study of context of spatial data - in our case contextual text relating to mentioned locations
Our Approach Geosemantic feature classification
Geoparsing [Planet OpenStreetMap database] 損 LOC (high precision, native language)
Text + POS + LOC + training set 損 classifier 損 context of how is location is talked about
Classes 損 timeliness past | future | present, situatedness insitu | remote, confirmation confirm | deny
Eyewitness reports 損 insitu
Breaking content 損 present
Denial of rumours 損 deny
State of the art Geosemantic text analysis
Text + POS + training set 損 classifier 損 event type
Location text 損 NLP Grammar 損 direction & distance
e.g. trouble spotted 5 miles north of London
Location text 損 sentiment analysis 損 good / bad opinion of text
Resilience of approaches across event types and languages an issue
Geosemantics and Spatio-Temporal Grounding of Rumours
8. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium 7
Geosemantics and Spatio-Temporal Grounding of Rumours
Open Street Map
Planet Database
Twitter Search & Streaming API
You Tube Search API
Instagram Search API
locations &
geometry
Keywords
Focus
Area(s)
RabbitMQ
HTTP, SQL,
SPARQL
JSON tweets
JSON tweets
+ Locations
JSON tweets
+ Locations
+ Class labels
Social
Media
Crawler
Geoparse
Geoparse
Geoparse
Geoparse
Geoparse
Geoclassify
Situation
Assessment
Visualization
Storm
Topology
Service
Key
Geospatial
Pre-processing
SQL content +
loc + class
CITATION geosemantics
Middleton, S.E. Krivcovs, V.
"Geoparsing and Geosemantic Analysis of Social Media for
Spatio-temporal Grounding of Rumours during Breaking News"
submitted TOIS 2015
CITATION geoparsing
Middleton, S.E. Middleton, L. Modafferi, S.
"Real-time Crisis Mapping of Natural Disasters using Social Media"
Intelligent Systems, IEEE , vol.29, no.2, pp.9,17, Mar.-Apr. 2014
CITATION tech blog
Middleton, S.E.
"From Twitter-based Crisis Mapping to Large-scale Real-Time
Situation Assessment with Trust and Credibility Analysis", 2014
http://revealproject.eu/
9. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Case Study - NYSE flooding in 2012 (false rumour)
8
Geosemantic filtering of evidence
7000+ tweets in 5 minute analysis window
114 ground truth tweets - WeatherChannel & CNN
Geosemantic filter reduced content volume by 95%
100% ground truth recall for CONFIRM class
85% ground truth recall for DENY class
Geosemantics and Spatio-Temporal Grounding of Rumours
New York Stock Exchange
(building)
Battery Park
(park)
10. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Case Study - Donetsk Airport 2015 (conflicting claims)
9
Spatio-temporal mapping of evidence
300,000 Tweets, YouTube & Instagram posts over 24 hours of Ukraine Crisis for 20th Jan 2015
4 ground truth YouTube videos used by LifeNews reports on 20th Jan 2015
Focus: Dontesk Airport cluster
Ground truth URI's ranked 10,14 & 28 out of 30
Geosemantics and Spatio-Temporal Grounding of Rumours
亠亠仍亠
(suburb)
Donetsk Airport
(building)
亳于从亳亶 仗仂仗亠从
(road)
11. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Definitions
10
What is credibility and trust ?
Trust and credibility are not well defined below is our interpretation
Credibility - consistency with other content (e.g. similar reports) and contextual information (e.g. local
geography)
Trust - subjective assessment of likelihood of content being false
A credible news report might still be false!
Knowledge-based Approach to Trust and Credibility Modelling
12. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Approach
11
Our Approach Knowledge-based Trust Modelling
Journalist already have personal sets of trusted sources they have come to rely upon
Knowledge-based approach
Journalist assert a-priori known facts (e.g. trusted sources, known locations)
Evidence from streams asserted incrementally into a triple store (i.e. useekm + owlim)
Simple inference 損 classify evidence 損 interactive exploration of evidence with journalist
OWL classes & individuals, owl:Restriction, owl:intersectionOf, SPARQL, GeoSPARQL ...
Not a black box - End users explore the evidence and we help them make a verification decision
Scalable approach able to represent different viewpoints of Journalists
State of the Art Trust and Credibility Modelling
Unsupervised learning (e.g. Bayesian Network, Damper Shafer) 損 trust prediction without explanation
Supervised reputation models 損 trust prediction with explanation
Heuristics & activity metrics 損 trust prediction with explanation
Knowledge-based Approach to Trust and Credibility Modelling
13. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Early Results - Work in Progress
12
Knowledge-based Approach to Trust and Credibility Modelling
14. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Early Results - Work in Progress
13
Knowledge-based Approach to Trust and Credibility Modelling
15. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Early Results - Work in Progress
14
Knowledge-based Approach to Trust and Credibility Modelling
16. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Roadmap for REVEAL Trust and Credibility Analysis
15
REVEAL project runs until Sept 2016
Ethnographic studies with Journalists
Crawl content in parallel to journalists searching User Generated Content (UGC)
Record ground truth by observing Journalists verifying news for real & explaining decisions
Empirical analysis - compare automated decisions with ground truth
MediaEval 2015 verification challenge
Competition verifying images using a common Twitter dataset (10 different news events)
Users trials 2015 - 2016
Future Work
17. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium 16
Any questions?
Stuart E. Middleton
University of Southampton IT Innovation Centre
email: sem@it-innovation.soton.ac.uk
web: www.it-innovation.soton.ac.uk
twitter:@stuart_e_middle, @IT_Innov, @RevealEU
Many thanks for your attention!