際際滷

際際滷Share a Scribd company logo
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Your Name  Your Company
your@email
www.yourwebsite.com
Stuart E. Middleton
University of Southampton IT Innovation Centre
sem@it-innovation.soton.ac.uk
www.it-innovation.soton.ac.uk
REVEAL Project - Trust and Credibility Analysis
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
RDSM 2015 Invited Talk Overview
1
 REVEAL Project
 Modality Analysis of Social Media Streams
 Geosemantics and Spatio-Temporal Grounding of Rumours
 Knowledge-based Approach to Trust and Credibility Modelling
 Future Work
Overview
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Overview
2
 Objectives
 Enable users to reveal hidden modalities such as reputation, influence or credibility of information
 Approach - Modality Extraction and Analysis
 Real-time modality extraction
 On-demand analytics capabilities
 Event-driven architecture using RabbitMQ to communicate
 Processing based on a scalable STORM cluster (real-time) & standalone HTTP services (on-demand)
 Journalism Use Case
 Newsgathering - Find newsworthy content and evidence to help verify this content
 Enterprise Use Case
 Forums - Identify and help newbies, track product feedback & sentiment & emerging trends
REVEAL Project
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Modality Extraction
3
 Social Media Streams
 Twitter, You Tube, Instagram, Facebook, Four Square
 Search (historical) and Stream (real-time)
 Social Network Analysis
 Community Detection, Community Graph Extraction, Community Classification (e.g. topics), Role
Analysis (e.g. popular participant), Influence Models ...
 Content Analysis
 Image Feature Extraction (e.g. sky, city), Image Similarity Clustering, Multimedia Indexing, Image
Manipulation Detection, Topic Models, Original Content Detection, Text Stylometry ...
 Geospatial Analysis
 Geoparsing, Geosemantic Classification, Image-based Geolocation, Geospatial Topic Model ...
 Semantic Analysis
 Directed Linked-Data Crawler, Semantic Context (e.g. context summaries based on linked data) ...
Modality Analysis of Social Media Streams
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Real-Time Annotation of Social Media
4
Modality Analysis of Social Media Streams
JSON Twitter @stuart_e_middle <tweet>
Author
Timestamp
URI's
Hashtags
...
Geoparsed
Location
Influence
Score
Topic
Model
JSON Facebook BBCNews <post>
JSON YouTube CNN <video>
JSON Instagram bbcnews <image>
11:30
11:35
11:40
11:45
11:50
Content Created
Text
Stylometry
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Incremental Aggregation of Annotated Social Media
5
Modality Analysis of Social Media Streams
JSON Facebook BBCNews <post>
JSON YouTube CNN <video>
JSON Instagram bbcnews <image>
JSON Twitter @stuart.e.middle <tweet>
JSON Facebook Profile @gadgetshow
JSON Instagram gadgetshow <image>
Cross Check
- Timestamps
- Locations
- Authors
...
Trust Analysis
- Trusted sources
- Reputations
- Correlation to known facts
...
Trustworthy
Evidence
Credible
Evidence
User Feedback
User Feedback
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Geosemantic Classification of Evidence
6
 What is geosemantics
 Study of context of spatial data - in our case contextual text relating to mentioned locations
 Our Approach  Geosemantic feature classification
 Geoparsing [Planet OpenStreetMap database] 損 LOC (high precision, native language)
 Text + POS + LOC + training set 損 classifier 損 context of how is location is talked about
 Classes 損 timeliness past | future | present, situatedness insitu | remote, confirmation confirm | deny
 Eyewitness reports 損 insitu
 Breaking content 損 present
 Denial of rumours 損 deny
 State of the art  Geosemantic text analysis
 Text + POS + training set 損 classifier 損 event type
 Location text 損 NLP Grammar 損 direction & distance
 e.g. trouble spotted 5 miles north of London
 Location text 損 sentiment analysis 損 good / bad opinion of text
 Resilience of approaches across event types and languages an issue
Geosemantics and Spatio-Temporal Grounding of Rumours
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium 7
Geosemantics and Spatio-Temporal Grounding of Rumours
Open Street Map
Planet Database
Twitter Search & Streaming API
You Tube Search API
Instagram Search API
locations &
geometry
Keywords
Focus
Area(s)
RabbitMQ
HTTP, SQL,
SPARQL
JSON tweets
JSON tweets
+ Locations
JSON tweets
+ Locations
+ Class labels
Social
Media
Crawler
Geoparse
Geoparse
Geoparse
Geoparse
Geoparse
Geoclassify
Situation
Assessment
Visualization
Storm
Topology
Service
Key
Geospatial
Pre-processing
SQL content +
loc + class
CITATION geosemantics
Middleton, S.E. Krivcovs, V.
"Geoparsing and Geosemantic Analysis of Social Media for
Spatio-temporal Grounding of Rumours during Breaking News"
submitted TOIS 2015
CITATION geoparsing
Middleton, S.E. Middleton, L. Modafferi, S.
"Real-time Crisis Mapping of Natural Disasters using Social Media"
Intelligent Systems, IEEE , vol.29, no.2, pp.9,17, Mar.-Apr. 2014
CITATION tech blog
Middleton, S.E.
"From Twitter-based Crisis Mapping to Large-scale Real-Time
Situation Assessment with Trust and Credibility Analysis", 2014
http://revealproject.eu/
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Case Study - NYSE flooding in 2012 (false rumour)
8
 Geosemantic filtering of evidence
 7000+ tweets in 5 minute analysis window
 114 ground truth tweets - WeatherChannel & CNN
 Geosemantic filter reduced content volume by 95%
 100% ground truth recall for CONFIRM class
 85% ground truth recall for DENY class
Geosemantics and Spatio-Temporal Grounding of Rumours
New York Stock Exchange
(building)
Battery Park
(park)
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Case Study - Donetsk Airport 2015 (conflicting claims)
9
 Spatio-temporal mapping of evidence
 300,000 Tweets, YouTube & Instagram posts over 24 hours of Ukraine Crisis for 20th Jan 2015
 4 ground truth YouTube videos used by LifeNews reports on 20th Jan 2015
 Focus: Dontesk Airport cluster
 Ground truth URI's ranked 10,14 & 28 out of 30
Geosemantics and Spatio-Temporal Grounding of Rumours
亠亠仍亠
(suburb)
Donetsk Airport
(building)
亳于从亳亶 仗仂仗亠从
(road)
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Definitions
10
 What is credibility and trust ?
 Trust and credibility are not well defined  below is our interpretation
 Credibility - consistency with other content (e.g. similar reports) and contextual information (e.g. local
geography)
 Trust - subjective assessment of likelihood of content being false
 A credible news report might still be false!
Knowledge-based Approach to Trust and Credibility Modelling
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Approach
11
 Our Approach  Knowledge-based Trust Modelling
 Journalist already have personal sets of trusted sources they have come to rely upon
 Knowledge-based approach
 Journalist assert a-priori known facts (e.g. trusted sources, known locations)
 Evidence from streams asserted incrementally into a triple store (i.e. useekm + owlim)
 Simple inference 損 classify evidence 損 interactive exploration of evidence with journalist
 OWL classes & individuals, owl:Restriction, owl:intersectionOf, SPARQL, GeoSPARQL ...
 Not a black box - End users explore the evidence and we help them make a verification decision
 Scalable approach able to represent different viewpoints of Journalists
 State of the Art  Trust and Credibility Modelling
 Unsupervised learning (e.g. Bayesian Network, Damper Shafer) 損 trust prediction without explanation
 Supervised reputation models 損 trust prediction with explanation
 Heuristics & activity metrics 損 trust prediction with explanation
Knowledge-based Approach to Trust and Credibility Modelling
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Early Results - Work in Progress
12
Knowledge-based Approach to Trust and Credibility Modelling
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Early Results - Work in Progress
13
Knowledge-based Approach to Trust and Credibility Modelling
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Early Results - Work in Progress
14
Knowledge-based Approach to Trust and Credibility Modelling
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium
Roadmap for REVEAL Trust and Credibility Analysis
15
 REVEAL project runs until Sept 2016
 Ethnographic studies with Journalists
 Crawl content in parallel to journalists searching User Generated Content (UGC)
 Record ground truth by observing Journalists verifying news for real & explaining decisions
 Empirical analysis - compare automated decisions with ground truth
 MediaEval 2015 verification challenge
 Competition verifying images using a common Twitter dataset (10 different news events)
 Users trials 2015 - 2016
Future Work
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium 16
Any questions?
Stuart E. Middleton
University of Southampton IT Innovation Centre
email: sem@it-innovation.soton.ac.uk
web: www.it-innovation.soton.ac.uk
twitter:@stuart_e_middle, @IT_Innov, @RevealEU
Many thanks for your attention!

More Related Content

REVEAL Project - Trust and Credibility Analysis

  • 1. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Your Name Your Company your@email www.yourwebsite.com Stuart E. Middleton University of Southampton IT Innovation Centre sem@it-innovation.soton.ac.uk www.it-innovation.soton.ac.uk REVEAL Project - Trust and Credibility Analysis
  • 2. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium RDSM 2015 Invited Talk Overview 1 REVEAL Project Modality Analysis of Social Media Streams Geosemantics and Spatio-Temporal Grounding of Rumours Knowledge-based Approach to Trust and Credibility Modelling Future Work Overview
  • 3. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Overview 2 Objectives Enable users to reveal hidden modalities such as reputation, influence or credibility of information Approach - Modality Extraction and Analysis Real-time modality extraction On-demand analytics capabilities Event-driven architecture using RabbitMQ to communicate Processing based on a scalable STORM cluster (real-time) & standalone HTTP services (on-demand) Journalism Use Case Newsgathering - Find newsworthy content and evidence to help verify this content Enterprise Use Case Forums - Identify and help newbies, track product feedback & sentiment & emerging trends REVEAL Project
  • 4. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Modality Extraction 3 Social Media Streams Twitter, You Tube, Instagram, Facebook, Four Square Search (historical) and Stream (real-time) Social Network Analysis Community Detection, Community Graph Extraction, Community Classification (e.g. topics), Role Analysis (e.g. popular participant), Influence Models ... Content Analysis Image Feature Extraction (e.g. sky, city), Image Similarity Clustering, Multimedia Indexing, Image Manipulation Detection, Topic Models, Original Content Detection, Text Stylometry ... Geospatial Analysis Geoparsing, Geosemantic Classification, Image-based Geolocation, Geospatial Topic Model ... Semantic Analysis Directed Linked-Data Crawler, Semantic Context (e.g. context summaries based on linked data) ... Modality Analysis of Social Media Streams
  • 5. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Real-Time Annotation of Social Media 4 Modality Analysis of Social Media Streams JSON Twitter @stuart_e_middle <tweet> Author Timestamp URI's Hashtags ... Geoparsed Location Influence Score Topic Model JSON Facebook BBCNews <post> JSON YouTube CNN <video> JSON Instagram bbcnews <image> 11:30 11:35 11:40 11:45 11:50 Content Created Text Stylometry
  • 6. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Incremental Aggregation of Annotated Social Media 5 Modality Analysis of Social Media Streams JSON Facebook BBCNews <post> JSON YouTube CNN <video> JSON Instagram bbcnews <image> JSON Twitter @stuart.e.middle <tweet> JSON Facebook Profile @gadgetshow JSON Instagram gadgetshow <image> Cross Check - Timestamps - Locations - Authors ... Trust Analysis - Trusted sources - Reputations - Correlation to known facts ... Trustworthy Evidence Credible Evidence User Feedback User Feedback
  • 7. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Geosemantic Classification of Evidence 6 What is geosemantics Study of context of spatial data - in our case contextual text relating to mentioned locations Our Approach Geosemantic feature classification Geoparsing [Planet OpenStreetMap database] 損 LOC (high precision, native language) Text + POS + LOC + training set 損 classifier 損 context of how is location is talked about Classes 損 timeliness past | future | present, situatedness insitu | remote, confirmation confirm | deny Eyewitness reports 損 insitu Breaking content 損 present Denial of rumours 損 deny State of the art Geosemantic text analysis Text + POS + training set 損 classifier 損 event type Location text 損 NLP Grammar 損 direction & distance e.g. trouble spotted 5 miles north of London Location text 損 sentiment analysis 損 good / bad opinion of text Resilience of approaches across event types and languages an issue Geosemantics and Spatio-Temporal Grounding of Rumours
  • 8. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium 7 Geosemantics and Spatio-Temporal Grounding of Rumours Open Street Map Planet Database Twitter Search & Streaming API You Tube Search API Instagram Search API locations & geometry Keywords Focus Area(s) RabbitMQ HTTP, SQL, SPARQL JSON tweets JSON tweets + Locations JSON tweets + Locations + Class labels Social Media Crawler Geoparse Geoparse Geoparse Geoparse Geoparse Geoclassify Situation Assessment Visualization Storm Topology Service Key Geospatial Pre-processing SQL content + loc + class CITATION geosemantics Middleton, S.E. Krivcovs, V. "Geoparsing and Geosemantic Analysis of Social Media for Spatio-temporal Grounding of Rumours during Breaking News" submitted TOIS 2015 CITATION geoparsing Middleton, S.E. Middleton, L. Modafferi, S. "Real-time Crisis Mapping of Natural Disasters using Social Media" Intelligent Systems, IEEE , vol.29, no.2, pp.9,17, Mar.-Apr. 2014 CITATION tech blog Middleton, S.E. "From Twitter-based Crisis Mapping to Large-scale Real-Time Situation Assessment with Trust and Credibility Analysis", 2014 http://revealproject.eu/
  • 9. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Case Study - NYSE flooding in 2012 (false rumour) 8 Geosemantic filtering of evidence 7000+ tweets in 5 minute analysis window 114 ground truth tweets - WeatherChannel & CNN Geosemantic filter reduced content volume by 95% 100% ground truth recall for CONFIRM class 85% ground truth recall for DENY class Geosemantics and Spatio-Temporal Grounding of Rumours New York Stock Exchange (building) Battery Park (park)
  • 10. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Case Study - Donetsk Airport 2015 (conflicting claims) 9 Spatio-temporal mapping of evidence 300,000 Tweets, YouTube & Instagram posts over 24 hours of Ukraine Crisis for 20th Jan 2015 4 ground truth YouTube videos used by LifeNews reports on 20th Jan 2015 Focus: Dontesk Airport cluster Ground truth URI's ranked 10,14 & 28 out of 30 Geosemantics and Spatio-Temporal Grounding of Rumours 亠亠仍亠 (suburb) Donetsk Airport (building) 亳于从亳亶 仗仂仗亠从 (road)
  • 11. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Definitions 10 What is credibility and trust ? Trust and credibility are not well defined below is our interpretation Credibility - consistency with other content (e.g. similar reports) and contextual information (e.g. local geography) Trust - subjective assessment of likelihood of content being false A credible news report might still be false! Knowledge-based Approach to Trust and Credibility Modelling
  • 12. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Approach 11 Our Approach Knowledge-based Trust Modelling Journalist already have personal sets of trusted sources they have come to rely upon Knowledge-based approach Journalist assert a-priori known facts (e.g. trusted sources, known locations) Evidence from streams asserted incrementally into a triple store (i.e. useekm + owlim) Simple inference 損 classify evidence 損 interactive exploration of evidence with journalist OWL classes & individuals, owl:Restriction, owl:intersectionOf, SPARQL, GeoSPARQL ... Not a black box - End users explore the evidence and we help them make a verification decision Scalable approach able to represent different viewpoints of Journalists State of the Art Trust and Credibility Modelling Unsupervised learning (e.g. Bayesian Network, Damper Shafer) 損 trust prediction without explanation Supervised reputation models 損 trust prediction with explanation Heuristics & activity metrics 損 trust prediction with explanation Knowledge-based Approach to Trust and Credibility Modelling
  • 13. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Early Results - Work in Progress 12 Knowledge-based Approach to Trust and Credibility Modelling
  • 14. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Early Results - Work in Progress 13 Knowledge-based Approach to Trust and Credibility Modelling
  • 15. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Early Results - Work in Progress 14 Knowledge-based Approach to Trust and Credibility Modelling
  • 16. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium Roadmap for REVEAL Trust and Credibility Analysis 15 REVEAL project runs until Sept 2016 Ethnographic studies with Journalists Crawl content in parallel to journalists searching User Generated Content (UGC) Record ground truth by observing Journalists verifying news for real & explaining decisions Empirical analysis - compare automated decisions with ground truth MediaEval 2015 verification challenge Competition verifying images using a common Twitter dataset (10 different news events) Users trials 2015 - 2016 Future Work
  • 17. REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu 息 2015 REVEAL consortium 16 Any questions? Stuart E. Middleton University of Southampton IT Innovation Centre email: sem@it-innovation.soton.ac.uk web: www.it-innovation.soton.ac.uk twitter:@stuart_e_middle, @IT_Innov, @RevealEU Many thanks for your attention!