The document discusses the PHEME project, which focuses on verifying the veracity (truthfulness) of information on social media as the "fourth challenge of big data" beyond volume, velocity, and variety. It aims to help analyze rumors and their trustworthiness by developing techniques like rumor detection and classification, modeling how claims spread across networks over time, and creating visualization tools. The goal is to help journalists and healthcare by providing open-source social intelligence and helping evaluate unreliable health-related claims patients may encounter online.
2. PHEME http://www.pheme.eu
Phemes & social media
Memes are thematic motifs that spread through
social media in ways analogous to genetic traits
We coined the term phemes to add truthfulness
and deception to the mix
2
http://en.wikipedia.org/wiki/Pheme
PHEME focuses on a fourth crucial, but hitherto largely
unstudied, challenge: Veracity
3. PHEME http://www.pheme.eu
Rumour analysis: The Problem
Now mostly manual
Rumours are challenging
Some rumours could take hours, days, weeks or even months to die out
Ill-meaning humans can currently outsmart computers (and humans)
and appear genuine
4. PHEME http://www.pheme.eu
Rumour analysis: The Problem
Mike Brown shot by police in Ferguson
We have different rumors emerging from the topic
We dont know if they are true.
We see the spikes and sometimes they come back
(different temporal dynamics)
We need to understand the overall conversation to see the
different points of view and how the rumours go forward
7. PHEME http://www.pheme.eu
From manual to automatic
We are investigating...
Ontologies for modelling phemes
Use a priori knowledge (LOD) and reasoning to
detect contradictions
Model phemes spread across media, social
networks, and time
Conversational analysis
Real-time rumour classification
Pheme visualisation to support veracity checking:
media maps, impact maps, geographical maps
8. PHEME http://www.pheme.eu
PatientsLikeMe
Cross-Media
Content Linking,
Spatio-Temporal
Grounding
Multilingual
LOD-Based
IE and Opinion
Mining
Rumour
Detection
And
Veracity
Classification
USE CASES
Veracity
Intelligence
In Patient
Care
Digital Journalism
Linked Open Data
Rumour Ontologies &
Reasoning (GraphDB)
Historical
Data
Archive
PHEME
Visual
Analytics
Dashboard
Social Context
Models
Trust,
Authority,
Implicit
Networks
Technology Outcome:
Open Source Computational Framework
...
PHEME VERACITY INTELLIGENCE
FRAMEWORK
9. PHEME http://www.pheme.euSome Meeting, Some Place, Some Date
Physical Infrastructure and Virtualization
Storage Infrastructure
Processing
Knowledge
Base
Stream ProcessingBatch Processing
Messaging/Comms
MultilingualData
Data
Collection
Rumour
Classification
UsageCuration
Data Value Chain
ITValueChain
IT Big Data Layer
Veracity and Language Value Chain
System Workflow Orchestration
MultilingualDataSocialMedia
MultilingualData
Data
Data
SW
LT Processing
& Analytics
Raw data
Repository
Lang
Detection
OntoText GraphDB
MultilingualData
MultilingualData
EndUsers
PhemeDashboard,
JournalistDashboard
Event
Detection
NLP
Processing
Annotation &
Training
Cross-media
linking
Cross-lingual
analysis
ResourceManagement
PHEME Big Data Architecture
for veracity analysis
10. PHEME http://www.pheme.eu
Application areas
Open-source social intelligence tools for
data journalism
Involves journalists from SwissInfo.ch, the Guardian,
New York Times, and other media
Improving healthcare
What health-related rumours are discussed in patient-
clinician consultations
Preventative medical advice, e.g. warn patients not to
trust certain rumours, when researching their disease
online
13. PHEME http://www.pheme.eu
Acknowledgement
The PHEME research project has received funding from the
European Union's Seventh Framework Programme for research,
technological development and demonstration under grant
agreement No. 611233.
13
This document does not represent the opinion of the European Community, and the European Community is not responsible for
any use that might be made of its content
Thanks!