The document describes a test collection for research on depression and language use created by analyzing Reddit posts. It includes a depression group and control group extracted from Reddit histories using pattern matching. The collection aims to enable early prediction of depression by sequentially processing language evidence. Several baseline methods are evaluated, with the dynamic approach achieving the best balance between predictive accuracy and early detection.
1 of 51
Download to read offline
More Related Content
A Test Collection for Research on Depression and Language Use
1. David E. Losada
Fabio Crestani
A Test Collection for Research on
Depression and Language Use
CLEF 2016, vora (Portugal)
9. linguistic markers
use of personal pronouns
statistical properties of text
topic models
psychometrics
content vs style
social words
verb tense positive/negative emotions
psychological processes
cognitive processes
10. Lack of data on depression & language
few collections available
focus on 2-class categorisation
no temporal dimension, no early risk analysis
11. little context about the tweet writer
difficult to assess whether a mention of
depression is genuine
no way to extract a long history of
tweets (e.g. several years)
12. little context about the tweet writer
difficult to assess whether a mention of
depression is genuine
no way to extract a long history of
tweets (e.g. several years)
14. A Thin Line
no way to extract any history
short messages, little context
15. A Thin Line
no way to extract any history
short messages, little context
17. large history for each redditor (several years)
many subreddits (communities) about different
medical conditions (e.g. depression or
anorexia)
long messages
terms & conditions allow use
for research purposes
18. large history for each redditor (several years)
many subreddits (communities) about different
medical conditions (e.g. depression or
anorexia)
long messages
terms & conditions allow use
for research purposes
20. depression group vs control group
I am depressed I think I have depression
Adopted extraction method from
Coppersmith et al. 2014:
pattern matching search
search for explicit mentions of diagnosis
(e.g. I was diagnosed with depression)
manual inspection of the results
21. depression group vs control group
(e.g. My wife has depression, I am a
student interested in depression)
large set of random redditors
from a wide range of subreddits
(news, media, ...)
also included some false positives
from the depression subreddit
22. retrieved all history
from any subreddit
his/her posts +
his/her comments to other posts
often several years of text
removed the post/comment with
the explicit mention of the
diagnosis (depression group)
redditor profile
23. pre- & post-diagnosis text
organised the writings in
chronological order
XML archives
redditor profile
25. early prediction task
detect early traces of depression
for each subject, sequentially process
pieces of evidence...
26. early prediction task
detect early traces of depression
for each subject, sequentially process
pieces of evidence...
------
------
2/13/13
John Doe's writings
(post or comments)
27. early prediction task
detect early traces of depression
for each subject, sequentially process
pieces of evidence...
------
------
2/13/13
John Doe's writings
(post or comments)
28. early prediction task
detect early traces of depression
for each subject, sequentially process
pieces of evidence...
------
------
2/13/13
------
------
2/15/13
John Doe's writings
(post or comments)
29. early prediction task
detect early traces of depression
for each subject, sequentially process
pieces of evidence...
------
------
2/13/13
------
------
------
------
2/15/13 3/1/13
John Doe's writings
(post or comments)
30. early prediction task
detect early traces of depression
for each subject, sequentially process
pieces of evidence...
------
------
2/13/13
------
------
------
------
------
------
2/15/13 3/1/13 12/9/16
...John Doe's writings
(post or comments)
31. early prediction task
detect early traces of depression
for each subject, sequentially process
pieces of evidence...
------
------
2/13/13
------
------
------
------
------
------
2/15/13 3/1/13 12/9/16
...John Doe's writings
(post or comments)
tradeoff
early decision
vs
more informed decision
32. early prediction task
detect early traces of depression
for each subject, sequentially process
pieces of evidence...
------
------
2/13/13
------
------
------
------
------
------
2/15/13 3/1/13 14/9/16
...John Doe's writings
(post or comments)
tradeoff
early decision
vs
more informed decision
when should I fire an alarm?
33. early prediction task: performance metric
After seeing k texts a system makes a binary
decision dd about John Doe:
d=1 => possible risk of depression
d=0 => non-risk case
34. early prediction task: performance metric
After seeing k texts a system makes a binary
decision dd about John Doe:
------
------
2/13/13
(1)
------
------
------
------
2/15/13
(2)
3/10/14
(k)
John Doe's writings
(post or comments) ...
decision (d)
d=1 => possible risk of depression
d=0 => non-risk case
43. Test
352 54
first n
1 2 n
------
------
...
------
------
------
------
...
2/13/13 2/15/13 3/1/13
depression language classifier
decision
44. Test
352 54
dynamic
1 2 n
------
------
...
------
------
------
------
...
2/13/13 2/15/13 3/1/13
depression language classifier
confident about risk?
we finish and predict 1 (risk case)
yes
45. Test
352 54
dynamic
1 2 n
------
------
...
------
------
------
------
...
2/13/13 2/15/13 3/1/13
depression language classifier
confident about risk?
we wait and see more evidence...
no
46. Test
352 54
dynamic
1 2 n
------
------
...
------
------
------
------
...
2/13/13 2/15/13 3/1/13
depression language classifier
confident about risk?
we finish and predict 1 (risk case)
yes
47. Test
352 54
dynamic
1 2 n
------
------
...
------
------
------
------
...
2/13/13 2/15/13 3/1/13
depression language classifier
confident about risk?
we wait and see more evidence...
no
48. random/minority: poor F1 & ERDE
first n: good F1 but slow at detecting risk cases
dynamic: best balance between correctness & time
results
49. new collection on
depression & language
early risk detection
algorithms
(preliminary baselines)
methodology for
benchmark
construction
temporal dimension
conclusions
50. David E. Losada
Fabio Crestani
A Test Collection for Research on Depression and
Language Use
We also thank the
Ministerio de Econom鱈a y Competitividad
of the Goverment of Spain &
FEDER Funds (ref. TIN2015-64282-R)
This research was funded by the Swiss
National Science Foundation
(project Early risk prediction on the Internet:
an evaluation corpus, 2015)
51. Acknowledgements:
Ehnero. picture pg 1.CC BY NC 2.0.
Gerald Gabernig. picture pg 2.CC BY 2.0.
ankxt. picture pg 3.CC BY 2.0.
NEC Corporation of America. picture pg 4.CC BY 2.0.
Jordi Borrs i Viv坦. picture pgs 5-6 .CC BY NC ND 2.0.
Helen Harrop. picture pg 7.CC BY SA 2.0.
Nilufer Gadgieva. picture pg 8.CC BY NC 2.0.
Alix May. picture pg 9.CC BY NC 2.0.
Justin Lincoln. picture pg 10.CC BY SA 2.0.
Grace McDunnough. picture pgs 11-18 (top).CC BY NC ND 2.0.
Andy Kennelly. picture pgs 19-21.CC BY NC 2.0.
Joel Olives. picture pgs 22-23 (left).CC BY 2.0.
Tim Morgan. picture pg 23 (right).CC BY 2.0.
Conor Lawless. picture pg 24.CC BY 2.0.
Oscar Rethwill. picture pgs 25-32.CC BY 2.0.
Emily. picture pgs 33-37.CC BY NC 2.0.
Tiberiu Ana. picture pg 38.CC BY 2.0.
woodleywonderworks. picture pg 39 (left), 40 (left).CC BY 2.0.
Niko Kaiser. picture pg 39 (right), 41-47.CC BY 2.0.
John Sheets. picture pg 48.CC BY NC 2.0.
Anders Sandberg. picture pg 49.CC BY NC 2.0.
See-ming Lee. picture pg 51.CC BY NC 2.0.