Authors: María Herrero-Zazo, Isabel Segura-Bedmar, Paloma Martínez
CILC 2013, 5th International Conference on Corpus Linguistics, Alicante, Spain (March 16, 2013)
The DDI (Drug-Drug Interaction) Corpus
1 of 34
More Related Content
The DDI (Drug-Drug Interaction) Corpus
2. MultiMedica Project
? A coordinated research project supported by
the Spanish Government under the Plan
Nacional de I+D.
? The participants groups:
– GSI (Universidad Politécnica de Madrid)
– LLI (Universidad Autónoma de Madrid)
– LaBDA (Universidad Carlos III de Madrid)
? To define and develop information extraction
and retrieval techniques based on texts from
the medical domain.
– Processing informative texts about health
topics in different languages: Spanish, Arabic
and Japanese.
– Processing scientific documents in English
about pharmacology. 2
http://labda.inf.uc3m.es/multimedica/index.html
4. What is a Drug-Drug Interaction
? A DDI occurs when a drug influences
the level or the activity of another
drug.
? A DDI can be beneficial, but most
times they are dangerous for patients.
? DDIs increase healthcare costs.
4
5. How do healthcare
professionals avoid DDIs?
? Different data sources for the study of
DDI: compendia, databases, DDI
checkers...
? But, they are not comprehensive:
↑ polytherapy → ↑ DDI incidence
? Many interactions are reported only in
medical journal or in drug safety
reports
5
6. How does Information
Extraction help?
A possible interaction resulting in
acute
renal failure has been reported
in a few subjects when indomethacin
was given with triamterene.
DDI ( INDOMETHACIN, TRIAMTERENE)
6
11. The DDI corpus:
main contributions
? A total of 1,025 annotated documents,
18,502 entities and 5,028 ddis
? Two different types of texts
? Annotation Guidelines and Inter-Annotator
Agreement.
? A comprehensive classification of drugs
and DDIs.
MedLine abstracts
DrugBank
11
12. The DDI corpus: entities
drug
Ibuprofen enhanced the toxicity of
methotrexate.
brand
Espidifen enhanced the toxicity of
methotrexate.
group
Analgesics are used in the
treatment of pain.
drug_n
Picrotoxin is a substance used as a
research tool.
12
13. The DDI corpus: ddis
mechanism
Lansoprazole may decrease the absorption
of enoxacin.
effect
Additive CNS depression may occur when
antihistamines are administered with
barbiturates.
advice
Patients taking isoniazid and disulfiram
concomitantly should closely monitored.
int
Clopidogrel interacts with omeprazol.
13
14. The DDI corpus:
Annotation of DDIs
- Binary relationship annotated at the sentence
level.
- Attribute type (effect, advice, mechanism, int)
with one attribute type
14
16. Main sources of Annotation
Problems
? Tokenization errors.
? Complexity of biomedical named
entities.
? Complexity of the biomedical texts.
? Lack of standard or reference works in
the specific domain.
16
17. Annotation Problems
What terms should be annotated?
Synonyms and term variants
– Different Nomenclatures
– Abbreviations
– Multi-word terms
– Nested terms
– Discontinuous names
17
18. Annotation Problems
? Different Nomenclatures:
Acetaminophen may increase
the anticoagulant effect of
acenocoumarol.
Paracetamol may increase the
anticoagulant effect of
acenocoumarol.
18
20. Annotation Problems
? Multi-word terms:
The administration of an
analgesic agent can reduce
effects of these drugs.
The administration of an
analgesic can reduce effects
of these drugs.
20
21. Annotation Problems
? Nested terms:
The concomitant use of
allopurinol and
thiazide diuretics may
contribute to the enhancement
of allopurinol toxicity
21
thiazide diuretics
thiazide
diuretics
22. ? Discontinuous names:
[…] can reduce the effects of
loop, potassium-sparing and
thiazide diuretics.
Annotation Problems
22
loop diuretics
potassium-sparing diuretics
thiazide diuretics
23. ? Ambiguity: Drugs names can have different
meanings
‘Insulin’ is a drug:
Therefore, in patients taking insulin,
regular monitoring of blood glucose is
recommended.
‘Insulin’ is a endogenous substance:
There is no evidence that EPA
supplements have detrimental effects
on glucose tolerance, insulin
secretion or insulin resistance in
non-diabetic subjects.
Annotation Problems
23
24. Concomitant aspirin may decrease the
metabolic clearance of nicotinic acid.
Two drugs and one DDI: one interacting
pair.
Annotation Problems
24
Concomitant aspirin may decrease the
metabolic clearance of nicotinic acid
25. 25
Annotation Problems
Multiple mentions of the same drug and only
one DDI: one interacting pair.
The concomitant use of
nitrofurantoin is not
recommended since
nitrofurantoin may
antagonize the effect of
norfloxacin.
nitrofurantoin
norfloxacin.
antagonize the effect of
maynitrofurantoin
26. 26
Annotation Problems
Multiple mentions of the same drug and only one
DDI: one interacting pair.
The concomitant use of
nitrofurantoin and
norfloxacin is not
recommended since
nitrofurantoin may
antagonize the effect of
norfloxacin.
nitrofurantoin
norfloxacin.
is not
antagonize the effect of
norfloxacin
may
recommended
nitrofurantoin
27. What have we done?
Annotation Guidelines:
– Clear and accurate definitions of
entities and relationships.
– Rules and conventions (about how
the annotation task should be carried
out)
– Examples clarifying their use.
http://www.cs.york.ac.uk/semeval-2013/task9/
27
29. Conclusions
? The DDI corpus:
– The most richly semantically annotated
resource for pharmacological text
processing built to date.
– Corpus and annotation guidelines are
publicly available.
– Can encourage the NLP community to
research in the development of
automatic tools to DDI extraction.
29
30. – Challenge DDIExtraction2013
– Take part in the 7th International Workshop
SEMEVAL 2013.
http://www.cs.york.ac.uk/semeval-2013/task9/
– Co-located with the Conference of the NAACL HLT
2013 (Atlanta, June 09-14)
http://naacl2013.naacl.org/
– 15 participant teams.
– Each system is trained and tested with the DDI
corpus.
– A ranking of participants will be provided using F1
measure.
Conclusions
30
31. ? To enrich the current version of the DDI corpus:
? Annotation of other important aspects of drug
information, for instance, adverse effects.
? Annotation of linguistic phenomena required for a
better understanding of the text:
? Negation.
? Modality.
? Anaphora.
? To represent the acquired knowledge (concepts,
attributes and relationships) in an ontology.
Future work
31
#7: For this reason, we think that Information Extraction can help to improve the early detection of drug interactions
The final goal of our method is to identify the drugs (indomethacin and tramteren) and to detect the interaction between them.
/the fainal gol of auer mezod is to identifai de drags (indometazin and tramteren) and to ditect the interakshon bituin dem/