狠狠撸

狠狠撸Share a Scribd company logo
The DDI (Drug-Drug Interaction) Corpus
MultiMedica Project
? A coordinated research project supported by
the Spanish Government under the Plan
Nacional de I+D.
? The participants groups:
– GSI (Universidad Politécnica de Madrid)
– LLI (Universidad Autónoma de Madrid)
– LaBDA (Universidad Carlos III de Madrid)
? To define and develop information extraction
and retrieval techniques based on texts from
the medical domain.
– Processing informative texts about health
topics in different languages: Spanish, Arabic
and Japanese.
– Processing scientific documents in English
about pharmacology. 2
http://labda.inf.uc3m.es/multimedica/index.html
3
What is a Drug-Drug Interaction
? A DDI occurs when a drug influences
the level or the activity of another
drug.
? A DDI can be beneficial, but most
times they are dangerous for patients.
? DDIs increase healthcare costs.
4
How do healthcare
professionals avoid DDIs?
? Different data sources for the study of
DDI: compendia, databases, DDI
checkers...
? But, they are not comprehensive:
↑ polytherapy → ↑ DDI incidence
? Many interactions are reported only in
medical journal or in drug safety
reports
5
How does Information
Extraction help?
A possible interaction resulting in
acute
renal failure has been reported
in a few subjects when indomethacin
was given with triamterene.
DDI ( INDOMETHACIN, TRIAMTERENE)
6
The DDI corpus for the
SemEval task 9
7
8
Related works
9
10
The DDI corpus:
main contributions
? A total of 1,025 annotated documents,
18,502 entities and 5,028 ddis
? Two different types of texts
? Annotation Guidelines and Inter-Annotator
Agreement.
? A comprehensive classification of drugs
and DDIs.
MedLine abstracts
DrugBank
11
The DDI corpus: entities
drug
Ibuprofen enhanced the toxicity of
methotrexate.
brand
Espidifen enhanced the toxicity of
methotrexate.
group
Analgesics are used in the
treatment of pain.
drug_n
Picrotoxin is a substance used as a
research tool.
12
The DDI corpus: ddis
mechanism
Lansoprazole may decrease the absorption
of enoxacin.
effect
Additive CNS depression may occur when
antihistamines are administered with
barbiturates.
advice
Patients taking isoniazid and disulfiram
concomitantly should closely monitored.
int
Clopidogrel interacts with omeprazol.
13
The DDI corpus:
Annotation of DDIs
- Binary relationship annotated at the sentence
level.
- Attribute type (effect, advice, mechanism, int)
with one attribute type
14
15
Main sources of Annotation
Problems
? Tokenization errors.
? Complexity of biomedical named
entities.
? Complexity of the biomedical texts.
? Lack of standard or reference works in
the specific domain.
16
Annotation Problems
What terms should be annotated?
Synonyms and term variants
– Different Nomenclatures
– Abbreviations
– Multi-word terms
– Nested terms
– Discontinuous names
17
Annotation Problems
? Different Nomenclatures:
Acetaminophen may increase
the anticoagulant effect of
acenocoumarol.
Paracetamol may increase the
anticoagulant effect of
acenocoumarol.
18
Annotation Problems
? Abbreviations:
Risk of 5-FU toxicity when
associated with
metronidazole.
5-FU = Fluorouracil
19
Annotation Problems
? Multi-word terms:
The administration of an
analgesic agent can reduce
effects of these drugs.
The administration of an
analgesic can reduce effects
of these drugs.
20
Annotation Problems
? Nested terms:
The concomitant use of
allopurinol and
thiazide diuretics may
contribute to the enhancement
of allopurinol toxicity
21
thiazide diuretics
thiazide
diuretics
? Discontinuous names:
[…] can reduce the effects of
loop, potassium-sparing and
thiazide diuretics.
Annotation Problems
22
loop diuretics
potassium-sparing diuretics
thiazide diuretics
? Ambiguity: Drugs names can have different
meanings
‘Insulin’ is a drug:
Therefore, in patients taking insulin,
regular monitoring of blood glucose is
recommended.
‘Insulin’ is a endogenous substance:
There is no evidence that EPA
supplements have detrimental effects
on glucose tolerance, insulin
secretion or insulin resistance in
non-diabetic subjects.
Annotation Problems
23
Concomitant aspirin may decrease the
metabolic clearance of nicotinic acid.
Two drugs and one DDI: one interacting
pair.
Annotation Problems
24
Concomitant aspirin may decrease the
metabolic clearance of nicotinic acid
25
Annotation Problems
Multiple mentions of the same drug and only
one DDI: one interacting pair.
The concomitant use of
nitrofurantoin is not
recommended since
nitrofurantoin may
antagonize the effect of
norfloxacin.
nitrofurantoin
norfloxacin.
antagonize the effect of
maynitrofurantoin
26
Annotation Problems
Multiple mentions of the same drug and only one
DDI: one interacting pair.
The concomitant use of
nitrofurantoin and
norfloxacin is not
recommended since
nitrofurantoin may
antagonize the effect of
norfloxacin.
nitrofurantoin
norfloxacin.
is not
antagonize the effect of
norfloxacin
may
recommended
nitrofurantoin
What have we done?
Annotation Guidelines:
– Clear and accurate definitions of
entities and relationships.
– Rules and conventions (about how
the annotation task should be carried
out)
– Examples clarifying their use.
http://www.cs.york.ac.uk/semeval-2013/task9/
27
28
Conclusions
? The DDI corpus:
– The most richly semantically annotated
resource for pharmacological text
processing built to date.
– Corpus and annotation guidelines are
publicly available.
– Can encourage the NLP community to
research in the development of
automatic tools to DDI extraction.
29
– Challenge DDIExtraction2013
– Take part in the 7th International Workshop
SEMEVAL 2013.
http://www.cs.york.ac.uk/semeval-2013/task9/
– Co-located with the Conference of the NAACL HLT
2013 (Atlanta, June 09-14)
http://naacl2013.naacl.org/
– 15 participant teams.
– Each system is trained and tested with the DDI
corpus.
– A ranking of participants will be provided using F1
measure.
Conclusions
30
? To enrich the current version of the DDI corpus:
? Annotation of other important aspects of drug
information, for instance, adverse effects.
? Annotation of linguistic phenomena required for a
better understanding of the text:
? Negation.
? Modality.
? Anaphora.
? To represent the acquired knowledge (concepts,
attributes and relationships) in an ontology.
Future work
31
Thank you for your attention!
mhzazo@pa.uc3m.es
32
33
The DDI corpus: results
34
Annotation Process
Manual Annotation of
entities and ddis

More Related Content

The DDI (Drug-Drug Interaction) Corpus

  • 2. MultiMedica Project ? A coordinated research project supported by the Spanish Government under the Plan Nacional de I+D. ? The participants groups: – GSI (Universidad Politécnica de Madrid) – LLI (Universidad Autónoma de Madrid) – LaBDA (Universidad Carlos III de Madrid) ? To define and develop information extraction and retrieval techniques based on texts from the medical domain. – Processing informative texts about health topics in different languages: Spanish, Arabic and Japanese. – Processing scientific documents in English about pharmacology. 2 http://labda.inf.uc3m.es/multimedica/index.html
  • 3. 3
  • 4. What is a Drug-Drug Interaction ? A DDI occurs when a drug influences the level or the activity of another drug. ? A DDI can be beneficial, but most times they are dangerous for patients. ? DDIs increase healthcare costs. 4
  • 5. How do healthcare professionals avoid DDIs? ? Different data sources for the study of DDI: compendia, databases, DDI checkers... ? But, they are not comprehensive: ↑ polytherapy → ↑ DDI incidence ? Many interactions are reported only in medical journal or in drug safety reports 5
  • 6. How does Information Extraction help? A possible interaction resulting in acute renal failure has been reported in a few subjects when indomethacin was given with triamterene. DDI ( INDOMETHACIN, TRIAMTERENE) 6
  • 7. The DDI corpus for the SemEval task 9 7
  • 8. 8
  • 10. 10
  • 11. The DDI corpus: main contributions ? A total of 1,025 annotated documents, 18,502 entities and 5,028 ddis ? Two different types of texts ? Annotation Guidelines and Inter-Annotator Agreement. ? A comprehensive classification of drugs and DDIs. MedLine abstracts DrugBank 11
  • 12. The DDI corpus: entities drug Ibuprofen enhanced the toxicity of methotrexate. brand Espidifen enhanced the toxicity of methotrexate. group Analgesics are used in the treatment of pain. drug_n Picrotoxin is a substance used as a research tool. 12
  • 13. The DDI corpus: ddis mechanism Lansoprazole may decrease the absorption of enoxacin. effect Additive CNS depression may occur when antihistamines are administered with barbiturates. advice Patients taking isoniazid and disulfiram concomitantly should closely monitored. int Clopidogrel interacts with omeprazol. 13
  • 14. The DDI corpus: Annotation of DDIs - Binary relationship annotated at the sentence level. - Attribute type (effect, advice, mechanism, int) with one attribute type 14
  • 15. 15
  • 16. Main sources of Annotation Problems ? Tokenization errors. ? Complexity of biomedical named entities. ? Complexity of the biomedical texts. ? Lack of standard or reference works in the specific domain. 16
  • 17. Annotation Problems What terms should be annotated? Synonyms and term variants – Different Nomenclatures – Abbreviations – Multi-word terms – Nested terms – Discontinuous names 17
  • 18. Annotation Problems ? Different Nomenclatures: Acetaminophen may increase the anticoagulant effect of acenocoumarol. Paracetamol may increase the anticoagulant effect of acenocoumarol. 18
  • 19. Annotation Problems ? Abbreviations: Risk of 5-FU toxicity when associated with metronidazole. 5-FU = Fluorouracil 19
  • 20. Annotation Problems ? Multi-word terms: The administration of an analgesic agent can reduce effects of these drugs. The administration of an analgesic can reduce effects of these drugs. 20
  • 21. Annotation Problems ? Nested terms: The concomitant use of allopurinol and thiazide diuretics may contribute to the enhancement of allopurinol toxicity 21 thiazide diuretics thiazide diuretics
  • 22. ? Discontinuous names: […] can reduce the effects of loop, potassium-sparing and thiazide diuretics. Annotation Problems 22 loop diuretics potassium-sparing diuretics thiazide diuretics
  • 23. ? Ambiguity: Drugs names can have different meanings ‘Insulin’ is a drug: Therefore, in patients taking insulin, regular monitoring of blood glucose is recommended. ‘Insulin’ is a endogenous substance: There is no evidence that EPA supplements have detrimental effects on glucose tolerance, insulin secretion or insulin resistance in non-diabetic subjects. Annotation Problems 23
  • 24. Concomitant aspirin may decrease the metabolic clearance of nicotinic acid. Two drugs and one DDI: one interacting pair. Annotation Problems 24 Concomitant aspirin may decrease the metabolic clearance of nicotinic acid
  • 25. 25 Annotation Problems Multiple mentions of the same drug and only one DDI: one interacting pair. The concomitant use of nitrofurantoin is not recommended since nitrofurantoin may antagonize the effect of norfloxacin. nitrofurantoin norfloxacin. antagonize the effect of maynitrofurantoin
  • 26. 26 Annotation Problems Multiple mentions of the same drug and only one DDI: one interacting pair. The concomitant use of nitrofurantoin and norfloxacin is not recommended since nitrofurantoin may antagonize the effect of norfloxacin. nitrofurantoin norfloxacin. is not antagonize the effect of norfloxacin may recommended nitrofurantoin
  • 27. What have we done? Annotation Guidelines: – Clear and accurate definitions of entities and relationships. – Rules and conventions (about how the annotation task should be carried out) – Examples clarifying their use. http://www.cs.york.ac.uk/semeval-2013/task9/ 27
  • 28. 28
  • 29. Conclusions ? The DDI corpus: – The most richly semantically annotated resource for pharmacological text processing built to date. – Corpus and annotation guidelines are publicly available. – Can encourage the NLP community to research in the development of automatic tools to DDI extraction. 29
  • 30. – Challenge DDIExtraction2013 – Take part in the 7th International Workshop SEMEVAL 2013. http://www.cs.york.ac.uk/semeval-2013/task9/ – Co-located with the Conference of the NAACL HLT 2013 (Atlanta, June 09-14) http://naacl2013.naacl.org/ – 15 participant teams. – Each system is trained and tested with the DDI corpus. – A ranking of participants will be provided using F1 measure. Conclusions 30
  • 31. ? To enrich the current version of the DDI corpus: ? Annotation of other important aspects of drug information, for instance, adverse effects. ? Annotation of linguistic phenomena required for a better understanding of the text: ? Negation. ? Modality. ? Anaphora. ? To represent the acquired knowledge (concepts, attributes and relationships) in an ontology. Future work 31
  • 32. Thank you for your attention! mhzazo@pa.uc3m.es 32

Editor's Notes

  • #7: For this reason, we think that Information Extraction can help to improve the early detection of drug interactions The final goal of our method is to identify the drugs (indomethacin and tramteren) and to detect the interaction between them. /the fainal gol of auer mezod is to identifai de drags (indometazin and tramteren) and to ditect the interakshon bituin dem/