Abstract: A large percentage of relevant radiologic patient information is currently only available in unstructured formats such as free text reports. In particular measurements are relevance since they are comparable and thus provide insight into the change of the health status over time, for example in response to some treatment. In radiology most of the measurements in reports describe the size of anatomical entities. Even though it is possible to extract measurements and anatomical entities from text using standard information extraction techniques, it is difficult to extract the relation between the measurement and the corresponding anatomical entity. Here we present a knowledgebased
approach to extract this relation using a model about typical size descriptions of anatomical entities in combination with hierarchical knowledge of existing medical ontologies. We evaluate our approach on two data sets of German radiology reports reaching an F1-measure of 0.85 and 0.79 respectively.
1 of 35
Download to read offline
More Related Content
Knowledge-based Extraction of Measurement-Entity Relations from German Radiology Reports
1. IEEE International Conference on Healthcare Informatics / September 2014
Knowledge-based Extraction of Measurement-Entity
Relations from German Radiology Reports
Heiner Oberkampf1,2, Claudia Bretschneider1, Sonja Zillner1, Bernhard Bauer2 and Matthias Hammon3
1Siemens AG, Corporate Technology
2University of Augsburg, Software Methodologies for Distributed Systems
3University Hospital Erlangen, Department of Radiology
Unrestricted ? Siemens AG 2014. All rights reserved
2. Agenda
Measurements in Radiology
Knowledge Model
Semantic Annotation of Radiology Reports
Extraction Algorithm
Evaluation
Outlook
Page 2 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
3. Measurements in Radiology
Not comprehensive list
Size
length: 1D, 2D, 3D
area, volume
index (e.g. spleen index = width*height*depth)
Density measured in Hounsfield scale (Hu)
mainly in CT images
minimal, maximal and mean density values for Regions of
Interest (ROIs)
Angle
e.g. bone configurations or fractions
Blood flow
e.g. PET: myocardial blood flow and blood flow in brain
…
1) Source: http://www.recist.com/recist-in-practice/19.html
1)
Page 3 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
4. Size Measurements in Radiology Reports
Example Sentences
Leber mit kranio-kaudalem Durchmesser von 15,5 cm.
Gr??enprogrediente, unscharf abgrenzbare Hypodensit?t links temporal nach kranial bis
nach parietobasal reichend (IMA 7-22; aktuell etwa 8 x 7 x 6 cm - Voruntersuchung etwa
4,5 x 3,5 cm) mit einzelnen, neuaufgetretenen, stippchenf?rmigen Hyperdensit?ten (IMA
11-14).
Etwas kaudal hiervon im Unterlappen am Lappenspalt zentral ein 1.3 cm (VU 1.3 cm)
gro?er Rundherd mit weiterhin deutlich vermehrtem FDG-Uptake (SUV max. 3.9; VU 5.7;
IMA 182) im Oberlappen lappenspaltnah ein 1.0 cm (VU 1.0 cm) gro?er Rundherd mit
vermehrtem FDG-Uptake (SUV max. 0.8; VU 1.5; IMA 199) sowie auf gleicher H?he im
Unterlappen dorsal paravertebral zwei Rundherde mit Ausl?ufern von 1.5 cm (VU 1.3 cm)
und lateral hiervon zwei verschmolzene Lymphknoten von zusammen 1.7 cm
Durchmesser (VU 1.5 cm + Satellit von 0.9 cm) mit deutlich vermehrtem FDG-Uptake
(SUV max. 4.0; VU 3.2 bzw. SUV max. 6.6; VU 4.8; IMA 207) und im costophrenischen
Winkel dorsal ein 0.9 cm (VU 0.5 cm) gro?er Rundherd mit vermehrtem FDG-Uptake
(SUV max. 1.7; VU 1.7; IMA 234).
Page 4 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
5. Longitudinal Integration
Image source: “Automated Detection and Volumetric Segmentation of the Spleen in CT Scans” M. Hammon, P. Dankerl, M. Kramer, S. Seifert, A. Tsymbal2, M. J.
Costa2, R. Janka1, M. Uder1, A. Cavallaro
Page 5 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
6. Two Data Sets
382 Lymphoma Patients
? 2584 reports
? imaging modality: CT, MRI, US,
Radiography, …
Diverse Internistic Patients
? 6007 reports
? imaging modality: CT
Page 6 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
7. Size Measurements in Radiology Reports1)
Mostly 1- and 2-dimensional and one or two per sentence.
# sentences Type of measuements: Type of sentences:
1-dim
40%
58%
3-dim
2%
2-dim
13109
4820
538 668 290
1 2 3 4 >4
# measurements contained in a sentence
1) Based on a data set of 2854 German radiology reports of 377 lymphoma patients and one of 6007 of diverse internistic patients
Page 7 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
8. Agenda
Measurements in Radiology
Knowledge Model
Semantic Annotation of Radiology Reports
Extraction Algorithm
Evaluation
Outlook
Page 8 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
9. Size Specifications
Commonly used types to describe the size of anatomical entities.
Interval
? Anterior-posterior diameter of liver normally 10-13 cm
? Thickness of wall of gallbladder normally 0.1 -0.3 cm
Normal Value
with deviation
? Truncus pulmonalis: 1.4 cm +/- 0.4 cm
Upper Bound ? Normal lymph node < 1 cm
Lower Bound
? Normal aorta diameter > 4 cm at root
? Enlarged lymph node > 1 cm
Basic form: anatomical entity, quality, value specification
Note: Specifications might be age or gender specific
Page 9 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
10. The Knowledge Model is based on existing biomedical ontologies.
Reused Ontologies Knowledge Model Knowledge Resources
Coverage
? 50 size specifications
? 38 different anatomical
entities
Representation
? OWL
Knowledge Representation
Anatomical Entities
? Radiological Lexicon (RadLex)
? Foundational Model of Anatomy
(FMA)1)
Qualities
? Ont. for Phenotypic Qualities (PATO)1)
Value Specifications
? Ont. for Biomedical Investigations
(OBI)1)
? Information Artifact Ontology (IAO)1)
? Units Ontology (UO)1)
? Model for Clinical Information (MCI)
1) Part of the Open Biological and Biomedical Ontologies Foundry library http://www.obofoundry.org/
Page 10 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
11. Normal Upper Bound Specification
Example: Lymph nodes are normally < 1 cm.
pato:normal
pato:size
pato:length
pato:diameter
:normalDiameterOfLy
mphNode
obi:scalar value
specification
iao:is quality specification of
mci:upper bound
specification
obi:has value specification
Quality Anatomical Entity
bfo:inheres in
iao:has measurement
unit label
radlex:lymph node
_:ln
Value Specification
_:vs1
uo:length unit
uo:centimeter
“1.0”^^xsd:float
_:usp
obi:has specified value
Page 11 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
12. Example: Normal diameter of the pulmonary atery is between 1.6 and 2.6 cm.
Quality Anatomical Entity
radlex:pulmonary
atery
pato:normal
pato:size
pato:length
pato:diameter
:normalDiameterOfPu _:pulmAtery
Value Specification
bfo:inheres in
obi:scalar value
specification
iao:has measurement
uo:length unit
_:vs1 uo:centimeter
unit label
_:vs2 “1.6”^^xsd:float
obi:has specified
value
Normal Interval Specification
iao:is quality specification of
mci:interval
specification
ro:has part
“2.6”^^xsd:float
obi:has value
specification
lmonaryAtery
mci:upper bound
specification
_:ubsp
mci:lower bound
specification
_:lbsp
_:isp
Page 12 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
13. Example: Normal length of kidney along craniocaudal axis: 8.0 – 13.0 cm.
Quality Anatomical Entity
radlex:kidney
_:kidney
Value Specification
pato:size bspo:transverse
bfo:inheres in
plane
_:tp
bspo:orthogonal_to
obi:scalar value
specification
iao:has measurement
uo:length unit
_:vs1 uo:centimeter
unit label
_:vs2 “8.0”^^xsd:float
obi:has specified
value
Normal Interval Specification
iao:is quality specification of
ro:has part
“13.0”^^xsd:float
obi:has value
specification
mci:interval
specification
_:isp
pato:normal
pato:length
:normalLengthKidney
Craniocaudal
mci:upper bound
specification
_:ubsp
mci:lower bound
specification
_:lbsp
Page 13 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
14. Agenda
Measurements in Radiology
Knowledge Model
Semantic Annotation of Radiology Reports
Extraction Algorithm
Evaluation
Outlook
Page 14 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
15. Semantic Annotation of Radiology Reports
Recognition of entities from ontologies and measurements
“Enlarged lymph node right paraaortal below the renal pedicle now 23 mm.”
measurement
value unit
23 mm
radlex:lymph node
Page 15 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
16. Semantic Annotation of Radiology Reports
Functional Scope
? Detection of multiword terms independent from the ordering of the individual tokens.
? Respect sentence boundaries and map multiword terms only when they occur within
these boundaries.
? Recognition of inflected forms of ontological concepts in the text such as detection of
plural form or other grammatical inflections based on stemmed forms.
Technical Realization
? builds on top of the UIMA framework
? adapted form of the UIMA Concept Mapper
? Outputs annotations in RDF
Page 16 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
17. Running Example
The running example used during the description of the resolution algorithm
“Enlarged lymph node right paraaortal below the renal pedicle now 23 mm.”
Annotations:
radlex:enlarged radlex:lymphadenopathy
radlex:lymph node
radlex:right
radlex:paraaortic radlex:inferior
radlex:inferior para-aortic lymph node
radlex:kidney radlex:renal pedicle
radlex:lateral aortic lymph node
2.3 uo:centimeter
Page 17 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
18. Agenda
Measurements in Radiology
Knowledge Model
Semantic Annotation of Radiology Reports
Extraction Algorithm
Evaluation
Outlook
Page 18 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
19. Overview of Algorithm
1. Using ontology structure of RadLex and create spanning tree for annotations.
2. Compare Measurement values with Knowledge Model
3. Compute a ranking and select the best entity
Page 19 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
20. Filter and Expand the Set of Annotations
Use knowledge from the RadLex ontology
RadLex entity
imaging modality descriptor …
anatomical entity clinical finding imaging observation
Anatomical_Site
enlarged lymphadenopathy
lymph node
right
paraaortic inferior
inferior para-aortic lymph node
kidney renal pedicle
lateral aortic lymph node
Page 20 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
21. Minimal Spanning Tree
Based on the set of relevant annotations we create a tree along the RadLex subclass hierarchy
Sentence:
“Enlarged lymph node right paraaortal
below the renal pedicle now 23 mm.”
Page 21 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
22. Attach Normal Size Specifications
For each entity of the spanning tree we retrieve available size specifications from the knowledge model.
compValue: 0.73 compValue: 0.0
normal: 0-1 cm
craniocaudal extension: 8-13 cm enlarged: 1-5 cm
anterior posterior diameter: 4 cm
? compValue: 1.3
?compValue: 2.48 ? compValue: 0.0
?compValue: 0.73
Sentence:
“Enlarged lymph node right paraaortal
below the renal pedicle now 23 mm.”
Page 22 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
23. Propagate Comparison Value
compValue: 0.73 compValue: 0.0
compValue: 0.0
compValue: 0.0 compValue: 0.0
Sentence: compValue: 0.0
“Enlarged lymph node right paraaortal
below the renal pedicle now 23 mm.”
Page 23 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
24. Ranking and Selection of Best Entity
Take ranking includes the position in the RadLex hierarchy
? Include position in RadLex hierarchy ? more specific entities are preferred
? Use threshold criteria to select best entity
“Enlarged lymph node right paraaortal below the renal pedicle now 23 mm.”
Structured Representation:
radlex:inferior para-aortic lymph node 2.3 uo:centimeter
Page 24 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
25. Agenda
Measurements in Radiology
Knowledge Model
Semantic Annotation of Radiology Reports
Extraction Algorithm
Evaluation
Outlook
Page 25 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
26. Scope of the Algorithm
The described algorithm resolves only one measurement-entity relation per sentence.
In Scope Out of Scope
? Sentences with two measurements about
different entities. E.g. “Splenomegaly with
23.0 x 14.5 x 8.5 cm and approx. 1.0 cm
lesion.”
? Sentences with more than two
measurements
? Sentences with one measurement
? Sentences with two measurements where
both measurements are about the same
entity. E.g.
“Spleen now with 10.5 x 4.5 cm slightly
smaller than in previous examination with
13.3 x 6.7 cm.”
Page 26 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
27. Scope of Algorithm
Analysis of sentences in- and out-of-scope
Reports on Lymphoma Patients Reports on Internistic Patients
3980
249
791
71 78 31
1 2 3 4 >4
# sentences
# measurements contained in a sentence
#Sentences out of Scope: 8.25%
9129
982
2798
467 590 259
1 2 3 4 >4
# sentences
# measurements contained in a sentence
#Sentences out of Scope: 16.15%
Page 27 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
28. Evaluation Schema
Description Example
correct ? The entity resolved is exactly what
the measurement is about
? The radiologist cannot name a better
entity
“Lymph node in mediastium 1.8 cm”
? mediastinal lymph node
(correct) ? The entity resolved is correct
however it could be more specific
? The radiologist can name a better
entity
“Lymph node in jaw angle 1 cm”
? lymph node
Radiologist: jugular lymph node
unresolvable ? The sentence does not allow a
resolution
? The algorithm did not resolve to a
false entity
“The biggest is now 2.7 cm.”
“Previously 53x18 mm.”
“Craniocaudal diameter now 10.8 cm.”
false ? The resolved entity is false or no
entity was resolved
? The radiologist can find the correct
entity.
“Large metastasis in liver with a size of
12.3 x 7.0 cm.” ? liver
Radiologist: metastasis
Page 28 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
29. Evaluation Results
Evaluation results for 500 randomly selected sentences for each data set.
Lymphoma Internistic
5%
unresolvable
21%
50%
24%
false
(correct)
correct
unresolvable
4%
19%
44%
34%
false
correct
(correct)
resolved 84%, unresolved 16%
recall: 0.8698
precision:0.8389
F-measure: 0.8540
resolved 80%, unresolved 20%
recall:0.7904
precision:0.7864
F-measure: 0.7884
Page 29 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
30. Evaluation by Resolved Anatomical Entity
Page 30 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved anatomical entity
31. Evaluation of Annotator
Using RadLex brings the follwowing two problems when used for German text:
1. Missing annotations
? Only about 25% of all RadLex concepts have German labels
? 6.59% of all sentences get no relevant annotations
? In 50% of the false resolutions, the correct entity was not annotated
2. Wrong annotations due to unspecific synonyms
? ‘radlex:breast mass’ has synonyms: ‘mass’, ‘nodule’, ‘lesion’, ‘nodular enhancement’
and ‘area of enhancement’
? ‘mass’ or ‘lesion’ are annotated with ‘radlex:breast mass’ and then the resolution
algorithm often falsely resolves to ‘breast mass’.
Page 31 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
32. Limitations of a Pure Knowledge-based Approach
We need to use the sentence context to better resolve more complex sentences.
? normal size specifications overlap
? measured entities are often not within the normal range
? annotation quality
? coverage
? level of detail of RadLex concepts
? wrong annotations due to synonyms
? restriction to sentence boundaries
? multiple measurements in one sentence
? one measurement about multiple entities
Page 32 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
33. Agenda
Measurements in Radiology
Knowledge Model
Semantic Annotation of Radiology Reports
Extraction Algorithm
Evaluation
Outlook
Page 33 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
34. Outlook
Adaptation of the algorithm already made:
? Use adapted version of RadLex
? Use statistics from the evaluated data set
? Use distance within sentence
? Now all sentences are in scope
Ongoing:
? Include context information about the quality: normal, enlarged…
? include annotations from previous sentence for unresolved sentences.
? Density measurements
Page 34 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved
35. Application
Longitudinal view on reports from consequtive examinations
Page 35 September 2014 Corporate Technology Restricted ? Siemens AG 2014. All rights reserved