狠狠撸

狠狠撸Share a Scribd company logo
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Graph-Based Machine Learning for
Automated Health Care Services
Rhicheek Patra, Oracle Labs Zürich
Email: rhicheek.patra@oracle.com
Twitter: @rhpatra, LinkedIn: @rhicheek-patra
Swiss Conference on Data Science 2019
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, and timing of any features or
functionality described for Oracle’s products remains at the sole discretion of Oracle.
2
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Benefits of graph models
HealthCare: Background & Problem Statement
KG-RNN: Overview
Empirical Validation
Concluding Remarks
1
2
3
4
5
3
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Benefits of graph models
HealthCare: Background & Problem Statement
KG-RNN: Overview
Empirical Validation
Concluding Remarks
1
2
3
4
5
4
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Your Data is a Graph!
? Represent your dataset as a
graph
– Entities are vertices
– Relationships are edges
? Annotate your graph
– Labels identify vertices and edges
– Properties describe vertices and
edges
Patient
name = ‘Jerald’
Admission
id = 1234
venue = ‘Triemli’
was_admitted
date = 01-23
Diagnosis
Procedure
diagnosed _with
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Benefits of Graph Models
? Some of graph benefits
– Intuitive data model
– Aggregate information over heterogeneous data sources
– Fast query over multi-hop relationships
– Data visualization and interactive exploration
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
? A fast, parallel, in-memory graph analytics framework
? Offers 35+ built-in, native graph analytics algorithms
? Provides a graph-specific query language (PGQL)
7
SELECT v, e, v2
FROM graph
MATCH (v)-[e]->(v2)
WHERE v.first_name ='Jerald’
AND v.last_name = 'Hilpert'
GROUP BY … ORDER BY … LIMIT …
Parallel Graph AnalytiX (PGX)
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Graph in Various Domains
Health Science Financial Services
Patient
Admissions
Diagnosis
Procedures
Customer
Account
Address
Alert
Institution
ICIJ
Worldcheck
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Benefits of graph models
HealthCare: Background & Problem Statement
KG-RNN: Overview
Empirical Validation
Concluding Remarks
1
2
3
4
5
9
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Knowledge Graphs
10
? Represent information as entities, properties and
relations between entities
? Allows to naturally combine information from
different sources
? Lot of activity and research around leveraging
knowledge graphs in Machine Learning
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Sample: Healthcare Knowledge Graph
11
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Sample: Retail Knowledge Graph
12
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Features of the HealthCare Knowledge Graph
? New relations updated in the Knowledge Graph (KG) over time
– Due to unseen disease-symptom analysis over new medical studies
– Due to incoming Electronic Medical Records (EMRs) for new medical admissions
? Dynamic properties of entities (a.k.a. “admissions” in the KG)
– Patient Admissions vary over long duration (few hours to few weeks)
– During the admission, patient goes through lots of different
? Medications: e.g., Aspirin (500mg/day)
? Lab tests: e.g., pH=7.4 at 15:35
? How do we construct such KGs?
– Internal information (e.g., admissions) aggregated over hospital databases
– External information (e.g., disease-symptoms) collected from public resources
13
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
What is the Learning Objective?
? Given an admission with all the dynamic medical inputs (e.g., medications,
lab results), how can we predict the diagnoses related to this admission?
14
1. Hypertension
2. Heart disease
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Benefits of graph models
HealthCare: Background & Problem Statement
KG-RNN: Overview
Empirical Validation
Concluding Remarks
1
2
3
4
5
15
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
How do we achieve this Learning objective?
? Model the problem as a temporal sequence one (split by N-hour intervals)
16
Now we can train a Recurrent Neural Network for this objective!
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Recurrent Neural Network and how do we employ it?
? Used in multiple sequence-based applications
– Stock Market Prediction
– Machine Translation
? Recurrent Neural Network requires the temporal states of the input x
(which is admission in our case)
17
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Input to the RNN: Admission Timeline Chunks
18
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Admission Training (Admission Encoder Module)
19
AggregatorAggregator
Chunk 1 Chunk N
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Where can the Knowledge Graph benefit in our objective?
? To feed the relations between previous “relevant” admissions (based on
some admission properties, e.g., Pre-diagnosis) and their diagnoses.
20
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Relevant Admission Extraction
? Nearest Neighbor problem
– Given an input Admission, find the top-K nearest Admission in the healthcare
knowledge graph
? Approach: Weighted Personalized PageRank
– Weights
? Disease – Symptom: Importance of Symptom for specific disease
? Admission – Symptom/Disease: Cosine Similarity w.r.t. Pre-diagnosis and Symptom/Disease
– Take top-K admissions as the most relevant one w.r.t. WPPR score
21
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Relevant Admission Extraction
22
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
How do we employ the neighbors information? – KG-RNN
? Encoding for Input Admission
– Obtain admission encoding for the input admission (similar to the baseline)
? Aggregation of neighbors diagnoses information
– Aggregate the diagnoses information of the neighbors and concatenate with the
encoding of the input admission
? Diagnosis Prediction
– Employ the concatenated vector for the diagnoses prediction
23
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
KG-RNN Prediction Module
24
Input Admission
Final Encoding
Neighbor 1 Final
Encoding
Neighbor M Final
Encoding
Prediction Module
Diagnoses
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Benefits of graph models
HealthCare: Background & Problem Statement
KG-RNN: Overview
Empirical Validation
Concluding Remarks
1
2
3
4
5
25
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Dataset: MIMIC-III
? Available at https://mimic.physionet.org/
? MIMIC is a relational database containing data of patients who stayed
within the intensive care units at Beth Israel Deaconess Medical Center
? Input: Admissions
? Output: Multi-label multi-class classification task
– We focus on top-50 most frequent ones
26
Type Patients Admissions Input events Output events Lab tests Prescriptions
Count 46,520 58,976 21,000,000 4,500,000 28,000,000 4,000,000
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Results
? Incorporating Neigbors information boosts quality of KG-RNN on MIMIC-III
27
Metric Average Model Score
AUROC
Macro
Baseline 85.24%
KG-RNN 86.29% (+1.05%)
Micro
Baseline 90.55%
KG-RNN 91.03% (+0.48%)
Properties
3 hours per chunk
200 chunks per admission
25 events per type and per chunk
10 neighbors sampled per input admission
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Sample example of Neighbor-boosting
28
? Baseline does not detect any
diagnosis
? Neighbor helped pushing
the confidence of KG-RNN
upward for 2nd diagnosis
? Reveals how much KG-RNN
relies on its neighbors
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Prediction Timeline
29
? Our approach provides real-time
diagnosis probabilities during an on-
going diagnosis leading to better
medications and treatments.
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Benefits of graph models
HealthCare: Background & Problem Statement
KG-RNN: Overview
Empirical Validation
Concluding Remarks
1
2
3
4
5
30
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |
Summary
? Novel approach for processing evolving Knowledge Graphs with GraphML
? Multiple Applications in HealthCare, Retail, Finance, Security
? Online articles for detailed read
– https://blogs.oracle.com/ai/introduction-to-knowledge-graphs-in-healthcare
– https://blogs.oracle.com/ai/graph-machine-learning-for-enhanced-healthcare-
services
? Tools
– PGX: https://docs.oracle.com/cd/E56133_01/latest/index.html
– PGQL: http://pgql-lang.org/
– Github repository: https://github.com/oracle/pgx-samples
31
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | 32
Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |

More Related Content

Graph-Based Machine Learning for Automated Health Care Services

  • 1. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Graph-Based Machine Learning for Automated Health Care Services Rhicheek Patra, Oracle Labs Zürich Email: rhicheek.patra@oracle.com Twitter: @rhpatra, LinkedIn: @rhicheek-patra Swiss Conference on Data Science 2019
  • 2. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. 2
  • 3. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Program Agenda Benefits of graph models HealthCare: Background & Problem Statement KG-RNN: Overview Empirical Validation Concluding Remarks 1 2 3 4 5 3
  • 4. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Program Agenda Benefits of graph models HealthCare: Background & Problem Statement KG-RNN: Overview Empirical Validation Concluding Remarks 1 2 3 4 5 4
  • 5. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Your Data is a Graph! ? Represent your dataset as a graph – Entities are vertices – Relationships are edges ? Annotate your graph – Labels identify vertices and edges – Properties describe vertices and edges Patient name = ‘Jerald’ Admission id = 1234 venue = ‘Triemli’ was_admitted date = 01-23 Diagnosis Procedure diagnosed _with
  • 6. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Benefits of Graph Models ? Some of graph benefits – Intuitive data model – Aggregate information over heterogeneous data sources – Fast query over multi-hop relationships – Data visualization and interactive exploration
  • 7. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | ? A fast, parallel, in-memory graph analytics framework ? Offers 35+ built-in, native graph analytics algorithms ? Provides a graph-specific query language (PGQL) 7 SELECT v, e, v2 FROM graph MATCH (v)-[e]->(v2) WHERE v.first_name ='Jerald’ AND v.last_name = 'Hilpert' GROUP BY … ORDER BY … LIMIT … Parallel Graph AnalytiX (PGX)
  • 8. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Graph in Various Domains Health Science Financial Services Patient Admissions Diagnosis Procedures Customer Account Address Alert Institution ICIJ Worldcheck
  • 9. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Program Agenda Benefits of graph models HealthCare: Background & Problem Statement KG-RNN: Overview Empirical Validation Concluding Remarks 1 2 3 4 5 9
  • 10. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Knowledge Graphs 10 ? Represent information as entities, properties and relations between entities ? Allows to naturally combine information from different sources ? Lot of activity and research around leveraging knowledge graphs in Machine Learning
  • 11. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Sample: Healthcare Knowledge Graph 11
  • 12. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Sample: Retail Knowledge Graph 12
  • 13. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Features of the HealthCare Knowledge Graph ? New relations updated in the Knowledge Graph (KG) over time – Due to unseen disease-symptom analysis over new medical studies – Due to incoming Electronic Medical Records (EMRs) for new medical admissions ? Dynamic properties of entities (a.k.a. “admissions” in the KG) – Patient Admissions vary over long duration (few hours to few weeks) – During the admission, patient goes through lots of different ? Medications: e.g., Aspirin (500mg/day) ? Lab tests: e.g., pH=7.4 at 15:35 ? How do we construct such KGs? – Internal information (e.g., admissions) aggregated over hospital databases – External information (e.g., disease-symptoms) collected from public resources 13
  • 14. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | What is the Learning Objective? ? Given an admission with all the dynamic medical inputs (e.g., medications, lab results), how can we predict the diagnoses related to this admission? 14 1. Hypertension 2. Heart disease
  • 15. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Program Agenda Benefits of graph models HealthCare: Background & Problem Statement KG-RNN: Overview Empirical Validation Concluding Remarks 1 2 3 4 5 15
  • 16. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | How do we achieve this Learning objective? ? Model the problem as a temporal sequence one (split by N-hour intervals) 16 Now we can train a Recurrent Neural Network for this objective!
  • 17. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Recurrent Neural Network and how do we employ it? ? Used in multiple sequence-based applications – Stock Market Prediction – Machine Translation ? Recurrent Neural Network requires the temporal states of the input x (which is admission in our case) 17
  • 18. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Input to the RNN: Admission Timeline Chunks 18
  • 19. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Admission Training (Admission Encoder Module) 19 AggregatorAggregator Chunk 1 Chunk N
  • 20. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Where can the Knowledge Graph benefit in our objective? ? To feed the relations between previous “relevant” admissions (based on some admission properties, e.g., Pre-diagnosis) and their diagnoses. 20
  • 21. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Relevant Admission Extraction ? Nearest Neighbor problem – Given an input Admission, find the top-K nearest Admission in the healthcare knowledge graph ? Approach: Weighted Personalized PageRank – Weights ? Disease – Symptom: Importance of Symptom for specific disease ? Admission – Symptom/Disease: Cosine Similarity w.r.t. Pre-diagnosis and Symptom/Disease – Take top-K admissions as the most relevant one w.r.t. WPPR score 21
  • 22. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Relevant Admission Extraction 22
  • 23. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | How do we employ the neighbors information? – KG-RNN ? Encoding for Input Admission – Obtain admission encoding for the input admission (similar to the baseline) ? Aggregation of neighbors diagnoses information – Aggregate the diagnoses information of the neighbors and concatenate with the encoding of the input admission ? Diagnosis Prediction – Employ the concatenated vector for the diagnoses prediction 23
  • 24. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | KG-RNN Prediction Module 24 Input Admission Final Encoding Neighbor 1 Final Encoding Neighbor M Final Encoding Prediction Module Diagnoses
  • 25. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Program Agenda Benefits of graph models HealthCare: Background & Problem Statement KG-RNN: Overview Empirical Validation Concluding Remarks 1 2 3 4 5 25
  • 26. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Dataset: MIMIC-III ? Available at https://mimic.physionet.org/ ? MIMIC is a relational database containing data of patients who stayed within the intensive care units at Beth Israel Deaconess Medical Center ? Input: Admissions ? Output: Multi-label multi-class classification task – We focus on top-50 most frequent ones 26 Type Patients Admissions Input events Output events Lab tests Prescriptions Count 46,520 58,976 21,000,000 4,500,000 28,000,000 4,000,000
  • 27. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Results ? Incorporating Neigbors information boosts quality of KG-RNN on MIMIC-III 27 Metric Average Model Score AUROC Macro Baseline 85.24% KG-RNN 86.29% (+1.05%) Micro Baseline 90.55% KG-RNN 91.03% (+0.48%) Properties 3 hours per chunk 200 chunks per admission 25 events per type and per chunk 10 neighbors sampled per input admission
  • 28. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Sample example of Neighbor-boosting 28 ? Baseline does not detect any diagnosis ? Neighbor helped pushing the confidence of KG-RNN upward for 2nd diagnosis ? Reveals how much KG-RNN relies on its neighbors
  • 29. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Prediction Timeline 29 ? Our approach provides real-time diagnosis probabilities during an on- going diagnosis leading to better medications and treatments.
  • 30. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Program Agenda Benefits of graph models HealthCare: Background & Problem Statement KG-RNN: Overview Empirical Validation Concluding Remarks 1 2 3 4 5 30
  • 31. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | Summary ? Novel approach for processing evolving Knowledge Graphs with GraphML ? Multiple Applications in HealthCare, Retail, Finance, Security ? Online articles for detailed read – https://blogs.oracle.com/ai/introduction-to-knowledge-graphs-in-healthcare – https://blogs.oracle.com/ai/graph-machine-learning-for-enhanced-healthcare- services ? Tools – PGX: https://docs.oracle.com/cd/E56133_01/latest/index.html – PGQL: http://pgql-lang.org/ – Github repository: https://github.com/oracle/pgx-samples 31
  • 32. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. | 32
  • 33. Copyright ? 2019, Oracle and/or its affiliates. All rights reserved. |