This document summarizes a thesis research project on classifying chronic obstructive pulmonary disease (COPD) phenotypes using a Bayesian network model. The study used data from 100 COPD patients and 100 asthma patients to develop a probabilistic model. The Bayesian network achieved a 98.75% accuracy in classifying COPD phenotypes into categories such as emphysema, chronic bronchitis, and asthmatic COPD. A neural network using the Levenberg-Marquardt algorithm obtained slightly lower but still high accuracy of 96.25%. The results demonstrate that COPD phenotypes can be classified with a high degree of accuracy using a Bayesian network model without relying on pulmonary function tests.
1 of 43
Download to read offline
More Related Content
v2 3rd (11-13 June 2015) KNH and UON Conference-Research as a Driver for Science & Technology Innovation for Heath
2. .1 1INTRODUCTION
WHO lists COPD as the 4th
leading cause of death
worldwide
90% of COPD mortalities experiences in middle and
low-income countries
COPD diagnosis is prone to
under-diagnosis and
misdiagnosis (reported in UK, Australia, Canada e.t.c)
Known risk factors:-
Smoking of tobacco,
anti-1-trypsin (A1At), and
air pollution are the known major risk factors
3. .1 2Previous Studies?
Previous studies have focused on using PFTs
Identify COPD phenotypes using variables
classify COPD cases based on severity (Stage 1 - 4) with
Stage 1 being Mild and Stage 4 Very Severe (Figure 2.1)
Use methods ;data-driven phenotyping techniques
such as:-
Cluster analysis
PCA
Factor analysis
Discriminant analysis to define COPD phenotypes
06/10/15 Amos Otieno Olwendo 3
4. .1 3Whats new?
Our model attempts NOT to use PFT
Phenotyping is related to disease classification
classifies COPD phenotypes based on
Morphology (appearance)
Function
Behavior
We use a Bayesian Network
We achieved a classification of 98.75% on the test data
set
06/10/15 Amos Otieno Olwendo 4
7. .1 6Research Scope
1. Determine the essential variables and parameters
2. Design the probabilistic model used in this research
3. Determine whether a given patient case has COPD
4. Identify the consequent COPD Phenotype
4.1 Emphysema
4.2 Chronic bronchitis
4.3 General COPD (amalgamation of bronchitis and
emphysema)
4.4 Asthmatic COPD (amalgamation of asthma and any
other phenotype(s))
5. Ascertain whether the given patient has Asthma
6. Severity ; NEXT as in Figure 2.1
7. Determine cause-effect relationships among variables
7
11. .2 4Spirometry & Barriers
FEV1 decline predicts the future of the patient
Equipment and training costs
Low confidence in the use and
interpretation of the results
Perceived lack of utility
Quality assurance issues
Physical demand from the patient to use the
spirometer esp. by the elderly and those
experiencing respiratory challenges
11
12. .2 5What is Modeling?
research technique that
connects empiricism to theory,
and
experiments to theory construction
and validation
Here: BN used as the knowledge base
laws of probability theory as the
reasoning engine
06/10/15 Amos Otieno Olwendo 12
13. .2 6Why PGMs?
1. Ability to handle vagueness as a result of:
i. Biased or incomplete understanding of the event at
hand
ii. World of Noisy observations
iii. Phenomena not represented
iv. Randomness of events in real-life
1. Human reasoning -based on facts and assumptions
2. Probabilistically degrees of belief are adjustable
based on evidence
3. Intuitive with a compact data structure
13
14. 2.7 Why BNs?
14
Representation: a directed acyclic graph (DAG)
Composed of random variables X1, , Xn organized as
Query,
Non-query, and
Evidence variables
Each variable/factor has a corresponding CPT
Parent-child relationships of variables are represented
as CPDs [ P(X1, , Xn) ]
Inference: exact and approximate
Learning: both parameters & structure, with complete
or incomplete data (through MLE)
Employs the use of Chain rule for BN
15. Knowledge Engineering
Knowledge acquisition
-elicitation,
-collection,
-analysis,
-modeling, and
-validation
Knowledge representation and
Reasoning
Types of CPDs
Noisy-Or CPD common for medical diagnosis applications
Sigmoid CPD best design approach (personal opinion) 15
17. .3 1METHODOLOGY
Non-interventional(Observational)
retrospective study [Experimental study]
Conducted at Loghman Hakim:
Heart &Lung Division
Tehran- a city with high levels of air pollution
(especially during winter )
17
18. .3 2Methodology
This study was conducted from
August 2013 to January 2014
This unit receives approximately
420 COPD patients and
4200 Asthma patients monthly
18
19. .3 3Methodology
Sample Size: 100 COPD + 100 Asthma
The environment composed of
Dr. Agin,
2 resident physicians (worked with Dr. Agin), and
2 nurses (1 translator + Dr. Agin-Patient contact)
Amos
conducted a structured interviews and
data was recorded using
a structured checklist
19
21. .3 5Checklist
The checklist had 10 questions
Parameter measures were conducted through
self report and/or
Observation
Checklist design
patients had to commit to their parameter choices by
choosing a number between 0 and 10
Each interview result was cross -checked with the
reference standards
21
22. 3.6 COPD & Asthma Diagnosis
Patient History
History of present illness
Past medical history
Family history of COPD and Asthma
social history of the patient (exposure to irritants)
Physical Exam
Review of systems
Visual examination (include palpation and percussion)
listing to the lungs (stethoscope),
physical activity
22
23. 3.7 COPD & Asthma Diagnosis
Pulmonary Function Test(PFT)s
e.g. spirometry /
Bronchodilators (Nebulizer)
X-ray if necessary
Vital signs
Examine O2 and CO2 in the blood (pulse oximeter )
Blood pressure (sphygmomanometer exam.)
23
27. .3 11Data Analysis
Primary tool: the Bayesian network
Model Validation: NN based on LM algorithm
The dataset was divided into
60% for training
40% test
To ensure an even distribution and representation,
we grouped cases based on phenotypes (per group:
target and control) then
assigned identifications to case then
Through simple random sampling, we determined what
cases to be used for training and testing respectively
27
28. .3 12Data Analysis
Developed a C++ application
through cases analysis,
assigns a real number between negative and positive
infinity to each patient case (using MLE)
loaded these results to SQL Server and
R Statistical software to obtain graphical outputs
28
35. 4.5 RESULTS: Summary
35
Category Bayesian Network
Percentage (%)
Classification of the
Test Data Set
Levenberg-Marquardt
Algorithm)
Percentage (%)
Classification of the
Test Data Set
COPD 97.50 92.50
Asthma 100 100
Overall 98.75 96.25
39. .5 1DISCUSSION
1. COPD burden worldwide is underestimated (could
be worse than it is)
2. COPD under-diagnosis and/or misdiagnosis should
not pose the challenges it currently does to clinicians
3. Increasing cases of COPD could be as
3.1 a result of the changes in some social behaviors that
3.2 affect COPD development and progression
3.3 Such behavior may include:
3.3.1 increasing number of female smokers and
3.3.2 increasing number of teenage smokers
3. Worst hit populations are in middle to low-income
countries (inadequate healthcare services)
39
40. .5 2SUGGESTIONS
1. Increased COPD Awareness at the community level
1.1 Anti-smoking campaigns
1.2 Reduced exposure to 2nd
hand cigarette smoke (creation
of designated smoking areas)
1.3 Cooking using firewood/cow dung in less ventilated
environments
1.4 Air obstruction symptoms
1.5 Legal measures- who can smoke and or buy cigarettes
40
41. .5 3SUGGESTIONS
2. Population-based screening (Target Case Finding)
2.1 whenever an individual shows up to a health care
worker
2.2 Maybe useful in identifying those at risk
2. Need for screening devices since
3.1 certain localities lack specialist and/or
3.2 equipment (PFT devices, other test materials like
bronchodilators, X-ray machines, maybe computers
and or internet)
41
#15: 1. Knowledge engineering (representation approach) is paramount
#23: Sign an objective evidence of a disease that can be observed or tested
Symptom an evidence of a disease; sometimes limited to the objective evidence of the disease, as experience by the individual such as pain, dizziness, weakness
#24: Sign an objective evidence of a disease that can be observed or tested
Symptom an evidence of a disease; sometimes limited to the objective evidence of the disease, as experience by the individual such as pain, dizziness, weakness