2. Fateme Pourghasem
Founder at BaMaMouz
Health-IT Student at Tehran University of Medical Sciences
Research Assistant at Company Non-Communicable Diseases Research Center (NCDRC)
Entrepreneurship Mentor at National Scientific Olympiad for Medical Sciences
40. Email spam? (0/1) spam filtering
Audio text transcript speech recognition
English Persian machine translation
Image, raider info position of other cars self-driving car
Image of phone defect? (0/1) visual inspection
Input (A) output (B) Application
50. Sahand Samiei
Medical Advisor at Nabz Group, AI Team
Medical Extern at Tehran University of Medical Sciences
Innovation Administrator at TUMS Exceptional Talents Development Center
51. Where are we?!
Job Positions Domain Expert or Interdisciplinary Programmer
Common Literature & Coordination
Problem-Solution Fit Assessment
Generalism
52. A.I. top benefits in health industry
Systematic collection of big data & Information accessibility
Careful decisions with high accuracy
Streamlining processes with high speed
Scalability of health services
Reduce costs
58. Impact of Class Imbalance on Loss Calculation
Weighted Loss
Loss
Prediction
Probabilities
Examples
2/8 x 0.3 = 0.075
0.3
0.5
P1 Normal
2/8 x 0.3 = 0.075
0.3
0.5
P2 Normal
2/8 x 0.3 = 0.075
0.3
0.5
P3 Normal
6/8 x 0.3 = 0.225
0.3
0.5
P4 Mass
2/8 x 0.3 = 0.075
0.3
0.5
P5 Normal
2/8 x 0.3 = 0.075
0.3
0.5
P6 Normal
6/8 x 0.3 = 0.225
0.3
0.5
P7 Mass
2/8 x 0.3 = 0.075
0.3
0.5
P8 Normal
59. Resampling to Achieve Balanced Classes
Examples
P1 Normal
P2 Normal
P3 Normal
P4 Mass
P5 Normal
P6 Normal
P7 Mass
P8 Normal
Loss
Prediction
Probabilities
Re-Sampled
0.3
0.5
P3 Normal
0.3
0.5
P6 Normal
0.3
0.5
P1 Normal
0.3
0.5
P8 Normal
0.3
0.5
P7 Mass
0.3
0.5
P4 Mass
0.3
0.5
P7 Mass
0.3
0.5
P4 Mass
62. Dataset Size
~ 10K Samples
~ 100K Samples
...
Solutions
1. Pretraining + Fine Tuning (Transfer Learning)
2. Data Augmentation
CNN Penguin / Cat / Dog
CNN Mass / Pneumonia / Edema
63. Key Challenges
Multi Label Loss
Weighted Loss
/ Resampling
Transfer Learning
/ Data Augmentation
Multi-Task
Class Imbalance Dataset Size
64. Model Testing
Dataset
Training Set Validation Set Test Set
Development
of models
Tuning and selection
of models
Reporting
of results
Cross validation
65. Patient Overlap
Dataset
Split by image
Training Set Validation Set Test Set
Mass
P. ID
Image
...
1
20
CXR1.JPg
...
0
17
CXR4.JPG
...
0
32
CXR7.JPG
...
...
...
...
Mass
P. ID
Image
...
1
20
CXR2.JPg
...
0
11
CXR5.JPG
...
0
32
CXR9.JPG
...
...
...
...
Mass
P. ID
Image
...
1
20
CXR0.JPg
...
0
38
CXR3.JPG
...
0
32
CXR8.JPG
...
...
...
...
66. Patient Overlap
Dataset
Split by patient
Training Set Validation Set Test Set
Mass
P. ID
Image
...
1
20
CXR1.JPg
...
1
20
CXR2.JPG
...
1
20
CXR0.JPG
...
...
...
...
Mass
P. ID
Image
...
0
32
CXR7.JPg
...
0
32
CXR8.JPG
...
0
32
CXR9.JPG
...
...
...
...
Mass
P. ID
Image
...
0
17
CXR4.JPg
...
0
38
CXR3.JPG
...
0
11
CXR5.JPG
...
...
...
...
67. Set Sampling
10% of data
----------------
120 CT Scans
400-500 X-Rays
130 Whole 際際滷s
X% of minority class !
68. Ground Truth (Reference Standard)
Consensus voting
1) +
Yes Yes
No
Mass ??
=
Yes
2) +
More definitive test