�ݺ�ߣ

and how to deal with that …
Dark Side and bright side of AI in Medicine
Parisa Rashidi, PhD
University of Florida

AI: Good Force?
Source: Google Image Search, AI Quotes
AI: Dark Side?

Adversarial Attacks
• Small and carefully designed change in input or
model, to intentionally force the model to make
a mistake.
• Why potential in clinical applications:
– Claim reimbursement handled by algorithms,
– Digital surrogates of patient response to drugs trials, or
approval decisions,
– Often-competing interests within health care, e.g.
insurance claims,
• Providers seeking to maximize and payers seeking to minimize.
– Billions of dollars at stake in systems’ outputs.
Finlayson, Samuel G., John D. Bowers, Joichi Ito, Jonathan L. Zittrain, Andrew L. Beam, and Isaac S. Kohane. "Adversarial attacks on medical machine learning." Science 363, no. 6433 (2019): 1287-1289.
1
Benign Perturbation Malignant
2

Adversarial Attack Examples
Ethical gray zone
(a dermatologist
could the camera
at any angle)
Suggested by the
Endocrine Society
Finlayson, Samuel G., John D. Bowers, Joichi Ito, Jonathan L. Zittrain, Andrew L. Beam, and Isaac S. Kohane. "Adversarial attacks on medical machine learning." Science 363, no. 6433 (2019): 1287-1289.
3

Clinical Systems’ Vulnerabilityto Adversarial Attacks
• Ground truth is often ambiguous.
• Medical imaging is standard, attacks do not
need to meet same standards of invariance.
• Often, commodity network architectures are
used (think ImageNet).
• Medical data interchange is limited, no
universal mechanism for authentication.
• Hospital infrastructure hard to update.
• Many potential adversaries.
4

Solutions
• Active engagement and dialogue between medical, technical, legal,
and ethical experts.
• Leave it to endpoint users, rather than as a preemptive solution?
(procrastination principle)
– Otherwise, resulting in rigid regulatory structure and stalling development.
– Resilience is difficult: breaking systems is easier than protecting them.
• Incremental, defensive, short-term steps
– E.g. Fingerprint hash of data.
5

Reidentificationof Study Participants
Schwarz, Christopher G., Walter K. Kremers, Terry M. Therneau, Richard R. Sharp, Jeffrey L. Gunter,
Prashanthi Vemuri, Arvin Arani et al. "Identification of Anonymous MRI Research Participants with Face-
Recognition Software." New England Journal of Medicine 381, no. 17 (2019): 1684-1686.
83%
Match
84
volunteers
2
6

Biasin AI
• Problem in an algorithm sold by a leading health
services company, called Optum, for care
decision-making for millions of people.
• Correcting the bias would more than double the
number of black patients flagged as at risk of
complicated medical needs, collectively 48,772
additional chronic diseases.
7
3

Safety: Design & EvaluationDecisions
Spiegelhalter, David. "Should We Trust Algorithms?." Harvard Data Science Review 2, no. 1 (2020).
8
4

Safety: Hacking Performance Metric
• In practice, human doctors will be hyper-
vigilant about the high-risk subtype, even
though it is rare.
• Existing AI models do indeed show
concerning error rates on clinically
important subsets despite encouraging
aggregate performance metrics.
Mistaken cases,
not critical
Mistaken
cases, critical
Human
AI
System
Oakden-Rayner, Luke, Jared Dunnmon, Gustavo Carneiro, and Christopher Ré. "Hidden stratification causes
clinically meaningful failures in machine learning for medical imaging." arXiv preprint arXiv:1909.12475 (2019).
4
9

• Deep learning benefits
– No need for manual feature engineering,
– Improved performance,
– Ability to capitalize on large amounts of data.
• Significant drawback for clinical tasks: not
interpretable
• Model’s decisions and important features
are not inherently determined, resulting in:
– Lack of trust
– Inability to diagnose problems
– Undetected issues
Deep Learning’s BlackBox
5
10

Interpretability
Shickel, Benjamin, Tyler J. Loftus, Lasith Adhikari, Tezcan Ozrazgat-Baslanti, Azra Bihorac, and Parisa Rashidi.
"DeepSOFA: A Continuous Acuity Score for Critically Ill Patients using Clinically Interpretable Deep Learning."
Scientific reports 9, no. 1 (2019): 1879.
11

Sustainability: Red AI vs. Green
AI
• Expensive processing of one
example: AlphaGo, the best version
of which required 1,920 CPUs and
280 GPUs to play a single game of
Go at a cost of over $1,000 per hour.
• Massive number of experiments:
researchers from Google trained over
12,800 neural networks in their neural
architecture search to improve
performance on object detection and
language modeling.
Schwartz, R., J. Dodge, and N. A. Smith. "Green AI (2019)." arXiv preprint arXiv:1907.10597 (2019).
The amount of compute used to train deep learning
models has increased 300,000x in 6 years.
12
6

A Frameworkfor AI Quality and Safety in
Medicine
Challen, Robert, Joshua Denny, Martin Pitt, Luke Gompels, Tom Edwards, and Krasimira Tsaneva-
Atanasova. "Artificial intelligence, bias and clinical safety." BMJ Qual Saf 28, no. 3 (2019): 231-237.
13

A general Frameworkfor AI Qualityand Safetyin Medicine
Challen, Robert, Joshua Denny, Martin Pitt, Luke Gompels, Tom Edwards, and Krasimira Tsaneva-Atanasova.
"Artificial intelligence, bias and clinical safety." BMJ Qual Saf 28, no. 3 (2019): 231-237.
14

Do’s and Don’ts
Beil, Michael, Ingo Proft, Daniel van Heerden, Sigal Sviri, and Peter Vernon van Heerden. "Ethical considerations about artificial intelligence for prognostication in intensive
care." Intensive Care Medicine Experimental 7, no. 1 (2019): 70.
Fair, Accountable, and Transparent (FAT) algorithms
15

�ݺ�ߣ

Dark Side and Bright Side of AI in Medicine

More Related Content

Dark Side and Bright Side of AI in Medicine

Editor's Notes