AI systems have potential benefits but also risks in clinical applications. Adversarial attacks can intentionally cause models to make mistakes, and medical data is vulnerable due to limited authentication. Bias in algorithms can negatively impact patient care. Interpretability is important for trust, diagnosis, and safety issues. Frameworks are needed for developing AI with quality, safety, and accountability.
1 of 17
Download to read offline
More Related Content
Dark Side and Bright Side of AI in Medicine
1. and how to deal with that
Dark Side and bright side of AI in Medicine
Parisa Rashidi, PhD
University of Florida
3. Adversarial Attacks
Small and carefully designed change in input or
model, to intentionally force the model to make
a mistake.
Why potential in clinical applications:
Claim reimbursement handled by algorithms,
Digital surrogates of patient response to drugs trials, or
approval decisions,
Often-competing interests within health care, e.g.
insurance claims,
Providers seeking to maximize and payers seeking to minimize.
Billions of dollars at stake in systems outputs.
Finlayson, Samuel G., John D. Bowers, Joichi Ito, Jonathan L. Zittrain, Andrew L. Beam, and Isaac S. Kohane. "Adversarial attacks on medical machine learning." Science 363, no. 6433 (2019): 1287-1289.
1
Benign Perturbation Malignant
2
4. Adversarial Attack Examples
Ethical gray zone
(a dermatologist
could the camera
at any angle)
Suggested by the
Endocrine Society
Finlayson, Samuel G., John D. Bowers, Joichi Ito, Jonathan L. Zittrain, Andrew L. Beam, and Isaac S. Kohane. "Adversarial attacks on medical machine learning." Science 363, no. 6433 (2019): 1287-1289.
3
5. Clinical Systems Vulnerabilityto Adversarial Attacks
Ground truth is often ambiguous.
Medical imaging is standard, attacks do not
need to meet same standards of invariance.
Often, commodity network architectures are
used (think ImageNet).
Medical data interchange is limited, no
universal mechanism for authentication.
Hospital infrastructure hard to update.
Many potential adversaries.
4
6. Solutions
Active engagement and dialogue between medical, technical, legal,
and ethical experts.
Leave it to endpoint users, rather than as a preemptive solution?
(procrastination principle)
Otherwise, resulting in rigid regulatory structure and stalling development.
Resilience is difficult: breaking systems is easier than protecting them.
Incremental, defensive, short-term steps
E.g. Fingerprint hash of data.
5
7. Reidentificationof Study Participants
Schwarz, Christopher G., Walter K. Kremers, Terry M. Therneau, Richard R. Sharp, Jeffrey L. Gunter,
Prashanthi Vemuri, Arvin Arani et al. "Identification of Anonymous MRI Research Participants with Face-
Recognition Software." New England Journal of Medicine 381, no. 17 (2019): 1684-1686.
83%
Match
84
volunteers
2
6
8. Biasin AI
Problem in an algorithm sold by a leading health
services company, called Optum, for care
decision-making for millions of people.
Correcting the bias would more than double the
number of black patients flagged as at risk of
complicated medical needs, collectively 48,772
additional chronic diseases.
7
3
10. Safety: Hacking Performance Metric
In practice, human doctors will be hyper-
vigilant about the high-risk subtype, even
though it is rare.
Existing AI models do indeed show
concerning error rates on clinically
important subsets despite encouraging
aggregate performance metrics.
Mistaken cases,
not critical
Mistaken
cases, critical
Human
AI
System
Oakden-Rayner, Luke, Jared Dunnmon, Gustavo Carneiro, and Christopher R辿. "Hidden stratification causes
clinically meaningful failures in machine learning for medical imaging." arXiv preprint arXiv:1909.12475 (2019).
4
9
11. Deep learning benefits
No need for manual feature engineering,
Improved performance,
Ability to capitalize on large amounts of data.
Significant drawback for clinical tasks: not
interpretable
Models decisions and important features
are not inherently determined, resulting in:
Lack of trust
Inability to diagnose problems
Undetected issues
Deep Learnings BlackBox
5
10
12. Interpretability
Shickel, Benjamin, Tyler J. Loftus, Lasith Adhikari, Tezcan Ozrazgat-Baslanti, Azra Bihorac, and Parisa Rashidi.
"DeepSOFA: A Continuous Acuity Score for Critically Ill Patients using Clinically Interpretable Deep Learning."
Scientific reports 9, no. 1 (2019): 1879.
11
13. Sustainability: Red AI vs. Green
AI
Expensive processing of one
example: AlphaGo, the best version
of which required 1,920 CPUs and
280 GPUs to play a single game of
Go at a cost of over $1,000 per hour.
Massive number of experiments:
researchers from Google trained over
12,800 neural networks in their neural
architecture search to improve
performance on object detection and
language modeling.
Schwartz, R., J. Dodge, and N. A. Smith. "Green AI (2019)." arXiv preprint arXiv:1907.10597 (2019).
The amount of compute used to train deep learning
models has increased 300,000x in 6 years.
12
6
14. A Frameworkfor AI Quality and Safety in
Medicine
Challen, Robert, Joshua Denny, Martin Pitt, Luke Gompels, Tom Edwards, and Krasimira Tsaneva-
Atanasova. "Artificial intelligence, bias and clinical safety." BMJ Qual Saf 28, no. 3 (2019): 231-237.
13
15. A general Frameworkfor AI Qualityand Safetyin Medicine
Challen, Robert, Joshua Denny, Martin Pitt, Luke Gompels, Tom Edwards, and Krasimira Tsaneva-Atanasova.
"Artificial intelligence, bias and clinical safety." BMJ Qual Saf 28, no. 3 (2019): 231-237.
14
16. Dos and Donts
Beil, Michael, Ingo Proft, Daniel van Heerden, Sigal Sviri, and Peter Vernon van Heerden. "Ethical considerations about artificial intelligence for prognostication in intensive
care." Intensive Care Medicine Experimental 7, no. 1 (2019): 70.
Fair, Accountable, and Transparent (FAT) algorithms
15
As recently as 2013, most hospitals were operating using the ninth edition of this coding scheme, published in 1978, despite the fact that a revised version (ICD-10) was published in 1990.
Ethics in Human-AI Interactions1.Shouldnt violate others freedom.2.The benefit been created should outweigh the risk.3.Benefit and risk should distribute fairly to everyone.