1) Researchers have discovered ways to generate "fooling images" that can trick artificial intelligence systems like facial recognition and image classifiers.
2) These fooling images involve subtle patterns or perturbations invisible to humans that cause AI systems to misclassify images, for example making a classifier think a random pattern is a picture of a person.
3) While fooling images have limitations, they pose security risks if exploited and demonstrate weaknesses in current AI systems. Researchers are working to develop defenses but adversarial attacks continue to evolve and improve.
1 of 11
Download to read offline
More Related Content
Magic ai these are the optical illusions that trick, fool, and flummox computers the verge
1. 8/27/2017 Magic AI: these are the optical illusions that trick, fool, and flummox computers - The Verge
https://www.theverge.com/2017/4/12/15271874/ai-adversarial-images-fooling-attacks-artificial-intelligence 1/12
Illustration by William Joel / The Verge
MAGIC AI: THESE ARE THE OPTICAL ILLUSIONS THAT TRICK,
FOOL, AND FLUMMOX COMPUTERS
by James Vincent @jjvincent Apr 12, 2017, 12:04pm EDT
Illustrations by William Joel
2. 8/27/2017 Magic AI: these are the optical illusions that trick, fool, and flummox computers - The Verge
https://www.theverge.com/2017/4/12/15271874/ai-adversarial-images-fooling-attacks-artificial-intelligence 2/12
T
heres a scene in William Gibsons 2010 novel Zero History, in which a character
embarking on a high-stakes raid dons what the narrator refers to as the
ugliest T-shirt in existence a garment which renders him invisible to
CCTV. In Neal Stephensons Snow Crash, a bitmap image is used to transmit a
virus that scrambles the brains of hackers, leaping through computer-augmented
optic nerves to rot the targets mind. These stories, and many others, tap into a
recurring sci-fi trope: that a simple image has the power to crash computers.
But the concept isnt fiction not completely, anyway. Last year, researchers were able to
fool a commercial facial recognition system into thinking they were someone else just by
wearing a pair of patterned glasses. A sticker overlay with a hallucinogenic print was stuck
onto the frames of the specs. The twists and curves of the pattern look random to humans,
but to a computer designed to pick out noses, mouths, eyes, and ears, they resembled the
contours of someones face any face the researchers chose, in fact. These glasses wont
delete your presence from CCTV like Gibsons ugly T-shirt, but they can trick an AI into
thinking youre the Pope. Or anyone you like.
3. 8/27/2017 Magic AI: these are the optical illusions that trick, fool, and flummox computers - The Verge
https://www.theverge.com/2017/4/12/15271874/ai-adversarial-images-fooling-attacks-artificial-intelligence 3/12
Researchers wearing simulated pairs of fooling glasses, and the people the facial recognition system thought they were.
These types of attacks are bracketed within a broad category of AI cybersecurity known as
adversarial machine learning, so called because it presupposes the existence of an
adversary of some sort in this case, a hacker. Within this field, the sci-fi tropes of ugly T-
shirts and brain-rotting bitmaps manifest as adversarial images or fooling images, but
adversarial attacks can take forms, including audio and perhaps even text. The existence of
these phenomena were discovered independently by a number of teams in the early 2010s.
They usually target a type of machine learning system known as a classifier, something
that sorts data into different categories, like the algorithms in Google Photos that tag
pictures on your phone as food, holiday, and pets.
| Image
by Mahmood Sharif, Sruti Bhagavatula, Lujo Bauer, and Michael K. Reiter
4. 8/27/2017 Magic AI: these are the optical illusions that trick, fool, and flummox computers - The Verge
https://www.theverge.com/2017/4/12/15271874/ai-adversarial-images-fooling-attacks-artificial-intelligence 4/12
To a human, a fooling image might look like a random tie-dye pattern or a burst of TV static,
but show it to an AI image classifier and itll say with confidence: Yep, thats a gibbon, or
My, what a shiny red motorbike. Just as with the facial recognition system that was fooled
by the psychedelic glasses, the classifier picks up visual features of the image that are so
distorted a human would never recognize them.
These patterns can be used in all sorts of ways to bypass AI systems, and have substantial
implications for future security systems, factory robots, and self-driving cars all places
where AIs ability to identify objects is crucial. Imagine youre in the military and youre
using a system that autonomously decides what to target, Jeff Clune, co-author of a 2015
paper on fooling images, tells The Verge. What you dont want is your enemy putting an
adversarial image on top of a hospital so that you strike that hospital. Or if you are using the
same system to track your enemies; you dont want to be easily fooled [and] start following
the wrong car with your drone.
5. 8/27/2017 Magic AI: these are the optical illusions that trick, fool, and flummox computers - The Verge
https://www.theverge.com/2017/4/12/15271874/ai-adversarial-images-fooling-attacks-artificial-intelligence 5/12
A selection of fooling images, and what an AI sees when it looks at them.
These scenarios are hypothetical, but perfectly viable if we continue down our current path
of AI development. Its a big problem, yes, Clune says, and I think its a problem the
research community needs to solve.
The challenge of defending from adversarial attacks is twofold: not only are we unsure how
to effectively counter existing attacks, but we keep discovering more effective attack
variations. The fooling images described by Clune and his co-authors, Jason Yosinski and
Anh Nguyen, are easily spotted by humans. They look like optical illusions or early web art,
| Image by Jeff Clune, Jason Yosinski, Anh Nguyen
6. 8/27/2017 Magic AI: these are the optical illusions that trick, fool, and flummox computers - The Verge
https://www.theverge.com/2017/4/12/15271874/ai-adversarial-images-fooling-attacks-artificial-intelligence 6/12
all blocky color and overlapping patterns, but there are far more subtle approaches to be
used.
One type of adversarial image
referred to by researchers as a
perturbation is all but invisible to the
human eye. It exists as a ripple of pixels
on the surface of a photo, and can be
applied to an image as easily as an Instagram filter. These perturbations were first
described in 2013, and in a 2014 paper titled Explaining and Harnessing Adversarial
Examples, researchers demonstrated how flexible they were. That pixely shimmer is
capable of fooling a whole range of different classifiers, even ones it hasnt been trained to
counter. A recently revised study named Universal Adversarial Perturbations made this
feature explicit by successfully testing the perturbations against a number of different neural
nets exciting a lot of researchers last month.
On the left is the original image; in the middle, the perturbation; and on the right, the final, perturbed image.
PERTURBATIONS CAN BE APPLIED TO
PHOTOS AS EASILY AS INSTAGRAM
FILTERS
| Image by Ian
Goodfellow, Jonathon Shlens, and Christian Szegedy
7. 8/27/2017 Magic AI: these are the optical illusions that trick, fool, and flummox computers - The Verge
https://www.theverge.com/2017/4/12/15271874/ai-adversarial-images-fooling-attacks-artificial-intelligence 7/12
Using fooling images to hack AI systems does have its limitations: first, it takes more time to
craft scrambled images in such a way that an AI system thinks its seeing a specific image,
rather than making a random mistake. Second, you often but not always need access
to the internal code of the system youre trying to manipulate in order to generate the
perturbation in the first place. And third, attacks arent consistently effective. As shown in
Universal Adversarial Perturbations, what fools one neural network 90 percent of the time,
may only have a success rate of 50 or 60 percent on a different network. (That said, even a
50 percent error rate could be catastrophic if the classifier in question is guiding a self-
driving semi truck.)
To better defend AI against fooling images, engineers subject them to adversarial training.
This involves feeding a classifier adversarial images so it can identify and ignore them, like
a bouncer learning the mugshots of people banned from a bar. Unfortunately, as Nicolas
Papernot, a graduate student at Pennsylvania State University whos written a number of
papers on adversarial attacks, explains, even this sort of training is weak against
computationally intensive strategies (i.e, throw enough images at the system and itll
eventually fail).
8. 8/27/2017 Magic AI: these are the optical illusions that trick, fool, and flummox computers - The Verge
https://www.theverge.com/2017/4/12/15271874/ai-adversarial-images-fooling-attacks-artificial-intelligence 8/12
A number of images with perturbations applied to them, captioned with what the AI sees.
To add to the difficulty, its not always clear why certain attacks work or fail. One explanation
is that adversarial images take advantage of a feature found in many AI systems known as
decision boundaries. These boundaries are the invisible rules that dictate how a system
can tell the difference between, say, a lion and a leopard. A very simple AI program that
spends all its time identifying just these two animals would eventually create a mental map.
Think of it as an X-Y plane: in the top right it puts all the leopards its ever seen, and in the
| Image by Seyed-Mohsen Moosavi-
Dezfooli, Alhussein Fawzi, Omar Fawzi, Pascal Frossard
9. 8/27/2017 Magic AI: these are the optical illusions that trick, fool, and flummox computers - The Verge
https://www.theverge.com/2017/4/12/15271874/ai-adversarial-images-fooling-attacks-artificial-intelligence 9/12
bottom left, the lions. The line dividing these two sectors the border at which lion
becomes leopard or leopard a lion is known as the decision boundary.
The problem with the decision boundary approach to classification, says Clune, is that its
too absolute, too arbitrary. All youre doing with these networks is training them to draw
lines between clusters of data rather than deeply modeling what it is to be leopard or a
lion. Systems like these can be manipulated in all sorts of ways by a determined adversary.
To fool the lion-leopard analyzer, you could take an image of a lion and push its features to
grotesque extremes, but still have it register as a normal lion: give it claws like digging
equipment, paws the size of school buses, and a mane that burns like the Sun. To a human
its unrecognizable, but to an AI checking its decision boundary, its just an extremely liony
lion.
As far as we know, adversarial images
have never been used to cause real-
world harm. But Ian Goodfellow, a
research scientist at Google Brain who
co-authored Explaining and Harnessing Adversarial Examples, says theyre not being
ignored. The research community in general, and especially Google, take this issue
seriously, says Goodfellow. And we're working hard to develop better defenses. A number
of groups, like the Elon Musk-funded OpenAI, are currently conducting or soliciting
research on adversarial attacks. The conclusion so far is that there is no silver bullet, but
researchers disagree on how much of a threat these attacks are in the real world. There are
already plenty of ways to hack self-driving cars, for example, that dont rely on calculating
complex perturbations.
Papernot says such a widespread weakness in our AI systems isnt a big surprise
classifiers are trained to have good average performance, but not necessarily worst-case
performance which is typically what is sought after from a security perspective. That is
WE'RE WORKING HARD TO
DEVELOP BETTER DEFENSES.
10. 8/27/2017 Magic AI: these are the optical illusions that trick, fool, and flummox computers - The Verge
https://www.theverge.com/2017/4/12/15271874/ai-adversarial-images-fooling-attacks-artificial-intelligence 10/12
to say, researchers are less worried about the times the system fails catastrophically than
how well it performs on average. One way of dealing with dodgy decision boundaries,
suggests Clune, is simply to make image classifiers that more readily suggest they dont
know what something is, as opposed to always trying to fit data into one category or
another.
Meanwhile, adversarial attacks also invite deeper, more conceptual speculation. The fact
that the same fooling images can scramble the minds of AI systems developed
independently by Google, Mobileye, or Facebook, reveals weaknesses that are apparently
endemic to contemporary AI as a whole.
Its like all these different networks are sitting around saying why dont these silly humans
recognize that this static is actually a starfish, says Clune. That is profoundly interesting
and mysterious; that all these networks are agreeing that these crazy and non-natural
images are actually of the same type. That level of convergence is really surprising people.
For Clunes colleague, Jason Yosinski,
the research on fooling images points to
an unlikely similarity between artificial
intelligence and intelligence developed
by nature. He noted that the same category errors made by AI and their decision
boundaries also exists in the world of zoology, where animals are tricked by what scientists
call supernormal stimuli.
These stimuli are artificial, exaggerated versions of qualities found in nature that are so
enticing to animals that they override their natural instincts. This behavior was first
observed around the 1950s, when researchers used it to make birds ignore their own eggs
in favor of fakes with brighter colors, or to get red-bellied stickleback fish to fight pieces of
trash as if they were rival males. The fish would fight trash, so long as it had a big red belly
painted on it. Some people have suggested human addictions, like fast food and
THAT IS PROFOUNDLY INTERESTING
AND MYSTERIOUS.
11. 8/27/2017 Magic AI: these are the optical illusions that trick, fool, and flummox computers - The Verge
https://www.theverge.com/2017/4/12/15271874/ai-adversarial-images-fooling-attacks-artificial-intelligence 11/12
pornography, are also examples of supernormal stimuli. In that light, one could say that the
mistakes AIs are making are only natural. Unfortunately, we need them to be better than
that.
View all stories in Tech
Apple is collecting
donations for Harvey
storm relief through
iTunes
A stu鍖ed bunny phone
case seems like a good
idea, but its not very
practical
Hackers are using the
promise of Game of
Thrones spoilers to
spread malware
AD
APPLE CIRCUIT BREAKER ENTERTAINMENT