You might have heard of the buzz phrase computer vision and think that its some futuristic thing, but its in your life already. Let me show you what is computer vision, how computer vision works (without math), and why computer vision is exploding today.
1 of 64
More Related Content
Cameras and Cats - Why Computer Vision is Changing the Way You See the World
1. Cameras and Cats
Why computer vision
is changing the way
you see the world
!1
@_changxu
Image: Pexels
2. !2
Computer Vision?
You might have heard of this buzz phrase
and think that its some futuristic thing,
but its in your life already.
4. !4
Nanit baby camera
Image: Nanit
A daily report on how well
your baby slept last night
Camera
above your
babys crib
5. !5
Waldo photos
Find your photo needles in album haystacks
Waldos image recognition
picks out just the photos
including your daughter
Teachers take photos of
campers and upload to Waldo
Image: Pexels
6. !6 Image: Engadget
Osmo game system Hands-on play beyond the screen
Camera sees
whats on the
table
7. !7
Ring video doorbell
Image: Direct Electric Company
Video feed outside your doorCamera sees
whats outside
your door
8. !8
Density people counter
Image: Density
Gather space utilization
data across all rooms
Camera counts
people as they enter
and exit a room
9. !9
Lane departure warnings
Image: Montgomery Rennie Jonson
Beeps if you deviate from your lane
Camera behind the
rear-view mirror
sees lane markings
in front
10. !10 Image: The Spoon
Self-checkout store
Pick up you want and put it in your bag
11. !11 Image: Arstechnica
Self-check out store ceiling
Cameras on the ceiling and in the shelves
identify you and the items you pick up
Cameras
12. !12
Camera-based growing system
Image: iUNU
Monitor crops to detect discoloration on the leaves that may
indicate certain diseases and alert the farmer immediately
Cameras
13. !13 Image: DigitalGlobe via Satellite Today
Using satellite images to count cars
Count cars at retailers parking
lots over time. Hedge fund uses
this data to make predictions on
how a retailer is doing.
14. !14 Image: DigitalGlobe via Satellite Today
Observe sponsors logos on-screen
and quantify their visual impact
GumGum sports sponsorship measurement
15. Everything you are doing uses computer vision.
!15
You just havent thought about it in any of these ways.
One of the reasons that computer vision is so useful is
because it is pervasive.
16. What is computer vision?
Computer vision is being able to interpret the
physical world through cameras and sensors.
!16
17. !17 Image: Freepik
In the past, we had to manually input instructions to tell machines to do one
action and then another.
19. $$$
###
###
!19 Image: Freepik
Now, computers can understand what they are seeing and sensing. They take inputs
that are images and videos, which allows them to directly observe the physical world.
$$$
So we can have many cameras
that humans dont need to look at.
20. !20 Image: Disney (Big Hero 6)
Using cameras as their eyes,
computers will become more
knowledgeable and aware of
the physical world and be able
to directly interact with us.
Baymax must
have great
computer vision
hookups.
24. !24 Images: Ring, Mobileye, DJI, Drishti Robotics
Cameras and sensors1 Cameras are being deployed everywhere, because
they are better, smaller, and cheaper. There are more
and more inputs of the physical world to computers.
At home
On the factory floor
On drones
In your car
25. !25 Image: The Verge
Cameras and sensors1 We have also hugely benefitted from the rise of smartphones
over the past decade, which put a camera in everyones
pocket. The massive scale also made cameras a lot cheaper.
26. !26
Connectivity2
Image: Freepik
These cameras are connected. They take photos at the edge
and then upload to the cloud to aggregate data. This helps
them to constantly improve how they understand the world.
27. !27
Connectivity2
Image: inVia Robotics
inVia Robotics makes robots for e-commerce fulfillment
warehouses. The robots navigate autonomously by reading
QR codes along the shelves and on the bins. They function as
a swarm by batching orders and optimizing routes together.
Cameras
28. !28
Data3
300+ hours of video uploaded every minute
50+ million photos posted every day
300+ million photos posted every day
Sources: Merchdope, Statistic Brain Research Institute, Gizmodo, Automated Insights
There is more and more visual data being shared over the
web, powered by cameras in everyones pocket and the many
connected and distributed cameras.
29. !29
Data3
Image: Google Driverless Car Project
We have also benefitted from the autonomous car movement,
through which we have gathered a huge amount of data
about road conditions, cars, pedestrians, and road signs.
30. !30 Image: Pexels
But just having many pictures of cats wont necessarily tell a computer whats a cat.
You look at this photo for a fraction
of a second and you know its a cat.
But how does a computer know?
31. !31 Image: Pexels
How does it know that this is a cat in front of a door
not a tiger and a sunset? Same colors, same stripes!
32. !32 Image: Pexels
How does it know that this line delineates the chest of this cat against a door,
when the image it sees is completely flat?
33. !33 Image: Todd Peterson
Plus, the computer has never seen this particular cat in this pose with this
background, because you just took this photo.
34. !34 Image: Pexels
That is, if youre lucky enough to get the whole cat. You might just get part of a cat.
How does a computer know that this is a cat?
36. !36 Image: Pexels
(210, 179, 172)
R G B
(127, 0, 0)
(246, 171, 97)
To a computer, an image is a
collection of pixels, where each one
is represented by three numbers.
37. !37 Image: Pexels
12 megapixels 12 million pixels
Since each photo on your iPhone X has
38. !38 Image: Pexels
12 megapixels 12 million pixels
36 million numbers
So to look at an image, a computer needs to analyze 36 million numbers.
39. !39 Image: Pexels
12 megapixels 12 million pixels
36 million numbers
trillions of relationships
But its even more complicated. Groups of pixels together gives you
the eyes, ears, and whiskers. This means that computers needs to
analyze trillions of pixel-to-pixel relationships to look at a single image.
40. !40
Intelligence4
Image: Pexels
Understanding what is in an
image is staggeringly complex.
We have developed algorithms
to simplify the computation so
that it wouldnt take days to tell
you if you took a photo of a cat.
41. !41
Im going to explain one algorithmic innovation
to you that is critical to understanding images:
Convolution
This is the most technical part of this talk,
but it is foundational to image recognition.
(And I will explain it without math.)
Intelligence4
42. !42
Intelligence4
Image: Pexels
If you look at an image, youll notice two things:
1) You look at each area separately. The
door ledge on the bottom right is
not relevant to cat ears on the top.
44. !44
Convolution is a method that allows you to easily and quickly do
both of those things to images.
Otherwise, you might look at how each pixel is related to every
other pixel in the entire image, or decipher each time whether
youre looking at cat ears. Using convolution saves you a lot of work.
Intelligence4
45. !45
Intelligence4
Image: Pexels
Convolution
I convolve all the pixels in this small box
to arrive at a new number. Now this
number has the information from itself
and the eight pixels surrounding it, but
it doesnt have any information from
the pixels that are far away, which
makes this an efficient operation.
I convolve the pixels and it tells me that
Im looking at a line at a certain angle.
47. !47
Intelligence4
Image: Pexels
I convolve all over the image and now I
see that there are two cat ears, eyes,
whiskers, and paws. Now I have pretty
good confidence that Im looking at a cat.
This is why you often hear the words
Convolutional Neural Networks, or CNNs,
or ConvNets, when people talk about
computer vision.
48. !48
Convolution is just one algorithm. Researchers have
developed many other algorithms to reduce the
complexity and help computers recognize whats in an
image accurately and quickly. Most are esoteric and
related to the inner workings of statistics and models.
Ill give two more examples that are easier to grasp.
Intelligence4
50. !50
Intelligence4
Image: Pexels
Transfer learning is taking a model that is already trained and
applying it to your problem. You need to tweak it because
they used professional photos whereas you are using your
phone camera. This is much easier than starting from scratch.
51. !51
Computing power5
Image: Scio Info Tech
CPUs GPUs
Used in traditional computing
Great at taking a sequential list of
instructions and executing them quickly
Have multiple cores
Have thousands of cores that can operate in
parallel and can perform a multitude of
identical simple jobs simultaneously
Developed for video games in order to
render images on the screen efficiently
Similarly, image recognition calls for applying
convolution quickly across the image
Computer vision is better suited for GPUs, which is why the
GPU market has gotten a lot of attention of late.
52. !52
Computing power5
Note: Figures include capital leases; AMZN Capex spend represents total consolidated capex across all businesses
Source: Company data, Goldman Sachs Global Investment Research
Capex spend by public cloud vendor
Major cloud providers have been adding GPUs and making
serious investments in machine-learning specific offerings.
No one publicly discloses
how many Nvidia GPUs they
are buying, but overall Capex
spend is directionally correct.
53. !53
Computing power5
Public cloud market size and share
Note: Market size based on Gartner Estimates; Company data based on GS Estimates
Source: Company data, Goldman Sachs Global Investment Research
Why? The public cloud market is growing rapidly. We always
need more computing power to recognize cats more quickly.
54. !54 Image: Pexels
Cameras and sensors
Connectivity
Data
Intelligence
Computing power
1
2
3
4
5
This is a virtuous cycle that gives better predictions and makes
computer vision better and better and more pervasive in our world.
55. So what are cat photos to you?
!55
Computer vision is breaking another wall that allows us to
interact digitally and physically. It fundamentally changes on
how businesses operate in the physical world.
What challenges do you have that computer vision can solve?
I have a few ideas.
56. !56 Image: Pexels
Identity and security: Can your product become benefit from seeing who it is
interfacing with? Either a digital product like FaceID or a physical product like a door
57. !57 Image: Pexels
E-commerce: Why am I still flipping through online catalogs and imagining
clothes on me, but when they arrive, the clothes would invariably fit terribly?
58. !58 Image: Pexels
Change detection: So much of our jobs is to keep an eye on something.
When youre driving and you want to switch lanes, you need to remember to
check your blind spots. Why do we still have blind spots?
59. !59 Image: Pexels
The police walk around the city all day to give tickets to illegally parked cars.
Why not put cameras around the city and ticket cars automatically?
60. !60
Or when youre in a factory and making lots of electronic widgets, a machine
could be out of alignment and start to make defective products. Why not
have a camera stare at it and alert you if it sees something different?
Image: ADDitude
61. !61 Image: Pexels
A lot of diseases start with changes that are imperceptible to the untrained
eye, like slight tremors in your fingers might indicate Parkinsons. Why not
have a camera as an observer in your home with a direct line to your doctor?
62. !62 Image: Pexels
Computer vision is changing the way we interact with the world.
Wed be hard pressed to find a business that this would not be relevant for.
63. Upfront portfolio companies using computer vision
!63
*
*Ring was a former portfolio company until it was sold to Amazon in 2018