The document provides an introduction to the field of computer vision, including its definition, origins, goals, connections to other disciplines, and applications. Specifically:
- Computer vision is defined as obtaining models, meaning, and control from visual data using machines and is concerned with understanding visual perception computationally and applying it to benefit others.
- It originated in the 1960s and aims to perceive the "world behind the picture" by understanding 3D information and semantics from images.
- It connects to fields like image processing, machine learning, artificial intelligence, robotics, psychology, neuroscience, and computer graphics.
- Applications include industrial inspection, surveillance, autonomous vehicles, assistive technologies, and digital cameras.
2. Contents Definition Why study Computer Vision Origin of Computer Vision Goal of Computer Vision Connections to other disciplines Vision Levels Applications
9. Definition Computer vision is the science and technology of machines of obtaining models, meaning and control information from visual data. As a scientific discipline, computer vision is concerned with the theory.
10. 油 The two main fields of computer vision are computational vision and machine vision. 油Computational vision has to do with simply recording and analyzing the visual perception, and trying to understand it. 油Machine vision has to do with using what is found from computational vision and applying it to benefit people, animals, environment, etc .
11. Why study computer vision? Images and video are everywhere! Personal photo albums Surveillance and security Movies, news, sports Medical and scientific images
12. Why study computer vision? Vision is useful Vision is interesting Vision is difficult Half of primate cerebral cortex is devoted to visual processing Achieving human-level visual perception is probably AI-complete
13. Origins of computer vision L. G. Roberts, Machine Perception of Three Dimensional Solids, Ph.D. thesis, MIT Department of Electrical Engineering, 1963.
15. The goal of computer vision To perceive the world behind the picture What exactly does this mean? Vision as a source of metric 3D information Vision as a source of semantic information
16. Connections to other disciplines Computer Vision Image Processing Machine Learning Artificial Intelligence Robotics Psychology Neuroscience Computer Graphics
17. Artificial Intelligence ... uses computer vision to recognize handwriting text and drawings . The Robocup tournament and ASIMO are examples of Artificial Intelligence using Computer Vision to its greatest extent. Artificial Intelligence
20. Machine learning is a scientific discipline that is concerned with the design and development of algorithms that allow computers to learn based on data, such as from sensor data or databases . Machine Learning
21. Psychology Neuroscience A braincomputer interface (BCI), sometimes called a direct neural interface or a brainmachine interface, is a direct communication pathway between a brain and an external device.
23. Image Processing A digital image is produced by one or several image sensors , which, besides various types of light-sensitive cameras, include range sensors, tomography devices, radar, ultra-sonic cameras, etc. Depending on the type of sensor, the resulting image data is an ordinary 2D image, a 3D volume, or an image sequence .
24. image processing brings some new concepts such as connectivity and rotational invariance that are meaningful or useful only for two-dimensional signals .
25. Vision as measurement device Real-time stereo Structure from motion NASA Mars Rover Pollefeys et al. Multi-view stereo for community photo collections Goesele et al.
27. Computer Graphics Computer graphics are graphics created using computers and, more generally, the representation and manipulation of pictorial data by a computer. It has revolutionized the animation and video game industry.
35. Vision Levels Early vision: Image formation and processing Mid-level vision: Grouping and fitting Multi-view geometry Recognition Advanced topics
36. I. Early vision Basic image formation and processing Cameras and sensors Light and color Linear filtering Edge detection * = Feature extraction: corner and blob detection
37. II. Mid-level vision Fitting and grouping Fitting: Least squares Hough transform RANSAC Alignment
38. III. Multi-view geometry Projective structure from motion: Here be dragons! Stereo Affine structure from motion Tomasi & Kanade (1993) Epipolar geometry
39. IV. Recognition Patch description and matching Clustering and visual vocabularies Bag-of-features models Classification
40. V. Advanced Topics Time permitting Segmentation Articulated models Face detection Motion and tracking
41. Applications of computer vision Driver assistance (collision warning, lane departure warning, rear object detection) Factory inspection Monitoring for safety (Poseidon) Reading license plates, checks, ZIP codes Surveillance Autonomous driving, robot navigation
42. Applications of computer vision Assistive technologies Entertainment (Sony EyeToy) Movie special effects Digital cameras (face detection for setting focus, exposure) Visual search (MSR Lincoln)
43. Challenges: local ambiguity slide credit: Fei-Fei, Fergus & Torralba