�ݺ�ߣ

Contents Definition Why study Computer Vision Origin of Computer Vision Goal of Computer Vision Connections to other disciplines Vision Levels Applications

``to know what is where, by looking.’’ (Marr). Where What VISION

Vision as a source of semantic information

Object categorization sky building flag wall banner bus cars bus face street lamp

Scene and context categorization outdoor city traffic …

Qualitative spatial information slanted rigid moving object horizontal vertical rigid moving object non-rigid moving object

Definition Computer vision is the science and technology of machines of obtaining models, meaning and control information from visual data. As a scientific discipline, computer vision is concerned with the theory.

�� The two main fields of computer vision are computational vision and machine vision. ��Computational vision has to do with simply recording and analyzing the visual perception, and trying to understand it. ��Machine vision has to do with using what is found from computational vision and applying it to benefit people, animals, environment, etc .

Why study computer vision? Images and video are everywhere! Personal photo albums Surveillance and security Movies, news, sports Medical and scientific images

Why study computer vision? Vision is useful Vision is interesting Vision is difficult Half of primate cerebral cortex is devoted to visual processing Achieving human-level visual perception is probably “AI-complete”

Origins of computer vision L. G. Roberts, Machine Perception of Three Dimensional Solids, Ph.D. thesis, MIT Department of Electrical Engineering, 1963.

The goal of computer vision To perceive the “world behind the picture” 153 156 148 152 149 147 139 146 142 150 146 144 137 125 120 119 136 146 151 164 172 175 183 188 196 200 205 208 214 214 219 217 159 151 150 148 140 138 139 129 119 104 86 82 89 97 107 115 118 130 128 132 128 144 160 168 179 188 200 208 213 220 212 214 149 146 153 147 147 146 132 99 73 78 87 96 105 120 138 151 145 157 163 171 165 161 146 126 157 184 190 201 215 212 214 214 145 150 154 148 148 126 93 67 72 78 96 107 117 127 131 134 127 154 166 167 183 194 200 195 143 140 175 190 197 203 206 207 151 153 151 147 120 85 67 75 84 83 94 92 81 78 78 91 83 117 126 144 178 200 201 203 208 175 127 159 185 196 195 206 146 144 139 123 79 66 74 83 79 69 64 62 58 50 46 54 54 66 60 80 86 108 141 191 184 200 187 123 144 175 198 199 135 130 115 87 64 77 90 79 78 85 81 63 55 57 56 53 70 62 61 68 59 58 84 105 168 194 196 183 131 151 185 197 128 116 92 71 82 94 103 101 83 101 88 66 70 90 80 42 39 53 88 73 76 82 116 87 97 144 188 195 190 166 171 203 135 120 84 83 108 127 135 115 100 92 79 49 85 74 59 0 0 0 50 69 52 79 157 141 100 84 136 187 206 204 189 200 144 103 91 115 139 147 127 91 87 80 72 44 61 84 25 0 0 0 50 181 45 69 142 164 167 113 93 130 193 199 208 203 139 102 123 143 137 131 109 85 93 84 68 47 77 86 31 0 3 0 51 156 53 75 141 169 199 151 171 108 143 181 199 208 141 135 153 142 114 104 97 97 83 98 77 42 77 96 79 21 0 23 58 46 56 77 155 199 212 161 194 193 164 187 202 205 160 172 164 141 128 112 98 95 100 96 91 73 68 86 75 73 64 65 54 69 77 115 190 212 193 181 174 188 210 194 202 207 179 189 160 140 139 116 97 97 108 103 110 99 75 80 72 83 50 55 54 95 98 174 205 185 179 188 185 190 193 217 217 224 189 183 152 130 121 105 105 117 114 108 107 115 110 81 85 85 87 81 81 124 183 202 175 180 178 171 173 204 225 215 219 225 178 161 149 135 120 115 122 129 137 145 131 121 125 115 109 91 92 111 132 159 173 170 184 176 184 190 191 217 210 226 228 223 187 159 139 127 125 115 118 121 121 131 133 134 140 137 134 139 140 152 141 154 170 163 195 194 176 198 216 209 219 224 223 226 185 164 140 122 116 110 109 108 113 118 115 116 123 127 135 148 154 162 165 170 171 160 183 198 201 210 223 216 221 222 221 226 188 175 150 130 118 117 113 110 108 115 117 123 130 132 138 150 157 158 174 182 189 186 198 221 224 221 227 221 223 218 218 222 187 179 158 141 124 127 125 127 126 129 130 135 139 141 150 165 175 172 185 195 207 210 212 226 229 222 224 224 223 218 219 221 188 184 172 159 138 135 135 143 143 143 144 146 145 147 160 174 184 191 199 207 211 213 217 224 227 223 223 221 221 218 224 223 192 191 187 174 153 139 140 147 146 149 157 162 160 159 165 174 181 198 201 210 212 216 223 224 225 225 220 215 217 215 224 224

The goal of computer vision To perceive the “world behind the picture” What exactly does this mean? Vision as a source of metric 3D information Vision as a source of semantic information

Connections to other disciplines Computer Vision Image Processing Machine Learning Artificial Intelligence Robotics Psychology Neuroscience Computer Graphics

Artificial Intelligence ... uses computer vision to recognize handwriting text and drawings . The Robocup tournament and ASIMO are examples of Artificial Intelligence using Computer Vision to its greatest extent. Artificial Intelligence

Machine learning is a scientific discipline that is concerned with the design and development of algorithms that allow computers to learn based on data, such as from sensor data or databases . Machine Learning

Psychology Neuroscience A brain–computer interface (BCI), sometimes called a direct neural interface or a brain–machine interface, is a direct communication pathway between a brain and an external device.

Image Processing A digital image is produced by one or several image sensors , which, besides various types of light-sensitive cameras, include range sensors, tomography devices, radar, ultra-sonic cameras, etc. Depending on the type of sensor, the resulting image data is an ordinary 2D image, a 3D volume, or an image sequence .

image processing brings some new concepts — such as connectivity and rotational invariance — that are meaningful or useful only for two-dimensional signals .

Vision as measurement device Real-time stereo Structure from motion NASA Mars Rover Pollefeys et al. Multi-view stereo for community photo collections Goesele et al.

Iphone sloves Rubiks cube –cube cheater

Computer Graphics Computer graphics are graphics created using computers and, more generally, the representation and manipulation of pictorial data by a computer. It has revolutionized the animation and video game industry.

A 2D projection of a 3D projection of a 4D

Robotics Mobile machines with power, sensing, and computing on-board. Works on Land (on and under) Water (ditto) Air Space ???

Vision Levels Early vision: Image formation and processing Mid-level vision: Grouping and fitting Multi-view geometry Recognition Advanced topics

I. Early vision Basic image formation and processing Cameras and sensors Light and color Linear filtering Edge detection * = Feature extraction: corner and blob detection

II. “Mid-level vision” Fitting and grouping Fitting: Least squares Hough transform RANSAC Alignment

III. Multi-view geometry Projective structure from motion: Here be dragons! Stereo Affine structure from motion Tomasi & Kanade (1993) Epipolar geometry

IV. Recognition Patch description and matching Clustering and visual vocabularies Bag-of-features models Classification

V. Advanced Topics Time permitting… Segmentation Articulated models Face detection Motion and tracking

Applications of computer vision Driver assistance (collision warning, lane departure warning, rear object detection) Factory inspection Monitoring for safety (Poseidon) Reading license plates, checks, ZIP codes Surveillance Autonomous driving, robot navigation

Applications of computer vision Assistive technologies Entertainment (Sony EyeToy) Movie special effects Digital cameras (face detection for setting focus, exposure) Visual search (MSR Lincoln)

Challenges: local ambiguity slide credit: Fei-Fei, Fergus & Torralba

�ݺ�ߣ

H Vijayalakshmi

More Related Content

H Vijayalakshmi