[06-30213][06-30241][06-25024]
Computer Vision and Imaging &
Robot Vision
Dr Hyung Jin Chang
h.j.chang@bham.ac.uk
School of Computer Science
Today’s agenda
• Part 1
– Topic overview
– Introductions to computer vision
• Part 2
– Module overview:
• Logistics and requirements
– Camera and Image Formation
Hyung Jin Chang
Lecture 1 – 2 01/02/2021
Robots
Industrial robots
Mobile robots in warehouses
Hyung Jin Chang
Lecture 1 – 3 01/02/2021
Mars robot, underwater robots
Hyung Jin Chang
Lecture 1 – 4 01/02/2021
UAV / Drone
Hyung Jin Chang
Lecture 1 – 5 01/02/2021
Autonomous Vehicle
Hyung Jin Chang
Lecture 1 – 6 01/02/2021
Humanoid robots
A humanoid robot is a robot with its body shape built to resemble the human body. The design may be for functional purposes, such as interacting with human tools and environments, or, for example, for experimental purposes, such as the study of locomotion.
Hyung Jin Chang
Lecture 1 – 7 01/02/2021
Welcome to Robot Vision
Terminator 1984
Hyung Jin Chang
Lecture 1 – 8 01/02/2021
Robot Vision?
Hyung Jin Chang
Lecture 1 – 9 01/02/2021
Robot Vision? Computer Vision!
Computer Vision
Robot
로봇 비젼
Vision Robot Vision
Hyung Jin Chang
Lecture 1 – 10 01/02/2021
Welcome to Computer Vision!
Slide credit: Fei-Fei Li
Hyung Jin Chang
Lecture 1 – 11 01/02/2021
What is Computer/Robot Vision?
Hyung Jin Chang
Lecture 1 – 12 01/02/2021
Evolution’s Big Bang
543 million years, B.C. – Paleontology
Hyung Jin Chang
Lecture 1 – 13 01/02/2021
Vision is OUR Dominant Sense
50% of our neural tissue is directly or indirectly related to vision, which assists in visual learning
S.B. Sells and Richard S. Fixott, “Evaluation of Research on Effects of Visual Training on Visual Functions”, American Journal of Ophthalmology
Hyung Jin Chang
Lecture 1 – 14 01/02/2021
Robot Vision
Enable machines to “see” the visual world as we do
Hyung Jin Chang
Lecture 1 – 15 01/02/2021
The Start of Computer Vision
Machine Perception of Three Dimensional Solids, Ph.D. thesis, MIT Department of Electrical Engineering, 1963.
Hyung Jin Chang
Lecture 1 – 16 01/02/2021
The Start of Computer Vision
• We try to teach a computer how to see.
Hyung Jin Chang
Lecture 1 – 17 01/02/2021
The Start of Computer Vision
David Marr, 1970s
David Courtenay Marr (19 January 1945 – 17 November 1980) was a
British neuroscientist and physiologist. Marr integrated results from psychology, artificial intelligence, and neurophysiology into new models of visual processing.
Hyung Jin Chang
Lecture 1 – 18 01/02/2021
Thought process from David Marr
David Marr, 1970s
Hyung Jin Chang
Lecture 1 – 19 01/02/2021
Map of Computer Science
Hyung Jin Chang
Lecture 1 – 20 01/02/2021
Multidisciplinary field
Slide credit: Fei-Fei Li
Hyung Jin Chang
Lecture 1 – 21 01/02/2021
Robot/Computer Vision
• Automatic understanding of images and video
1. Computingpropertiesofthe3Dworldfromvisualdata
(measurement)
Hyung Jin Chang
Lecture 1 – 22 01/02/2021
1. Vision for measurement
Real-time stereo
NASA Mars Rover
Wang et al.
Structure from motion
Tracking
Slide credit: Kristen Grauman
Snavely et al.
Demirdjian et al.
Hyung Jin Chang
Lecture 1 – 23 01/02/2021
Robot/Computer Vision
• Automatic understanding of images and video
1. Computingpropertiesofthe3Dworldfromvisualdata
(measurement)
2. Algorithmsandrepresentationstoallowamachineto recognise objects, people, scenes, and activities (perception and interpretation)
Hyung Jin Chang
Lecture 1 – 24 01/02/2021
2. Vision for perception, interpretation
sky
Objects Activities Scenes Locations Text / writing Faces Gestures Motions Emotions…
The Wicked Twister
Cedar Point
Ferris wheel
ride Lake Erie
water tree
ride
12 E
tree
ride
people waiting in
line people sitting on
ride
umbrellas
maxair
tree deck
carousel
bench
tree
amusement park
pedestrians
Hyung Jin Chang
Slide credit: Kristen Grauman Lecture 1 – 25 01/02/2021
Robot/Computer Vision
• Automatic understanding of images and video
1. Computingpropertiesofthe3Dworldfromvisualdata
(measurement)
2. Algorithmsandrepresentationstoallowamachineto recognise objects, people, scenes, and activities (perception and interpretation)
3. Algorithmstomine,search,andinteractwithvisualdata (search and organisation)
Hyung Jin Chang
Lecture 1 – 26 01/02/2021
3. Visual search, organisation
Query Image or video Relevant archives content
Slide credit: Kristen Grauman
Hyung Jin Chang
Lecture 1 – 27 01/02/2021
Why is vision difficult?
Hyung Jin Chang
Lecture 1 – 28 01/02/2021
Visual illusions
Hyung Jin Chang
Lecture 1 – 29 01/02/2021
Visual illusions
Hyung Jin Chang
Lecture 1 – 30 01/02/2021
Visual illusions
Hyung Jin Chang
Lecture 1 – 31 01/02/2021
Strong models for regularisation
Reliable results can be obtained when
• data is well-behaved (follows
the assumptions)
• when we have strong models
(usually designed by hand) to compensate for missing data, noise, deviations from assumptions
Perception is a kind of
controlled hallucination [Max Clowes, Jan Koenderink]
Hyung Jin Chang
Lecture 1 – 32 01/02/2021
What humans see
Slide credit: Larry Zitnick
Hyung Jin Chang
Lecture 1 – 33 01/02/2021
What computers see
243
239
240
225
206
185
188
218
211
206
216
225
242
239
218
110
67
31
34
152
213
206
208
221
243
242
123
58
94
82
132
77
108
208
208
215
235
217
115
212
243
236
247
139
91
209
208
211
233
208
131
222
219
226
196
114
74
208
213
214
232
217
131
116
77
150
69
56
52
201
228
223
232
232
182
186
184
179
159
123
93
232
235
235
232
236
201
154
216
133
129
81
175
252
241
240
235
238
230
128
172
138
65
63
234
249
241
245
237
236
247
143
59
78
10
94
255
248
247
251
234
237
245
193
55
33
115
144
213
255
253
251
248
245
161
128
149
109
138
65
47
156
239
255
190
107
39
102
94
73
114
58
17
7
51
137
23
32
33
148
168
203
179
43
27
17
12
8
17
26
12
160
255
255
109
22
26
19
35
24
Slide credit: Larry Zitnick
Hyung Jin Chang
Lecture 1 – 34 01/02/2021
Why is vision difficult?
• Ill-posed problem: real world much more complex than what we can measure in images
– 3D à 2D
• Impossible to literally “invert” image formation
process
Slide credit: Kristen Grauman
Hyung Jin Chang
Lecture 1 – 35 01/02/2021
Challenges: ambiguity
• Many different 3D scenes could have given rise to a particular 2D picture
Slide credit: Svetlana Lazebnik
Hyung Jin Chang
Lecture 1 – 36 01/02/2021
Challenges: many nuisance parameters
Illumination
Object pose
Clutter
Occlusions
Intra-class appearance
Viewpoint
Slide credit: Kristen Grauman
Hyung Jin Chang
Lecture 1 – 37 01/02/2021
Challenges: scale
Slide credit: Fei-Fei, Fergus, Torralba
Hyung Jin Chang
Lecture 1 – 38 01/02/2021
Challenges: motion
Slide credit: Svetlana Lazebnik
Hyung Jin Chang
Lecture 1 – 39 01/02/2021
Challenges: occlusion, clutter
Slide credit: Svetlana Lazebnik
Hyung Jin Chang
Lecture 1 – 40 01/02/2021
Challenges: object intra-class variation
Slide credit: Fei-Fei, Fergus, Torralba
Hyung Jin Chang
Lecture 1 – 41 01/02/2021
Challenges: context and human experience
Hyung Jin Chang Slide credit: Fei-Fei, Fergus, Torralba Lecture 1 – 42 01/02/2021
Challenges: context and human experience
Slide credit: Fei-Fei, Fergus, Torralba
Hyung Jin Chang
Lecture 1 – 43 01/02/2021
Challenges: context and human experience
Slide credit: Fei-Fei, Fergus, Torralba
Hyung Jin Chang
Lecture 1 – 44 01/02/2021
Challenges: complexity
How many object categories are there?
Biederman 1987
Slide credit: Fei-Fei, Fergus, Torralba
Hyung Jin Chang
Lecture 1 – 45 01/02/2021
Challenges: complexity
10 billion images
250 billion images
400 hours uploaded per minute
From
1 billion images served daily
:
Almost 90% of web traffic is visual!
10 billion images
Hyung Jin Chang
Lecture 1 – 46 01/02/2021
Challenges: complexity
• Thousandstomillionsofpixelsinanimage
• 30+degreesoffreedomintheposeofarticulatedobjects (humans)
• Abouthalfofthecerebralcortexinprimatesisdevotedto processing visual information [Felleman and van Essen 1991]
Hyung Jin Chang
Lecture 1 – 47 01/02/2021
Have we overcome those challenges?
Hyung Jin Chang
Lecture 1 – 48 01/02/2021
Well …
Hyung Jin Chang
Lecture 1 – 49 01/02/2021
Industrial Robot + Vision
Hyung Jin Chang
Lecture 1 – 50 01/02/2021
Robot @ Ocado
Hyung Jin Chang
Lecture 1 – 51 01/02/2021
Robot @ Dyson
Hyung Jin Chang
Lecture 1 – 52 01/02/2021
Robot @ Boston Dynamics
Hyung Jin Chang
Lecture 1 – 53 01/02/2021
Robot @ Boston Dynamics
Hyung Jin Chang
Lecture 1 – 54 01/02/2021
Robot @ Boston Dynamics
Hyung Jin Chang
Lecture 1 – 55 01/02/2021
Robot @ Boston Dynamics
Hyung Jin Chang
Lecture 1 – 56 01/02/2021
Optical Character Recognition (OCR)
Digit recognition in 1993
yann.lecun.com
License plate readers
http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
Sudoku grabber
http://sudokugrab.blogspot.com/
Automatic check processing
Source: S. Seitz, N. Snavely
Hyung Jin Chang
Lecture 1 – 57 01/02/2021
Image Classification and Object Detection
Image classification
Object detection
Hyung Jin Chang
Lecture 1 – 58 01/02/2021
Face Detection
Hyung Jin Chang
Lecture 1 – 59 01/02/2021
Face Detection for Privacy Protection
Hyung Jin Chang
Lecture 1 – 60 01/02/2021
Technology gone wild …
Hyung Jin Chang
Lecture 1 – 61 01/02/2021
Biometrics
Fingerprint scanners
Finger vein recognition
Face recognition systems
Palm vein recognition
Hyung Jin Chang
Lecture 1 – 62 01/02/2021
Apple’s Face ID
Hyung Jin Chang
Lecture 1 – 63 01/02/2021
Interactive Systems
Shotton et al.
Hyung Jin Chang
Lecture 1 – 64 01/02/2021
Interaction in Augmented Reality
Hyung Jin Chang
Lecture 1 – 65 01/02/2021
Interaction in Augmented Reality
Hyung Jin Chang
Lecture 1 – 66 01/02/2021
Interaction in Augmented Reality
Hyung Jin Chang
Lecture 1 – 67 01/02/2021
Object Detection & Recognition
Hyung Jin Chang
Lecture 1 – 68 01/02/2021
Tracking & Recognition
Hyung Jin Chang
Lecture 1 – 69 01/02/2021
Autonomous Vehicle
Hyung Jin Chang
Lecture 1 – 70 01/02/2021
Image Colourisation
Hyung Jin Chang
Lecture 1 – 71 01/02/2021
Image Colourisation
Hyung Jin Chang
Lecture 1 – 72 01/02/2021
3D Reconstruction from Photo Collections
Q. Shan, R. Adams, B. Curless, Y. Furukawa, and S. Seitz, The Visual Turing Test for Scene Reconstruction, 3DV 2013
Slide credit: Svetlana Lazebnik
Hyung Jin Chang
Lecture 1 – 73 01/02/2021
3D Reconstruction from Photo Collections
Hyung Jin Chang
Lecture 1 – 74 01/02/2021
From 2D to 3D
S. Saito et al.. “PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization”, CVPR 2020
Hyung Jin Chang
Lecture 1 – 75 01/02/2021
From 2D to 3D
S. Saito et al.. “PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization”, CVPR 2020
Hyung Jin Chang
Lecture 1 – 76 01/02/2021
From 2D to 3D
J. Kopf et al.. “One Shot 3D Photography”, SIGRAPH 2020
Hyung Jin Chang
Lecture 1 – 77 01/02/2021
Photo Wake-Up
Hyung Jin Chang
Lecture 1 – 78 01/02/2021
Medical Imaging
3D imaging MRI, CT
Image guided surgery
Grimson et al., MIT
Source: S. Seitz
Hyung Jin Chang
Lecture 1 – 79 01/02/2021
Why Vision?
• As image sources multiply, so do applications – Relieve humans of boring, easy tasks
– Enhance human abilities
– Advance human-computer interaction, visualization
– Perception for robotics / autonomous agents – Organize and give access to visual content
Hyung Jin Chang
Lecture 1 – 80 01/02/2021
Summary
• Robot/Computer Vision is useful, interesting, and difficult
• A growing and exciting field
• Lots of cool and important applications
• New teams in existing companies, startups, etc.
Slide adapted from Devi Parikh
Hyung Jin Chang
Lecture 1 – 82 01/02/2021
Summary
• Computer Vision is HOT!
CVPR 2019 industrial sponsors
Hyung Jin Chang
Lecture 1 – 83 01/02/2021