程序代写代做代考 algorithm kernel go html graph Fundamentals of Computer Vision

Fundamentals of Computer Vision
Lecture

• Disparity.
• Stereo rectification.
• Stereo matching.
• Improving stereo matching.
• Structured light.
Overview of today’s lecture

Slide credits
Most of these slides were adapted from:
• Kris Kitani (15-463, Fall 2016, Fall 2017), Ioannis Gkioulekas (16-385, Spring 2019), Robert Colin (454, Fall 2019s).
Some slides were inspired or taken from: • Fredo Durand (MIT).

Stereo

Revisiting triangulation

How would you reconstruct 3D points?
Left image Right image

How would you reconstruct 3D points?
Left image Right image 1. Selectpointinoneimage(how?)

How would you reconstruct 3D points?
Left image Right image
1. Selectpointinoneimage(how?)
2. Formepipolarlineforthatpointinsecondimage(how?)

How would you reconstruct 3D points?
Left image Right image
1. Selectpointinoneimage(how?)
2. Formepipolarlineforthatpointinsecondimage(how?) 3. Findmatchingpointalongline(how?)

How would you reconstruct 3D points?
Left image Right image
1. Selectpointinoneimage(how?)
2. Formepipolarlineforthatpointinsecondimage(how?) 3. Findmatchingpointalongline(how?)
4. Performtriangulation(how?)

left image
left camera with matrix
Triangulation
3D point
right image
right camera with matrix

How would you reconstruct 3D points?
Left image Right image
1. Selectpointinoneimage(how?)
2. Formepipolarlineforthatpointinsecondimage(how?) 3. Findmatchingpointalongline(how?)
4. Performtriangulation(how?)
What are the disadvantages of this procedure?

Stereo rectification

What’s different between these two images?

Objects that are close move more or less?

The amount of horizontal movement is inversely proportional to …

The amount of horizontal movement is inversely proportional to …
… the distance from the camera.
More formally…

3D point
image plane
camera center camera center

image plane

(baseline)

How is X related to x?
(baseline)

(baseline)

How is X related to x’?
(baseline)

(baseline)

(baseline)
Disparity
(wrt to camera origin of image plane)

(baseline)
Disparity
inversely proportional to depth

Another way to look at disparity
X
x-x¢ = f O-O¢ z
z x’
f B
disparity = x - x¢ = B × f z
Disparity is inversely proportional to depth.
x
f
O Baseline
O’

Real-time stereo sensing
Nomad robot searches for meteorites in Antartica http://www.frc.ri.cmu.edu/projects/meteorobot/index.html

Pre-collision braking

What other vision system uses disparity for depth sensing?

This is how 3D movies work

Is disparity the only depth cue the human visual system uses?

So can I compute depth from any two images of the same object?

So can I compute depth from any two images of the same object?
1. Need sufficient baseline
2. Images need to be ‘rectified’ first (make epipolar lines horizontal)

1.Rectify images
(make epipolar lines horizontal)
2.For each pixel
a.Find epipolar line scanline in the right image b.Search the scanline and pick the best match c.Compute disparity x-x’
d.Compute depth from disparity

How can you make the epipolar lines horizontal?

3D point
image plane
camera center camera center
What’s special about these two cameras?

When are epipolar lines horizontal?
When this relationship holds:
R = I t = (T, 0, 0)
x
x’ t

When are epipolar lines horizontal?
When this relationship holds:
R = I t = (T, 0, 0) Let’s try this out…
x
x’ t
This always has to hold for rectified images

When are epipolar lines horizontal?
When this relationship holds:
R = I t = (T, 0, 0) Let’s try this out…
x
t
Write out the constraint
x’
This always has to hold for rectified images

When are epipolar lines horizontal?
When this relationship holds:
R = I t = (T, 0, 0) Let’s try this out…
x
x’ t
This always has to hold
Write out the constraint
The image of a 3D point will always be on the same horizontal line
y coordinate is always the same!

How can you make the epipolar lines horizontal?

Use stereo rectification?

What is stereo rectification?

What is stereo rectification?
Reproject image planes onto a common plane parallel to the line between camera centers
How can you do this?

What is stereo rectification?
Reproject image planes onto a common plane parallel to the line between camera centers
Need two homographies (3×3 transform), one for each input image reprojection
C. Loop and Z. Zhang. Computing Rectifying Homographies for Stereo Vision.Computer Vision and Pattern Recognition, 1999.

Stereo Rectification
1. Rotate the right camera by R
(aligns camera coordinate system orientation only)
2. Rotate (rectify) the left camera so that the epipole is at infinity
3. Rotate (rectify) the right camera so that the epipole is at infinity
4. Adjust the scale

Stereo Rectification:
1. Compute E to get R
2. Rotate right image by R
3. Rotate both images by Rrect
4. Scale both images by H

Stereo Rectification:
rotate by R
1. Compute E to get R
2. Rotate right image by R
3. Rotate both images by Rrect
4. Scale both images by H
aligns camera coordinate system orientation only)

Stereo Rectification:
1. Compute E to get R
2. Rotate right image by R aligns camera coordinate system orientation only)
3. Rotate both images by Rrect
4. Scale both images by H

Stereo Rectification:
rotate by Rrect
rotate by Rrect
1. Compute E to get R
2. Rotate right image by R
3. Rotate both images by Rrect
4. Scale both images by H

Stereo Rectification:
1. Compute E to get R
2. Rotate right image by R
3. Rotate both images by Rrect
4. Scale both images by H

Stereo Rectification:
scale by H
scale by H
1. Compute E to get R
2. Rotate right image by R
3. Rotate both images by Rrect
4. Scale both images by H

Stereo Rectification:
1. Compute E to get R
2. Rotate right image by R
3. Rotate both images by Rrect
4. Scale both images by H

Stereo Rectification:
1. Compute E to get R
2. Rotate right image by R
3. Rotate both images by Rrect
4. Scale both images by H

Step 1: Compute E to get R
SVD:
Let
We get FOUR solutions:
two possible rotations two possible translations

We get FOUR solutions:
Which one do we choose?
Compute determinant of R, valid solution must be equal to 1
(note: det(R) = -1 means rotation and reflection)
Compute 3D point using triangulation, valid solution has positive Z value
(Note: negative Z means point is behind the camera )

Let’s visualize the four configurations… image plane
Camera Icon
camera center
Find the configuration where the point is in front of both cameras
optical axis

Find the configuration where the points is in front of both cameras

Find the configuration where the points is in front of both cameras

Stereo Rectification:
1. Compute E to get R
2. Rotate right image by R
3. Rotate both images by Rrect
4. Scale both images by H

When do epipolar lines become horizontal?

Parallel cameras
Where is the epipole?

Parallel cameras
epipole at infinity

Let
Setting the epipole to infinity (Building Rrect from e)
The matrix gives the camera’s pose and
it will be specified in terms of its row vectors
epipole e (using SVD on E) (translation from E)
The new X axis parallel to the baseline, epipole coincides with translation vector
The new Y axis orthogonal to X
cross product of e and the direction vector of the optical axis
The new Z axis is orthogonal to XY (orthogonal vector)
Given:

If
and orthogonal
then

If
and orthogonal
then
Where is this point located on the image plane?

If
and orthogonal
then
Where is this point located on the image plane?
At x-infinity

Stereo Rectification Algorithm
1. Estimate E using the 8 point algorithm (SVD)
2. Estimate the epipole e (SVD of E)
3. Build Rrect from e
4. Decompose E into R and T
5. Set R1=Rrect and R2 = RRrect
6. Rotate each left camera point (warp image) [x’ y’ z’] = R1 [x y z]
7. Rectified points as p = f/z’[x’ y’ z’]
8. Repeat 6 and 7 for right camera points using R2

Stereo Rectification Algorithm
1. Estimate E using the 8 point algorithm
2. Estimate the epipole e (solve Ee=0)
3. Build Rrect from e
4. Decompose E into R and T
5. Set R1=Rrect and R2 = RRrect
6. Rotate each left camera point x’~ Hx where H = KR1 *You may need to alter the focal length (inside K) to keep points within the original image size
7. Repeat 6 and 7 for right camera points using R2

What can we do after rectification?

Stereo matching

Depth Estimation via Stereo Matching

1.Rectify images
(make epipolar lines horizontal)
2.For each
a.Find epipolar line scanline in the right image b.Search the scanline and pick the best match c.Compute disparity x-x’
d.Compute depth from disparity
pixel
How would you do this?

Reminder from filtering
How do we detect an edge?

Reminder from filtering
How do we detect an edge?
• We filter with something that looks like an edge.
1
0
-1
*
*
horizontal edge filter
1
0
-1
original
We can think of linear filtering as a way to evaluate how similar an image is locally to some template.
vertical edge filter

Find this template
How do we detect the template in the following image?

Find this template
How do we detect the template in he following image?
filter output
What will the output look like?
image
Solution 1: Filter the image using the template as filter kernel.

Find this template
How do we detect the template in he following image?
filter output
image
Solution 1: Filter the image using the template as filter kernel.
What went wrong?

Find this template
How do we detect the template in he following image?
filter output
image
Solution 1: Filter the image using the template as filter kernel.
Increases for higher local intensities.

Find this template
How do we detect the template in he following image?
filter
template mean
output
What will the output look like?
image
Solution 2: Filter the image using a zero-mean template.

Find this template
How do we detect the template in he following image?
filter
template mean
output
output
True detection
thresholding
What went wrong?
image
Solution 2: Filter the image using a zero-mean template.
False detections

Find this template
How do we detect the template in he following image?
filter
template mean
output
output
image
Solution 2: Filter the image using a zero-mean template.
Not robust to high- contrast areas

Find this template
How do we detect the template in he following image?
filter output
What will the output look like?
image
Solution 3: Use sum of squared differences (SSD).

How do we detect the template in he following image?
filter output
1-output
Find this template
True detection
image
Solution 3: Use sum of squared differences (SSD).
thresholding
What could go wrong?

How do we detect the template in he following image?
filter output
1-output
Find this template
image
Solution 3: Use sum of squared differences (SSD).
Not robust to local intensity changes

Find this template
How do we detect the template in he following image?
Observations so far:
• subtracting mean deals with brightness bias
• dividing by standard deviation removes contrast bias Can we combine the two effects?

References
Basic reading:
• Szeliski textbook, Section 8.1 (not 8.1.1-8.1.3), Chapter 11, Section 12.2. • Hartley and Zisserman, Section 11.12.