Multiple View Geometry: Exercise Sheet 8
Prof. Dr. Florian Bernard, Florian Hofherr, Tarun Yenamandra
Computer Vision Group, TU Munich
Link Zoom Room , Password: 307238
Exercise: June 9th, 2021
Part I: Theory
Download the ICRA 2013 paper Robust Odometry Estimation for RGB-D Cameras by Kerl, Sturm
and Cremers from the Publications sections on our webpage.1 Read the paper and focus in particular
on III. Direct Motion Estimation.
1. Image Warping
(a) Look at the warping function τ(ξ,x) in Eq. (9). What do τ(ξ,x) and ri(ξ) look like at
ξ = 0?
(b) Prove that the derivative of ri(ξ) w.r.t. ξ at ξ = 0 is
∂ri(ξ)
∂ξ
∣∣∣∣
ξ=0
=
1
z
(
Ixfx Iyfy
)(1 0 −x
z
−xy
z
z + x
2
z
−y
0 1 −y
z
−z − y
2
z
xy
z
x
)∣∣∣∣∣
(x,y,z)>=π−1(xi,Z1(xi))
To this end, apply the chain rule multiple times and use the following identity:
∂T (g(ξ),p)
∂ξ
∣∣∣∣
ξ=0
=
(
Id3 −p̂
)
∈ R3×6 .
Note: The notation ∂f(x)/∂x denotes the Jacobian matrix including all first-order par-
tial derivatives, where the number of rows is the number of dimensions of f(x), and the
number of columns is the number of dimensions of x.
(c) Following the derivation in (b), determine the derivative for arbitrary ξ
∂ri(∆ξ ◦ ξ)
∂∆ξ
∣∣∣∣
∆ξ=0
where ◦ is defined by
ξ1 ◦ ξ2 := log
(
exp(ξ̂1) · exp(ξ̂2)
)∨
.
∨ : se(3)→ R6 is the inverse of the hat transform.
Hint: Rewrite the problem such that you can make use of part b).
1http://vision.in.tum.de/publications
1
https://tum-conf.zoom.us/s/62772800235?pwd=SUpZN2QrV0JpeXJyR2R1TWx5cHEwdz09
2. Image Pyramids
In order to handle large translational and rotational motions, a coarse-to-fine scheme is applied
in the paper. To go from one level l to l + 1, the images I(l) (intensity) and D(l) (depth) are
downscaled by averaging over intensities or valid depth values, respectively:
I(l+1)(n,m) : =
1
4
·
∑
(n′,m′)∈O(n,m)
I(l)(n′,m′)
O(n,m) = {(2n, 2m), (2n+ 1, 2m), (2n, 2m+ 1), (2n+ 1, 2m+ 1)}
D(l+1)(n,m) : =
1
|Od(n,m)|
·
∑
(n′,m′)∈Od(n,m)
D(l)(n′,m′)
Od(n,m) = {(n′,m′) ∈ O(n,m) : D(n′,m′) 6= 0}
How does the camera matrix K change from level l to l + 1? Write down f (l+1)x , f
(l+1)
y , c
(l+1)
x
and c(l+1)y in terms of f
(l)
x , f
(l)
y , c
(l)
x and c
(l)
y .
3. Optimization for Normally Distributed p(ri)
(a) Confirm that a normally distributed p(ri) with a uniform prior on the camera motion leads
to normal least squares minimization. To this end, use
p(ri|ξ) = p(ri) = A exp
(
−
r2i
σ2
)
to show that with a constant prior p(ξ), the maximum a posteriori estimate is given by
ξMAP = arg min
ξ
∑
i
ri(ξ)
2 .
(b) Explicitly show that the weights
w(ri) =
1
ri
∂ log p(ri)
∂ri
are constant for normally distributed p(ri).
(c) Show that in the case of normally distributed p(ri) the update step ∆ξ can be computed as
∆ξ = −
(
J>J
)−1
J>r(0) .
2
Part II: Practical Exercises
In this exercise you will implement direct image alignment as Gauss-Newton minimization on SE(3).
Download the package ex8.zip provided on the website. It contains a code framework, test images
and the corresponding camera calibration.
1. Implement the function [Id,Dd,Kd] = downscale(I,D,K,level)which (recursively)
halves the image resolution of the image I , the depth mapD and adjusts the corresponding cam-
era matrix K per pyramid level l. For an input frame of dimensions 640 × 480 (l = 1), level 2
corresponds to 320 × 240 pixels, level 3 corresponds to 160 × 120 pixels and so on. Use the
equations and results obtained in the theory part.
2. Complete the function r = calcErr(I1, D1, I2, xi, K) that takes the images and
their (assumed) relative pose, and calculates the per-pixel residual r(ξ) as defined in the slides.
r should be a n× 1 vector, with n = w × h, the number of pixels. Visualize the residual as an
image for ξ = 0.
Hint: perform tests on a coarse version of the image (e.g. 160× 120) to make it run faster.
3. Implement the function [J, r] = deriveNumeric(I1, D1, I2, xi, K) that dif-
ferentiates r(∆ξ ◦ ξ) numerically w.r.t. ∆ξ: for each pixel xi compute
∂ri(∆ξ ◦ ξ)
∂∆ξ
∣∣∣∣
∆ξ=0
≈
(
ri((�e1) ◦ ξ)− ri(ξ)
�
, …,
ri((�e6) ◦ ξ)− ri(ξ)
�
)
where � is a small value (for Matlab � = 10−6), ej is the j’th unit vector and the operator ◦ is
defined as in exercise 1.(c) of Part I. J should be a n × 6 matrix. The per-pixel residuals r(ξ)
are returned as r.
4. Implement Gauss-Newton minimization for the photometric error
E(ξ) =
∑
i
ri(ξ)
2 = ||r(ξ)||22
according to the theory part. To this end, complete the script ex08 in ll. 70 and ll. 75. For an
update ∆ξ, compute the updated motion as ξnew = ∆ξ ◦ ξold. Use only one pyramid level l = 3
(160× 120) in the beginning, and then add the others.
5. Implement a function J = deriveAnalytic(I1, D1, I2, xi, K) that differentiates
r(∆ξ ◦ ξ) analytically w.r.t. ∆ξ. Use the result of the theory part, Exercise 1 (c). The use of
this analytical gradient instead of the numeric derivatives in the minimization should result in a
significant speed-up.
6. Run your implementation on the provided images using the script ex08.
3