L5_Transformation
L5: 3D Transformation
Hao Su
Machine Learning meets Geometry
3D Spatial Relationships
• How to represent the relationships between objects?
2Wang et al., “DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion”, CVPR 2019
3D Spatial Relationships
• How to represent the relationships between objects?
3Wang et al., “DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion”, CVPR 2019
Relative position and
orientation
Prereq: Topology
• Topology: Structural Properties of a Manifold
• Two surfaces and are topologically equivalent if
there is a differentiable bijection between and
M N
M N
4
Prereq: Topology
5
• More examples:
Rotation and SO(n)
Orientation
• We use “rotation” to represent the relative orientation
between two frames
• For example,
– Space Frame:
– Body Frame:
– rotates the frame of the space to the frame of
the body after the origins are aligned
{s} = { ̂xs, ̂ys, ̂zs}
{b} = { ̂xb, ̂yb, ̂zb}
Rsb
7
Rotation in ℝ2
8
1 Degree of Freedom
Rotation in ℝ3
9
3 Degree of Freedoms
The Set of Rotations
•
• : “Special Orthogonal Group”
• “Group”: a group under the matrix multiplication
• “Orthogonal”:
• “Special”:
• : 2D rotations, 1 DoF
• : 3D rotations, 3 DoF
SO(n) = {R ∈ ℝn×n : det(R) = 1,RRT = I}
SO(n)
RRT = I
det(R) = 1
SO(2)
SO(3)
10
Topology of SO(n)
• The topology of is the same as a circleSO(2)
11
Topology of SO(n)
• Circles do not have the same topology as
No differentiable bijections between and
• The topology of is also different from
Why do we care about the topology?
(−1,1)n
⟹ SO(2)
(−1,1)n
SO(3) (−1,1)n
12
( (
-1 1
Parameterizing Rotation in
Networks is Tricky
13
• An ideal parameterization to use
in networks:
1. The domain is (as network output)
f(θ) : U ↦ SO(2)
(−l, l)n
Parameterizing Rotation in
Networks is Tricky
14
• An ideal parameterization to use
in networks:
1. The domain is
2. is a differentiable bijection
f(θ) : U ↦ SO(2)
(−l, l)n
f
θ
θ′
R
f
f
• If input data points to network are
close, but the predictions happen
to be far after convergence, the
network (a continuous function) will
make awful predictions between the
two data points!
• Need special network design to
overcome the issue (will discuss in
future lectures)
θOtherwise:
near far
Parameterizing Rotation in
Networks is Tricky
15
• An ideal parameterization to use
in networks:
1. The domain is
2. is a differentiable bijection
3. with , there should
, such that and
for some constant and
small (all movement in should be
achieved by movement in the domain with a near
constant speed)
f(θ) : U ↦ SO(2)
(−l, l)n
f
∀θ ∀y ∈ Tf(θ) ∥y∥ = 1
∃x ∈ Tθ y = Df [x]
c + ϵ > ∥x∥ > c − ϵ c
ϵ SO(n)
Parameterizing Rotation in
Networks is Tricky
• An ideal parameterization to use
in networks:
1. The domain is
2. is a differentiable bijection
3. with , there should
, such that and
for some constant and
small
• However, 1 and 2 are contradictory by topology!
• For 3, it also creates troubles for the case.
f(θ) : U ↦ SO(2)
(−l, l)n
f
∀θ ∀y ∈ Tf(θ) ∥y∥ = 1
∃x ∈ Tθ y = Df [x]
c + ϵ > ∥x∥ > c − ϵ c
ϵ
SO(3)
16
Euler Angles
Euler Angle is Very Intuitive
18
https://www.programmersought.com/article/8900590257/
Euler Angle to Rotation Matrix
• Rotation about principal axis is represented as:
• for arbitrary rotation
Rx(α) :=
1 0 0
0 cos α −sin α
0 sin α cos α
Ry(β) :=
cos β 0 sin β
0 1 0
−sin β 0 cos β
Rz(γ) :=
cos γ −sin γ 0
sin γ cos γ 0
0 0 1
R = Rz(α)Ry(β)Rx(γ)
19
Inspection from Learning
Perspective
• Euler Angle is not unique for some rotations. For
example,
20https://www.mecademic.com/resources/Euler-angles/Euler-angles
Rz(45
∘)Ry(90
∘)Rx(45
∘) = Rz(90
∘)Ry(90
∘)Rx(90
∘)
= [
0 0 1
0 1 0
−1 0 0]
Inspection from Learning
Perspective
• Gimbal lock:
• is rank-deficient at some
• some movement in cannot be
achieved
Df θ
⇒ Tf(θ)(SO(3))
21https://www.mecademic.com/resources/Euler-angles/Euler-angles
Inspection from Learning
Perspective
• For example: When ,
since changing and has the same effects, a
degree of freedom disappears!
β = π/2
α γ
22https://www.mecademic.com/resources/Euler-angles/Euler-angles
R = Rz(α)Ry(π/2)Rx(γ)
=
0 0 1
sin(α + γ) cos(α + γ) 0
−cos(α + γ) sin(α + γ) 0
Summary
• Euler angle can parameterize every rotation and has
good interpretability
• It is not a unique representation at some points
• There are some points where not every change in the
target space (rotations) can be realized by a change in
the source space (Euler angles)
23
Axis-Angle
Euler Theorem
• Any rotation in is equivalent to rotation about a
fixed axis through a positive angle
• : unit vector of rotation axis ( )
• : angle of rotation
•
SO(3)
ω ∈ ℝ3 𝜃
ω̂ ∥ω̂∥ = 1
𝜃
R ∈ SO(3) := Rot(ω̂, θ)
25
Given and , what is ?ω̂ θ R ∈ SO(3)
Skew-Symmetric Matrix
• is skew-symmetric
• Skew-symmetric matrix operator:
,
• Cross product can be a linear transformation
–
• Lie Algebra of 3D rotation:
–
𝐴 A = − AT
ω =
ω1
ω2
ω3
[ω] :=
0 −ω3 ω2
ω3 0 −ω1
−ω2 ω1 0
a × b = [a]b
so(3) := {S ∈ ℝ3×3 : ST = − S}
27
Given and , what is ?ω̂ θ R ∈ SO(3)
• Consider a point . At time , the position is
• Rotate with unit angular velocity
around axis , i.e.,
–
–
𝑞 t = 0 𝑞0
𝑞
ω̂
𝑣 = ω̂ × r
·q(t) = ω̂ × q(t) = [ω̂]q(t)
28
(solution of the ODE)
the swept angle
(exponential map)
• is also called rotation vector or exponential
coordinate
·q(t) = ω̂ × q(t) = [ω̂]q(t)
⇒ q(t) = e[ω̂]tq0
∥ω̂∥ = 1
⇒ θ = ∥ω̂t∥ = t
⇒ q(θ) = e[ω̂]θq0
⇒ Rot(ω̂, θ) = e[ω̂]θ = e[ω̂θ]
⃗ω = ω̂θ
29
Given and , what is ?ω̂ θ R ∈ SO(3)
• Definition of Matrix Exponential:
• Sum of infinite series? Rodrigues Formula
– Can prove that
– Then, use Taylor expansion of sin and cos
–
[ω̂]3 = − [ω̂]
30
e[ω̂]θ = I + [ω̂]sin θ + [ω̂]2(1 − cos θ)
e[ω̂]θ = I + θ[ω̂] +
θ2
2!
[ω̂]2 +
θ3
3!
[ω̂]3 + ⋯
Given and , what is ?ω̂ θ R ∈ SO(3)
Given , what is and ?R ∈ SO(3) ω̂ θ
• First question: Is there a unique parametrization?
– No:
1. and give the same rotation
2. when , and can be arbitrary
• When 2 does not happen, and if we also restrict
, a unique parameterization exists:
– when , can be computed by
,
– when , they are the cases that for
rotations around x/y/z axis
(ω̂, θ) (−ω̂, − θ)
R = I θ = 0 ω̂
θ ∈ [0,π)
tr(R) ≠ − 1
θ = arccos
1
2
[tr(R) − 1] [ω̂] =
1
2 sin θ
(R − RT)
tr(R) = − 1 θ = π
31
Distance between Rotations
• How to measure the distance between rotations
?
• A natural view is to measure the (minimal) effort to
rotate the body at pose to pose:
(R1, R2)
R1 R2
32
∵ (R2R
T
1 )R1 = R2 ∴ dist(R1, R2) = θ(R2R
T
1 ) = arccos
1
2
[tr(R2R
T
1 ) − 1]
X
Z Y
X
Z
Y
R1
R2
XZ
Y
Inspection from Learning Perspective
• When used in networks, one prominent issue is:
– Suppose that you are estimating as a 3D vector
– To keep a unique parameterization, you assume
that
– Your current solution is
– is mapped to a neighborhood point in
, but it is not in the neighborhood of the
domain, hence gradient descent could not achieve
it
θω̂
θ ∈ (0,π]
πω̂
(π − ϵ)(−ω̂)
SO(3)
33
Summary of Axis-Angle
• Axis-Angle is an intuitive rotation representation
• By adding a constraint to the domain of , the
parameterization can be unique at most points
• Can be converted to and from rotation matrices by
exponential map and its inverse (when possible)
• Induced a distance between rotations which is a
metric in (independent of parameterization)
θ
SO(3)
34
Quaternion
Mathematical Definition
• Recall the complex number
• Quaternion is a more generalized complex number:
– is the real part and is the
imaginary part
– Imaginary:
– anti-commutative :
a + bi
q = w + xi + yj + zk
w ⃗v = (x, y, z)
i2 = j2 = k2 = ijk = − 1
ij = k = − ji, jk = i = − kj, ki = j = − ik
36
Properties of General Quaternions
• In vector-form, the product of two quaternions:
For and
• Conjugate:
• Norm:
• Inverse:
q1 = (w1, ⃗v 1) q2 = (w2, ⃗v 2)
q1q2 = (w1w2 − ⃗v T1 ⃗v 2, w1 ⃗v 2 + w2 ⃗v 1 + ⃗v 1 × ⃗v 2)
q* = (w, − ⃗v )
∥q∥2 = w2 + ⃗v T ⃗v = qq* = q*q
q−1 :=
q*
∥q∥2
37
Unit Quaternion as Rotation
• A unit quaternion 1 can represent a rotation
– Four numbers plus one constraint 3 DoF
• Geometrically, the shell of a 4D sphere
𝒒 =
→
38
Unit Quaternion as Rotation
• Rotate a vector by quaternion :
1. Augment to
2.
• Compose rotations by quaternion:
– : first rotate by and then by
– Since , we
conclude that composing rotations is as simple as
multiplying quaternions!
⃗x q
⃗x x = (0, ⃗x )
x′ = qxq−1
(q2(q1xq*1 )q*2 ) q1 q2
(q2(q1xq*1 )q*2 ) = (q2q1)x(q*1 q*2 )
39
Conversation between
Quaternions and Angle-Axis
• Exponential coordinate Quaternion:
Quaternion is very close to angle-axis representation!
• Exponential coordinate Quaternion:
,
→
q = [cos(θ/2), sin(θ/2)ω̂]
←
θ = 2 arccos(w) ω̂ =
1
sin(θ/2)
⃗v θ ≠ 0
0 θ = 0
40
Conversation between
Quaternion and Rotation Matrix
• Rotation Quaternion
where and
• Rotation Quaternion
– Rotation Angle-Axis Quaternion
←
R(q) = E(q)G(q)T
E(q) = [− ⃗v , wI + [ ⃗v ]]
G(q) = [− ⃗v , wI − [ ⃗v ]]
→
→ →
41
Inspection from Learning Perspective
• Each rotation corresponds to two quaternions
(“double-covering”)
• Need to normalize to unit length in networks. This
normalization may cause big/small gradients in
practice
42
More about Quaternion
• Quaternion is computationally cheap:
– Internal representation of Physical Engine and
Robot
– Pay attention to convention (w, x, y, z) or (x, y, z,
w)?
– (w, x, y, z): SAPIEN, transforms3d, Eigen, blender,
MuJoCo, V-Rep
– (x, y, z, w): ROS, PhysX, PyBullet
43
Summary of Quaternion
• Very useful and popular in practice
• 4D parameterization, compact and efficient to
compute
• Good numerical properties in general
44
? means no singularity with single exceptions
Summary of
Rotation Representations
45
Inverse? Composing?
Any local
movement in
SO(3) can be
achieved by local
movement in the
domain?
Rotation
Matrix
✔ ✔ N/A
Euler Angle Complicated Complicated No
Angle-axis ✔ Complicated ?
Skew-
symmetrical
Matrix
✔ Complicated ?
Quaternion ✔ ✔ ✔
Resources
• A useful torch library that you can play with is “kornia’’
• Use with cautious to its numerical properties
• “ceres” is a C++ library that is quite useful
46