[06-30213][06-30241][06-25024]
Computer Vision and Imaging &
Robot Vision
Dr Hyung Jin Chang Dr Yixing Gao
h.j.chang@bham.ac.uk y.gao.8@bham.ac.uk
School of Computer Science
MULTIPLE-VIEW GEOMETRY
(CHAPTERS 7, 8 (FORSYTH & PONCE) CHAPTER 11 (R. SZELISKI))
STEREO GEOMETRY AND SCENE RECONSTRUCTION
Depth and stereo: Basics
Left camera
Right camera
?
• Thebasicprincipleistriangulation
• Reconstructioncalculatedbyintersectionoftworays • Assume:
• Known camera position in 3D (calibration) • Correspondence between points is known
Slide credit: Steve Seitz
Triangulation
• Weknowprojectionsofa3Dpoint(andcamera projection matrices), how to find the coordinates of the original 3D point (a pre-image of projections)?
X?
x1 O1
x2
O2
Slide credit: Svetlana Lazebnik
Triangulation: Intersection
• Trytointersectapairofvisualrays,corresponding to 𝑥!and 𝑥”.
• But because of numerical errors and noise, the rays will not
intersect in practice!
R2
R1
X?
x1 O1
x2
O2
Slide credit: Svetlana Lazebnik
Triangulation: Geometric approach
• Findtheshortestsegmentconnectingthetworays and take the value 𝑋 in the middle.
• Not very principled…
X
x1 O1
x2
O2
Slide credit: Svetlana Lazebnik
Triangulation: Linear
algebraic approach
lx =PX x ́PX=0 11111
[x ]PX=0 1 ́ 1
[x ]PX=0 2 ́ 2
lx=PX 222
x ́PX=0 22
Recall: Vector product written in matrix form:
é0-az ayùébxù a ́b=ê a 0 -a úêb ú=[a ]b
ê-a a 0úêbú ëy x ûëzû
zxy ́
Slide credit: Svetlana Lazebnik
Triangulation: Linear
algebraic approach
lx =PX x ́PX=0 11111
[x ]PX=0 1 ́ 1
[x ]PX=0 2 ́ 2
lx=PX 222
x ́PX=0 22
Two independent equations each, 3 unknowns in X.
• Write a homogeneous system.
𝑨𝑿 = 𝟎
• Solve by SVD. Solution for 𝑿 is the eigenvector corresponding
to the smallest eigenvalue.
• Much better than geometric approach, since it easily generalizes to multiple cameras.
Slide credit: Svetlana Lazebnik
Triangulation: Nonlinear
• FindXthatminimizesasumofreprojectionerrors! d2(x ,PX)+d2(x ,P X)
X? x’1
x1 O1
x2 x’2
O2
approach
11 22
Slide credit: Svetlana Lazebnik
Triangulation: Nonlinear approach
• FindXthatminimizesasumofreprojectionerrors! d2(x ,PX)+d2(x ,P X)
11 22
• Mostaccurate,butdoesnothaveaclosed-form solution.
• Requiresiterativealgorithm(bundleadjustment)
• Initializebydirectlineartransformation(DLT).
• OptimizebyGradientdescentorGauss-Newtonor Levenberg-Marquardt
Slide credit: Bastian Leibe
Geometry of a simple stereo
• Whatifthecorrespondenceswereunknown?
• Givenapointintheleftimage,wheretolookforthe
corresponding point in the right image?
• Wewillconsiderasimplesystemfirst:
• Parallelopticalaxes
• Knowncameraparameters (i.e., a calibrated system)
𝑂&
𝑂%
𝑲𝟏
𝑲𝟐
𝒕,𝑹
Projection to image planes:
Point in 3D
Point in left image
x
Focal length
Point in right image
z
z
Depth of point p
x
Optical center of the left camera.
Distance between cameras centers (baseline)
Optical center of the right camera
Geometry of a simple stereo
• If the axes are parallel, and the camera parameters are known, then the point depth can be estimated by triangulation.
Similar triangles
(pl, P, pr) in (Ol, P, Or):
T+xl-xr =T Z-f Z
T
xr -xl
Z=f disparity
Disparity and depth
Image I(x,y) Disparity map d(x,y) Image I ́(x ́,y ́)
(𝑥, 𝑦)
𝑑(𝑥, 𝑦)
(𝑥, 𝑦)
𝑥 ́,𝑦 ́ =(𝑥+𝑑(𝑥,𝑦),𝑦)
A general calibrated system
• Parallelimageplanesnotrequiredforageneral stereo system.
vs.
Correspondence constraints
• Recallthesimplesetup:Parallelimageplanes,aligned by pixel lines…
• Ifthepointpintheleftimageisknown,wheretolook for its correspondence p’ in the right image?
• Somewherealongtheline.
Correspondence constraints
• Generalsetup:Imageplanesnolongerparallel…
• Ifthepointpintheleftimageisknown,wheretolook
for its correspondence p’ in the right image?
• Needsomemoregeometryresultstoanswerthis…
Correspondence constraints
• Ifthepointpintheleftimageisknown,wheretolook for its correspondence p’ in the right image?
Along projections
of all 3D points along the ray passing through 𝒑.
Correspondence constraints
• Multiple-view geometry constrains our search for correspondences!
epipolar line
epipolar plane
epipolar line
• Epipolar contraint: Why is this useful?
• Simplifies the correspondence problem to a 1D search along the corresponding epipolar line!
Slide adapted from Steve Seitz
Epipolar
• Baseline:alineconnectingthecameracenters.
• Epipole:pointwherethebaselinepuncturestheimageplane.
• Epipolarplane:planeconnectingtwoepipolesanda3Dpoint.
• Epipolarline:intersectionofepipolarplaneandimageplane.
geometry: Definitions
• All epipolar lines of a single image intersect at the camera epipole.
Slide credit: Marc Pollefeys
Epipolar
constraints
• Potential matches for p necessarily lie on the corresponding epipolar line l‘.
• Potential matches for p‘ necessarily lie on the corresponding
epipolar line l. http://www.ai.sri.com/~luong/research/Meta3DViewer/EpipolarGeo.html
Slide credit: Marc Pollefeys
Example
Slide credit: Kristen Grauman
Question?
• Howtoformulatetheepipolarconstraints,fora general stereo system?
• I.e.,givenapoint𝒙intheleftimagewhatistheequation of the epipolar line in the right image?
𝒙
𝒍
• Willlookattwocases:
• Calibratedcameras–Knowncalibrationmatrices
• Noncalibratedcameras–Calibrationmatricesunknown
Epipolar
• Both cameras are calibrated, so we know their intrinsic matrices K and K‘.
• This means that we can always transform image coordinates (pixels) into image plane coordinates (meters).
• Therefore the epipolar geometry of a calibrated system is calculated in image plane coordinates (meters)!
geometry:
A calibrated system
Geometry of a calibrated system
Note: X‘ is written in c.s. of O‘, while T and X are written in c.s. of O! Write transformation of 𝑋′ to 𝑋.
X
=R + X×(T ́X)=X×(T ́RX¢)
X’
T
T ́ X = T ́ RX ‘+ T ́ T =T ́RX’
=0
Slide credit: Kristen Grauman
plane normal (cross product)
Geometry of a calibrated system
Note: X‘ is written in c.s. of O‘, while T and X are written in c.s. of O! Write transformation of 𝑋′ to 𝑋.
X
=R + XT (T ́X)=XT (T ́RX¢)
X’
T
T ́ X = T ́ RX ‘+ T ́ T =T ́RX’
=0
This holds since the left side is 0!
https://en.wikipedia.org/wiki/Cross_product#Conversion_to_matrix_multiplication
Slide credit: Kristen Grauman
plane normal (cross product)
Geometry of a calibrated system
XT (T ́RX¢)=0
XT (TxRX¢)=0 Let E=TxR ,then
XTEX¢=0
• This holds for rays p and p’ (parallel with X and X‘), from
camera center to the corresponding point in image plane. Then this holds:
• Matrix 𝑬 is called an essential matrix, that relates the
corresponding image points [Longuet-Higgins 1981]
The vector cross product also can be expressed as the product of a skew-symmetric matrix and a vector
pTEp’=0
Slide credit: Kristen Grauman
Essential matrix and
Epipolar constraints:
A point p in one image corresponds to a point p‘ in the other image and is related by this equation.
(pT E)T Is a vector, representing the epipolar line l’, that contains p‘.
epipolar
lines
pTEp¢ = 0
Ep¢Is a vector, representing the epipolar line l, that contains p.
Slide credit: Kristen Grauman
Essential matrix and
• Relatesimagesofcorrespondingpointsinboth cameras at a given rotation and translation.
• (Assumingweknowtheintrinsicparameters.)
• Canbecalculatedfromknownextrinsicparameters.
E=TR ́
Translation and rotation of the second camera with respect (w.r.t.) the first.
epipolar
lines
Example:
P
Epipolar
lines in parallel cams
Transformation of c.s. from Ol to Or:
p
T…tx
pTEp =0 lr
(right) (left)
p‘
For a pair of parallel cameras, the image of any point in the first camera must lie on the same horizontal line in the other camera.
Slide credit: Kristen Grauman
Epipolar
Cannot transform coordinates of points in image (pixels) to coordinates in image plane
(meters) since we do not know K and K‘! (the system is not calibrated!)
x x’
• Camera calibration matrices K and K’ unknown.
• We can still write the epipolar constraints in terms of
unknown normalized image plane coordinates:
geometry:
A
noncalibrated
system
X
xT E x ‘ = 0
Projection from unknown image plane coordinates to pixels.
ˆ ˆ¢¢¢ x=Kx, x=Kx
Side adapted from: Svetlana Lazebnik
Epipolar
Important: 𝑥0, 𝑥0′ written in image pixels. X
x x’
ˆˆ
xT E x ‘ = 0 xT Fx ‘ = 0, with, F = K -T EK ‘-1
ˆ x = K -1 x
geometry:
A
noncalibrated
system
x=Kx
ˆ
ˆ¢ ¢¢ x=K x x=Kx
¢ ¢-1 ˆ¢
Fundamental matrix (Faugeras and Luong, 1992)
Side adapted from: Svetlana Lazebnik
Epipolar
xTEx =0 T -T -1 ˆ ˆ¢ xFx¢=0 with F=K EK¢
• Fx’ is epipolar line corresponding to x’ (l = F x’)
• FTx is epipolar line corresponding to x (l’ = FTx)
• Fe’=0 in FTe=0
• F is singular (rank=2)
• F has seven DoF
geometry:
X
Fundamenta
l matrix
x x’
Slide credit: Svetlana Lazebnik
Epipolar
• Assumeaknownepipolargeometry
• Ifweknowprojectionsofa3Dpointintotwoimages and all fundamental matrices, how do we calculate the projection into the third image?
transfer
x1
x2
? x3
Slide credit: Svetlana Lazebnik
Epipolar
• Assumeaknownepipolargeometry
• Ifweknowprojectionsofa3Dpointintotwoimages and all fundamental matrices, how do we calculate the projection into the third image?
transfer
x1
x2
l x3 l 31 32
l31 = FT13 x1 l32 = FT23 x2
Slide credit: Svetlana Lazebnik
Back to stereo reconstruction
• Ifparallelopticalaxesareassumed,thenthe corresponding point in left and right image are related by a disparity map:
[x ́,y ́]=[x+D(x,y), y]
Image I(x,y) Disparity map D(x,y) Image I ́(x ́,y ́)
• Butthisisnotassimpleinnonparallelsystems!
+
Stereo image rectification
• Convenient if the lines for searching the matches correspond
to the epipolar lines.
Algorithm:
• Reproject image planes into a common
plane, parallel to the baseline.
• Makes disparity estimation as
simple as in parallel cameras.
• Two homographies (3×3) – matrix transformation for reprojection of left and right image planes.
C. Loop & Z. Zhang, Computing Rectifying Homographies for Stereo Vision. CVPR’99
Slide adapted from Li Zhang
Stereo image rectification
Source: Alyosha Efros
Stereo reconstruction
• Mainsteps
• Calibratethecameras
• Rectifytheimages
• Calculatethedisparity • Estimatethedepth
Slide credit: Kristen Grauman
The correspondence problem
Epipolar geometry makes matching simpler, but does not solve it…
Multiple potential matches meet the epipolar constraints.
So the question remains: Which of the matches is correct?
Slika iz Gee & Cipolla 1999
Slide credit: Kristen Grauman
The correspondence problem
• Seemsthatcorrespondencesearchis not that simple after all…
• Assumptionsrequired!
• Foridentifyingmatchesofpairswewillassume:
• Mostpointsinthescenearevisibleinbothcameras.
• Regionsatthecorrespondingpointslocallylooksimilar
across the images.
• Thebaselineisrelativelysmall(comparedtothedepthof scene points)
Slide credit: Kristen Grauman
Additional constraints
• Similarity
• Uniqueness
• Order
• Disparitygradient
Slide credit: Kristen Grauman
Dense correspondences
• For each pixel in the left image:
• Find the corresponding epipolar line in the right image.
• Among all points along the epipolar line choose the most similar (e.g., Sum of Squared Differences, correlation)
• (Estimate depth by triangulation of matching points.)
• Simple when image lines correspond to epipolar lines Þ That is why we first rectify the images!
adapted from Svetlana Lazebnik, Li Zhang, B. Leibe
Dense correspondences
Left
Right
scanline
• Slide a window along the right scanline and compare contents of that window with the reference window in the left image
• Matching cost: SSD or normalized correlation
Matching cost
disparity
Dense correspondences
Left Right
scanline
SSD
Dense correspondences
scanline
Norm. corr
Example: Window search
• Data:UniversityofTsukuba
Slide credit: Kristen Grauman
Example: Window search
• Data:UniversityofTsukuba
• The point in the left and right image are compared by comparing small patches of pixels extracted at the points.
Slide credit: Kristen Grauman
Effects of the window size
– Smaller window + More detail
• Morenoise
W = 3
W = 20
– Larger window
+ Smoother disparity maps • Lessdetail
Slike: Li Zhang
Slide credit: Kristen Grauman
Local similarity problems
Textureless regions
????
Occlusion
Slide credit: Kristen Grauman
Local similarity problems
Non-Lambertian surfaces, specularities
Slide credit: Kristen Grauman
Sparse correspondences??
• Constrain the search only to the detected key-points.
• Instead of comparing intensities, apply more advanced descriptors
(e.g., SIFT).
• Still constrain the correspondence search by epipolar geometry.
Important to choose good (robust) descriptors!
Slide credit: Kristen Grauman
Additional constraints
• Similarity
• Uniqueness
• Order
• Disparitygradient
Slide credit: Kristen Grauman
Constraints: Uniqueness
• Foropaqueobjects,asinglepointintheleftimage corresponds to a single point in the right image.
Figure from Gee & Cipolla 1999
Slide credit: Kristen Grauman
Constraints: Order of points
• Pointsonasinglesurface,appearinthesameorder in both views.
Order of points constraint violated
Slide credit: Svetlana Lazebnik
Constraints:
• Assumingasmoothsurface,weexpectlocally smooth disparity.
Disparity gradient
Figure from Gee & Cipolla 1999
Slide credit: Kristen Grauman
Implementation of constraints
• Accountforeachcorrespondenceindependentlyof
its neighbors:
Order, smoothness, assumption violated!
• Optimizeallcorrespondencesjointly…
? how
Slide credit: Kristen Grauman
Along the lines
• Find matches that are coherent over the entire line
• Each line is still considered independently
Left image Right image
Slide credit: Kristen Grauman
Image intensity
“Shortest path” along stereo lines
Il
Right image Ir
𝑑!”#”$%&”‘( (𝑙” , 𝑟” !
Left image
Il Il
q t
Ir
Occlusion
in the left image
sp
Ir
Cost: Sum of correspondence similarities along the path.
Solve for 𝒎2 by dynamic programing Ohta & Kanade ’85, Cox et al. ‘96
ˆ
m = arg max C(m)
m
C(m)=åd iÎm
similarity
(l ,r) i i
Occlusion in the right image
correspondence
Coherent stereo in 2D field
• Results in jaggedness across neighboring lines
Without dynamic programing
With dynamic programing
• Better: Also account for regularization across lines
• Unfortunately we cannot apply dynamic programming to find a spatially coherent disparity on a 2D grid.
Solve by Graph Cuts
• Graphcutsefficientlyoptimizesuchcostfunctions Graph cuts (Boykov et al.) Ground truth
Joint combinatorial assignment by minimizing an appropriate energy…
Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via
Graph Cuts, PAMI 2001.
Dataset of stereo comparison: http://vision.middlebury.edu/stereo/
Application: View interpolation
Right image
Slide credit: Svetlana Lazebnik
Application: View interpolation
Left image
Slide credit: Svetlana Lazebnik
Application: View interpolation
Disparity
Slide credit: Svetlana Lazebnik
Application: View interpolation
Slide credit: Svetlana Lazebnik
Application: View interpolation
L. Zitnick et al, High-quality video view interpolation using a layered representation, SIGGRAPH 2004.
Slide credit: Svetlana Lazebnik
A novel view video
Microsoft research
Slide credit: Svetlana Lazebnik
Dense vs. Sparse
• Sparse
• Effectiveness
• Canchooseonlykey-points,whicharelesssensitiveto intensity changes.
• But…
• Need to know how to choose the right ones… • Get only a sparse depth estimate.
• Dense
• Simple
• Usingmoredepthpointsbetterreconstructsasurface. • But…
• Still fails in textureless regions.
• Pixel intensities inappropriate for correspondences. • Inappropriate for wide-baseline systems.
• Efficientassignmentbyenergyminimization
Slide credit: Kristen Grauman
Quick Recap
• Multiple-viewgeometry:Asimplestereo Z=f
T
xr -xl
Quick Recap
• Astereowithnon-alignedcameras:Epipolargeometry
Correspondence search along the epipolar lines…
Quick Recap
• Acalibrated/noncalibratedsystem: pTEp¢ = 0
E=TR ́
xT F x¢ = 0
F = K-T EK¢-1
• Inpracticetransformtoalignedstereo:
Quick Recap
• Disparitycalculaton:
Regularization per lines:
By max similarity match:
Global regularization:
Stereo reconstruction
• Mainsteps
• Calibratethecamerasystem
• Rectifytheimageplanes
• Calculatethedisparity
• Estimatethedepth by triangulation
A moving camera
• Ifacameraismovingfreely,thestereocannotbe pre-calibrated (except from matrix K, that is)
• Actually,wearedealingwithmultiple“cameras”
Thank you
Hyung Jin Chang
14/03/2021