程序代写 Mid-Level Vision (Multiview): Stereo

Mid-Level Vision (Multiview): Stereo
1. Two cameras with identical focal lengths are set up so that their image planes are coplanar and their x-axes are collinear. Derive the relationship between the distance (Z) of a point from the cameras and the distance (B) that sep- arates the origins of the two cameras.
For any camera, the image coordinates of a point (X,Y,Z) is
􏰆fX fY􏰇
(u,v)= Z,Z where all coordinates are relative to the cameras coordinate system.
P
Hence, for the left camera in the stereo pair: 􏰆fXL fYL􏰇
(uL,vL)= Z , Z
􏰆fXR fYR􏰇 􏰆f(XL−B) fYL􏰇
Z For the right camera:
(uR,vR)=Z,Z= Z ,Z
B
(using the coordinate system of the left camera). Hence:
uL −uR = f XL − f (XL −B) = f B ZZZ
Z= fB (uL −uR)
OL
OR
XL
XR
2. Comment on the accuracy with which an object’s depth can be measured with (a) changing distance, (b) changing baseline.
Accuracy will depend on the size of the disparity: the larger the disparity, the smaller the effects of small measure- ment errors.
d=uL−uR = fB Z
(a) as the distance to the object decreases, Z decreases, and disparity increases. Hence, accuracy increases as the object comes closer.
(b) as the baseline, B, increases, the disparity increases. Hence, accuracy increases as the baseline gets longer.
3. Briefly explain what is meant by the Epipolar constraint on the stereo correspondence problem.
The Epipolar constraint reduces the search for correspondence to a single line (the epipolar line) in the image. Finding the epipolar line requires knowledge of the intrinsic and extrinsic parameters of the cameras.
For a simple coplanar configuration of cameras the epipolar lines are the horizontal scan lines of the images.
4. List other constraints applied to solving the stereo correspondence problem, and note circumstances in which they fail.
• Maximum disparity (limit search to ± f B around the point with zero disparity). Fails for points closer to
cameras than Zmin.
Zmin
31

• Continuity(assumeneighbouringpointshavesimilardisparities).Failsatdiscontinuitiesbetweensurfaces at different depths.
• Uniqueness(assumeeachelementinoneimagematchesexactlyoneelementintheother).Failsforsurfaces angled such that they project different numbers of elements to each image, or where there is occlusion of a element in one image but not the other.
• Ordering (assume that matching elements occur in the same order along the conjugated epipolar lines). Fails where surfaces are at different depths.
5. In a stereo vision system, the baseline between the camera centres is 400mm and the angle of convergence of the z- axes of the cameras is 60o . Assume the z-axes of each camera make an equal angle with the baseline (i.e., 60o in this case). If the line-of-sight of a scene point makes angles αL and αR with the z-axes of the left and right cameras respectively, then what is the distance of the point from the horopter (a) αL = αR = 15o, (b) αL = +15o and αR = −15o, and (c) αL = −15o and αR =+15o.
15
15
zR
r
zL
Fixation point
w0
Horopter
OL B OR
(a)αL =αR =15o
Disparity=(αL −αR)=(15−15)=0
By definition, point is on the horopter, hence distance from the horopter is zero.
32

zR
zL
Fixation point
A
PPP P
w0
Horopter
15
−15
(b) αL = +15o and αR = −15o Disparity=(αL −αR)=(15−(−15))=30
Angle BOR P = 75o
Hence, distance from the baseline to P is 0.5Btan(75) = 746.4mm
Distance of fixation point from baseline is 0.5Btan(60) = 346.4mm
Therefore distance between horopter and P is 746.4- 346.4=400mm
Horopter
OL B OR zR zL
Fixation point
−15
15
(c) αL = −15o and αR = +15o Disparity=(αL −αR)=(−15−15)=−30
Angle BOR P = 45o
Hence, distance from the baseline to P is 0.5B tan(45) = 200mm Distance of fixation point from baseline is 0.5Btan(60) = 346.4mm
Therefore distance between horopter and P is 200-346.4=- 146.4mm
w0
P
OL B OR
6. Give two reasons why the recovery of depth information is important for object recognition.
Depth between objects provides a cue for segmentation, and hence, helps solve the problem of mid-level vision. Depth within an object provides information about the shape of an object, and hence, helps solve the problem
of high-level vision.
7. Briefly describe two oculomotor cues to depth.
33

Accommodation The shape of the lens in the eye, or the depth of the image plane in a camera, is related to the depth of objects that will be in focus. Hence, knowledge of these values provides information about the depth of the object being observed.
Convergence Therotationofeyes/camerasinastereovisionsystemcanvarytofixateobjectsatdifferentdepths. Hence, the angle of convergence provides information about the depth of the object being fixated.
8. Briefly describe four monocular cues to depth.
Interposition Nearer objects may occlude more distant objects. Hence occlusion (or interposition) provides in- formation about relative depth.
Size familiarity Objects of known size provide depth information, since the smaller the image of the object the greater its depth.
,thetextureelementsgetsmallerandmorecloselyspacedwith increasing depth.
Linear perspective lines that are parallel in the scene converge towards a vanishing point in the image. As the distance between the lines in the image decreases, so depth increases.
Aerial perspective Due to the scattering of light by particles in the atmosphere, distant objects look fuzzier and have lower luminance contrast and colour saturation.
Shading Thedistributionoflightandshadowonobjectsprovidesacuefordepth. 9. Briefly describe two motion induced cues to depth.
Motion parallax As the camera move sideways, objects closer than the fixation point appear to move in a direc- tion opposite to the camera, while objects further away appear to move in the same direction. The speed of movement increases with distance from the fixation point.
Optic Flow As a camera moves forward or backward, objects closer to the camera move more quickly across the image plane.
;thesechangesinocclusion provides information about relative depth.
Structure from motion (kinetic depth) Movement of an object or of the camera can generate different views of an object that can be combined to recover 3D structure.
34