Depth
Recovery of depth information is important for: • Controlling movement
• mobile robot path planning
• grasping an object with a robot arm • 3D reconstruction
• generating CAD models
• virtual reality
• Object Recognition
• depth (between objects) provides a cue for segmentation • depth (within an object) provides information about the
shape of an object
Stereo is only one source of information about depth…
Computer Vision / Mid-Level Vision / Stereo and Depth 46
Cues to depth: Binocular cues
Disparity: difference in location of corresponding points in a stereo pair of images.
an autostereogram
Stereo vision is important (try catching a ball with one eye closed), and is sufficient (random dot stereograms and autostereograms demonstrate this as all other cues to depth have been removed).
However, disparity is only one of many depth cues.
Computer Vision / Mid-Level Vision / Stereo and Depth 47
Cues to depth: Oculomotor cues
Accommodation: the shape of the lens (and hence its refractive power) can be modified by muscles in the eye to bring objects at different depths into focus. Hence, the state of these muscles provides information about the depth of the object being observed.
Convergence: the rotation of eyes/cameras can vary to fixate objects at different depths. Hence, the angle of convergence provides information about the depth of the object being fixated.
Computer Vision / Mid-Level Vision / Stereo and Depth 48
Cues to depth: Monocular cues i.e. cues that provide depth information with one eye closed
Interposition: when the view of one object is interrupted by the presence of another object, we use this pattern of occlusion to determine the relative depth of the objects. The near object is perceived as “interposed” between the far object and the observer.
Computer Vision / Mid-Level Vision / Stereo and Depth 49
Cues to depth: Monocular cues
Interposition: manipulating interposition depth cues can produce impossible objects, or illusory objects.
Penrose triangle
kanizsa square
M. C. Escher’s Waterfall
Computer Vision / Mid-Level Vision / Stereo and Depth 50
Cues to depth: Monocular cues
Size familiarity: distant objects necessarily produce a smaller image than nearby objects of the same size. The larger of two identical objects tends to be perceived as closer that the smaller one.
Computer Vision / Mid-Level Vision / Stereo and Depth 51
Cues to depth: Monocular cues
Texture gradients: for uniformly textured surfaces, the texture elements get smaller and more closely spaced with distance.
Computer Vision / Mid-Level Vision / Stereo and Depth 52
Cues to depth: Monocular cues
Linear perspective: the property of parallel lines converging at infinity provides a cue to depth (and size).
Computer Vision / Mid-Level Vision / Stereo and Depth 53
Cues to depth: Monocular cues
Linear perspective: manipulating perspective can produce unusual perceptions (overcoming size familiarity).
Computer Vision / Mid-Level Vision / Stereo and Depth
54
Ames room
Cues to depth: Monocular cues
Linear perspective: manipulating perspective can produce unusual perceptions (overcoming prior beliefs about gravity!).
Computer Vision / Mid-Level Vision / Stereo and Depth 55
Cues to depth: Monocular cues
Linear perspective: manipulating perspective can produce a strong perception of 3D structure, e.g. Trompe L’oeil art.
Computer Vision / Mid-Level Vision / Stereo and Depth 56
Cues to depth: Monocular cues
Linear perspective: manipulating perspective can produce a strong perception of 3D structure, e.g. Trompe L’oeil art.
Computer Vision / Mid-Level Vision / Stereo and Depth 57
Cues to depth: Monocular cues
Aerial perspective: due to the scattering of light by particles in the atmosphere, distant objects look fuzzier and have lower luminance contrast and colour saturation.
Computer Vision / Mid-Level Vision / Stereo and Depth 58
Cues to depth: Monocular cues
Shading: the distribution of light and shadow on objects provides a cue for depth (the brain assumes, usually, that light comes from above).
Computer Vision / Mid-Level Vision / Stereo and Depth 59
Cues to depth: Monocular cues
Shading: the distribution of light and shadow on objects provides a cue for depth (the brain assumes, usually, that light comes from above).
Hollow face illusion
shading fails to recover depth when pitted against familiarity
Computer Vision / Mid-Level Vision / Stereo and Depth 60
Cues to depth: Motion induced cues
Motion parallax: speed and direction of image motion induced by movement of the camera/eye varies with depth. Objects closer than the fixation point appear to move in a direction opposite to the observer, while objects further away appear to move in the same direction.
Computer Vision / Mid-Level Vision / Stereo and Depth 61
Cues to depth: Motion induced cues
Optic Flow: As a camera moves forward or backward, the pattern of stimulation across the entire visual field changes, producing a pattern of expanding or contracting “optic flow”. Points closer to the camera move more quickly across the image plane
Computer Vision / Mid-Level Vision / Stereo and Depth 62
Cues to depth: Motion induced cues
Accretion and deletion : parts of an object can appear or disappear when an observer moves relative to two surfaces that are at different depths.
Motion to left causes far surface to become occluded by near surface
Motion to right causes far surface to become uncovered by near surface
Computer Vision / Mid-Level Vision / Stereo and Depth
63
Cues to depth: Motion induced cues
Structure from motion (kinetic depth): movement of an object or of the camera can induce the perception of 3D structure.
Depth is conveyed solely by the variation in velocity and direction of each dot: no depth is seen in static image.
Computer Vision / Mid-Level Vision / Stereo and Depth 64
Summary: Stereo Vision
Aim: to infer depth from 2 or more images taken from different viewpoints. Two sub-problems:
1. Correspondence. Determining which points in the left and right images are projections of the same point in the scene. Constraints:
● epipolar constraint (corresponding points lie along the epipolar line)
● maximum disparity (extent of search reduced by knowledge of baseline and minimum depth)
● continuity (disparity varies smoothly assuming continuous surfaces)
● uniqueness (one-to-one matches assuming surfaces are approximately parallel to image plane and no occlusion)
● ordering (points occur in same order in each image assuming no occlusion)
Computer Vision / Mid-Level Vision / Stereo and Depth 65
Summary: Stereo Vision
2. Reconstruction. Given the correspondence between points, and
camera parameters, calculate the depth (z). Depth related to disparity (d).
d=x’L−x’R
B d
d = αL-αR
d > 0 outside of the horopter
d < 0 inside the horopter
For coplanar cameras:
z=f For non coplanar cameras:
Computer Vision / Mid-Level Vision / Stereo and Depth 66
Summary: Depth
Binocular Disparity
Oculomotor
Accommodation (shape of lens in eye) Convergence (angle of rotation of cameras/eyes)
Monocular
Interposition (occlusion of one object/part of object by another) Size familiarity (the smaller the object the greater its depth) Texture gradients (texture become smaller and denser with depth) Linear perspective (convergence of lines at vanishing points) Aerial perspective (reduction in contrast and saturation with depth) Shading (light and shadow)
Motion
Motion parallax (depth related to image motion caused by camera motion) Optic Flow (depth related to speed of image expansion/contraction) Accretion and deletion (changes in occlusion due to camera movement) Structure from motion (depth perception due to object motion)
Computer Vision / Mid-Level Vision / Stereo and Depth 67