Image Formation
1. What determines where a point in the 3D world appears on a 2D image?
Camera position and orientation relative to 3D point.
2. What determines how bright the image of that point is?
Illumination (intensity, colour-spectrum, number of sources, locations of sources, etc.)
Reflectance (material properties, amount of light absorbed, reflected, etc.)
Sensor (sensitivity to different frequencies of light)
3. Given the RGB values of a pixel, how can we tell the colour of the surface that is shown at that point in the image?
We can’t. The RGB values of the image will depend both on the properties of the surface (including its colour) and the properties of the light it is reflecting. Without knowledge of the latter, we can’t know the former.
4. Briefly define what is meant by the terms “focus” and “exposure”.
Focus means that all rays coming from a scene point converge into a single image point.
Exposure is the time needed to allow enough light through to form an image
3
5. Write down the thin lens equation, which relates the focal length of a lens to the depths of the image and object.
1=1+1 f ∥z∥ ∥z′∥
where:
f = focal length of lens
z = distance of object from the lens z’ = distance of image from the lens
6. Derive the thin lens equation.
From similar triangles:
and:
y cancels, hence:
(green)
(blue)
z′y = (z′−f)y zf
z′ = (z′−f) = z′ − 1 zff
=⇒ y′ = z′y =⇒ y′ = (z′−f)y
y′ = y z′−f f
f
y′ = y
z′z z
equating for y’:
dividingbothsidesbyz’: 1 = 1 − 1 z f z′
=⇒ 1 + 1 = 1 z z′ f
7. If a lens has a focal length of 35mm at what depth should the image plane be placed to bring an object 3m from the camera into focus? What if the object is at 0.5m?
4
1=1+1 f ∥z∥ ∥z′∥
For object at 3m:
1=1+1 35 ∥3000∥ ∥z′ ∥
1=1−1 ∥z′ ∥ 35 ∥3000∥
z′ = 35.4mm
For object at 0.5m:
1=1−1 ∥z′∥ 35 ∥500∥
z′ = 37.6mm
8. If an object is at a distance of 3m from a pinhole camera, where should the image plane be placed to get a focused image?
A pinhole camera has an infinite focal range, meaning that the image will be in focus no matter what distance the image plane is from the pinhole.
9. Briefly compare the mechanisms used for focusing a camera and an eye.
A camera lens has a fixed shape, and hence, a fixed focal length. Focusing is achieved by moving the lens to change the distance to the image plane.
An eye lens has an adjustable shape, and hence, a variable focal length. Whereas the distance between the lens and the image plane (the retina) is fixed. Focusing is achieved by changing the focal length of the lens.
10. Briefly compare the mechanisms used for sampling the image in a camera and in an eye.
Camera: Has sensing elements sensitive to three wavelengths (RGB). These elements occur in a fixed ratio across the whole image plane. The sampling density is uniform across the whole image plane.
5
Eye: Has sensing elements sensitive to four wavelengths (RGBW). These elements occur in variable ratios across the image plane (cone density highest at fovea, rod density highest outside fovea). The sampling density is non-uniform across the image plane (density is highest at the fovea).
11. A point in 3D space has coordinates [10,10,500] mm relative to the camera reference frame. If the image principal point is at coordinates [244,180] pixels, and the magnification factors in the x and y directions are -925 and -740, then determine the location of the point in the image. Assume that the camera does not suffer from skew or any other defect.
u α0ox1000x z z
v=10βoy 0100y 100100101
u −925 0 2441 0 0 0 10 225.5
v = 1 0 −740 180 0 1 0 0 10 = 165.2
500 500 1001001011
226 Answer is in pixels, so round to nearest integer values: 165
12. The RGB channels for a small patch of image are shown below. Convert this image patch to a greyscale representation using an equal weighting for each channel and (a) 8 bits per pixel, (b) 2 bits per pixel.
205 195 143 138 154 145 IR= 238 203 ,IG= 166 143 ,IB= 174 151
(a) 8 bits per pixel gives 256 grey levels, represented by the integers 0 to 255. (205+143+154)/3 = 167.3333
(195+138+145)/3 = 159.3333
(238+166+174)/3 = 192.6667
(203+143+151)/3 = 165.6667
Therefore, rounding to nearest integer:
167 159 IGrey = 193 166
(b) 2 bits per pixel gives 4 grey levels, represented by the integers 0 to 3. Hence, 8-bit values in range [0, 63] map to 0, [64, 127] map to 1, [128, 191] map to 2, [192, 255] map to 3.
2 2 IGrey= 3 2
6
13. The image below shows the values of the red, green and blue pixels in a Bayer masked image. Calculate the RGB values for each pixel at the four central locations (i.e., locations 33, 34, 43, and 44), using: a) bilinear interpolation; b) smooth hue transition interpolation; c) edge-directed interpolation.
a) bilinear interpolation
At Pixel 33: Red = R33
Green = (G23+G34+G32+G43) / 4 Blue = (B22+B24+B42+B44) / 4
At Pixel 34:
Red = (R33+R35) / 2 Green = G34
Blue = (B24+B44) / 2
At Pixel 43:
Red = (R33+R53) / 2 Green = G43
Blue = (B42+B44) / 2
At Pixel 44:
Red = (R33+R35+R53+R55) / 4 Green = (G34+G43+G45+G54) / 4 Blue = B44
b) smooth hue transition interpolation
Interpolation of green pixels as above, call these values G33, G34, G43, and G44.
At Pixel 33: Red = R33
Blue = G33 * (B22 / G22 + B24 / G24 + B42 / G42 + B44 / G44)/4
7
At Pixel 34:
Red = G34 * (R33 / G33 + R35 / G35)/2 Blue = G34 * (B24 / G24 + B44 / G44)/2
At Pixel 43:
Red = G43 * (R33 / G33 + R53 / G53)/2 Blue = G43 * (B42 / G42 + B44 / G44)/2
At Pixel 44:
Red = G44 * (R33 / G33 + R35 / G35 + R53 / G53 + R55 / G55)/4 Blue = B44
c) edge-directed interpolation
At Pixel 33:
∆H =∥G32−G34∥ ∆V =∥G23−G43∥
Green = Green = Green =
At Pixel 34: Green =
At Pixel 43: Green =
At Pixel 44:
(G32+G34)/2 if ∆H < ∆V (G23+G43)/2 if ∆H > ∆V (G32+G34+G23+G43)/4 if ∆H = ∆V
G34
G43
∆H =∥G43−G45∥
Green = (G43+G45)/2 if ∆H < ∆V
Green = (G34+G54)/2 if ∆H > ∆V
Green = (G43+G45+G34+G54)/4 if ∆H = ∆V
Interpolation of Red and Blue pixels as in (b).
14. Briefly describe the different characteristics of the fovea and periphery of the retina.
Fovea:
• high resolution (acuity) – due to high density of photoreceptors • colour – due to photoreceptors being cones
• low sensitivity – due to response characteristics of cones
Periphery:
• low resolution (acuity) – due to low density of photoreceptors and large convergence of photoreceptors to ganglion cells
• monochrome – due to photoreceptors being rods 8
∆V =∥G34−G54∥
• high sensitivity – due to response characteristics of rods
15. Describe the general receptive field structure of a retinal ganglion cell. Describe the full range of ganglion cell RFs found in the normally functioning human retina.
Ganglion cells have centre-surround RFs. There are two classes On-centre, off-surround (excited by central stimulus inhibited by surround) and Off-centre, on-surround (excited by surround and inhibited by centre).
The input to the centre and surround can come from rods or cones, giving rise to various combinations of RF properties:
Centre Surround W+ W- W- W+ R+ G-
R- G+ G+ R-
G- R+
B+ Y- B- Y+ Y+ B- Y- B+
16. Briefly describe how Ganglion cell RFs give rise to: (a) efficient image coding, (b) invariance to illumination, (c) edge enhancement.
A centre-surround cell only responds strongly when there is a difference between the intensity of the stimulus in the centre compared to that in the surround (e.g. centre is bright, surround is dark). When the intensity across the RF is uniform, the cell responds weakly.
(a) efficient image coding.
In a typical image, across a large proportion of the image, intensity changes slowly. This means that only
relatively few ganglion cells respond strongly. This minimises the number of active neurons. Also, each active neuron is encoding the difference between the centre and surround, this difference in intensity is less than the absolute intensity, and hence, can be encoded to the same precision using a lower bandwidth.
(b) invariance to illumination.
The absolute intensity of different parts of the image will change with the intensity of the illumination,
however, the relative intensities of different parts of the image will remain constant. Each active neuron is encoding the difference between the centre and surround, and hence, the response is relatively unaffected by illumination strength.
9
(c) edge enhancement.
Edges cause changes in intensity, changes in intensity cause centre-surround cells to become active.
Hence, edges cause strong responses from ganglion cells.
10