Low-Level Vision (Biological)
1. List the range of image properties to which V1 cells show selectivity.
• colour
• orientation
• direction of motion • spatial frequency • eye of origin
• binocular disparity • position
2. What is a hyper-column?
A hyper-column is a region of V1 that contains neurons covering the full range of RF types for a single spatial location.
3. Briefly describe the stimulus selectivities of simple and complex cells in V1.
Simple cell: optimum response to an appropriately oriQ2ented stimulus, of the right contrast polarity, placed at certain position within the receptive field.
Complex cell: optimum response to an appropriately oriented stimulus, with any contrast polarity, placed anywhere within the receptive field.
4. Describe how simple cell responses could be modelled using convolution.
A simple cell RF can be well described by a Gabor function.
Convolving the image with a Gabor mask will simulate the response of all simple cells selective for the
same parameters across all hyper-columns.
Repeating the convolution with Gabor masks with different parameters (e.g. orientation, spatial frequency, phase, aspect ratio) will simulate the responses of all different types of V1 simple cell.
17
5. Describe how complex cell responses could be modelled using convolution.
A complex cell can be modelled by combining the outputs of two or more simple cells.
For example, the response from a quadrature pair of Gabor functions (two Gabors with a phase difference of π/2) can be used as input to a model of a complex cell. These inputs need to be combined by taking the
square-root of the sum of the squares of the two inputs.
Hence, complex cell responses can be modelled by performing multiple convolutions with different Gabor masks to simulate simple cell responses (as described in the answer to the previous question), and subse- quently combining (as described above) those responses for Gabor masks with identical parameters except with different phases.
6. Gabors functions are the components of natural images under the “sparsity” constraint. What is the sparsity constraint and how is this relevant to efficient coding?
The sparsity constraint requires that the minimum number of components are present in each image.
Hence, by using Gabors as the components by which an image is represented, an image can be coded accurately and efficiently (with a minimum number of active components).
7. Briefly describe what is meant by the classical receptive field and the non-classical receptive field.
Classical Receptive Field (cRF) = the region of visual space / the stimulus properties that can elicit a response from a neuron.
Non-classical Receptive Field (ncRF) = the region of visual space / the stimulus properties that can modulate the response from a neuron, but not generate a response from the neuron in the absence of input to the cRF.
8. What is an “association field”. Describe the association field for a V1 cell with an orientation preference.
An association field is the pattern of long-range lateral connections received by an orientation selective neuron in V1. It defines the ncRF of such a neuron.
18
A V1 neuron with a cRF selective for a particular orientation will receive lateral excitation from neighbouring V1 cells with similar orientation preferences that are aligned so that they are collinear or co-circular with it. It will receive lateral inhibition from other neighbouring V1 cells with similar orientation preferences.
9. How do lateral connections in V1 give rise to (a) contour integration, (b) pop-out, (c) texture segmentation?
(a) contour integration is generated principally by lateral excitation between cells with nearly co-linear/co- circular orientation preferences. These cells enhance each others response, and hence, make linear or circular contours more visible.
(b) pop-out is generated principally by lateral inhibition between cells with similar preferences. These cells suppress each other’s response making cells responding to different image features relatively more active, and hence, dissimilar stimuli more visible.
(c) texture segmentation is generated principally by lateral inhibition between cells with similar preferences. Hence, cells responding to features within a uniform region of texture suppress each others response. How- ever, cells on the border between different textures only receive inhibition from one side, giving these cells a relatively higher response, and hence, making the border more visible.
19
Mid-Level Vision: Segmentation (Biological)
1. Briefly describe the difference between bottom-up and top-down influences on grouping.
Top-down influences come from prior knowledge and experience. They cause image elements to be grouped because of prior expectations about what elements belong to the same object.
Bottom-up influences come from image properties. They cause image elements to be grouped because the have similar properties.
2. For each of the following images identify the “Gestalt Law” that gives rise to the observed grouping.
(a)
(b) (c)
(d)
(e) (f)
(a) similarity
(b) common region (c) proximity
(d) connectivity
(e) closure
(f) continuity
3. Explain how lateral connections in V1 give rise to the Gestalt biases of similarity and continuity.
20
Lateral inhibitory connections cause mutual suppression of neurons representing similar image elements. At borders between dissimilar elements there is less inhibition, and hence, the border is enhanced.
Lateral excitatory connections cause mutual enhancement of neurons representing co-linearly orientated im- age elements. Hence, the response of elements that form continuous contours is enhanced.
4. Explain what is meant by border ownership?
Border ownership refers to the fact that the boundary between two regions in an image is perceived as part of one region (the foreground) and not the other region (the background). This means that foreground objects have a defined shape (delineated by the border), whereas background objects appear shapeless.
21