2812ICT Perceptual Computing
Image Representation & Processing
Outline
• Image Filtering
• Convolution and Correlation • Template matching
Images as Discrete Functions
• We can think of an image as a function, f, from R2 to R
• f( x, y ) gives the intensity at position ( x, y )
• Defined over a rectangle, with a finite range:
• Images are usually digital (discrete)
• Sample the 2D space on a regular grid
• Represented as a matrix of integer values
• A color image can be written as a “vector-valued” function:
r(x, y) f (x, y) = g(x, y)
b(x, y)
f(x, y)
pixel
y x
Image neighborhoods
• Q: What happens if we reshuffle all pixels within the images?
• A: Its histogram won’t change. Point-wise processing unaffected.
• Need to measure properties relative to small neighborhoods of pixels
Image Filtering
• Filtering:
• Form a new image whose pixels are a combination of original pixel values
• Goals:
• Extract useful information from the images
• Features (edges, corners, blobs…) • Modifyorenhanceimageproperties
• de‐noising; super-resolution; in-painting
Filter example #1: Moving Average
• Let’s replace each pixel with an average of all the values in its neighborhood
• Eg. Neighborhood of +/- 2 pixels • Moving average in 1D:
Source: S. Marschner
Weighted Moving Average
• Can add weights to our moving average •Weights [1,1,1,1,1] /5
Source: S. Marschner
Weighted Moving Average
• Non-uniform weights [1, 4, 6, 4, 1] / 16
Source: S. Marschner
Moving Average In 2D
• 2D moving average over a 3 × 3 window of neighbourhood
• A convolution operation:
• Replaces each pixel with an average of its neighborhood. • Achieve smoothing effect (remove sharp features)
Moving Average In 2D
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
0
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
0
0
0
0
0
0
0
0
0
90
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Source: S. Seitz
Moving Average In 2D
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
0
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
0
0
0
0
0
0
0
0
0
90
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
10
Source: S. Seitz
Moving Average In 2D
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
0
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
0
0
0
0
0
0
0
0
0
90
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
10
20
Source: S. Seitz
Moving Average In 2D
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
0
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
0
0
0
0
0
0
0
0
0
90
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
10
20
30
Source: S. Seitz
Moving Average In 2D
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
0
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
0
0
0
0
0
0
0
0
0
90
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
10
20
30
30
Source: S. Seitz
Moving Average In 2D
0
10
20
30
30
30
20
10
0
20
40
60
60
60
40
20
0
30
60
90
90
90
60
30
0
30
50
80
80
90
60
30
0
30
50
80
80
90
60
30
0
20
30
50
50
60
40
20
10
20
30
30
30
30
20
10
10
10
10
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
0
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
0
0
0
0
0
0
0
0
0
90
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Source: S. Seitz
Moving Average In 2D
Gaussian filter
• What if we want nearest neighboring pixels to have the most influence on the output?
This kernel is an approximation of a Gaussian function:
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
90
0
90
90
90
0
0
0
0
0
90
90
90
90
90
0
0
0
0
0
0
0
0
0
0
0
0
0
0
90
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
2
1
2
4
2
1
2
1
Source: S. Seitz
Gaussian filters
• What parameters matter here? • Size of kernel or mask
• Note, Gaussian function has infinite support, but discrete filters use finite kernels
σ = 5 with 10 σ = 5 with 30 x 10 kernel x 30 kernel
Gaussian filters
• What parameters matter here?
• Variance of Gaussian: determines extent of
smoothing
σ = 2 with 30 σ = 5 with 30 x 30 kernel x 30 kernel
Smoothing with a Gaussian
Parameter σ is the “scale” / “width” / “spread” of the Gaussian kernel, and controls the amount of smoothing.
…
Boundary issues
• What about near the edge?
• the filter window falls off the edge of the image
• need to extrapolate
• methods:
• clip filter (black)
• wrap around
• copyedge
• reflect across edge
Source: S. Marschner
Convolution
• Convolution:
• Flip the filter in both dimensions (bottom to top, right to left) • Thenapplycross-correlation
H
F
Notation for convolution operator
Convolution in operation
Source: Fei-Fei Li
Convolution in operation
Source: Fei-Fei Li
Shift invariant linear system
• Shift invariant:
• Operator behaves the same everywhere, i.e. the value of the output depends on the pattern in the image neighborhood, not the position of the neighborhood.
• Linear:
• Superposition: h * (f1 + f2) = (h * f1) + (h * f2)
• Scaling: h * (k f) = k (h * f)
Properties of convolution
• Linear & shift invariant • Commutative:
f*g=g*f • Associative
(f * g) * h = f * (g * h) • Identity:
unit impulse e = […, 0, 0, 1, 0, 0, …]. • Differentiation:
f * e = f
Separability
• In some cases, filter is separable, and we can factor into two steps: • Convolve all rows
• Convolve all columns
(Cross) correlation (symbol )
• Cross correlation of two 2D signals f[n,m] and g[n,m]
(k, l) is called the lag
• Equivalent to a convolution without the flip
(g* is defined as the complex conjugate of g. When g(n,m) are real numbers, g*=g.)
Convolution vs. correlation
Convolution
Cross-correlation
For a Gaussian or box filter, how will the outputs differ?
If the input is an impulse signal, how will the outputs differ?
Convolution vs. correlation
https://commons.wikimedia.org/wiki/File:Comparison_convolution_correlation.svg
Convolution vs. correlation
• A convolution is an integral that expresses the amount of overlap of one function as it is shifted over another function.
• Convolution is a filtering operation
• Correlation compares the similarity of data. Correlation computes a measure of similarity of two images as they are shifted by one another. The correlation result reaches a maximum at the time when the two images match best.
• Correlation is a measure of relatedness of two signals
(Cross) correlation – example
thresholding
thresholding
Filters for features
• Filters allow us to abstract higher-level “features”.
• Map raw pixels to an intermediate representation that will be used for subsequent processing
• Goal: reduce amount of data, discard redundancy, preserve what’s useful
Template matching
• Filters as templates:
Note that filters look like the effects they are intended to
find — “matched filters”
• Use normalized cross-correlation score to find a given pattern (template) in the image.
• Normalization needed to control for relative brightness.
Template matching
Scene
Template (mask)
A toy example
Template matching
Detected template
Template
Template matching
Detected template Correlation map
Where’s Waldo?
Scene
Template
Where’s Waldo?
Scene
Template
Where’s Waldo?
Detected template Correlation map
Template matching
Scene
What if the template is not identical to some subimage in the scene?
Template
Template matching
Detected template
Match can be meaningful, if scale, orientation, and general appearance is right.
Template
Fast convolution/correlation
• Convolution can be done efficiently in the frequency domain through the use of Fast Fourier Transform.
• The Convolution Theorem (https://en.wikipedia.org/wiki/Convolution_theorem) stated that under suitable conditions the Fourier transform of a convolution is the pointwise product of the Fourier transforms, i.e.
• In other words, we can perform a convolution by taking the Fourier transform of both functions, multiplying the results, and then performing an inverse Fourier transform
• Convolution theorem is used extensively in audio signal analysis and image filtering operations.
Next week:
• Image features extraction