Fundamentals of Computer Vision
Lecture 5
• Image noise
• Edge detection
• Fourier series.
• Frequency domain.
• Fourier transform.
• Frequency-domain filtering.
• Sampling and Aliasing.
• Gaussian image pyramid.
Overview of today’s lecture
Slide credits
Most of these slides were adapted directly from:
• Robert Colin (454, Fall 2019), Kris Kitani (15-463, Fall 2016), Ioannis Gkioulekas (16-385, Spring 2019)
Some slides were inspired or taken from:
• Fredo Durand (MIT).
• Bernd Girod (Stanford University).
• James Hays (Georgia Tech).
• Steve Marschner (Cornell University).
• Steve Seitz (University of Washington).
Review
Convolution vs. Cross Correlation
• A convolution is an integral that expresses the amount of overlap of one function as it is shifted over another function.
• convolutionisafilteringoperation
• Correlation compares the similarity of two sets of data. Correlation computes a measure of similarity of two input signals as they are shifted by one another. The correlation result reaches a maximum at the time when the two signals match best .
• correlationisameasureofrelatednessoftwosignals
Source: J. Niebles and R. Krishna
(Cross) correlation – example
scanline
Left Right
Norm. cross corr. score
Convolution in 2D – examples
•0
•0
•0
•0
•2
•0
•0
•0
•0
•1
•1
•1
•1
•1
•1
•1
•1
•1
–
(Note that filter sums to 1)
“details of the image”
–
=?
Original
•0
•0
•0
•0
•1
•0
•0
•0
•0
•0
•0
•0
•0
•1
•0
•0
•0
•0
•1
•1
•1
•1
•1
•1
•1
•1
•1
+
Source: J. Niebles and R. Krishna
What does blurring take away?
–
=
original
• Let’s add it back:
smoothed (5×5)
Detailed
original
Detailed
Sharpened
+
=
Source: J. Niebles and R. Krishna
Convolution in 2D – Sharpening filter
•0
•0
•0
•0
•2
•0
•0
•0
•0
•1
•1
•1
•1
•1
•1
•1
•1
•1
–
Original
Sharpening filter: Accentuates differences with local average
=
Source: J. Niebles and R. Krishna
Image support and edge effect
•A computer will only convolve finite support signals. • What happens at the edge?
• zero “padding”
• edge value replication h • mirror extension
• more (beyond the scope of this class)
f
Source: J. Niebles and R. Krishna
Practical Issue: Border Handling
• Problem: what do we do for border pixels where the kernel does not completely overlap the image?
???
for interior pixels where there is full overlap, we know what to do.
but what values do we use for pixels that are “off the image” ?
Practical Issue: Border Handling
• Differentborderhandlingmethodsspecifydifferentways of defining values for pixels that are off the image.
• Oneofthesimplestmethodsiszero-padding,whichwe used by default in earlier examples.
0 0 0 0 … 0
0
0
0
Practical Issue: Border Handling
• Othermethods…
• Replication–replaceeachoff-imagepixelwiththevalue from the nearest pixel that IS in the image.
Example:
123 123 123
111 333 444 666 777 999
7s 7 8 9 9s 789
789
1s
3s
123 456 789
123 456 789
Practical Issue: Border Handling
• Othermethods…
• Reflection–reflectpixelvaluesattheborder(asifthere was a little mirror there)
Example:
987789987 654456654 321123321
321 321 654 654 987 987
987789987 654456654 321123321
123 456 789
123 456 789
Practical Issue: Border Handling
• Othermethods…
• Wraparound–whengoingofftherightborderoftheimage,youwrap around to the left border. Similarly, when leaving the bottom of the image you reenter at the top. Basically, the image is a big donut (or torus).
Example:
123 456 789
123123 456456 789789
123 123 456 456 789 789
123123123 456456456 789789789
123 456 789
123 456 789
Math Example : 1D Gradient
Consider function f(x) = 100 – 0.5 * x2
Math Example : 1D Gradient
Consider function f(x) = 100 – 0.5 * x2 Gradientisdf(x)/dx=-2*0.5*x = -x
Geometric interpretation:
gradient at x0 is slope of tangent line to curve at point x0
tangent line
Dy x0
Dx f(x)
slope = Dy / Dx = df(x)/dx
x0
Math Example : 1D Gradient
f(x) = 100 – 0.5 * x2 df(x)/dx = – x
grad = slope = 2
x = -2
Math Example : 1D Gradient
f(x) = 100 – 0.5 * x2
df(x)/dx = – x
1 2
0
3
gradients
-1
-2
-3
Math Example : 1D Gradient
f(x) = 100 – 0.5 * x2
df(x)/dx = – x
Gradients on this side of peak are positive
0
1 -1
2 -2
Gradients on this side of peak are negative
3
gradients
-3
Note: Sign of gradient at point tells you what direction to go to travel “uphill”
Math Example : 1D Gradient
f(x) = 100 – 0.5 * x2
df(x)/dx = – x
Gradients on this side of peak are positive
0
1 -1
2 -2
Gradients on this side of peak are negative
3
gradients
-3
Note: Magnitude of gradient at point tells how steep the slope is
Math Example : 2D Gradient
f(x,y)=100-0.5*x2 -0.5*y2
Gradient = [df(x,y)/dx , df(x,y)/dy] = [- x , – y]
Let g=[gx,gy] be the gradient vector at point/pixel (x0,y0)
Vector g points uphill
(direction of steepest ascent)
Vector – g points downhill
(direction of steepest descent)
Vector [gy, -gx] is perpendicular, and denotes direction of constant elevation. i.e.tangenttocontour line passing through point (x0,y0)
• Backward filter: • Forward:
• Central:
[0 1 -1] [-1 1 0] [ 1 0 -1]
1D discrete derivate filters
We can compute derivatives with convolutions.
Source: J. Niebles and R. Krishna
3×3 image gradient filters
1 0 −1 1 0 −1 1 0 −1
1 3
Source: J. Niebles and R. Krishna
Intensity profile
Source: D. Hoiem
Source: J. Niebles and R. Krishna
Gradient
Intensity
Effect of Noise
Effects of noise
How do you find the edge of this signal?
Noisy input image
Using a derivative filter
Where is the edge?
Effects of noise
• Finite difference filters respond strongly to noise
– Image noise results in pixels that look very different from their neighbors – Generally, the larger the noise the stronger the response
• What is to be done?
– Smoothing the image should help, by forcing pixels different to their neighbors (=noise pixels?) to look more like neighbors
Source: D. Forsyth
Solution: smooth first
f
h f*h
To find edges, look for peaks in
Source: S. Seitz
Derivative of Gaussian (DoG) filter
Derivative theorem of convolution:
input
derivative of Gaussian
output (same as before)
Laplace filter
Basically a second derivative filter.
• We can use finite differences to derive it, as with first derivative filter.
first-order finite difference
second-order finite difference
1D derivative filter
Laplace filter
1
0
-1
1
-2
1
Derivative of Gaussian filter
2D-gaussian
x – derivative
* [10-1]=
Derivative of Gaussian filter
x-direction
y-direction
Derivative of Gaussian filter
2D Edge Detection Filters
•
is the Laplacian operator:
Laplacian of Gaussian LoG
Gaussian
Derivative of Gaussian DoG
Slide credit: Steve Seitz
Efficient Implementation Approximating LoG with DoG
LoG can be approximate by a Difference of two Gaussians (DoG) at different scales
s1/s2 1.6
M.Hebert, CMU
1D example
Laplacian of Gaussian (LoG) filter
As with derivative, we can combine Laplace filtering with Gaussian filtering
input
Laplacian of Gaussian
output
?
Laplacian of Gaussian (LoG) filter
As with derivative, we can combine Laplace filtering with Gaussian filtering
input
Laplacian of Gaussian
output “zero crossings” at edges
Laplacian of Gaussian vs Derivative of Gaussian
Laplacian of Gaussian filtering Derivative of Gaussian filtering
Laplacian of Gaussian vs Derivative of Gaussian
zero- crossing
peak
Laplacian of Gaussian filtering Derivative of Gaussian filtering
Zero crossings are more accurate at localizing edges (but not very convenient).
Smoothing with a Gaussian
Recall: parameter σ is the “scale” / “width” / “spread” of the Gaussian kernel, and controls the amount of smoothing.
…
Slide credit: Kristen Grauman
Effect of σ on derivatives
σ = 1 pixel
σ = 3 pixels
The apparent structures differ depending on Gaussian’s scale parameter.
Larger values: larger scale edges detected Smaller values: finer features detected
Slide credit: Kristen Grauman
Tradeoff between smoothing at different scales
1 pixel 3 pixels 7 pixels
• Smoothed derivative removes noise, but blurs edge. Also finds edges at different “scales”.
Source: D. Forsyth
So, what scale to choose?
It depends what we’re looking for.
Slide credit: Kristen Grauman
Some history
Who is this guy?
What is he famous for?
Jean Baptiste Joseph Fourier (1768-1830)
Jean Baptiste Joseph Fourier (1768-1830)
What is he famous for?
The Fourier series claim (1807):
‘Any univariate function can be rewritten as a weighted sum of sines and cosines of different frequencies.’
… and apparently also for the discovery of the greenhouse effect
Is this claim true?
The Fourier series claim (1807):
‘Any univariate function can be rewritten as a weighted sum of sines and cosines of different frequencies.’
Jean Baptiste Joseph Fourier (1768-1830)
Is this claim true?
The Fourier series claim (1807):
‘Any univariate function can be rewritten as a weighted sum of sines and cosines of different frequencies.’
Well, almost.
• The theorem requires additional conditions.
• Close enough to be named after him.
• Very surprising result at the time.
Jean Baptiste Joseph Fourier (1768-1830)
The committee examining his paper had expressed skepticism, in part due to not so rigorous proofs
Malus
Lagrange
Legendre
Laplace
Amusing aside
Only known portrait of Adrien-Marie Legendre
1820 watercolor caricatures of French mathematicians Adrien- Marie Legendre (left) and Joseph Fourier (right) by French artist Julien-Leopold Boilly
For two hundred years, people were misidentifying this portrait as him
Louis Legendre (same last name, different person)
Fourier series
Basic building block
Fourier’s claim: Add enough of these to get any periodic signal you want!
Basic building block
variable phase angular
frequency
Fourier’s claim: Add enough of these to get any periodic signal you want!
amplitudesinusoid
Examples
How would you generate this function?
=?+?
Examples
How would you generate this function?
?
=+?
Examples
How would you generate this function?
?
?
=+
Examples
How would you generate this function?
square wave
=?+?
Examples
How would you generate this function?
?
?
square wave
≈+ =
Examples
How would you generate this function?
?
?
square wave
≈+ =
Examples
How would you generate this function?
?
?
square wave
≈+ =
Examples
How would you generate this function?
?
?
square wave
≈+ =
Examples
How would you generate this function?
?
?
square wave
≈+ =
How would you express this mathematically?
Examples
=
square wave
How would could you visualize this in the frequency domain?
infinite sum of sine waves
square wave
=
magnitude
Examples
infinite sum of sine waves
frequency
Frequency domain
Visualizing the frequency spectrum
amplitude
frequency
Visualizing the frequency spectrum
amplitude
Recall the temporal domain visualization
=+
frequency
Visualizing the frequency spectrum
=+
amplitude
Recall the temporal domain visualization
How do we plot …
frequency
Visualizing the frequency spectrum
amplitude
Recall the temporal domain visualization
=+
frequency
Visualizing the frequency spectrum
Recall the temporal domain visualization
=+
amplitude
not visualizing the symmetric negative part
frequency
Need to understand this to understand the 2D version!
Examples
Spatial domain visualization Frequency domain visualization 1D
2D
?
Examples
Spatial domain visualization Frequency domain visualization 1D
2D
What do these dots correspond to?
Examples
Spatial domain visualization Frequency domain visualization ?
Examples
Spatial domain visualization Frequency domain visualization
Examples
How would you generate this image with sine waves?
Examples
How would you generate this image with sine waves?
Has both an x and y components
Examples
+=
?
Examples
+=
?
Examples
+=
Basic building block
amplitudesinusoid
variable phase angular
frequency
Fourier’s claim: Add enough of these to get any periodic signal you want!
What about non- periodic signals?
Fourier transform
Complex numbers have two parts:
rectangular coordinates
Recalling some basics
real imaginary Alternative representation:
polar transform
polar coordinates
polar transform
Complex numbers have two parts:
rectangular coordinates
Recalling some basics
real imaginary Alternative representation:
polar transform
How do you write these in exponential form?
polar coordinates
polar transform
Complex numbers have two parts:
Recalling some basics
rectangular coordinates
real imaginary Alternative representation:
polar
coordinates polar transform or equivalently
exponential how did we get this? form
Complex numbers have two parts: rectangular
Recalling some basics
coordinates
real imaginary Alternative representation:
polar
coordinates polar transform or equivalently
exponential Euler’s formula form
This will help us understand the Fourier transform equations
Fourier transform
Fourier transform inverse Fourier transform
Where is the connection to the ‘summation of sine waves’ idea?
discrete continuous
Fourier transform
Fourier transform inverse Fourier transform
Where is the connection to the ‘summation of sine waves’ idea?
discrete continuous
Fourier transform
Where is the connection to the ‘summation of sine waves’ idea?
sum over frequencies
Euler’s formula
scaling parameter wave components
Note the symmetry: duality property of Fourier transform
Fourier transform pairs
spatial domain frequency domain
Computing the discrete Fourier transform (DFT)
Computing the discrete Fourier transform (DFT)
is just a matrix multiplication:
In practice this is implemented using the fast Fourier transform (FFT) algorithm.
Fourier transforms of natural images
original
amplitude phase
Image phase matters!
Fourier transforms of natural images
cheetah phase with zebra amplitude
zebra phase with cheetah amplitude
Frequency-domain filtering
Why do we care about all this?
The convolution theorem
The Fourier transform of the convolution of two functions is the product of their Fourier transforms:
The inverse Fourier transform of the product of two Fourier transforms is the convolution of the two inverse Fourier transforms:
Convolution in spatial domain is equivalent to multiplication in frequency domain!
What do we use convolution for?
Convolution for 1D continuous signals
Definition of linear shift-invariant filtering as convolution:
filtered signal filter input signal
Using the convolution theorem, we can interpret and implement all types of linear shift-invariant filtering as multiplication in frequency domain.
Why implement convolution in frequency domain?
Frequency-domain filtering in Matlab
Filtering with fft:
im = double(imread(‘…’))/255;
im = rgb2gray(im); % “im” should be a gray-scale floating point image
[imh, imw] = size(im);
hs = 50; % filter half-size
fil = fspecial(‘gaussian’, hs*2+1, 10);
fftsize = 1024; % should be order of 2 (for speed) and include padding
im_fft = fft2(im, fftsize, fftsize);
padding
fil_fft = fft2(fil, fftsize, fftsize);
same size as image
% 1) fft im with
% 2) fft fil, pad to
% 3) multiply fft
% 4) inverse fft2
im_fil_fft = im_fft .* fil_fft;
images
im_fil = ifft2(im_fil_fft);
im_fil = im_fil(1+hs:size(im,1)+hs, 1+hs:size(im, 2)+hs); % 5) remove padding
Displaying with fft:
figure(1), imagesc(log(abs(fftshift(im_fft)))), axis image, colormap jet
Spatial domain filtering
=
Fourier transform
= Frequency domain filtering
filter kernel
inverse Fourier transform
Revisiting blurring
Why does the Gaussian give a nice smooth image, but the square filter give edgy artifacts?
Gaussian filter
Box filter
Gaussian blur
Box blur
More filtering examples
? ?
filters shown in frequency- domain
More filtering examples
low-pass
filters shown band-pass in frequency-
domain
More filtering examples
high-pass
?
More filtering examples
high-pass
More filtering examples
original image low-pass filter
frequency magnitude
?
More filtering examples
original image low-pass filter
frequency magnitude
More filtering examples
original image high-pass filter
frequency magnitude
?
More filtering examples
original image high-pass filter
frequency magnitude
More filtering examples
original image band-pass filter
frequency magnitude
More filtering examples
original image band-pass filter
frequency magnitude
More filtering examples
original image band-pass filter
frequency magnitude
More filtering examples
original image band-pass filter
frequency magnitude
Image Downsampling
Image Scaling
This image is too big to fit on the screen. How can we generate a half-sized version?
Source: S. Seitz
Image sub-sampling
delete even rows, delete even columns
Throw away every other row and column to create a 1/2 size image
– called image sub-sampling
1/4
1/8
Source: S. Seitz
Naïve Image Downsampling
1/2 1/4 (2x zoom) 1/8 (4x zoom)
What is the 1/8 image so pixelated (and do you know what this effect is called)?
Downsample
However: Subsampling is a bad idea unless you have previously blurred/smoothed the image! (because it leads to aliasing)
Image sub-sampling
Source: F. Durand
Reminder
Images are a discrete, or sampled, representation of a continuous world
Very simple example: a sine wave
Sampling
How would you discretize this signal?
Very simple example: a sine wave
Sampling
Very simple example: a sine wave
Sampling
How many samples should I take?
Can I take as many samples as I want?
Very simple example: a sine wave
Sampling
How many samples should I take?
Can I take as few samples as I want?
Undersampling
Very simple example: a sine wave
Unsurprising effect: information is lost.
Undersampling
Very simple example: a sine wave
Unsurprising effect: information is lost.
Surprising effect: can confuse the signal with one of lower frequency.
Undersampling
Very simple example: a sine wave
Unsurprising effect: information is lost.
Surprising effect: can confuse the signal with one of lower frequency. Note: we could always confuse the signal with one of higher frequency.
Aliasing
Fancy term for: Undersampling can disguise a signal as one of a lower frequency
Unsurprising effect: information is lost.
Surprising effect: can confuse the signal with one of lower frequency. Note: we could always confuse the signal with one of higher frequency.
Aliasing in textures
Temporal aliasing
Temporal aliasing
Temporal aliasing
Anti-aliasing
How would you deal with aliasing?
Anti-aliasing
How would you deal with aliasing? Approach 1: Oversample the signal
Anti-aliasing in textures
aliasing artifacts
anti-aliasing by oversampling
Anti-aliasing
How would you deal with aliasing? Approach 1: Oversample the signal
Approach 2: Smooth the signal
• Remove some of the detail effects that cause aliasing.
• Lose information, but better than aliasing artifacts.
How would you smooth a signal?
Better image downsampling
Apply a smoothing filter first, then throw away half the rows and columns
1/2
Gaussian filter delete even rows delete even columns
Gaussian filter delete even rows delete even columns
1/4
1/8
Better image downsampling
1/2 1/4 (2x zoom) 1/8 (4x zoom)
Naïve image downsampling
1/2 1/4 (2x zoom) 1/8 (4x zoom)
Anti-aliasing
Question 1: How much smoothing do I need to do to avoid aliasing? Question 2: How many samples do I need to take to avoid aliasing? Answer to both: Enough to reach the Nyquist limit.
The Nyquist-Shannon sampling theorem
A continuous signal can be perfectly reconstructed from its discrete version using linear interpolation, if sampling occurred with frequency:
This is called the Nyquist frequency
Equivalent reformulation: When downsampling, aliasing does not occur if samples are taken at the Nyquist frequency or higher.
Gaussian Image Pyramid
Idea for Today:
Form a Multi-Resolution Representation
original
sigma = 1
sigma = 3
sigma = 10
Variable frequency sensitivity
Experiment: Where do you see the stripes?
frequency
contrast
Variable frequency sensitivity
Campbell-Robson contrast sensitivity curve
Our eyes are sensitive to mid-range frequencies
• Early processing in humans filters for various orientations and scales of frequency
• Perceptual cues in the mid frequencies dominate perception
frequency
contrast
Pyramid Representations
Because a large amount of smoothing limits the frequency of features in the image, we do not need to keep all the pixels around!
Strategy: progressively reduce the number of pixels as we smooth more and more.
Leads to a “pyramid” representation if we subsample at each level.
Gaussian Pyramid
• Synthesis: Smooth image with a Gaussian and downsample. Repeat.
• Gaussian is used because it is self-reproducing (enables incremental smoothing).
• Computational cost typically dominated by convolution at the two lowest (highest resolution) levels.
Gaussian pyramid
How does the Nyquist-Shannon theorem relate to the Gaussian pyramid?
Gaussian pyramid
How does the Nyquist-Shannon theorem relate to the Gaussian pyramid?
• Gaussian blurring is low-pass filtering.
• By blurring we try to sufficiently decrease
the Nyquist frequency to avoid aliasing.
Reminder
• Cascaded Gaussians
– Repeated convolution by a smaller Gaussian to simulate effects of a larger one.
• G*(G*f) = (G*G)*f [associativity]
Gaussian Pyramid
Low res
High res
Emphasis: Smaller Images have Lower Resolution
Gaussia{n pyramid
F0 F1 F2
blur subsample blur subsample
…
F0
*
H F1
*
H
Low resolution
G =(G *gaussian)¯2
Gaussian Pyramid
4 3 blur
G =(G *gaussian)¯2
32
blur
G =(G *gaussian)¯2 21
blur
G =(G *gaussian)¯2 1 0
High resolution
G0 =Image blur
L =G -expand(G ) ii i+1
G =L +expand(G ) ii i+1
G0
G L=G nnn
G2 – = L2
G1 – = L1
-=
L0
The Laplacian Pyramid
Gaussian Pyramid Laplacian Pyramid
Basic reading:
• Szeliski textbook, Sections 3.4, 3.5
References
Additional reading:
• Burt and Adelson, “The Laplacian Pyramid as a Compact Image Code,” IEEE ToC 1983.
The original Laplacian pyramid paper.
• Hubel and Wiesel, “Receptive fields, binocular interaction and functional architecture in the cat’s
visual cortex,” The Journal of Physiology 1962
A foundational paper describing information processing in the visual system, including the
different types of filtering it performs;; Hubel and Wiesel won the Nobel Prize in Medicine in 1981 for the discoveries described in this paper.