程序代写代做代考 algorithm kernel go Fundamentals of Computer Vision

Fundamentals of Computer Vision
Lecture 5

• Image noise
• Edge detection
• Fourier series.
• Frequency domain.
• Fourier transform.
• Frequency-domain filtering.
• Sampling and Aliasing.
• Gaussian image pyramid.
Overview of today’s lecture

Slide credits
Most of these slides were adapted directly from:
• Robert Colin (454, Fall 2019), Kris Kitani (15-463, Fall 2016), Ioannis Gkioulekas (16-385, Spring 2019)
Some slides were inspired or taken from:
• Fredo Durand (MIT).
• Bernd Girod (Stanford University).
• James Hays (Georgia Tech).
• Steve Marschner (Cornell University).
• Steve Seitz (University of Washington).

Review

Convolution vs. Cross Correlation
• A convolution is an integral that expresses the amount of overlap of one function as it is shifted over another function.
• convolutionisafilteringoperation
• Correlation compares the similarity of two sets of data. Correlation computes a measure of similarity of two input signals as they are shifted by one another. The correlation result reaches a maximum at the time when the two signals match best .
• correlationisameasureofrelatednessoftwosignals
Source: J. Niebles and R. Krishna

(Cross) correlation – example
scanline
Left Right
Norm. cross corr. score

Convolution in 2D – examples
•0
•0
•0
•0
•2
•0
•0
•0
•0
•1
•1
•1
•1
•1
•1
•1
•1
•1

(Note that filter sums to 1)
“details of the image”

=?
Original
•0
•0
•0
•0
•1
•0
•0
•0
•0
•0
•0
•0
•0
•1
•0
•0
•0
•0
•1
•1
•1
•1
•1
•1
•1
•1
•1
+
Source: J. Niebles and R. Krishna

What does blurring take away?

=
original
• Let’s add it back:
smoothed (5×5)
Detailed
original
Detailed
Sharpened
+
=
Source: J. Niebles and R. Krishna

Convolution in 2D – Sharpening filter
•0
•0
•0
•0
•2
•0
•0
•0
•0
•1
•1
•1
•1
•1
•1
•1
•1
•1

Original
Sharpening filter: Accentuates differences with local average
=
Source: J. Niebles and R. Krishna

Image support and edge effect
•A computer will only convolve finite support signals. • What happens at the edge?
• zero “padding”
• edge value replication h • mirror extension
• more (beyond the scope of this class)
f
Source: J. Niebles and R. Krishna

Practical Issue: Border Handling
• Problem: what do we do for border pixels where the kernel does not completely overlap the image?
???
for interior pixels where there is full overlap, we know what to do.
but what values do we use for pixels that are “off the image” ?

Practical Issue: Border Handling
• Differentborderhandlingmethodsspecifydifferentways of defining values for pixels that are off the image.
• Oneofthesimplestmethodsiszero-padding,whichwe used by default in earlier examples.
0 0 0 0 … 0
0
0
0

Practical Issue: Border Handling
• Othermethods…
• Replication–replaceeachoff-imagepixelwiththevalue from the nearest pixel that IS in the image.
Example:
123 123 123
111 333 444 666 777 999
7s 7 8 9 9s 789
789
1s
3s
123 456 789
123 456 789

Practical Issue: Border Handling
• Othermethods…
• Reflection–reflectpixelvaluesattheborder(asifthere was a little mirror there)
Example:
987789987 654456654 321123321
321 321 654 654 987 987
987789987 654456654 321123321
123 456 789
123 456 789

Practical Issue: Border Handling
• Othermethods…
• Wraparound–whengoingofftherightborderoftheimage,youwrap around to the left border. Similarly, when leaving the bottom of the image you reenter at the top. Basically, the image is a big donut (or torus).
Example:
123 456 789
123123 456456 789789
123 123 456 456 789 789
123123123 456456456 789789789
123 456 789
123 456 789

Math Example : 1D Gradient
Consider function f(x) = 100 – 0.5 * x2

Math Example : 1D Gradient
Consider function f(x) = 100 – 0.5 * x2 Gradientisdf(x)/dx=-2*0.5*x = -x
Geometric interpretation:
gradient at x0 is slope of tangent line to curve at point x0
tangent line
Dy x0
Dx f(x)
slope = Dy / Dx = df(x)/dx
x0

Math Example : 1D Gradient
f(x) = 100 – 0.5 * x2 df(x)/dx = – x
grad = slope = 2
x = -2

Math Example : 1D Gradient
f(x) = 100 – 0.5 * x2
df(x)/dx = – x
1 2
0
3
gradients
-1
-2
-3

Math Example : 1D Gradient
f(x) = 100 – 0.5 * x2
df(x)/dx = – x
Gradients on this side of peak are positive
0
1 -1
2 -2
Gradients on this side of peak are negative
3
gradients
-3
Note: Sign of gradient at point tells you what direction to go to travel “uphill”

Math Example : 1D Gradient
f(x) = 100 – 0.5 * x2
df(x)/dx = – x
Gradients on this side of peak are positive
0
1 -1
2 -2
Gradients on this side of peak are negative
3
gradients
-3
Note: Magnitude of gradient at point tells how steep the slope is

Math Example : 2D Gradient
f(x,y)=100-0.5*x2 -0.5*y2
Gradient = [df(x,y)/dx , df(x,y)/dy] = [- x , – y]
Let g=[gx,gy] be the gradient vector at point/pixel (x0,y0)
Vector g points uphill
(direction of steepest ascent)
Vector – g points downhill
(direction of steepest descent)
Vector [gy, -gx] is perpendicular, and denotes direction of constant elevation. i.e.tangenttocontour line passing through point (x0,y0)

• Backward filter: • Forward:
• Central:
[0 1 -1] [-1 1 0] [ 1 0 -1]
1D discrete derivate filters
We can compute derivatives with convolutions.
Source: J. Niebles and R. Krishna

3×3 image gradient filters
1 0 −1 1 0 −1 1 0 −1
1 3
Source: J. Niebles and R. Krishna

Intensity profile
Source: D. Hoiem
Source: J. Niebles and R. Krishna
Gradient
Intensity

Effect of Noise

Effects of noise
How do you find the edge of this signal?
Noisy input image
Using a derivative filter
Where is the edge?

Effects of noise
• Finite difference filters respond strongly to noise
– Image noise results in pixels that look very different from their neighbors – Generally, the larger the noise the stronger the response
• What is to be done?
– Smoothing the image should help, by forcing pixels different to their neighbors (=noise pixels?) to look more like neighbors
Source: D. Forsyth

Solution: smooth first
f
h f*h
To find edges, look for peaks in
Source: S. Seitz

Derivative of Gaussian (DoG) filter
Derivative theorem of convolution:
input
derivative of Gaussian
output (same as before)

Laplace filter
Basically a second derivative filter.
• We can use finite differences to derive it, as with first derivative filter.
first-order finite difference
second-order finite difference
1D derivative filter
Laplace filter
1
0
-1
1
-2
1

Derivative of Gaussian filter
2D-gaussian
x – derivative
* [10-1]=

Derivative of Gaussian filter
x-direction
y-direction

Derivative of Gaussian filter

2D Edge Detection Filters

is the Laplacian operator:
Laplacian of Gaussian LoG
Gaussian
Derivative of Gaussian DoG
Slide credit: Steve Seitz

Efficient Implementation Approximating LoG with DoG
LoG can be approximate by a Difference of two Gaussians (DoG) at different scales
s1/s2 1.6
M.Hebert, CMU
1D example

Laplacian of Gaussian (LoG) filter
As with derivative, we can combine Laplace filtering with Gaussian filtering
input
Laplacian of Gaussian
output
?

Laplacian of Gaussian (LoG) filter
As with derivative, we can combine Laplace filtering with Gaussian filtering
input
Laplacian of Gaussian
output “zero crossings” at edges

Laplacian of Gaussian vs Derivative of Gaussian
Laplacian of Gaussian filtering Derivative of Gaussian filtering

Laplacian of Gaussian vs Derivative of Gaussian
zero- crossing
peak
Laplacian of Gaussian filtering Derivative of Gaussian filtering
Zero crossings are more accurate at localizing edges (but not very convenient).

Smoothing with a Gaussian
Recall: parameter σ is the “scale” / “width” / “spread” of the Gaussian kernel, and controls the amount of smoothing.

Slide credit: Kristen Grauman

Effect of σ on derivatives
σ = 1 pixel
σ = 3 pixels
The apparent structures differ depending on Gaussian’s scale parameter.
Larger values: larger scale edges detected Smaller values: finer features detected
Slide credit: Kristen Grauman

Tradeoff between smoothing at different scales
1 pixel 3 pixels 7 pixels
• Smoothed derivative removes noise, but blurs edge. Also finds edges at different “scales”.
Source: D. Forsyth

So, what scale to choose?
It depends what we’re looking for.
Slide credit: Kristen Grauman

Some history

Who is this guy?

What is he famous for?
Jean Baptiste Joseph Fourier (1768-1830)

Jean Baptiste Joseph Fourier (1768-1830)
What is he famous for?
The Fourier series claim (1807):
‘Any univariate function can be rewritten as a weighted sum of sines and cosines of different frequencies.’
… and apparently also for the discovery of the greenhouse effect

Is this claim true?
The Fourier series claim (1807):
‘Any univariate function can be rewritten as a weighted sum of sines and cosines of different frequencies.’
Jean Baptiste Joseph Fourier (1768-1830)

Is this claim true?
The Fourier series claim (1807):
‘Any univariate function can be rewritten as a weighted sum of sines and cosines of different frequencies.’
Well, almost.
• The theorem requires additional conditions.
• Close enough to be named after him.
• Very surprising result at the time.
Jean Baptiste Joseph Fourier (1768-1830)
The committee examining his paper had expressed skepticism, in part due to not so rigorous proofs
Malus
Lagrange
Legendre
Laplace

Amusing aside
Only known portrait of Adrien-Marie Legendre
1820 watercolor caricatures of French mathematicians Adrien- Marie Legendre (left) and Joseph Fourier (right) by French artist Julien-Leopold Boilly
For two hundred years, people were misidentifying this portrait as him
Louis Legendre (same last name, different person)

Fourier series

Basic building block
Fourier’s claim: Add enough of these to get any periodic signal you want!

Basic building block
variable phase angular
frequency
Fourier’s claim: Add enough of these to get any periodic signal you want!
amplitudesinusoid

Examples
How would you generate this function?
=?+?

Examples
How would you generate this function?
?
=+?

Examples
How would you generate this function?
?
?
=+

Examples
How would you generate this function?
square wave
=?+?

Examples
How would you generate this function?
?
?
square wave
≈+ =

Examples
How would you generate this function?
?
?
square wave
≈+ =

Examples
How would you generate this function?
?
?
square wave
≈+ =

Examples
How would you generate this function?
?
?
square wave
≈+ =

Examples
How would you generate this function?
?
?
square wave
≈+ =
How would you express this mathematically?

Examples
=
square wave
How would could you visualize this in the frequency domain?
infinite sum of sine waves

square wave
=
magnitude
Examples
infinite sum of sine waves
frequency

Frequency domain

Visualizing the frequency spectrum
amplitude
frequency

Visualizing the frequency spectrum
amplitude
Recall the temporal domain visualization
=+
frequency

Visualizing the frequency spectrum
=+
amplitude
Recall the temporal domain visualization
How do we plot …
frequency

Visualizing the frequency spectrum
amplitude
Recall the temporal domain visualization
=+
frequency

Visualizing the frequency spectrum
Recall the temporal domain visualization
=+
amplitude
not visualizing the symmetric negative part
frequency
Need to understand this to understand the 2D version!

Examples
Spatial domain visualization Frequency domain visualization 1D
2D
?

Examples
Spatial domain visualization Frequency domain visualization 1D
2D
What do these dots correspond to?

Examples
Spatial domain visualization Frequency domain visualization ?

Examples
Spatial domain visualization Frequency domain visualization

Examples
How would you generate this image with sine waves?

Examples
How would you generate this image with sine waves?
Has both an x and y components

Examples
+=
?

Examples
+=
?

Examples
+=

Basic building block
amplitudesinusoid
variable phase angular
frequency
Fourier’s claim: Add enough of these to get any periodic signal you want!
What about non- periodic signals?

Fourier transform

Complex numbers have two parts:
rectangular coordinates
Recalling some basics
real imaginary Alternative representation:
polar transform
polar coordinates
polar transform

Complex numbers have two parts:
rectangular coordinates
Recalling some basics
real imaginary Alternative representation:
polar transform
How do you write these in exponential form?
polar coordinates
polar transform

Complex numbers have two parts:
Recalling some basics
rectangular coordinates
real imaginary Alternative representation:
polar
coordinates polar transform or equivalently
exponential how did we get this? form

Complex numbers have two parts: rectangular
Recalling some basics
coordinates
real imaginary Alternative representation:
polar
coordinates polar transform or equivalently
exponential Euler’s formula form
This will help us understand the Fourier transform equations

Fourier transform
Fourier transform inverse Fourier transform
Where is the connection to the ‘summation of sine waves’ idea?
discrete continuous

Fourier transform
Fourier transform inverse Fourier transform
Where is the connection to the ‘summation of sine waves’ idea?
discrete continuous

Fourier transform
Where is the connection to the ‘summation of sine waves’ idea?
sum over frequencies
Euler’s formula
scaling parameter wave components

Note the symmetry: duality property of Fourier transform
Fourier transform pairs
spatial domain frequency domain

Computing the discrete Fourier transform (DFT)

Computing the discrete Fourier transform (DFT)
is just a matrix multiplication:
In practice this is implemented using the fast Fourier transform (FFT) algorithm.

Fourier transforms of natural images
original
amplitude phase

Image phase matters!
Fourier transforms of natural images
cheetah phase with zebra amplitude
zebra phase with cheetah amplitude

Frequency-domain filtering

Why do we care about all this?

The convolution theorem
The Fourier transform of the convolution of two functions is the product of their Fourier transforms:
The inverse Fourier transform of the product of two Fourier transforms is the convolution of the two inverse Fourier transforms:
Convolution in spatial domain is equivalent to multiplication in frequency domain!

What do we use convolution for?

Convolution for 1D continuous signals
Definition of linear shift-invariant filtering as convolution:
filtered signal filter input signal
Using the convolution theorem, we can interpret and implement all types of linear shift-invariant filtering as multiplication in frequency domain.
Why implement convolution in frequency domain?

Frequency-domain filtering in Matlab
Filtering with fft:
im = double(imread(‘…’))/255;
im = rgb2gray(im); % “im” should be a gray-scale floating point image
[imh, imw] = size(im);
hs = 50; % filter half-size
fil = fspecial(‘gaussian’, hs*2+1, 10);
fftsize = 1024; % should be order of 2 (for speed) and include padding
im_fft = fft2(im, fftsize, fftsize);
padding
fil_fft = fft2(fil, fftsize, fftsize);
same size as image
% 1) fft im with
% 2) fft fil, pad to
% 3) multiply fft
% 4) inverse fft2
im_fil_fft = im_fft .* fil_fft;
images
im_fil = ifft2(im_fil_fft);
im_fil = im_fil(1+hs:size(im,1)+hs, 1+hs:size(im, 2)+hs); % 5) remove padding
Displaying with fft:
figure(1), imagesc(log(abs(fftshift(im_fft)))), axis image, colormap jet

Spatial domain filtering
=
Fourier transform
= Frequency domain filtering
filter kernel
inverse Fourier transform

Revisiting blurring
Why does the Gaussian give a nice smooth image, but the square filter give edgy artifacts?
Gaussian filter
Box filter

Gaussian blur

Box blur

More filtering examples
? ?
filters shown in frequency- domain

More filtering examples
low-pass
filters shown band-pass in frequency-
domain

More filtering examples
high-pass
?

More filtering examples
high-pass

More filtering examples
original image low-pass filter
frequency magnitude
?

More filtering examples
original image low-pass filter
frequency magnitude

More filtering examples
original image high-pass filter
frequency magnitude
?

More filtering examples
original image high-pass filter
frequency magnitude

More filtering examples
original image band-pass filter
frequency magnitude

More filtering examples
original image band-pass filter
frequency magnitude

More filtering examples
original image band-pass filter
frequency magnitude

More filtering examples
original image band-pass filter
frequency magnitude

Image Downsampling

Image Scaling
This image is too big to fit on the screen. How can we generate a half-sized version?
Source: S. Seitz

Image sub-sampling
delete even rows, delete even columns
Throw away every other row and column to create a 1/2 size image
– called image sub-sampling
1/4
1/8
Source: S. Seitz

Naïve Image Downsampling
1/2 1/4 (2x zoom) 1/8 (4x zoom)
What is the 1/8 image so pixelated (and do you know what this effect is called)?

Downsample
However: Subsampling is a bad idea unless you have previously blurred/smoothed the image! (because it leads to aliasing)

Image sub-sampling
Source: F. Durand

Reminder
Images are a discrete, or sampled, representation of a continuous world

Very simple example: a sine wave
Sampling
How would you discretize this signal?

Very simple example: a sine wave
Sampling

Very simple example: a sine wave
Sampling
How many samples should I take?
Can I take as many samples as I want?

Very simple example: a sine wave
Sampling
How many samples should I take?
Can I take as few samples as I want?

Undersampling
Very simple example: a sine wave
Unsurprising effect: information is lost.

Undersampling
Very simple example: a sine wave
Unsurprising effect: information is lost.
Surprising effect: can confuse the signal with one of lower frequency.

Undersampling
Very simple example: a sine wave
Unsurprising effect: information is lost.
Surprising effect: can confuse the signal with one of lower frequency. Note: we could always confuse the signal with one of higher frequency.

Aliasing
Fancy term for: Undersampling can disguise a signal as one of a lower frequency
Unsurprising effect: information is lost.
Surprising effect: can confuse the signal with one of lower frequency. Note: we could always confuse the signal with one of higher frequency.

Aliasing in textures

Temporal aliasing

Temporal aliasing

Temporal aliasing

Anti-aliasing
How would you deal with aliasing?

Anti-aliasing
How would you deal with aliasing? Approach 1: Oversample the signal

Anti-aliasing in textures
aliasing artifacts
anti-aliasing by oversampling

Anti-aliasing
How would you deal with aliasing? Approach 1: Oversample the signal
Approach 2: Smooth the signal
• Remove some of the detail effects that cause aliasing.
• Lose information, but better than aliasing artifacts.
How would you smooth a signal?

Better image downsampling
Apply a smoothing filter first, then throw away half the rows and columns
1/2
Gaussian filter delete even rows delete even columns
Gaussian filter delete even rows delete even columns
1/4
1/8

Better image downsampling
1/2 1/4 (2x zoom) 1/8 (4x zoom)

Naïve image downsampling
1/2 1/4 (2x zoom) 1/8 (4x zoom)

Anti-aliasing
Question 1: How much smoothing do I need to do to avoid aliasing? Question 2: How many samples do I need to take to avoid aliasing? Answer to both: Enough to reach the Nyquist limit.

The Nyquist-Shannon sampling theorem
A continuous signal can be perfectly reconstructed from its discrete version using linear interpolation, if sampling occurred with frequency:
This is called the Nyquist frequency
Equivalent reformulation: When downsampling, aliasing does not occur if samples are taken at the Nyquist frequency or higher.

Gaussian Image Pyramid

Idea for Today:
Form a Multi-Resolution Representation
original
sigma = 1
sigma = 3
sigma = 10

Variable frequency sensitivity
Experiment: Where do you see the stripes?
frequency
contrast

Variable frequency sensitivity
Campbell-Robson contrast sensitivity curve
Our eyes are sensitive to mid-range frequencies
• Early processing in humans filters for various orientations and scales of frequency
• Perceptual cues in the mid frequencies dominate perception
frequency
contrast

Pyramid Representations
Because a large amount of smoothing limits the frequency of features in the image, we do not need to keep all the pixels around!
Strategy: progressively reduce the number of pixels as we smooth more and more.
Leads to a “pyramid” representation if we subsample at each level.

Gaussian Pyramid
• Synthesis: Smooth image with a Gaussian and downsample. Repeat.
• Gaussian is used because it is self-reproducing (enables incremental smoothing).
• Computational cost typically dominated by convolution at the two lowest (highest resolution) levels.

Gaussian pyramid
How does the Nyquist-Shannon theorem relate to the Gaussian pyramid?

Gaussian pyramid
How does the Nyquist-Shannon theorem relate to the Gaussian pyramid?
• Gaussian blurring is low-pass filtering.
• By blurring we try to sufficiently decrease
the Nyquist frequency to avoid aliasing.

Reminder
• Cascaded Gaussians
– Repeated convolution by a smaller Gaussian to simulate effects of a larger one.
• G*(G*f) = (G*G)*f [associativity]

Gaussian Pyramid
Low res
High res

Emphasis: Smaller Images have Lower Resolution

Gaussia{n pyramid
F0 F1 F2
blur subsample blur subsample

F0
*
H F1
*
H

Low resolution
G =(G *gaussian)¯2
Gaussian Pyramid
4 3 blur
G =(G *gaussian)¯2
32
blur
G =(G *gaussian)¯2 21
blur
G =(G *gaussian)¯2 1 0
High resolution
G0 =Image blur

L =G -expand(G ) ii i+1
G =L +expand(G ) ii i+1
G0
G L=G nnn
G2 – = L2
G1 – = L1
-=
L0
The Laplacian Pyramid
Gaussian Pyramid Laplacian Pyramid

Basic reading:
• Szeliski textbook, Sections 3.4, 3.5
References
Additional reading:
• Burt and Adelson, “The Laplacian Pyramid as a Compact Image Code,” IEEE ToC 1983.
The original Laplacian pyramid paper.
• Hubel and Wiesel, “Receptive fields, binocular interaction and functional architecture in the cat’s
visual cortex,” The Journal of Physiology 1962
A foundational paper describing information processing in the visual system, including the
different types of filtering it performs;; Hubel and Wiesel won the Nobel Prize in Medicine in 1981 for the discoveries described in this paper.