Computer Vision
Image Processing I
1
Image processing
• Image processing = image in > image out
• Aims to suppress distortions and enhance relevant information • Prepares images for further analysis and interpretation
• Image analysis = image in > features out
• Computer vision = image in > interpretation out
2
Types of image processing
• Two main types of image processing operations:
– Spatial domain operations (in image space)
– Frequency domain operations (mainly in Fourier space)
• Two main types of spatial domain operations:
– Point operations (intensity transformations on individual pixels) – Neighbourhood operations (spatial filtering on groups of pixels)
3
Types of image processing
• Two main types of image processing operations:
– Spatial domain operations (in image space)
– Frequency domain operations (mainly in Fourier space)
• Two main types of spatial domain operations:
– Point operations (intensity transformations on individual pixels) – Neighbourhood operations (spatial filtering on groups of pixels)
4
Topics and learning goals
• Describe the workings of basic point operations
Contrast stretching, thresholding, inversion, log/power transformations
• Understanding and using the intensity histogram Histogram specification, equalization, matching
• Defining arithmetic and logical operations Summation, subtraction, AND, OR, et cetera
5
Spatial domain operations • General form of spatial domain operations
where
𝑔𝑥,𝑦 =𝑇𝑓𝑥,𝑦
𝑓 𝑥, 𝑦 is the input image
𝑔 𝑥, 𝑦 is the processed image
𝑇 ∙ is the operator applied at (𝑥, 𝑦)
6
Spatial domain operations • Point operations: 𝑇 operates on individual pixels
𝑇:R⟶R 𝑔𝑥,𝑦 =𝑇𝑓𝑥,𝑦
• Neighbourhood operations: 𝑇 operates on multiple pixels 𝑇:R2⟶R 𝑔𝑥,𝑦 =𝑇𝑓𝑥,𝑦,𝑓𝑥+1,𝑦,𝑓𝑥−1,𝑦,…
7
Point operations
8
Neighbourhood operations
9
Contrast stretching
Input
Output
T
LH
𝐼𝑛𝑝𝑢𝑡 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 𝑙𝑒𝑣𝑒𝑙 10
𝑂𝑢𝑡𝑝𝑢𝑡 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 𝑙𝑒𝑣𝑒𝑙
Contrast stretching
• Produces images of higher contrast
• Puts values below 𝐿 in the input to black in the output
• Puts values above 𝐻 in the input to white in the output
• Linearly scales values between 𝐿 and 𝐻 in the input to the maximum range in the output
11
Intensity thresholding
Input
Output
T
Threshold
𝐼𝑛𝑝𝑢𝑡 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 𝑙𝑒𝑣𝑒𝑙
12
𝑂𝑢𝑡𝑝𝑢𝑡 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 𝑙𝑒𝑣𝑒𝑙
Intensity thresholding
• Limiting case of contrast stretching
• Produces binary images of gray-scale images
• Puts values below the threshold to black in the output
• Puts values equal/above the threshold to white in the output
• Popularmethodforimagesegmentation(discussedlater)
• Useful only if object and background intensities are very different
13
Automatic intensity thresholding
• Otsu’s method for computing the threshold
Exhaustively searches for the threshold minimising the intra-class variance
𝜎2 =𝑝𝜎2+𝑝𝜎2 𝑊0011
Equivalent to maximising the inter-class variance (much faster to compute)
𝜎2=𝑝𝑝 𝜇 −𝜇 2 𝐵0101
Here, 𝑝0 is the fraction of pixels below the threshold (class 0), 𝑝1 is the fraction of pixels equal
to or above the threshold (class 1), 𝜇0 and 𝜇1 are the mean intensities of pixels in class 0 and
class1,𝜎2 and𝜎2 aretheintensityvariances,and𝑝 +𝑝 =1and𝜎2 +𝜎2 =𝜎2 01 0101
14
Otsu thresholding
15
Automatic intensity thresholding
• Iso-data method for computing the threshold
1. Select an arbitrary initial threshold 𝑡
2. Compute 𝜇0 and 𝜇1 with respect to the threshold
3. Update the threshold to the mean of the means: 𝑡 = 𝜇0 + 𝜇1 /2
4. If the threshold changed in Step 3, go to Step 2
Upon convergence, the threshold is midway between the two class means
16
Iso-data thresholding
17
Multi-level thresholding
O
Input
Output
T
I
18
Intensity inversion
Input
O
Output
T
I
19
Intensity inversion
Useful for enhancing gray/white details in images within dominant black areas
20
Log transformation • Definition of log transformation
𝑠 = 𝑐 log 1 + 𝑟
where 𝑟 is the input intensity, 𝑠 is the output intensity, and 𝑐 is a constant
– Maps narrow range of low gray-level values into wider range of output values, and opposite for higher gray-level values
– Also compresses dynamic range of images with large variations in pixel values (such as Fourier spectra, discussed later)
21
Log transformation
22
Power transformation • Definition of power transformation
𝑠 = 𝑐 𝑟𝛾 where 𝑐 and 𝛾 are constants
– Similar to log transformation
– Represents a whole family of transformations by varying 𝛾
– Many devices respond according to a power law (gamma correction)
– Useful for general-purpose contrast manipulation
23
Power transformation
24
Power transformation
Input
𝑐=1
𝛾=3
𝛾=4 𝛾=5
25
Piecewise linear transformations
• Complementary to other transformation methods • Enable more fine-tuned design of transformations • Can have very complex shapes
• Requires more user input
26
Piecewise contrast stretching
• One of the simplest piecewise linear transformations
• Increases the dynamic range of gray levels in images
• Used in display devices or recording media to span full range
27
Week2
28 28
Piecewise contrast stretching
Transform
Input
Transformed Binary Thresholding
Gray-level slicing
• Used to highlight specific range of gray levels
• Two different slicing approaches:
1) High value for all gray levels in a range of interest and low value for all
others (produces a binary image)
2) Brighten a desired range of gray levels while preserving background and other gray-scale tones of the image
29
Transform 1
Transform 2
Input
Result of Transform 1
Gray-level slicing
30
Bit-plane slicing
• Highlights contribution to total image by specific bits • An image with n-bits/pixel has n bit-planes
• Slicing can be useful for image compression
31
Bit-planes of an 8-bit image
151 =
1 0 0 1 0 1 1 1
Bit-plane 7
(most significant)
Bit-plane 0 (least significant)
32
Bit-planes of an 8-bit image
Input Bit-planes
33
Histogram of pixel intensities
• For every possible intensity level, count the number of pixels having that level, and plot the pixel counts as a function of the level
h(r)
𝐿=28 =256 𝑁 = #pixels
L−1
h(r) = N
r=0
Normalized histogram
= probability function
1 h(r) = p(r) N
8-bit image
Intensity Level
34
Count
Histogram processing
• Histogram equalization
Aim: To get an image with equally distributed intensity levels over the full intensity range
• Histogram specification (also called histogram matching) Aim: To get an image with a specified intensity distribution, determined by the shape of the histogram
35
Histogram processing
36
Histogram equalization
Enhances contrast for intensity values near histogram maxima and decreases contrast near histogram minima
Histogram bins are much more “equal” here
37 37
Histogram equalization
• Let 𝑟 ∈ 0, 𝐿 − 1 represent gray levels of the image
𝑟 = 0 represents black and 𝑟 = 𝐿 − 1 represents white
• Consider transformations 𝑠 = 𝑇 𝑟 , 0 ≤ 𝑟 ≤ 𝐿 − 1, satisfying
1) 𝑇(𝑟) is single-valued and monotonically increasing in 0 ≤ 𝑟 ≤ 𝐿 − 1 This guarantees that the inverse transformation exists
2) 0≤𝑇 𝑟 ≤𝐿−1for0≤𝑟≤𝐿−1
This guarantees that the input and output ranges will be the same
38
Histogram equalization (continuous case)
Consider 𝑟 and 𝑠 as continuous random variables over 0, 𝐿 − 1 with PDFs 𝑝𝑟(𝑟) and 𝑝𝑠(𝑠) If 𝑝𝑟(𝑟) and 𝑇(𝑟) are known and 𝑇−1(𝑠) satisfies monotonicity, then, from probability theory
|𝑟𝑑|)𝑟(𝑟𝑝 = 𝑠 𝑠𝑝 𝑠𝑑
Letuschoose: 𝑠=𝑇 𝑟 =(𝐿−1)𝑟𝑝𝑟 ξ 𝑑ξ 0
This is the CDF of 𝑟 which satisfies conditions (1) and (2)
Now: 𝑑𝑠=𝑑𝑇(𝑟)=𝐿−1𝑑 𝑟𝑝𝑟ξ𝑑ξ=(𝐿−1)𝑝𝑟(𝑟)
𝑑𝑟0 𝑟𝑑𝑟𝑑
Therefore: 𝑝𝑠𝑠=𝑝𝑟𝑟 1 = 1 for0≤𝑠≤𝐿−1 1−𝐿 𝑟 𝑟𝑝 1−𝐿
This is a uniform distribution!
39
Histogram equalization (discrete case)
For discrete values we get probabilities and summations instead of PDFs and integrals:
𝑝𝑟(𝑟𝑘)=𝑛𝑘/𝑀𝑁 for 𝑘 = 0,1,…,𝐿−1
where 𝑀𝑁 is total number of pixels in image, 𝑛𝑘 is the number of pixels with gray level 𝑟𝑘
and 𝐿 is the total number of gray levels in the image
Thus𝑠 =𝑇 𝑟 =(𝐿−1)σ𝑘 𝑝 𝑟 = 𝐿−1σ𝑘 𝑛 for 𝑘=0,1,…,𝐿−1
𝑘 𝑘 𝑗=0 𝑟 𝑗 𝑀𝑁 𝑗=0 𝑗
This transformation is called histogram equalization
However, in practice, getting a perfectly uniform distribution for discrete images is rare
40
Histogram matching (continuous case)
Assume that 𝑟 and 𝑠 are continuous intensities and 𝑝𝑧(𝑧) is the target distribution for the output image From our previous analysis we know that the following transformation results in a uniform distribution:
ξ𝑑ξ 𝑟𝑝𝑟)1−𝐿(= 𝑟𝑇=𝑠 0
Now we can define a function 𝐺(𝑧) as:
𝑠 = ξ 𝑑ξ 𝑧𝑝 𝑧 1 − 𝐿 = 𝑧 𝐺
Therefore:
0
)𝑟(𝑇 1−𝐺= 𝑠 1−𝐺=𝑧
41
Histogram matching (discrete case) For discrete image values we can write:
𝑘𝐿−1𝑘 𝑠=𝑇𝑟=(𝐿−1)𝑝𝑟= 𝑛
and therefore:
𝑘 = 0, 1, … , 𝐿 − 1
𝐺𝑧𝑞 =(𝐿−1)σ𝑞 𝑝𝑧(𝑧𝑖) 𝑖=0
𝑧𝑞 = 𝐺−1(𝑠𝑘)
𝑘𝑘 𝑟𝑗𝑀𝑁𝑗=0𝑗 𝑗=0
42
Arithmetic and logical operations • Defined on a pixel-by-pixel basis between two images
+ − ∗
/=
^
AND OR XOR
…
43
Arithmetic and logical operations
• Useful arithmetic operations include addition and subtraction
–
44
Arithmetic and logical operations • Useful logical operations include AND and OR
Input Mask Input AND Mask
45
Arithmetic/Logic Operations
• on pixel-by-pixel basis between 2 or more images
• AND and OR operations are used for masking- selecting subimages as RoI
• subtraction and addition are the most useful arithmetic operations
46
Chapter 3
Image Enhancement in the Spatial Domain
? (1)
? (2)
47
Image Averaging
• Noisy image g(x, y) formed by adding noise n(x, y) to uncorrupted image f(x, y):
g(x, y) = f(x, y) + n(x, y)
• Assume that at each (x, y), the noise is uncorrelated and has zero average value.
• Aim: To obtain smoothed result by averaging a set of noisy images gi(x,y), i=1,2,…,K
1K
g (x ,y) K gi (x, y)
i=1
• As K increases, the variability of the pixel values decreases
• assumes that images are spatially registered
48
Chapter 3
Image Enhancement in the Spatial Domain
49
Spatial Filtering
• These methods use a small neighbourhood of a pixel in the input image to produce a new brightness value for that pixel
• Also called filtering techniques
• Neighbourhood of (𝑥, 𝑦) is usually a square or rectangular subimage
centred at (𝑥, 𝑦)- called filter / mask / kernel /template / window
• A linear transformation calculates a value in the output image g(i, j) as a linear combination of brightnesses in a local neighbourhood of the pixel in the input image f(i, j), weighted by coefficients h:
𝑔 𝑥,𝑦 = σ𝑎 σ𝑏 𝑖=−𝑎 𝑗=−𝑏
h 𝑖,𝑗 𝑓(𝑥−𝑖,𝑦−𝑗)
• This is called a discrete convolution with a convolution mask h
50
Spatial Filtering
Convolution
51
Smoothing Spatial Filters
Used for blurring, noise reduction
Neighbourhood Averaging (Mean Filter)
𝑔𝑥,𝑦 =1σ(𝑛,𝑚)∈𝑆𝑓(𝑛,𝑚) 𝑃
• Replace intensity at pixel (x, y) with the average of the intensities in a neighbourhood of (x, y).
• We can also use a weighted average, giving more importance to some pixels over others in the neighbourhood- reduces blurring
• Neighbourhood averaging blurs edges
52
Chapter 3
Image Enhancement in the Spatial Domain
53
Chapter 3
Image Enhancement in the Spatial Domain
54
Another example
Consider an image of constant intensity, with widely isolated pixels with different intensity from the background. We wish to detect these pixels.
Use the following mask:
-1 -1 -1
-1 8 -1 -1 -1 -1
Smoothing Spatial Filters
• Aim: To suppress noise, other small fluctuations in image- may be result of sampling, quantization, transmission, environment disturbances during acquisition
• Uses redundancy in the image data
• May blur sharp edges, so care is needed
55
Gaussian Filter
1 − 𝑥2+𝑦2 𝑔𝑥,𝑦,𝜎=2𝜋𝜎2𝑒 2𝜎2
• Replace intensity at pixel (x, y) with the weighted average of the intensities in a neighbourhood of (x, y).
• It is a set of weights that approximate the profile of a Gaussian function.
• It is very effective in reducing noise and also reducing details (image blurring )
56
Gaussian Filter
57
Non-linear Spatial Filters
Also called order-statistics filters- response based on ordering the pixels in the neighbourhood, and replacing centre pixel with the ranking result.
Median Filter
•intensity of each pixel is replaced by the median of the intensities in neighbourhood of that pixel
•Median M of a set of values is the middle value such that half the values in the set are less than M and the other half greater than M
•Median filtering forces points with distinct intensities to be more like their neighbours, thus eliminating isolated intensity spikes
•Also, isolated pixel clusters (light or dark), whose area is <= n^2/2, are eliminated by nxn median filter
•Good for impulse noise (salt-and-pepper noise)
•Other examples of order-statistics filters are max and min filters
58
Median Filter
69 37 19 51 43 44 50 58 68
?
69
37
19
51
43
44
50
58
68
19
37
43
44
50
51
58
68
69
59
Chapter 3
Image Enhancement in the Spatial Domain
60
Pooling
Max / average/ median pooling
− Provides translation invariance
− Reduces computations
− Popular in deep convolutional neural networks (deep learning)
61
Sharpening Spatial Filters-Edge Detection
• Goal is to highlight fine details, or enhance details that have been blurred
• Spatial differentiation is the tool-strength of response of derivative operator is proportional to degree of discontinuity of the image at the point where operator is applied
• Image differentiation enhances edges, and de-emphasizes slowly varying gray-level values.
62
Derivative definitions
•For 1-D function f(x), the first order derivative is approximated as:
𝑑𝑓 𝑑𝑥
•The second-order derivative is approximated as:
𝑑2𝑓 = 𝑓(𝑥+1) + 𝑓(𝑥−1)– 2𝑓(𝑥)
•These are partial derivatives, so that extension to 2D is easy.
= 𝑓(𝑥+1)– 𝑓(𝑥)
𝑑𝑥2
63
Chapter 3
Image Enhancement in the Spatial Domain
64
Basic idea - Derivatives
• Horizontal scan of the image
• Edge modelled as a ramp- to represent blurring due to sampling
• First derivative is
– Non-zero along ramp
– zero in regions of constant intensity
– constant during an intensity transition
• Second derivative is
– Nonzero at onset and end of ramp
– Stronger response at isolated noise point
– zero everywhere except at onset and termination of intensity transition
• Thus, magnitude of first derivative can be used to detect the presence of an edge, and sign of second derivative to determine whether a pixel lies on dark or light side of an edge.
65
Summary - Derivatives
• First-order derivatives produce thicker edges, have stronger
response to gray-level step
• Second-order derivatives produce stronger response to fine detail (thin lines, isolated points), produce double response at step changes in gray level
66
Gradient Operator First-order derivatives implemented using magnitude of the gradient
For function f(x, y), the gradient of f at (x, y) is G with x and y components Gx The magnitude of the gradient vector is
G [ f(x, y) ] = G2 +G2 xy
Thisiscommonlyapproximatedby: G[f(x,y)]=|Gx | +|Gy | Gx and Gy are linear and may be obtained by using masks
, Gy
We use numerical techniques to compute these- give rise to different masks, e.g. Roberts’ 2x2 cross-gradient operators, Sobel’s 3x3 masks
67
Chapter 3
Image Enhancement in the Spatial Domain
68
The Laplacian Second order derivatives based on the Laplacian.
For a function f (x, y), the Laplacian is defined by
D2 f =¶2 f +¶2 f
¶x2 ¶y2 This is a linear operator, as all derivative operators are.
In discrete form:
and similarly in y direction. Summing them gives us
2 f = f(x+1,y)+ f(x−1,y)−2f(x,y) x2
∆2𝑓(𝑥,𝑦)=𝑓 𝑥+1,𝑦 +𝑓 𝑥−1,𝑦 +𝑓 𝑥,𝑦+1 +𝑓 𝑥,𝑦−1 −4𝑓(𝑥,𝑦)
69
Chapter 3
Image Enhancement in the Spatial Domain
70
Laplacian ctd
• There are other forms of the Laplacian- can include diagonal directions, for example
• Laplacian highlights grey-level discontinuities and produces dark featureless backgrounds
• The background can be recovered by adding or subtracting the Laplacian image to the original image
71
Chapter 3
Image Enhancement in the Spatial Domain
72
Chapter 3
Image Enhancement in the Spatial Domain
73
Chapter 3
Image Enhancement in the Spatial Domain
74
Chapter 3
Image Enhancement in the Spatial Domain
75
• •
Padding
When we use spatial filters for pixels on the boundary of an image, we do not have enough neighbours
To get an image with the same size as input image o Zero: set all pixels outside the source image to 0
o Constant: set all pixels outside the source image to a specified border value o Clamp: repeat edge pixels indefinitely
o Wrap: copy pixels from opposite side of the image
o Mirror: reflect pixels across the image edge
76
Padding Example
Szeliski, “Computer Vision”, Chapter 3
77
References and acknowledgements
• Chapter 3 of Gonzalez and Woods 2002
• Sections 3.1-3.3 of Szeliski
• Some images drawn from above resources
78