程序代写代做代考 go kernel deep learning Computer Vision

Computer Vision
Image Processing I
1

Image processing
• Image processing = image in > image out
• Aims to suppress distortions and enhance relevant information • Prepares images for further analysis and interpretation
• Image analysis = image in > features out
• Computer vision = image in > interpretation out
2

Types of image processing
• Two main types of image processing operations:
– Spatial domain operations (in image space)
– Frequency domain operations (mainly in Fourier space)
• Two main types of spatial domain operations:
– Point operations (intensity transformations on individual pixels) – Neighbourhood operations (spatial filtering on groups of pixels)
3

Types of image processing
• Two main types of image processing operations:
– Spatial domain operations (in image space)
– Frequency domain operations (mainly in Fourier space)
• Two main types of spatial domain operations:
– Point operations (intensity transformations on individual pixels) – Neighbourhood operations (spatial filtering on groups of pixels)
4

Topics and learning goals
• Describe the workings of basic point operations
Contrast stretching, thresholding, inversion, log/power transformations
• Understanding and using the intensity histogram Histogram specification, equalization, matching
• Defining arithmetic and logical operations Summation, subtraction, AND, OR, et cetera
5

Spatial domain operations • General form of spatial domain operations
where
𝑔𝑥,𝑦 =𝑇𝑓𝑥,𝑦
𝑓 𝑥, 𝑦 is the input image
𝑔 𝑥, 𝑦 is the processed image
𝑇 ∙ is the operator applied at (𝑥, 𝑦)
6

Spatial domain operations • Point operations: 𝑇 operates on individual pixels
𝑇:R⟶R 𝑔𝑥,𝑦 =𝑇𝑓𝑥,𝑦
• Neighbourhood operations: 𝑇 operates on multiple pixels 𝑇:R2⟶R 𝑔𝑥,𝑦 =𝑇𝑓𝑥,𝑦,𝑓𝑥+1,𝑦,𝑓𝑥−1,𝑦,…
7

Point operations
8

Neighbourhood operations
9

Contrast stretching
Input
Output
T
LH
𝐼𝑛𝑝𝑢𝑡 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 𝑙𝑒𝑣𝑒𝑙 10
𝑂𝑢𝑡𝑝𝑢𝑡 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 𝑙𝑒𝑣𝑒𝑙

Contrast stretching
• Produces images of higher contrast
• Puts values below 𝐿 in the input to black in the output
• Puts values above 𝐻 in the input to white in the output
• Linearly scales values between 𝐿 and 𝐻 in the input to the maximum range in the output
11

Intensity thresholding
Input
Output
T
Threshold
𝐼𝑛𝑝𝑢𝑡 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 𝑙𝑒𝑣𝑒𝑙
12
𝑂𝑢𝑡𝑝𝑢𝑡 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 𝑙𝑒𝑣𝑒𝑙

Intensity thresholding
• Limiting case of contrast stretching
• Produces binary images of gray-scale images
• Puts values below the threshold to black in the output
• Puts values equal/above the threshold to white in the output
• Popularmethodforimagesegmentation(discussedlater)
• Useful only if object and background intensities are very different
13

Automatic intensity thresholding
• Otsu’s method for computing the threshold
Exhaustively searches for the threshold minimising the intra-class variance
𝜎2 =𝑝𝜎2+𝑝𝜎2 𝑊0011
Equivalent to maximising the inter-class variance (much faster to compute)
𝜎2=𝑝𝑝 𝜇 −𝜇 2 𝐵0101
Here, 𝑝0 is the fraction of pixels below the threshold (class 0), 𝑝1 is the fraction of pixels equal
to or above the threshold (class 1), 𝜇0 and 𝜇1 are the mean intensities of pixels in class 0 and
class1,𝜎2 and𝜎2 aretheintensityvariances,and𝑝 +𝑝 =1and𝜎2 +𝜎2 =𝜎2 01 0101
14

Otsu thresholding
15

Automatic intensity thresholding
• Iso-data method for computing the threshold
1. Select an arbitrary initial threshold 𝑡
2. Compute 𝜇0 and 𝜇1 with respect to the threshold
3. Update the threshold to the mean of the means: 𝑡 = 𝜇0 + 𝜇1 /2
4. If the threshold changed in Step 3, go to Step 2
Upon convergence, the threshold is midway between the two class means
16

Iso-data thresholding
17

Multi-level thresholding
O
Input
Output
T
I
18

Intensity inversion
Input
O
Output
T
I
19

Intensity inversion
Useful for enhancing gray/white details in images within dominant black areas
20

Log transformation • Definition of log transformation
𝑠 = 𝑐 log 1 + 𝑟
where 𝑟 is the input intensity, 𝑠 is the output intensity, and 𝑐 is a constant
– Maps narrow range of low gray-level values into wider range of output values, and opposite for higher gray-level values
– Also compresses dynamic range of images with large variations in pixel values (such as Fourier spectra, discussed later)
21

Log transformation
22

Power transformation • Definition of power transformation
𝑠 = 𝑐 𝑟𝛾 where 𝑐 and 𝛾 are constants
– Similar to log transformation
– Represents a whole family of transformations by varying 𝛾
– Many devices respond according to a power law (gamma correction)
– Useful for general-purpose contrast manipulation
23

Power transformation
24

Power transformation
Input
𝑐=1
𝛾=3
𝛾=4 𝛾=5
25

Piecewise linear transformations
• Complementary to other transformation methods • Enable more fine-tuned design of transformations • Can have very complex shapes
• Requires more user input
26

Piecewise contrast stretching
• One of the simplest piecewise linear transformations
• Increases the dynamic range of gray levels in images
• Used in display devices or recording media to span full range
27

Week2
28 28
Piecewise contrast stretching
Transform
Input
Transformed Binary Thresholding

Gray-level slicing
• Used to highlight specific range of gray levels
• Two different slicing approaches:
1) High value for all gray levels in a range of interest and low value for all
others (produces a binary image)
2) Brighten a desired range of gray levels while preserving background and other gray-scale tones of the image
29

Transform 1
Transform 2
Input
Result of Transform 1
Gray-level slicing
30

Bit-plane slicing
• Highlights contribution to total image by specific bits • An image with n-bits/pixel has n bit-planes
• Slicing can be useful for image compression
31

Bit-planes of an 8-bit image
151 =
1 0 0 1 0 1 1 1
Bit-plane 7
(most significant)
Bit-plane 0 (least significant)
32

Bit-planes of an 8-bit image
Input Bit-planes
33

Histogram of pixel intensities
• For every possible intensity level, count the number of pixels having that level, and plot the pixel counts as a function of the level
h(r)
𝐿=28 =256 𝑁 = #pixels
L−1
h(r) = N
r=0
Normalized histogram
= probability function
1 h(r) = p(r) N
8-bit image
Intensity Level
34
Count

Histogram processing
• Histogram equalization
Aim: To get an image with equally distributed intensity levels over the full intensity range
• Histogram specification (also called histogram matching) Aim: To get an image with a specified intensity distribution, determined by the shape of the histogram
35

Histogram processing
36

Histogram equalization
Enhances contrast for intensity values near histogram maxima and decreases contrast near histogram minima
Histogram bins are much more “equal” here
37 37

Histogram equalization
• Let 𝑟 ∈ 0, 𝐿 − 1 represent gray levels of the image
𝑟 = 0 represents black and 𝑟 = 𝐿 − 1 represents white
• Consider transformations 𝑠 = 𝑇 𝑟 , 0 ≤ 𝑟 ≤ 𝐿 − 1, satisfying
1) 𝑇(𝑟) is single-valued and monotonically increasing in 0 ≤ 𝑟 ≤ 𝐿 − 1 This guarantees that the inverse transformation exists
2) 0≤𝑇 𝑟 ≤𝐿−1for0≤𝑟≤𝐿−1
This guarantees that the input and output ranges will be the same
38

Histogram equalization (continuous case)
Consider 𝑟 and 𝑠 as continuous random variables over 0, 𝐿 − 1 with PDFs 𝑝𝑟(𝑟) and 𝑝𝑠(𝑠) If 𝑝𝑟(𝑟) and 𝑇(𝑟) are known and 𝑇−1(𝑠) satisfies monotonicity, then, from probability theory
|𝑟𝑑|)𝑟(𝑟𝑝 = 𝑠 𝑠𝑝 𝑠𝑑
Letuschoose: 𝑠=𝑇 𝑟 =(𝐿−1)׬𝑟𝑝𝑟 ξ 𝑑ξ 0
This is the CDF of 𝑟 which satisfies conditions (1) and (2)
Now: 𝑑𝑠=𝑑𝑇(𝑟)=𝐿−1𝑑 ׬𝑟𝑝𝑟ξ𝑑ξ=(𝐿−1)𝑝𝑟(𝑟)
𝑑𝑟0 𝑟𝑑𝑟𝑑
Therefore: 𝑝𝑠𝑠=𝑝𝑟𝑟 1 = 1 for0≤𝑠≤𝐿−1 1−𝐿 𝑟 𝑟𝑝 1−𝐿
This is a uniform distribution!
39
Histogram equalization (discrete case)
For discrete values we get probabilities and summations instead of PDFs and integrals:
𝑝𝑟(𝑟𝑘)=𝑛𝑘/𝑀𝑁 for 𝑘 = 0,1,…,𝐿−1
where 𝑀𝑁 is total number of pixels in image, 𝑛𝑘 is the number of pixels with gray level 𝑟𝑘
and 𝐿 is the total number of gray levels in the image
Thus𝑠 =𝑇 𝑟 =(𝐿−1)σ𝑘 𝑝 𝑟 = 𝐿−1σ𝑘 𝑛 for 𝑘=0,1,…,𝐿−1
𝑘 𝑘 𝑗=0 𝑟 𝑗 𝑀𝑁 𝑗=0 𝑗
This transformation is called histogram equalization
However, in practice, getting a perfectly uniform distribution for discrete images is rare
40

Histogram matching (continuous case)
Assume that 𝑟 and 𝑠 are continuous intensities and 𝑝𝑧(𝑧) is the target distribution for the output image From our previous analysis we know that the following transformation results in a uniform distribution:
ξ𝑑ξ 𝑟𝑝𝑟׬)1−𝐿(= 𝑟𝑇=𝑠 0
Now we can define a function 𝐺(𝑧) as:
𝑠 = ξ 𝑑ξ 𝑧𝑝 𝑧׬ 1 − 𝐿 = 𝑧 𝐺
Therefore:
0
)𝑟(𝑇 1−𝐺= 𝑠 1−𝐺=𝑧
41
Histogram matching (discrete case) For discrete image values we can write:
𝑘𝐿−1𝑘 𝑠=𝑇𝑟=(𝐿−1)෍𝑝𝑟= ෍𝑛
and therefore:
𝑘 = 0, 1, … , 𝐿 − 1
𝐺𝑧𝑞 =(𝐿−1)σ𝑞 𝑝𝑧(𝑧𝑖) 𝑖=0
𝑧𝑞 = 𝐺−1(𝑠𝑘)
𝑘𝑘 𝑟𝑗𝑀𝑁𝑗=0𝑗 𝑗=0
42

Arithmetic and logical operations • Defined on a pixel-by-pixel basis between two images
+ − ∗
/=
^
AND OR XOR

43

Arithmetic and logical operations
• Useful arithmetic operations include addition and subtraction

44

Arithmetic and logical operations • Useful logical operations include AND and OR
Input Mask Input AND Mask
45

Arithmetic/Logic Operations
• on pixel-by-pixel basis between 2 or more images
• AND and OR operations are used for masking- selecting subimages as RoI
• subtraction and addition are the most useful arithmetic operations
46

Chapter 3
Image Enhancement in the Spatial Domain
? (1)
? (2)
47

Image Averaging
• Noisy image g(x, y) formed by adding noise n(x, y) to uncorrupted image f(x, y):
g(x, y) = f(x, y) + n(x, y)
• Assume that at each (x, y), the noise is uncorrelated and has zero average value.
• Aim: To obtain smoothed result by averaging a set of noisy images gi(x,y), i=1,2,…,K
1K
g (x ,y)  K gi (x, y)
i=1
• As K increases, the variability of the pixel values decreases
• assumes that images are spatially registered
48

Chapter 3
Image Enhancement in the Spatial Domain
49

Spatial Filtering
• These methods use a small neighbourhood of a pixel in the input image to produce a new brightness value for that pixel
• Also called filtering techniques
• Neighbourhood of (𝑥, 𝑦) is usually a square or rectangular subimage
centred at (𝑥, 𝑦)- called filter / mask / kernel /template / window
• A linear transformation calculates a value in the output image g(i, j) as a linear combination of brightnesses in a local neighbourhood of the pixel in the input image f(i, j), weighted by coefficients h:
𝑔 𝑥,𝑦 = σ𝑎 σ𝑏 𝑖=−𝑎 𝑗=−𝑏
h 𝑖,𝑗 𝑓(𝑥−𝑖,𝑦−𝑗)
• This is called a discrete convolution with a convolution mask h
50

Spatial Filtering
Convolution
51

Smoothing Spatial Filters
Used for blurring, noise reduction
Neighbourhood Averaging (Mean Filter)
𝑔𝑥,𝑦 =1σ(𝑛,𝑚)∈𝑆𝑓(𝑛,𝑚) 𝑃
• Replace intensity at pixel (x, y) with the average of the intensities in a neighbourhood of (x, y).
• We can also use a weighted average, giving more importance to some pixels over others in the neighbourhood- reduces blurring
• Neighbourhood averaging blurs edges
52

Chapter 3
Image Enhancement in the Spatial Domain
53

Chapter 3
Image Enhancement in the Spatial Domain
54

Another example
Consider an image of constant intensity, with widely isolated pixels with different intensity from the background. We wish to detect these pixels.
Use the following mask:
-1 -1 -1
-1 8 -1 -1 -1 -1
Smoothing Spatial Filters
• Aim: To suppress noise, other small fluctuations in image- may be result of sampling, quantization, transmission, environment disturbances during acquisition
• Uses redundancy in the image data
• May blur sharp edges, so care is needed
55

Gaussian Filter
1 − 𝑥2+𝑦2 𝑔𝑥,𝑦,𝜎=2𝜋𝜎2𝑒 2𝜎2
• Replace intensity at pixel (x, y) with the weighted average of the intensities in a neighbourhood of (x, y).
• It is a set of weights that approximate the profile of a Gaussian function.
• It is very effective in reducing noise and also reducing details (image blurring )
56

Gaussian Filter
57

Non-linear Spatial Filters
Also called order-statistics filters- response based on ordering the pixels in the neighbourhood, and replacing centre pixel with the ranking result.
Median Filter
•intensity of each pixel is replaced by the median of the intensities in neighbourhood of that pixel
•Median M of a set of values is the middle value such that half the values in the set are less than M and the other half greater than M
•Median filtering forces points with distinct intensities to be more like their neighbours, thus eliminating isolated intensity spikes
•Also, isolated pixel clusters (light or dark), whose area is <= n^2/2, are eliminated by nxn median filter •Good for impulse noise (salt-and-pepper noise) •Other examples of order-statistics filters are max and min filters 58 Median Filter 69 37 19 51 43 44 50 58 68 ? 69 37 19 51 43 44 50 58 68 19 37 43 44 50 51 58 68 69 59 Chapter 3 Image Enhancement in the Spatial Domain 60 Pooling Max / average/ median pooling − Provides translation invariance − Reduces computations − Popular in deep convolutional neural networks (deep learning) 61 Sharpening Spatial Filters-Edge Detection • Goal is to highlight fine details, or enhance details that have been blurred • Spatial differentiation is the tool-strength of response of derivative operator is proportional to degree of discontinuity of the image at the point where operator is applied • Image differentiation enhances edges, and de-emphasizes slowly varying gray-level values. 62 Derivative definitions •For 1-D function f(x), the first order derivative is approximated as: 𝑑𝑓 𝑑𝑥 •The second-order derivative is approximated as: 𝑑2𝑓 = 𝑓(𝑥+1) + 𝑓(𝑥−1)– 2𝑓(𝑥) •These are partial derivatives, so that extension to 2D is easy. = 𝑓(𝑥+1)– 𝑓(𝑥) 𝑑𝑥2 63 Chapter 3 Image Enhancement in the Spatial Domain 64 Basic idea - Derivatives • Horizontal scan of the image • Edge modelled as a ramp- to represent blurring due to sampling • First derivative is – Non-zero along ramp – zero in regions of constant intensity – constant during an intensity transition • Second derivative is – Nonzero at onset and end of ramp – Stronger response at isolated noise point – zero everywhere except at onset and termination of intensity transition • Thus, magnitude of first derivative can be used to detect the presence of an edge, and sign of second derivative to determine whether a pixel lies on dark or light side of an edge. 65 Summary - Derivatives • First-order derivatives produce thicker edges, have stronger response to gray-level step • Second-order derivatives produce stronger response to fine detail (thin lines, isolated points), produce double response at step changes in gray level 66 Gradient Operator First-order derivatives implemented using magnitude of the gradient For function f(x, y), the gradient of f at (x, y) is G with x and y components Gx The magnitude of the gradient vector is G [ f(x, y) ] = G2 +G2 xy Thisiscommonlyapproximatedby: G[f(x,y)]=|Gx | +|Gy | Gx and Gy are linear and may be obtained by using masks , Gy We use numerical techniques to compute these- give rise to different masks, e.g. Roberts’ 2x2 cross-gradient operators, Sobel’s 3x3 masks 67 Chapter 3 Image Enhancement in the Spatial Domain 68 The Laplacian Second order derivatives based on the Laplacian. For a function f (x, y), the Laplacian is defined by D2 f =¶2 f +¶2 f ¶x2 ¶y2 This is a linear operator, as all derivative operators are. In discrete form: and similarly in y direction. Summing them gives us 2 f = f(x+1,y)+ f(x−1,y)−2f(x,y) x2 ∆2𝑓(𝑥,𝑦)=𝑓 𝑥+1,𝑦 +𝑓 𝑥−1,𝑦 +𝑓 𝑥,𝑦+1 +𝑓 𝑥,𝑦−1 −4𝑓(𝑥,𝑦) 69 Chapter 3 Image Enhancement in the Spatial Domain 70 Laplacian ctd • There are other forms of the Laplacian- can include diagonal directions, for example • Laplacian highlights grey-level discontinuities and produces dark featureless backgrounds • The background can be recovered by adding or subtracting the Laplacian image to the original image 71 Chapter 3 Image Enhancement in the Spatial Domain 72 Chapter 3 Image Enhancement in the Spatial Domain 73 Chapter 3 Image Enhancement in the Spatial Domain 74 Chapter 3 Image Enhancement in the Spatial Domain 75 • • Padding When we use spatial filters for pixels on the boundary of an image, we do not have enough neighbours To get an image with the same size as input image o Zero: set all pixels outside the source image to 0 o Constant: set all pixels outside the source image to a specified border value o Clamp: repeat edge pixels indefinitely o Wrap: copy pixels from opposite side of the image o Mirror: reflect pixels across the image edge 76 Padding Example Szeliski, “Computer Vision”, Chapter 3 77 References and acknowledgements • Chapter 3 of Gonzalez and Woods 2002 • Sections 3.1-3.3 of Szeliski • Some images drawn from above resources 78