2/18/2021
CSE 473/573
Introduction to Computer Vision and Image Processing
‘-
PROJECT #1
‘-
1
2/18/2021
Optical character recognition (OCR)
‘-
Digit recognition, AT&T labs License plate readers
http://www.research.att.com/~yann/
http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
Technology to convert scanned docs to text
• If you have a scanner, it probably came with OCR software
3
Optical Character Recognition • Project 1 is NOT a general OCR.
• Given “templates” that you should look for on a document. • Only the given examples should be “recognized”
‘-
4
2
2/18/2021
What you need to do
Enrollment
• Decide what “features” you will use to match these target
• Target characters should be considered scale independent.
Detection
• Process the test image to identify candidates
• Connected component analysis and segmentation • Processing to prepare for recognition
• Recognition
• Extract features and develop classification strategy.
• Classify the candidates as one of the targets or UNKNOWN
• Key – Do all of this as efficiently as possible
‘-
5
Hints
• Connected Components
• https://aishack.in/tutorials/connected-component-labelling/
• You MUST implement it yourself
• Efficiency is key – consider what assumptions you can make
‘-
• Consider making and sharing other examples with your classmates.
• Classification
• Template matching at the pixel level will earn at most 90% • Consider other features to match
• Consider ways to make matching more efficient
• Remember, targets characters are scale independent
• Data
6
3
2/18/2021
FEATURE EXTRACTION Questions from Last Class?
‘-
Canny edge detector
• • •
•
Filter image with derivative of Gaussian
Find magnitude and orientation of gradient
Non-maximum suppression:
• Thin multi-pixel wide “ridges” down to single pixel
width
‘-
Linking and thresholding (hysteresis):
• Define two thresholds: low and high
• Use the high threshold to start edge curves and the low threshold to continue them
Source: D. Lowe, L. Fei-Fei 8
4
2/18/2021
The Canny edge detector
original image (Lena)
‘-
9
The Canny edge detector
‘-
norm of the gradient
10
5
2/18/2021
The Canny edge detector
‘-
thresholding
11
The Canny edge detector
‘-
How to turn these thick regions of the gradient into curves?
12
thresholding
6
2/18/2021
Non-maximum suppression
Check if pixel is local maximum along gradient direction, select single max across width of the edge
• requires checking interpolated pixels p and r ‘-
13
The Canny edge detector
‘-
thinning (non-maximum suppression)
Problem: pixels along this edge didn’t survive the thresholding
14
7
2/18/2021
Hysteresis thresholding
• Check that maximum value of gradient value is sufficiently large
• drop-outs? use hysteresis
‘-
‐ use a high threshold to start edge curves and a low threshold to continue them.
Source: S. Seitz
15
Hysteresis thresholding
original image
‘-
high threshold (strong edges)
low threshold (weak edges)
hysteresis threshold
16
Source: L. Fei-Fei
8
2/18/2021
Object boundaries vs. edges
‘-
Background Texture
Shadows
17
Edge detection is just the beginning…
Berkeley segmentation database:
http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/segbench/
‘-
image human segmentation
gradient magnitude Source: L. Lazebnik
18
9
2/18/2021
IMAGE PYRAMIDS
‘-
Image pyramids
• Gaussian pyramid
• Laplacian pyramid
• Wavelet/QMF pyramid
‘-
20
10
2/18/2021
The Gaussian pyramid • Smooth with Gaussians, because
• a Gaussian* Gaussian=another Gaussian
• Gaussians are low pass filters, so representation is redundant.
‘-
21
‘-
http://www-bcs.mit.edu/people/adelson/pub_pdfs/pyramid83.pdf
22
11
2/18/2021
‘-
23
Convolution and subsampling as a matrix multiply (1-D case)
x2 G1x1
G11 4 6 4 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
00146410000000000000
00001464100000000000
‘-
00000014641000000000 00000000146410000000 00000000001464100000 00000000000014641000 00000000000000146410
(Normalization constant of 1/16 omitted for visual clarity.)
24
12
2/18/2021
Next pyramid level
x3 G2x2
G2 1 4 6 4 1 0 0 0 0 0 1 4 6 4 1‘- 0 00001464 00000014
25
The combined effect of the two pyramid levels
x3 G2G1x1 G2G1
‘-
1 4 10 20 31 40 44 40 31 20 10 4 1 0 0 0 0 0 0 0 0 0 0 0 1 4 10 20 31 40 44 40 31 20 10 4 1 0 0 0 0 0 0 0 0 0 0 0 1 4 10 20 31 40 44 40 30 16 4 0 0 0 0 0 0 0 0 0 0 0 0 0 1 4 10 20 25 16 4 0
26
13
2/18/2021
‘-
http://www-bcs.mit.edu/people/adelson/pub_pdfs/pyramid83.pdf
27
Gaussian pyramids used for • up- or down- sampling images.
• Multi-resolution image analysis
• Look for an object over various spatial scales
• Coarse-to-fine image processing: form blur estimate or the ‘-
motion analysis on very low-resolution image, upsample and repeat. Often a successful strategy for avoiding local minima in complicated estimation tasks.
28
14
2/18/2021
Image pyramids
• Gaussian
• Laplacian
• Wavelet/QMF
‘-
29
The Laplacian Pyramid • Synthesis
• Compute the difference between upsampled Gaussian pyramid level and Gaussian pyramid level.
• Band pass filter – each level represents spatial frequencies
(largely) unrepresented at other level.
‘-
30
15
2/18/2021
Laplacian pyramid algorithm
G1x1 x2 x2
x1
x3
(I F G )x 333
31
‘-
(I F2G2 )x2 FGx
111
(I FG )x 111
Upsampling
yFxF6100 2333
4400
1 6 1 0‘- 0440 0161 0044 0016 0004
32
16
2/18/2021
Showing, at full resolution, the information captured at each level of a Gaussian (top) and Laplacian (bottom) pyramid.
‘-
http://www-bcs.mit.edu/people/adelson/pub_pdfs/pyramid83.pdf
33
Laplacian pyramid reconstruction algorithm: recover x1 from L1, L2, L3 and x4
G# is the blur-and-downsample operator at pyramid level # F# is the blur-and-upsample operator at pyramid level #
Laplacian pyramid elements: L1 = (I – F1 G1) x1
L2 = (I – F2 G2) x2
L3 = (I – F3 G3) x3
x2 = G1 x1 x3 = G2 x2 x4 = G3 x3
‘-
Reconstruction of original image (x1) from Laplacian pyramid elements:
x3 = L3 + F3 x4 x2 = L2 + F2 x3 x1 = L1 + F1 x2
34
17
2/18/2021
Laplacian pyramid reconstruction algorithm: recover x1 from L1, L2, L3 and g3
x1 x2 x3 +
+
g3 ‘-
+
L1
L3 L2
35
‘-
Gaussian pyramid
36
18
d
2/18/2021
‘-
Laplacian pyrami
37
Image pyramids
• Gaussian
• Laplacian
• Wavelet/QMF
‘-
38
19
2/18/2021
Wavelets/QMF’s
transformed image
F Uf
Vectorized image
‘-
Fourier transform, or Wavelet transform, or
Steerable pyramid transform
39
The simplest wavelet transform: the Haar transform
U=11 1 -1
‘-
40
20
2/18/2021
In 2 dimensions…
Frequency domain
‘-
Horizontal high pass
Horizontal low pass
41
Apply the wavelet transform separable in both dimensions
Horizontal high pass
Horizontal low pass
Horizontal high pass, vertical high pass
Horizontal low pass, vertical high-pass
‘-
Horizontal high pass, vertical low-pass
Horizontal low pass,
Vertical low-pass 42
21
2/18/2021
To create 2-D filters, apply the 1-D filters separably in the two spatial dimensions
‘-
Simoncelli and Adelson, in “Subband coding”, Kluwer, 1990.
43
LL1
Horizontal low pass, Vertical low-pass
Horizontal high pass, vertical low-pass
‘-
Horizontal low pass, vertical high-pass
Horizontal high pass, vertical high pass
44
22
2/18/2021
LL2
‘-
45
‘-
46
23
2/18/2021
Wavelet/QMF representation
‘-
47
Why use these representations?
• Handle real-world size variations with a constant-size
vision algorithm.
• Remove noise
• Analyze texture
• Recognize objects
• Label image features
‘-
48
24
2/18/2021
‘-
http://web.mit.edu/persci/people/adelson/pub_pdfs/RCA84.pdf 49
‘-
http://web.mit.edu/persci/people/adelson/pub_pdfs/RCA84.pdf 50
25
2/18/2021
‘-
http://web.mit.edu/persci/people/adelson/pub_pdfs/RCA84.pdf 51
FEATURE DETECTION AND MATCHING
‘-
26
2/18/2021
Feature Detection and Matching
• Local features
• Pyramids for invariant feature detection • Invariant descriptors
• Matching
‘-
53
Image matching
by Diva Sian
‘-
by swashford
54
27
2/18/2021
Harder case
by Diva Sian
by scgbt
‘-
55
Harder still?
‘-
NASA Mars Rover images
56
28
2/18/2021
Answer below (look for tiny colored squares…)
‘-
NASA Mars Rover images with SIFT feature matches Figure by Noah Snavely
57
Local features and alignment
‘-
• Global methods sensitive to occlusion, lighting, parallax effects. So look for local features that match well.
• How would you do it by eye?
• We need to match (align) images
58
[Darya Frolova and Denis Simakov]
29
2/18/2021
Local features and alignment • Detect feature points in both images
‘-
59
[Darya Frolova and Denis Simakov]
Local features and alignment
• Detect feature points in both images • Find corresponding pairs
‘-
60
[Darya Frolova and Denis Simakov]
30
2/18/2021
Local features and alignment
• Detect feature points in both images • Find corresponding pairs
• Use these pairs to align images
‘-
61
[Darya Frolova and Denis Simakov]
Local features and alignment • Problem 1:
• Detect the same point independently in both images ‘-
no chance to match!
We need a repeatable detector
[Darya Frolova and Denis Simakov]
62
31
2/18/2021
Local features and alignment • Problem 2:
• For each point correctly recognize the corresponding one
?
We need a reliable and distinctive descriptor
63
[Darya Frolova and Denis Simakov]
‘-
Geometric transformations
‘-
64
32
2/18/2021
Photometric transformations
‘-
Figure from T. Tuytelaars ECCV 2006 tutorial
65
And other nuisances…
• Noise
• Blur
• Compression artifacts
•… ‘-
66
33
2/18/2021
Invariant local features
Subset of local feature types designed to be invariant to common geometric and photometric transformations.
Basic steps:
1) Detect distinctive interest points 2) Extract invariant descriptors
‘-
67
Figure: David Lowe
Main questions
• Where will the interest points come from?
• What are salient features that we’ll detect in multiple views?
• How to describe a local region?
• How to establish correspondences, i.e., compute
matches?
‘-
68
34