CPSC 425: Computer Vision
Lecture 34: Review 1
Today’s “fun” Example: Colorful Image Colorization
Copyright By PowCoder代写 加微信 powcoder
Final Exam Details 2.5 hours
Closed book, no calculators — Equations will be given
Format similar to midterm exam — Part A: Multiple-part true/false — Part B: Short answer
No coding questions
How to study?
— Look at the Lectures Notes and Assignment and think critically if you truly
understand the material
— It easy to look at the slides and think — “This all makes sense”
— Look at each algorithm, concept,
— what are properties of the algorithm / concept?
— what does each step do?
— is this step important? can you imagine doing it another way?
— what are parameters? what would be the effect of changing those?
Course Review: Reading Lecture slides
Assigned readings from Forsyth & Ponce (2nd ed.)
— Paper “Texture Synthesis by Non-parametric Sampling”
— Paper “Distinctive Image Features from Scale-Invariant Keypoints”
Assignments 1-6
Quiz questions
Check out the recap section for each lecture
Practice problems (with solutions) — 5 sets will be in Canvas/solutions later 5
Course Review: Cameras and Lenses Pinhole camera
Projections (and projection equations)
— perspective, weak perspective, orthographic
Lenses (why, how) thin lens, focus Human eye
Course Review: Filters Point operations
Filtering: Correlation and convolution
Box, pillbox, Gaussian filters
Separability
Padding, convolving filter with filter
Non-linear filters: median, bilateral
Characterization theorem: convolution
Continuous and discrete images: sampling, quantization, undersampling, aliasing, Nyquist rate Colour filter arrays – demosaicing
Template matching – correlation, normalized correlation
Course Review: Edge and Corners
Causes of “edges”
Estimating the image gradient
Canny edge detection – max of gradient Marr/Hildreth edge detection – zero-crossing Boundary detection
Harris corner detection
Course Review: Texture
Texture representation
Laplacian pyramid, oriented pyramid
Texture synthesis (Efros and Leung paper)
Texture analysis (leads to Bag of Words… textons = words)
Course Review: Colour
Human colour perception
RGB and CIE XYZ colour spaces
Uniform colour space
HSV colour space
Lambertian (matte) and specular reflection
Color Matching Experiments
Forsyth & Ponce (2nd ed.) Figure 3.2
Show a split field to subjects. One side shows the light whose colour one wants to match. The other a weighted mixture of three primaries (fixed lights)
Test Light
Example 1: Color Matching Experiment
knobs here
Example Credit:
Example 1: Color Matching Experiment
knobs here
Example Credit:
Example 2: Color Matching Experiment
Example Credit:
Example 2: Color Matching Experiment
We say a “negative” amount of was needed to make a match , because
we added it to the test color side
The primary color amount needed to match:
Example Credit:
Uniform Colour Spaces
Mc : Each ellipse shows colours perceived to be the same
10 times actual size Actual Size
Forsyth & Ponce (2nd ed.) Figure 3.14
Uniform Colour Spaces
McAdam ellipses demonstrate that differences in x , y are a poor guide to
differences in perceived colour
A uniform colour space is one in which differences in coordinates are a good guide to differences in perceived colour
— example: CIE LAB
HSV Colour Space
More natural description of colour for human interpretation
Hue: attribute that describes a pure colour — e.g. ’red’, ’blue’
Saturation: measure of the degree to which a pure colour is diluted by white light — pure spectrum colours are fully saturated
Value: intensity or brightness
Hue + saturation also referred to as chromaticity. 18
Course Review: Local Invariant Features
Keypoint detection using Difference of Gaussian pyramid Keypoint orientation assignment
Keypoint descriptor
Matching with nearest and second-nearest neighbors
SIFT and object recognition
Scale Invariant Feature Transform (SIFT)
SIFT describes both a detector and descriptor
1. Multi-scale extrema detection 2. Keypoint localization
3. Orientation assignment
4. Keypoint descriptor
Slide Credit: Ioannis (Yannis) Gkioulekas (CMU)
1. Multi-scale Extrema Detection
Half the size
Difference of Gaussian (DoG)
Slide Credit: Ioannis (Yannis) Gkioulekas (CMU)
First octave Second octave
1. Multi-scale Extrema Detection
Detect maxima and minima of Difference of Gaussian in scale space
Selected if larger than all 26 neighbors
Difference of Gaussian (DoG)
Slide Credit: Ioannis (Yannis) Gkioulekas (CMU)
Scale of Gaussian variance
2. Keypoint Localization
— After keypoints are detected, we remove those that have low contrast or
are poorly localized along an edge
— Lowe suggests computing the ratio of the eigenvalues of C (recall Harris
corners) and checking if it is greater than a threshold
3. Orientation Assignment
— Create histogram of local gradient
directions computed at selected scale
— Assign canonical orientation at peak of smoothed histogram
— Each key specifies stable 2D coordinates (x , y , scale, orientation)
4. SIFT Descriptor
— Thresholded image gradients are sampled over 16 × 16 array of locations in
scale space (weighted by a Gaussian with sigma half the size of the window) — Create array of orientation histograms
— 8 orientations × 4 × 4 histogram array
Course Review: Fitting Data to a Model
RANSAC Hough transform
RANSAC (RANdom SAmple Consensus)
1. Randomly choose minimal subset of data points necessary to fit model (a
2. Points within some distance threshold, t, of model are a consensus set. Size
of consensus set is model’s support
3. Repeat for N samples; model with biggest support is most robust fit — Points within distance t of best model are inliers
— Fit final model to all inliers
Slide Credit:
RANSAC: How many samples?
be the fraction of inliers (i.e., points on line)
be the number of points needed to define hypothesis for a line in the plane)
Suppose samples are chosen
The probability that a single sample of points is correct (all inliers) is
The probability that all samples fail is
Choose large enough (to keep this below a target failure rate)
RANSAC: k Samples Chosen (p = 0.99)
Figure Credit: Hartley & Zisserman 29
Discussion of RANSAC
Advantages:
— General method suited for a wide range of model fitting problems — Easy to implement and easy to calculate its failure rate
Disadvantages:
— Only handles a moderate percentage of outliers without cost blowing up
— Many real problems have high rate of outliers (but sometimes selective choice of random subsets can help)
The Hough transform can handle high percentage of outliers
Idea of Hough transform:
— For each token vote for all models to which the token could belong — Return models that get many votes
Example: For each point, vote for all lines that could pass through it; the true lines will pass through many points and so receive many votes
Example: Clean Data
Horizontal axis is θ
Vertical Axis is r
Forsyth & Ponce (2nd ed.) Figure 10.1 (Top)
Example: Some Noise
Horizontal axis is θ
Vertical Axis is r
Forsyth & Ponce (2nd ed.) Figure 10.1 (Bottom)
Example: Too Much Noise
Horizontal axis is θ
Vertical Axis is r
Forsyth & Ponce (2nd ed.) Figure 10.2
Sample Question
In his SIFT paper, why did Lowe choose to use a Hough transform rather than
RANSAC to recognize clusters of 3 consistent features?
Note: the clusters can be modeled as resulting from an affine transformation – a warping – which is the “model”
Course Review: Stereo
Epipolar constraint
Rectified images
Computing correspondences Ordering constraint
The Epipolar Constraint
Matching points lie along corresponding epipolar lines
Reduces correspondence problem to 1D search along conjugate epipolar lines
Greatly reduces cost and ambiguity of matching
Slide credit:
Simplest Case: Rectified Images
Image planes of cameras are parallel
Focal points are at same height
Focal lengths same
Then, epipolar lines fall along the horizontal scan lines of the images
We assume images have been rectified so that epipolar lines correspond to scan lines
— Simplifies algorithms — Improves efficiency
Rectified Stereo Pair
Method: Correlation
Ordering Constraints
Ordering constraint … …. and a failure case
Forsyth & Ponce (2nd ed.) Figure 7.13
Idea: Use More Cameras
Adding a third camera reduces ambiguity in stereo matching
Forsyth & Ponce (2nd ed.) Figure 7.17
Course Review: Motion and Optical Flow
Motion (geometric), optical flow (radiometric) Optical flow constraint equation Lucas-Kanade method
Optical Flow Constraint Equation
Consider image intensity also to be a function of time, . We write
Applying the chain rule for differentiation, we obtain where subscripts denote partial differentiation
. and . Then is the 2-D motion and the space of all
such and is the 2-D velocity space
. Then we obtain the (classic) optical flow constraint 44
How do we compute …
spatial derivative
Forward difference Sobel filter Scharr filter
temporal derivative
Frame differencing
Slide Credit: Ioannis (Yannis) Gkioulekas (CMU)
Frame Differencing: Example
(example of a forward temporal difference)
Slide Credit: Ioannis (Yannis) Gkioulekas (CMU)
Slide Credit: Ioannis (Yannis) Gkioulekas (CMU)
How do we compute …
spatial derivative
Forward difference Sobel filter Scharr filter
optical flow
How do you compute this?
temporal derivative
Frame differencing
Slide Credit: Ioannis (Yannis) Gkioulekas (CMU)
Lucas-Kanade Summary
A dense method to compute motion, , at every location in an image
Key Assumptions:
1. Motion is slow enough and smooth enough that differential methods apply (i.e., that the partial derivatives, , are well-defined)
2. The optical flow constraint equation holds (i.e.,
3. A window size is chosen so that motion, , is constant in the window
4. A window size is chosen so that the rank of is 2 for the window 49
Course Review: Clustering
Two basic approaches: agglomerative and divisive clustering Dendrograms
Inter-cluster distance measures
K-means clustering
Segmentation by clustering
Course Review: Classification Bayes’ risk, loss functions
Underfitting, overfitting
Generative vs discriminative models
Cross-validation
Receiver Operating Characteristic (ROC) curve – trade-off between true/false positives
Parametric vs. non-parametric classifiers — K-nearest neighbour
— Support vector machines
— Decision trees
Course Review: Image Classification
Visual words, codebooks
Bag of words representation – histogram of word occurrences Spatial pyramid
VLAD – vector of locally aggregated descriptors (residuals)
Sample Question
How do we construct a codebook (vocabulary) of local descriptors, say SIFT?
Course Review: Object Detection
Boosting Sliding window
Viola-Jones face detection – Haar wavelets, integral image, boosting, cascade of detectors
Object proposals
Course Review: Convolutional Neural Networks Neuron, activation function, layers, fully connected, weights/bias
Backpropagation (you only need to know properties), Stochastic gradient descent
Convolutional neural network architecture
Convolutional neural network layers, weight sharing, local support Pooling
Receptive field
Categorization, detection, segmentation, instance segmentation R-CNN
Hope you enjoyed the course!
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com