EBU7240 Computer Vision
Tracking: Image Alignment
Semester 1, 2021
Changjae Oh
Copyright By PowCoder代写 加微信 powcoder
• Motion Estimation (Review of EBU6230 content)
• Image Alignment
• Kanade-Lucas-Tomasi (KLT) Tracking
• Mean-shift Tracking
Objectives
• To review Lucas-Kanade optical flow in EBU6230
• To understand Lucas-Kanade image alignment
• To understand the relationship between Lucas-Kanade optical flow and im age alignment
• To understand Kanade-Lucas-Tomasi tracker
• Motion Estimation (Review of EBU6230 content)
• Image Alignment
• Kanade-Lucas-Tomasi (KLT) Tracking
• Mean-shift Tracking
Motion Estimation: Gradient method
• Brightness consistency constraint
H(x, y,t) = I(x + x, y + y,t + t) • small motion: (Δx and Δy are less than 1 pixel)
– suppose we take the Taylor series expansion of I:
I(x + , y + y,t + t) = I(x, y,t) + I x + I y + I t
x y t + higher order terms
I(x + , y + y,t + t) I(x, y,t) + I x + I y + I t x y t
Gradient method
• Spatio-temporal constraint
IVx +IVy +I =0 x y t
• This equation introduces one constraint only
– Where the motion vector of a pixel has 2 components (parameters)
– A second constraints is necessary to solve the system
Aperture problem
• The aperture problem
– stems from the need to solve one equation with two unknowns, which are the two components of optical flow
– it is not possible to estimate both components of the optical flow from the local spatial and temporal derivatives
• By applying a constraint
– the optical flow field changes smoothly in a small neighborhood it is possible to estimate both components of the optical flow
if the spatial and temporal derivatives of the image
intensity are available
Solving the aperture problem
• How to get more equations for a pixel?
• By applying a constraint
– the optical flow field changes smoothly in a small neighborhood it is possible to estimate both components of the optical flow
if the spatial and temporal derivatives of the image
intensity are available
• Lucas–Kanade method
Gradient method
• The Lucas–Kanade method assumes that the displacement of the image contents between two nearby instants (frames) is small
• Matrix form
Ix(qn)Vx +Iy(qn)Vy =−It(qn)
I(q) I(q) V −I(q)
I (q )V + I (q )V = −I (q ) x1xy1y t1
Ix(q2)Vx +Iy(q2)Vy =−It(q2) …
x1y1 t1 A=Ix(q2) Iy(q2) v= x b=−It(q2)
y − I (q ) t n
I(q)I(q) xnyn
Gradient method
• Prob: we have more equations than unknowns
Av=b minimize Av−b
• Solution: solve least squares problem
( AT A)v = AT b
• minimum least squares solution given by solution of:
II II II x x x yVx =− x t
yy
IxIy IIVy IyIt
• The summations are over all n pixels in the K x K window
• This technique was first proposed by Lukas & Kanade (1981)
Lucas-Kanade flow
II II II x x x yVx =− x t
yy
IxIy IIVy IyIt
• When is this solvable?
• A A should be invertible
• A A should not be too small due to noise
– eigenvalues l and l of A A should not be too small
• A A should be well-conditioned
– l1/ l2 should not be too large (l1 = larger eigenvalue) T
– Recall the Harris corner detector: M = A A is the second moment matrix
Interpreting the eigenvalues
Classification of image points using eigenvalues of the second moment matrix:
l1 and l2 are large,
“Flat” region
l1 and l2 are small
Uniform region
– gradients have small magnitude – small l1, small l2
– system is ill-conditioned
– gradients have one dominant direction – large l1, small l2
– system is ill-conditioned
High-texture or corner region
– gradients have different directions, large magnitudes – large l1, large l2
– system is well-conditioned
• Motion Estimation (Review of EBU6230 content)
• Image Alignment
• Kanade-Lucas-Tomasi (KLT) Tracking
• Mean-shift Tracking
How can I find in the image?
Idea #1: Template Matching
Slow, global solution
Idea #2: Pyramid Template Matching
Faster, locally optimal
Idea #3: Model refinement
(when you have a good initial solution)
Fastest, locally optimal
Some notation before we get into the math…
2D image transformation
2D image coordinate
Parameters of the transformation
Warped image
Pixel value at a coordinate
Translation
coordinate
affine transform
can be written in matrix form when linear
affine warp matrix can also be 3×3 when last row is [0 0 1]
coordinate
Image alignment • Problem definition
warped image template image
Find the warp parameters p such that the SSD is minimized
Image alignment
Find the warp parameters p such that the SSD is minimized
Image alignment • Problem definition
warped image template image
Find the warp parameters p such that the SSD is minimized How could you find a solution to this problem?
Image alignment
This is a non-linear (quadratic) function of a non-parametric function!
(Function I is non-parametric)
Hard to optimize
What can you do to make it easier to solve?
assume good initialization,
linearized objective and update incrementally
(pretty strong assumption)
If you have a good initial guess p…
can be written as …
(a small incremental adjustment) (this is what we are solving for now)
This is still a non-linear (quadratic) function of
a non-parametric function!
(Function I is non-parametric)
How can we linearize the function I for a really small perturbation of p? Taylor series approximation!
Multivariable Taylor Series Expansion (First order approximation)
chain rule short-hand
short-hand
By linear approximation,
Now, the function is a linear function of the unknowns
(A matrix of partial derivatives)
Affine transform
Rate of change of the warp
warp function (2 x 1)
Partial derivatives of warp function
template image intensity (scalar)
incremental warp (6 x 1)
warp parameters (6 for affine)
pixel coordinate (2 x 1)
image gradient (1 x 2)
image intensity (scalar)
Summary: Lucas
warped image
template image
Difficult non-linear optimization problem
Assume known approximate solution Solve for increment
Taylor series approximation Linearize
then solve for
OK, so how do we solve this?
vector of constants
vector of variables
(moving terms around)
Have you seen this form of optimization problem before?
Looks like
constant variable constant
How do you solve this?
Least squares approximation
is solved by
Applied to our tasks:
is optimized when
after applying
Kanade Strategy:
warped image
template image
Difficult non-linear optimization problem
Assume known approximate solution Solve for increment
Taylor series approximation Linearize
Solution to least square s approximation
Called Gauss-Newton gradient decent non-linear optimization!
1.Warp image 2.Compute error image 3.Compute gradient 4.Evaluate Jacobian 5.Compute Hessian 6.Compute
7.Update parameters
x’: coordinates of the warped image (gradients of the warped image)
1.Warp image 2.Compute error image 3.Compute gradient 4.Evaluate Jacobian 5.Compute Hessian 6.Compute
7.Update parameters
K motion estimation
• Relationships
K image alignment?
Lucas-Kanade motion estimation (what we learned in EBU6230) can be seen as a spec ial case of the Lucas-Kanade image alignment with a translational warp model
Translation Affine
transform coordinate
affine transform
coordinate
• Motion Estimation (Review of EBU6230 content)
• Image Alignment
• Kanade-Lucas-Tomasi (KLT) Tracking
• Mean-shift Tracking
https://www.youtube.com/watch?v=rwIjkECpY0M
• Up to now, we’ve been aligning entire images
̶ but we can also track just small image regions too
• Questions to solve tracking
̶ How should we select features?
̶ How should we track them from frame to frame?
based tracking
tracker: history
An Iterative Image Registration Technique with an Application to Stereo Vision.
Detection and Tracking of Feature Points.
The original KLT algorithm
Good Features to Track.
tracker: history
Kanade-Lucas- should we track them from frame to frame?
Lucas-Kanade
Method for aligning (tracking) an image patch
How should we select features?
Tomasi-Kanade Method for choosing the best feature (image patch) for tracking
What are good features for tracking?
Intuitively, we want to avoid smooth regions and edges.
But is there a more principled way to define good features?
What are good features for tracking?
Can be derived from the tracking algorithm
‘A feature is good if it can be tracked well’
Recall: Lucas
error function (SSD) incremental update
Gradient update
image alignment
Hessian matrix
Stability of gradient decent iterations depends on …
Inverting the Hessian
When does the inversion fail?
H is singular. But what does that mean?
Hessian matrix
Above the noise level
both Eigenvalues are large
Well-conditioned
both Eigenvalues have similar magnitude
Hessian matrix
Concrete example: Consider translation model Hessian
←when is this singular? How are the eigenvalues related to image content?
Interpreting eigenvalues
What kind of image patch does
each region represent?
Interpreting eigenvalues
‘horizontal’ edge
‘vertical’ edge
Interpreting eigenvalues
‘horizontal’ edge
‘vertical’ edge
What are good features for tracking?
‘big Eigenvalues means good for tracking’
KLT algorithm
1. Find corners satisfying
2. For each corner compute displacement to next frame using the Lucas-Kanade method
3. Store displacement of each corner, update corner position
4. (optional) Add more corner points every M frames using 1
5. Repeat 2 to 3 (4)
6. Returns long trajectories for each corner point
EBU7240 Computer Vision
Semester 1, 2021
Changjae Oh
• Motion Estimation (Review of EBU6230 content)
• Image Alignment
• Kanade-Lucas-Tomasi (KLT) Tracking
• Mean-shift Tracking
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Slides credit: K. Kitani
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Find the region of highest density
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Pick a point
Draw a window
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Compute the (weighted) mean
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Shift the window
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Compute the mean
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Shift the window
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Compute the mean
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Shift the window
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Compute the mean
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Shift the window
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Compute the mean
Shift the window
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Compute the mean
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Shift the window
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Compute the mean
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Shift the window
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Compute the mean
Mean Shift Algorithm
A ‘mode seeking’ algorithm Fukunaga & Hostetler (1975)
Shift the window
Shift Algorithm
Initialize
1. Compute mean-shift
place we start
shift values becomes really small
compute the ‘mean’
compute the ‘shift’ update the point
Shift Tracking
Given a set of points: and a kernel:
𝐾(𝒙,𝒙𝑠) = 𝑔
Find the mean sample point:
Shift Algorithm
Initialize
1. Compute mean-shift
place we start
shift values becomes really small
compute the ‘mean’
compute the ‘shift’ update the point
Everything up to now has been about distributions over samples…
Dealing with Images
Pixels for a lattice, spatial density is the same everywhere!
What can we do?
Dealing with Images
Consider a set of points:
Associated weights: Sample mean:
Mean shift:
Shift Algorithm
(for images)
Initialize
1. Compute mean-shift
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
• Simple Mean Shift procedure:
Shift procedure
Compute mean shift vector Translate the Kernel window by m(x)
Initialize
1. Compute mean-shift
Finally… mean shift tracking in video!
Mean shift tracking in video
Goal: find the best candidate location in frame 2
center coordinate of target
center coordinate of candidate
Use the mean shift algorithm
to find the best candidate location
‘candidate’
there are many ‘candidates’ but only one ‘target’
rigid object tracking
hand tracking
rigid object tracking
Compute a descriptor for the target
rigid object tracking
Search for similar descriptor in neighborhood in next frame
Target Candidate
rigid object tracking
Compute a descriptor for the new target
rigid object tracking
Search for similar descriptor in neighborhood in next frame
Target Candidate
How do we model the target and candidate regions?
Modelling the target
M-dimensional target descriptor (centered at target center)
a ‘fancy’ (confusing) way to write a weighted histogram
Normalization factor
sum over all pixels
function of inverse distance (weight)
quantization function
A normalized color histogram (weighted by distance)
Kronecker delta function
Modelling the candidate
M-dimensional candidate descriptor 𝒑𝒚 ={𝑝 𝒚,…,𝑝 (𝒚)}
(centered at location y) a weighted histogram at y
Similarity between the target and candidate
Distance function
Bhattacharyya Coefficient
Just the Cosine distance between two unit vectors
Now we can compute the similarity between a target and multiple candidate regions
image similarity over image
we want to find this peak
image similarity over image
Objective function
Assuming a good initial guess
Linearize around the initial guess (Taylor series expansion)
function at specified value derivative
Objective function
Linearized objective
Fully expanded
Remember definition of this?
Objective function
Fully expanded linearized objective
Moving terms around…
Does not depend on unknown y
Weighted kernel density estimate
Weight is bigger when
OK, why are we doing all this math?
We want to maximize this
Fully expanded linearized objective
We want to maximize this
Fully expanded linearized objective
only need to maximize this!
doesn’t depend on unknown y where
We want to maximize this
Fully expanded linearized objective
only need to maximize this!
doesn’t depend on unknown y where
Mean Shift Algorithm!
what can we use to solve this weighted KDE?
the new sample of mean of this KDE is
(this was derived earlier)
(new candidate location)
Shift Object Tracking
For each frame:
1. Initialize location Compute
2. Derive weights
3. Shift to new candidate location (mean shift)
4. Compute
5. If return
Otherwise and go back to 2
Shift Object Tracking: example
Compute a descriptor for the target
Shift Object Tracking: example
Search for similar descriptor in neighborhood in next frame
Target Candidate
Shift Object Tracking: example
Compute a descriptor for the new target
Shift Object Tracking: example
Search for similar descriptor in neighborhood in next frame
Target Candidate
Modern trackers
level to High
https://youtu.be/9zoJP2FkpgU?t=7
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com