PowerPoint 프레젠테이션
Changjae Oh
Copyright By PowCoder代写 加微信 powcoder
Computer Vision
– Tracking: Image Alignment –
Semester 1, 22/23
• Motion Estimation (Review of EBU6230 content)
• Image Alignment
• Kanade-Lucas-Tomasi (KLT) Tracking
• Mean-shift Tracking
Objectives
• To review Lucas-Kanade optical flow in EBU6230
• To understand Lucas-Kanade image alignment
• To understand the relationship between Lucas-Kanade optical flow and im
age alignment
• To understand Kanade-Lucas-Tomasi tracker
• Motion Estimation (Review of EBU6230 content)
• Image Alignment
• Kanade-Lucas-Tomasi (KLT) Tracking
• Mean-shift Tracking
Motion Estimation: Gradient method
• Brightness consistency constraint
• small motion: (Δx and Δy are less than 1 pixel)
– suppose we take the Taylor series expansion of I:
),,(),,( ttyyxxItyxH +++=
tyxIttyyxI
tyxIttyyxI
sorder termhigher
Gradient method
• Spatio-temporal constraint
• This equation introduces one constraint only
– Where the motion vector of a pixel has 2 components (parameters)
– A second constraints is necessary to solve the system
Aperture problem
• The aperture problem
– stems from the need to solve one equation with two unknowns,
which are the two components of optical flow
– it is not possible to estimate both components of the optical flow
from the local spatial and temporal derivatives
• By applying a constraint
– the optical flow field changes smoothly in a small neighborhood
it is possible to estimate both components of the optical flow
if the spatial and temporal derivatives of the image
intensity are available
Solving the aperture problem
• How to get more equations for a pixel?
• By applying a constraint
– the optical flow field changes smoothly in a small neighborhood
it is possible to estimate both components of the optical flow
if the spatial and temporal derivatives of the image
intensity are available
• Lucas–Kanade method
Gradient method
• The Lucas–Kanade method assumes that the displacement of the image contents
between two nearby instants (frames) is small
• Matrix form
… …
Gradient method
• Prob: we have more equations than unknowns
• Solution: solve least squares problem
• minimum least squares solution given by solution of:
• The summations are over all n pixels in the K x K window
• This technique was first proposed by Lukas & Kanade (1981)
minimize bAv −
Lucas-Kanade flow
• When is this solvable?
• ATA should be invertible
• ATA should not be too small due to noise
– eigenvalues l1 and l2 of A
TA should not be too small
• ATA should be well-conditioned
– l1/ l2 should not be too large (l1 = larger eigenvalue)
– Recall the Harris corner detector: M = ATA is the second moment
Interpreting the eigenvalues
l1 and l2 are large,
l1 and l2 are small “Edge”
Classification of image points using eigenvalues
of the second moment matrix:
Uniform region
– gradients have small magnitude
– small l1, small l2
– system is ill-conditioned
– gradients have one dominant direction
– large l1, small l2
– system is ill-conditioned
High-texture or corner region
– gradients have different directions, large magnitudes
– large l1, large l2
– system is well-conditioned
• Motion Estimation (Review of EBU6230 content)
• Image Alignment
• Kanade-Lucas-Tomasi (KLT) Tracking
• Mean-shift Tracking
How can I find in the image?
Idea #1: Template Matching
Slow, global solution
Idea #2: Pyramid Template Matching
Faster, locally optimal
Idea #3: Model refinement
Fastest, locally optimal
(when you have a good initial solution)
Some notation before we get into the math…
2D image transformation
2D image coordinate
Parameters of the transformation
Translation
transform coordinate
Pixel value at a coordinate
affine transform
coordinate
Warped image
can be written in matrix form when linear
affine warp matrix can also be 3×3 when last row is [0 0 1]
Image alignment
• Problem definition
warped image template image
Find the warp parameters p such that the SSD is minimized
Image alignment
Find the warp parameters p such that the SSD is minimized
Image alignment
• Problem definition
warped image template image
Find the warp parameters p such that the SSD is minimized
How could you find a solution to this problem?
Image alignment
This is a non-linear (quadratic) function of a
non-parametric function!
(Function I is non-parametric)
Hard to optimize
What can you do to make it easier to solve?
assume good initialization,
linearized objective and update incrementally
Lucas-Kanade alignment
If you have a good initial guess p…
can be written as …
(a small incremental adjustment)
(pretty strong assumption)
(this is what we are solving for now)
Lucas-Kanade alignment
How can we linearize the function I for a really small perturbation of p?
Taylor series approximation!
This is still a non-linear (quadratic) function of
a non-parametric function!
(Function I is non-parametric)
Lucas-Kanade alignment
Multivariable Taylor Series Expansion
(First order approximation)
chain rule
short-hand
short-hand
Lucas-Kanade alignment
By linear approximation,
Now, the function is a linear function of the unknowns
Lucas-Kanade alignment
Affine transform
Rate of change of the warp
The Jacobian
(A matrix of partial derivatives)
Lucas-Kanade alignment
pixel coordinate
image intensity
warp function
warp parameters
(6 for affine)
image gradient
Partial derivatives of warp function
incremental warp
template image intensity
Summary: Lucas-Kanade alignment
Taylor series approximation
Difficult non-linear optimization problem
Assume known approximate solution
Solve for increment
then solve for
warped image template image
Lucas-Kanade alignment – Solver
OK, so how do we solve this?
Lucas-Kanade alignment – Solver
(moving terms around)
Have you seen this form of optimization problem before?
Lucas-Kanade alignment – Solver
How do you solve this?
Looks like
variableconstant constant
Lucas-Kanade alignment – Solver
Least squares approximation
is solved by
is optimized when
Applied to our tasks:
after applying
Lucas-Kanade alignment – Solver
Taylor series approximation
Difficult non-linear optimization problem
Assume known approximate solution
Solve for increment
warped image template image
Solution to least square
s approximation
Called Gauss-Newton gradient decent non-linear optimization!
Lucas-Kanade alignment – Algorithm
1.Warp image
2.Compute error image
3.Compute gradient
4.Evaluate Jacobian
5.Compute Hessian
7.Update parameters
x’: coordinates of the warped image
(gradients of the warped image)
Lucas-Kanade alignment – Algorithm
1.Warp image
2.Compute error image
3.Compute gradient
4.Evaluate Jacobian
5.Compute Hessian
7.Update parameters
L-K motion estimation vs L-K image alignment?
• Relationships
̶ Lucas-Kanade motion estimation (what we learned in EBU6230) can be seen as a spec
ial case of the Lucas-Kanade image alignment with a translational warp model
Translation
transform coordinate
affine transform
coordinate
• Motion Estimation (Review of EBU6230 content)
• Image Alignment
• Kanade-Lucas-Tomasi (KLT) Tracking
• Mean-shift Tracking
https://www.youtube.com/watch?v=rwIjkECpY0M
Feature-based tracking
• Up to now, we’ve been aligning entire images
̶ but we can also track just small image regions too
• Questions to solve tracking
̶ How should we select features?
̶ How should we track them from frame to frame?
KLT-tracker: history
An Iterative Image Registration Technique
with an Application to Stereo Vision.
Detection and Tracking of Feature Points.
Good Features to Track.
The original KLT algorithm
KLT-tracker: history
Method for aligning
(tracking) an image patch
Kanade-Lucas- for choosing the
best feature (image patch)
for tracking
Lucas- -Kanade
How should we select features?How should we track them from frame to
What are good features for tracking?
Intuitively, we want to avoid smooth regions
and edges.
But is there a more principled way to define
good features?
Can be derived from the tracking algorithm
What are good features for tracking?
‘A feature is good if it can be tracked well’
Recall: Lucas-Kanade image alignment
incremental update
error function (SSD)
Gradient update
Hessian matrix
Stability of gradient decent iterations depends on …
Inverting the Hessian
When does the inversion fail?
H is singular. But what does that mean?
Hessian matrix
Above the noise level
Well-conditioned
both Eigenvalues are large
both Eigenvalues have similar magnitude
Hessian matrix
Concrete example: Consider translation model
How are the eigenvalues related to image content?
←when is this singular?
Interpreting eigenvalues
What kind of image patch does
each region represent?
Interpreting eigenvalues
‘horizontal’
‘vertical’
Interpreting eigenvalues
‘horizontal’
‘vertical’
What are good features for tracking?
‘big Eigenvalues means good for tracking’
KLT algorithm
1. Find corners satisfying
2. For each corner compute displacement to next frame
using the Lucas-Kanade method
3. Store displacement of each corner, update corner position
4. (optional) Add more corner points every M frames using 1
5. Repeat 2 to 3 (4)
6. Returns long trajectories for each corner point
Changjae Oh
Computer Vision
– Mean-shift tracking –
Semester 1, 22/23
• Motion Estimation (Review of EBU6230 content)
• Image Alignment
• Kanade-Lucas-Tomasi (KLT) Tracking
• Mean-shift Tracking
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Slides credit: K. Kitani
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Find the region of
highest density
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Pick a point
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Draw a window
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Compute the
(weighted) mean
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Shift the window
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Compute the mean
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Shift the window
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Compute the mean
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Shift the window
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Compute the mean
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Shift the window
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Compute the mean
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Shift the window
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Compute the mean
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Shift the window
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Compute the mean
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Shift the window
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Compute the mean
Mean Shift Algorithm
Fukunaga & Hostetler (1975)
A ‘mode seeking’ algorithm
Shift the window
Mean-Shift Algorithm
Initialize
1. Compute mean-shift
shift values becomes really small
place we start
compute the ‘mean’
compute the ‘shift’
update the point
Mean-Shift Tracking
Given a set of points:
and a kernel:
Find the mean sample point:
𝐾(𝒙, 𝒙𝑠) = 𝑔
Mean-Shift Algorithm
Initialize
1. Compute mean-shift
shift values becomes really small
place we start
compute the ‘mean’
compute the ‘shift’
update the point
Everything up to now has been about
distributions over samples…
Dealing with Images
Pixels for a lattice, spatial density is the same everywhere!
What can we do?
Dealing with Images
Consider a set of points:
Sample mean:
Mean shift:
Associated weights:
Mean-Shift Algorithm (for images)
Initialize
1. Compute mean-shift
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
For images, each pixel is point with a weight
• Simple Mean Shift procedure:
̶ Compute mean shift vector
̶ Translate the Kernel window by m(x)
Mean-Shift procedure
Initialize
1. Compute mean-shift
Finally… mean shift tracking in video!
Mean shift tracking in video
Frame 1 Frame 2
center coordinate
center coordinate of
Goal: find the best candidate location in frame 2
Use the mean shift algorithm
to find the best candidate location
‘candidate’
there are many ‘candidates’ but only one ‘target’
Non-rigid object tracking
hand tracking
Non-rigid object tracking
Compute a descriptor for the target
Non-rigid object tracking
Target Candidate
Search for similar descriptor in neighborhood in next frame
Non-rigid object tracking
Compute a descriptor for the new target
Non-rigid object tracking
Target Candidate
Search for similar descriptor in neighborhood in next frame
How do we model the target and candidate regions?
Modelling the target
M-dimensional target descriptor
A normalized
color histogram
(weighted by distance)
Kronecker delta
function of inverse
Normalization
(centered at target center)
a ‘fancy’ (confusing) way to write a weighted histogram
all pixels
quantization
Modelling the candidate
M-dimensional candidate descriptor
(centered at location y)
a weighted histogram at y
𝒑 𝒚 = {𝑝1 𝒚 ,… , 𝑝𝑀(𝒚)}
Similarity between the target and candidate
Bhattacharyya Coefficient
Just the Cosine distance between two unit vectors
Distance function
Now we can compute the similarity between a
target and multiple candidate regions
similarity over imageimage
similarity over imageimage
we want to find this peak
Objective function
Assuming a good initial guess
Linearize around the initial guess (Taylor series expansion)
derivativefunction at specified value
Objective function
definition of this?
Linearized objective
Fully expanded
Objective function
Does not depend on unknown y
Weighted kernel density estimate
Weight is bigger when
Fully expanded linearized objective
Moving terms around…
OK, why are we doing all this math?
Fully expanded linearized objective
We want to maximize this
Fully expanded linearized objective
We want to maximize this
doesn’t depend on unknown y
only need to
maximize this!
Fully expanded linearized objective
We want to maximize this
doesn’t depend on unknown y
only need to
maximize this!
what can we use to solve this weighted KDE?
Mean Shift Algorithm!
the new sample of mean of this KDE is
(this was derived earlier)
(new candidate
Mean-Shift Object Tracking
1. Initialize location
2. Derive weights
3. Shift to new candidate location (mean shift)
4. Compute
5. If return
Otherwise and go back to 2
For each frame:
Mean-Shift Object Tracking: example
Compute a descriptor for the target
Mean-Shift Object Tracking: example
Target Candidate
Search for similar descriptor in neighborhood in next frame
Mean-Shift Object Tracking: example
Compute a descriptor for the new target
Mean-Shift Object Tracking: example
Target Candidate
Search for similar descriptor in neighborhood in next frame
Modern trackers
From Mid-level to High-level ?
https://youtu.be/9zoJP2FkpgU?t=7
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com