Slide 1
The goals of segmentation
Separate image into coherent “objects”
image
human segmentation
Source: Lana Lazebnik
1
Using slide sets by Bharath Hariharan and Frank Dellaert
1
CS 376 Lecture 7 Segmentation
intensity
pixel count
input image
black pixels
gray pixels
white pixels
These intensities define the three groups.
We could label every pixel in the image according to which of these primary intensities it is.
i.e., segment the image based on the intensity feature.
What if the image isn’t quite so simple?
1
2
3
Image segmentation: toy example
Kristen Grauman
2
2
CS 376 Lecture 7 Segmentation
intensity
pixel count
input image
input image
intensity
pixel count
Kristen Grauman
3
3
CS 376 Lecture 7 Segmentation
input image
intensity
pixel count
Now how to determine the three main intensities that define our groups?
We need to cluster.
Kristen Grauman
4
4
CS 376 Lecture 7 Segmentation
0
190
255
Goal: choose three “centers” as the representative intensities, and label every pixel according to which of these centers it is nearest to.
Best cluster centers are those that minimize SSD between all points and their nearest cluster center ci:
1
2
3
intensity
Kristen Grauman
5
5
CS 376 Lecture 7 Segmentation
Grouping by clustering
Idea: embed pixels into high-dimensional space (e.g. 3-dimensions)
Each pixel is a point
Group together points
Slide by Bharath Hariharan
K-means
Assumption: each group is a Gaussian with different means and same standard deviation
Suppose we know all . Which group should a point belong to?
The j with highest
= The j with smallest
Slide by Bharath Hariharan
K-means
Assumption: each group = a Gaussian with different means and same standard deviation
If means are known, then trivial assignment to groups. How?
Assign data point to nearest mean!
Slide by Bharath Hariharan
K-means
Problem: means are not known
What if we know a set of points from each cluster?
belong to cluster k
What should be ?
Slide by Bharath Hariharan
K-means
Problem: means are not known
If assignment of points to clusters is known, then finding means is easy
How? Compute the mean of every cluster!
Slide by Bharath Hariharan
K-means
Given means, can assign points to clusters
Given assignments, can compute means
Idea: iterate!
Slide by Bharath Hariharan
K-means
Step-1 : randomly pick k centers
Slide by Bharath Hariharan
K-means
Step 2: Assign each point to nearest center
Slide by Bharath Hariharan
K-means
Step 3: re-estimate centers
Slide by Bharath Hariharan
K-means
Step 4: Repeat
Slide by Bharath Hariharan
K-means
Step 4: Repeat
Slide by Bharath Hariharan
K-means
Step 4: Repeat
Slide by Bharath Hariharan
K-means
Ground-truth vs k-means
Ground truth
K-means 100 iterations
Slide by Bharath Hariharan
K-means – another example
Slide by Bharath Hariharan
K-means
Input: set of data points, k
Randomly pick k points as means
For i in [0, maxiters]:
Assign each point to nearest center
Re-estimate each center as mean of points assigned to it
Slide by Bharath Hariharan
K-means – the math
Input: set of data points , k
Randomly pick k points as means
For iteration in [0, maxiters]:
Assign each point to nearest center
Re-estimate each center as mean of points assigned to it
Slide by Bharath Hariharan
K-means – the math
An objective function that must be minimized:
Every iteration of k-means takes a downward step:
Fixes and sets to minimize objective
Fixes and sets to minimize objective
Slide by Bharath Hariharan
K-means on image pixels
Slide by Bharath Hariharan
K-means on image pixels
Picture courtesy David Forsyth
One of the clusters from k-means
Slide by Bharath Hariharan
K-means on image pixels
What is wrong?
Pixel position
Nearby pixels are likely to belong to the same object
Far-away pixels are likely to belong to different objects
How do we incorporate pixel position?
Instead of representing each pixel as (r,g,b)
Represent each pixel as (r,g,b,x,y)
Slide by Bharath Hariharan
K-means on image pixels+position
Slide by Bharath Hariharan
The issues with k-means
Captures pixel similarity but
Doesn’t capture continuity of contours
Captures near/far relationships only weakly
Can merge far away objects together
Requires knowledge of k!
Can it deal with texture?
Slide by Bharath Hariharan
What is texture?
Hard to define, ambiguous concept
Some sort of pattern consisting of repeating elements
That we perceive as a pattern rather than individual elements
Often an indicator of:
Material: fur, sand, grass
Shape
Slide by Bharath Hariharan
29
Textures
Terrycloth
Rough Plastic
Plaster-b
Sponge
Rug-a
Painted Spheres
Columbia-Utrecht Database (http://www.cs.columbia.edu/CAVE)
Slide by Bharath Hariharan
29
Textures
Creator:WalterBaxter-2016
Information extracted from IPTC Photo Metadata
A large collection of objects (birds/leaves) can also appear as texture
Slide by Bharath Hariharan
Texture edges
When can we detect texture boundaries?
Texture boundary
Slide by Bharath Hariharan
Julesz’s texton theory
Human Vision operates in two distinct modes:
Pre-attentive vision – parallel, instantaneous
Attentive vision – serial search by focusing on individual things
Texture discrimination occurs in the pre-attentive mode
We don’t look at individual patterns but at statistics of the region
What kind of statistics?
Not just average color
But density of certain elements – “textons”
32
Slide adapted from Jitendra Malik
32
Julesz’s texton theory
Textons are:
Elongated blobs – e.g. rectangles, ellipses, line segments with specific orientations, widths and lengths
Terminators – ends of line segments
Crossings of line segments
Julesz arrived at these by experimenting on which textures were distinguishable
33
Slide adapted from Jitendra Malik
33
34
Distinguishable textures
Slide adapted from Jitendra Malik
34
35
Distinguishable textures
Slide adapted from Jitendra Malik
35
36
Indistinguishable textures
Slide adapted from Jitendra Malik
36
How do we define textons?
Use filter bank (i.e, set of filters) to detect oriented edges, spots etc
Identify repeated structures
Consider filter bank responses as “features” of a patch
Cluster patches: cluster centers form textons
38
2D Textons
Goal: find canonical local features in a texture;
1) Filter image with linear filters:
2) Run k-means on filter outputs; 3) k-means centers are the textons.
Spatial distribution of textons defines the texture;
Slide adapted from Jitendra Malik
38
39
Texton Labeling
Each pixel labeled to texton i (1 to K) which is most similar in appearance;
Similarity measured by the Euclidean distance between the filter responses;
39
K-means: pros and cons
Pros
Simple, fast to compute
Converges to local minimum of within-cluster squared error
Cons/issues
Setting k?
Sensitive to initial centers
Sensitive to outliers
Detects spherical clusters
Assuming means can be computed
40
Slide credit: Kristen Grauman
40
40
CS 376 Lecture 7 Segmentation
The mean shift algorithm seeks modes or local maxima of density in the feature space
Mean shift algorithm
image
Feature space
(L*u*v* color values)
41
Slide credit: Kristen Grauman
41
CS 376 Lecture 7 Segmentation
Search
window
Center of
mass
Mean Shift
vector
Mean shift
Slide by Y. Ukrainitz & B. Sarel
42
42
CS 376 Lecture 7 Segmentation
Search
window
Center of
mass
Mean Shift
vector
Mean shift
Slide by Y. Ukrainitz & B. Sarel
43
43
CS 376 Lecture 7 Segmentation
Search
window
Center of
mass
Mean Shift
vector
Mean shift
Slide by Y. Ukrainitz & B. Sarel
44
44
CS 376 Lecture 7 Segmentation
Search
window
Center of
mass
Mean Shift
vector
Mean shift
Slide by Y. Ukrainitz & B. Sarel
45
45
CS 376 Lecture 7 Segmentation
Search
window
Center of
mass
Mean Shift
vector
Mean shift
Slide by Y. Ukrainitz & B. Sarel
46
46
CS 376 Lecture 7 Segmentation
Search
window
Center of
mass
Mean Shift
vector
Mean shift
Slide by Y. Ukrainitz & B. Sarel
47
47
CS 376 Lecture 7 Segmentation
Search
window
Center of
mass
Mean shift
Slide by Y. Ukrainitz & B. Sarel
48
48
CS 376 Lecture 7 Segmentation
Cluster: all data points in the attraction basin of a mode
Attraction basin: the region for which all trajectories lead to the same mode
Mean shift clustering
Slide by Y. Ukrainitz & B. Sarel
49
49
CS 376 Lecture 7 Segmentation
Find features (color, gradients, texture, etc)
Initialize windows at individual feature points
Perform mean shift for each window until convergence
Merge windows that end up near the same “peak” or mode
Mean shift clustering/segmentation
50
Slide credit: Kristen Grauman
50
CS 376 Lecture 7 Segmentation
http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html
Mean shift segmentation results
51
Slide credit: Kristen Grauman
51
CS 376 Lecture 7 Segmentation
Mean shift segmentation results
52
Slide credit: Kristen Grauman
52
CS 376 Lecture 7 Segmentation
Mean shift
Pros:
Does not assume shape on clusters
One parameter choice (window size)
Generic technique
Find multiple modes
Cons:
Selection of window size
Does not scale well with dimension of feature space
Kristen Grauman
53
CS 376 Lecture 7 Segmentation
Superpixels
Separate image into coherent “objects”
Group together similar-looking pixels for efficiency of further processing
X. Ren and J. Malik. Learning a classification model for segmentation. ICCV 2003.
Source: Lana Lazebnik
54
54
CS 376 Lecture 7 Segmentation
SLIC Superpixels
Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk, SLIC Superpixels Compared to State-of-the-art Superpixel Methods, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, num. 11, p. 2274 – 2282, May 2012.
Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk, SLIC Superpixels, EPFL Technical Report no. 149300, June 2010.
SLIC Superpixels
Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk, SLIC Superpixels Compared to State-of-the-art Superpixel Methods, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, num. 11, p. 2274 – 2282, May 2012.
Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk, SLIC Superpixels, EPFL Technical Report no. 149300, June 2010.
Graph-based segmentation
Define a node for each pixel or the superpixel in the graph
Define a similarity metric (for example color) between pixels and add them as edges between nodes in the graph
Solve…
Required reading
Further MATLAB
https://www.mathworks.com/help/images/ref/imsegkmeans.html
Further Reading
https://docs.opencv.org/3.4/d8/d83/tutorial_py_grabcut.html
Further reading
The introduction section of:
http://yaksoy.github.io/papers/ETH19-PhD-Aksoy.pdf
P (x
i
|µ
j
) / e�
1
2�2
kxi�µjk2
xk1 , xk2 , . . . , xkn
µk =
(xk1 + xk2 + . . .+ xkn)
n
yi = argmin
j
kxi � µjk2
µj =
P
i:yi=j
xiP
i:yi=j
1
min
µ,y
X
i
kxi � µyik
2
/docProps/thumbnail.jpeg