[06-30213][06-30241][06-25024]
Computer Vision and Imaging &
Robot Vision
Dr Hyung Jin Chang Dr Yixing Gao
h.j.chang@bham.ac.uk y.gao.8@bham.ac.uk
School of Computer Science
SEGMENTATION & GROUPING
(SZELISKI 5.2-5.4)
2
Features and filters
Transforming and describing images; textures, colors, edges
3
Slide credit: Kristen Grauman
Grouping and fitting
Clustering, segmentation, fitting; what parts
belong together? Slide credit: Kristen Grauman
[fig from Shi et al]
4
Outline
• What are grouping problems in vision?
• Bottom-up segmentation via clustering
– Algorithms:
• Mode finding and mean shift: k-means, mean-shift • Graph-based: normalized cuts
– Features: color, texture, …
• Quantization for texture summaries
Slide credit: Kristen Grauman
5
Grouping in vision
• Goals:
– Gather features that belong together
– Obtain an intermediate representation that compactly describes key image or video parts
Slide credit: Kristen Grauman
6
Examples of grouping in vision
[Figure by J. Shi] Determine image regions
[http://poseidon.csd.auth.gr/LAB_RESEARCH/Latest/imgs/S peakDepVidIndex_img2.jpg]
Group video frames into shots
[Figure by Grauman & Darrell] Object-level grouping
Fg / Bg
[Figure by Wang & Suter]
Figure-ground
Slide credit: Kristen Grauman
7
Grouping in vision
• Goals:
– Gather features that belong together
– Obtain an intermediate representation that compactly describes key image (video) parts
• Top down vs. bottom up segmentation
– Top down: pixels belong together because they are
from the same object
• Hard to measure success
– What is interesting depends on the app.
– Bottom up: pixels belong together because they look similar
Slide credit: Kristen Grauman
8
Outline
• Bottom-up segmentation via clustering
– Algorithms:
• Mode finding and mean shift: k-means, mean-shift • Graph-based: normalized cuts
– Features: color, texture, …
• Quantization for texture summaries
Slide credit: Kristen Grauman
9
The goals of segmentation
Separate image into coherent “objects” image human segmentation
10
Source: Lana Lazebnik
The goals of segmentation Separate image into coherent “objects”
Group together similar-looking pixels for efficiency of further processing
“superpixels”
X. Ren and J. Malik. Learning a classification model for segmentation. ICCV 2003. 11
Source: Lana Lazebnik
Image segmentation: toy example
black pixels
pixels
white pixels
1
3 2
gray
input image
• These intensities define the three groups.
• We could label every pixel in the image according to which of these primary intensities it is.
• i.e., segment the image based on the intensity feature. • What if the image isn’t quite so simple?
Slide credit: Kristen Grauman
intensity
12
pixel count
input image
intensity
input image
intensity
Slide credit: Kristen Grauman
13
pixel count pixel count
input image
intensity
• Now how to determine the three main intensities that define our groups?
• We need to cluster.
Slide credit: Kristen Grauman
14
pixel count
0
190 255
intensity
• Goal: choose three “centers” as the representative intensities, and label every pixel according to which of these centers it is nearest to.
• Best cluster centers are those that minimize SSD (sum of squared difference) between all points and their nearest cluster center ci:
Slide credit: Kristen Grauman
1
3
2
15
Clustering
• With this objective, it is a “chicken and egg” problem:
– If we knew the cluster centers, we could allocate points to groups by assigning each to its closest center.
– If we knew the group memberships, we could get the centers by computing the mean per group.
Slide credit: Kristen Grauman
16
K-means clustering
• Basic idea: randomly initialize the k cluster centers, and iterate between the two steps we just saw.
1. Randomlyinitializetheclustercenters,c1,…,cK
2. Given cluster centers, determine points in each cluster • For each point p, find the closest ci. Put p into cluster i
3. Givenpointsineachcluster,solveforci
• Set ci to be the mean of points in cluster i
4. If ci have changed, repeat Step 2
Properties
• Will always converge to some solution
• Can be a “local minimum”
• does not always find the global minimum of objective function:
17
Source: Steve Seitz
Andrew Moore
18
Andrew Moore
19
Andrew Moore
20
Andrew Moore
21
Andrew Moore
22
K-means clustering • Demo
http://web.stanford.edu/class/ee103/visualiz ations/kmeans/kmeans.html
Slide credit: Kristen Grauman
23
K-means: pros and cons
Pros
• Simple, fast to compute
• Converges to local minimum of within-cluster squared error
Cons/issues
• Setting k?
• Sensitive to initial centers
• Sensitive to outliers
• Detects spherical clusters
• Assuming means can be computed
Slide credit: Kristen Grauman
24
•
An aside: Smoothing out cluster
assignments
Assigning a cluster label per pixel may yield outliers:
•
original
How to ensure they are spatially smooth?
labeled by cluster center’s intensity
?
25
1
3 2
Slide credit: Kristen Grauman
Segmentation as clustering
Depending on what we choose as the feature space, we can group pixels in different ways.
Grouping pixels based on intensity similarity
Feature space: intensity value (1-d) Slide credit: Kristen Grauman
26
K=2
quantization of the feature space; segmentation label map
Slide credit: Kristen Grauman
K=3
27
Segmentation as clustering
Depending on what we choose as the feature space, we can group pixels in different ways.
Grouping pixels based on color similarity
B
G
R=255 G=200 B=250
R=245 G=220 B=248
Feature space: color value (3-d) Slide credit: Kristen Grauman
28
R
R=15 G=189 B=2
R=3 G=12 B=2
Segmentation as clustering
Depending on what we choose as the feature space, we can group pixels in different ways.
Grouping pixels based on intensity similarity
Clusters based on intensity similarity don’t have to be spatially coherent.
Slide credit: Kristen Grauman
29
Segmentation as clustering
Depending on what we choose as the feature space, we can group pixels in different ways.
Grouping pixels based on intensity+position similarity
Intensity
Y
Slide credit: Kristen Grauman
30
X
Both regions are black, but if we also include position (x,y), then we could group the two into distinct segments; way to encode both similarity & proximity.
Segmentation as clustering
• Color, brightness, position alone are not enough to distinguish all regions…
Slide credit: Kristen Grauman
31
Segmentation as clustering
Depending on what we choose as the feature space, we can group pixels in different ways.
Grouping pixels based on texture similarity
F1
F2
Filter bank of 24 filters
F24
Feature space: filter bank responses (e.g., 24-d)
32
Slide credit: Kristen Grauman
…
Recall: texture representation example
mean d/dx value
mean d/dy value
Win. #1
4
10
Win.#2
18
7
Win.#9
20
20
original image
Slide credit: Kristen Grauman
derivative filter responses, squared
statistics to summarize patterns in small windows
33
…
…
Recall: texture representation example
Windows with primarily horizontal edges
Both
mean d/dx value
mean d/dy value
Win. #1
4
10
Win.#2
18
7
Win.#9
20
20
Dimension 1 (mean d/dx value)
Windows with small gradient in both directions
Slide credit: Kristen Grauman
Windows with primarily vertical edges
statistics to summarize patterns in small windows
34
Dimension 2 (mean d/dy value)
…
…
Image segmentation example
Slide credit: Kristen Grauman
35
Color vs. texture
These look very similar in terms of their color distributions (histograms).
How would their texture distributions compare?
36
Slide credit: Kristen Grauman
Outline
• What are grouping problems in vision?
• Bottom-up segmentation via clustering – Algorithms:
• Graph-based: normalized cuts
– Features: color, texture, …
• Quantization for texture summaries
• Mode finding and mean shift: k-means, mean-shift
Slide credit: Kristen Grauman
37
K-means: pros and cons
Pros
• Simple, fast to compute
• Converges to local minimum of within-cluster squared error
Cons/issues
• Setting k?
• Sensitive to initial centers
• Sensitive to outliers
• Detects spherical clusters
• Assuming means can be computed
Slide credit: Kristen Grauman
38
•
Mean shift algorithm
The mean shift algorithm seeks modes or local maxima of density in the feature space
Slide credit: Kristen Grauman
image
Feature space (L*u*v* color values)
39
Mean shift
Search window
Center of mass
Slide by Y. Ukrainitz & B. Sarel
Mean Shift vector
40
Mean shift
Search window
Center of mass
Slide by Y. Ukrainitz & B. Sarel
Mean Shift vector
41
Mean shift
Search window
Center of mass
Slide by Y. Ukrainitz & B. Sarel
Mean Shift vector
42
Mean shift
Search window
Center of mass
Slide by Y. Ukrainitz & B. Sarel
Mean Shift vector
43
Mean shift
Search window
Center of mass
Slide by Y. Ukrainitz & B. Sarel
Mean Shift vector
44
Mean shift
Search window
Center of mass
Slide by Y. Ukrainitz & B. Sarel
Mean Shift vector
45
Mean shift
Search window
Center of mass
Slide by Y. Ukrainitz & B. Sarel
46
Mean shift clustering
• Cluster: all data points in the attraction basin of a mode
• Attraction basin: the region for which all trajectories lead to the same mode
Slide by Y. Ukrainitz & B. Sarel
47
Mean shift clustering/segmentation
• Find features (color, gradients, texture, etc)
• Initialize windows at individual feature points
• Perform mean shift for each window until convergence
• Merge windows that end up near the same “peak” or mode
Slide credit: Kristen Grauman
48
Mean shift segmentation results
http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html Slide credit: Kristen Grauman
49
Mean shift segmentation results
Slide credit: Kristen Grauman
50
Mean shift
• Pros:
– Does not assume shape on clusters – One parameter choice (window size) – Generic technique
– Find multiple modes
• Cons:
– Selection of window size
– Does not scale well with dimension of feature space
Slide credit: Kristen Grauman
51
Outline
• What are grouping problems in vision?
• Bottom-up segmentation via clustering
– Algorithms:
• Mode finding and mean shift: k-means, mean-shift
– Features: color, texture, …
• Quantization for texture summaries
• Graph-based: normalized cuts
Slide credit: Kristen Grauman
52
Images as graphs
q
wpq
w
p
Fully-connected graph
• node (vertex) for every pixel
• link between every pair of pixels, p,q
• affinity weight wpq for each link (edge) – wpq measures similarity
» similarity is inversely proportional to difference (in color and position…) 53
Source: Steve Seitz
Measuring affinity • One possibility:
Small sigma: group only nearby points
Large sigma: group distant points
Slide credit: Kristen Grauman
54
Measuring affinity σ=.2
Data points
Affinity matrices
σ=.1 σ=.2 σ=1 Slide credit: Kristen Grauman
55
Segmentation by Graph Cuts
p
ABC
Break Graph into Segments
q
wpq
w
• Want to delete links that cross between segments
• Easiest to break links that have low similarity (low weight) – similar pixels should be in the same segments
– dissimilar pixels should be in different segments
56
Source: Steve Seitz
Cuts in a graph: Min cut
A
B
Link Cut
• set of links whose removal makes a graph disconnected
cut(A,B)= wp,q pA,qB
• cost of a cut:
Find minimum cut
• gives you a segmentation
• fast algorithms exist for doing this
57
Source: Steve Seitz
Minimum cut
• Problem with minimum cut:
Weight of cut proportional to number of edges in the cut; tends to produce small, isolated components.
[Shi & Malik, 2000 PAMI]
Slide credit: Kristen Grauman
58
Cuts in a graph: Normalized cut
A
B
Normalized Cut
• fix bias of Min Cut by normalizing for size of segments:
cut(A,B) + cut(A,B) assoc(A,V) assoc(B,V)
assoc(A,V) = sum of weights of all edges that touch A
• Ncut value small when we get two clusters with many edges with high weights, and few edges of low weight between them
• Approximate solution for minimizing the Ncut value : generalized eigenvalue problem.
59
J. Shi and J. Malik, Normalized Cuts and Image Segmentation, CVPR, 1997
Source: Steve Seitz
Example results
Slide credit: Kristen Grauman
60
Results: Berkeley Segmentation Engine
Slide credit: Kristen Grauman
61
Normalized cuts: pros and cons
Pros:
• Generic framework, flexible to choice of function that computes weights (“affinities”) between nodes
• Does not require model of the data distribution
Cons:
• Time complexity can be high
– Dense,highlyconnectedgraphs→manyaffinitycomputations – Solving eigenvalue problem
• Preference for balanced partitions Slide credit: Kristen Grauman
62
Efficient graph-based segmentation
• Runs in time nearly linear in the number of edges
• Easy to control coarseness of segmentations
• Results can be unstable
P. Felzenszwalb and D. Huttenlocher, Efficient Graph-Based Image Segmentation, IJCV 2004
Segmentation for efficiency: “superpixels”
[Felzenszwalb and Huttenlocher 2004]
[Hoiem et al. 2005, Mori 2005]
[Shi and Malik 2001]
Segmentation for object proposals
“Selective Search” [Sande, Uijlings et al. ICCV 2011, IJCV 2013]
[Endres Hoiem ECCV 2010, IJCV 2014]
Segmentation for object proposals
Multiple segmentations
B. Russell et al., “Using Multiple Segmentations to Discover Objects and their Extent in Image Collections,” CVPR 2006 66
Slide credit: Lana Lazebnik
Segmentation
67
Motion segmentation
Input sequence
Image Segmentation Motion Segmentation
Input sequence
Image Segmentation Motion Segmentation
A.Barbu, S.C. Zhu. Generalizing Swendsen-Wang to sampling arbitrary posterior
probabilities, IEEE Trans. PAMI, August 2005. Slide credit: Kristen Grauman
68
Top-down segmentation
E. Borenstein and S. Ullman, “Class-specific, top-down segmentation,” ECCV 2002
A. Levin and Y. Weiss, “Learning to Combine Bottom-Up and Top-Down Segmentation,”
69
ECCV 2006.
Slide credit: Lana Lazebnik
Top-down segmentation
Normalized cuts
Top-down segmentation
E. Borenstein and S. Ullman, “Class-specific, top-down segmentation,” ECCV 2002
A. Levin and Y. Weiss, “Learning to Combine Bottom-Up and Top-Down Segmentation,”
70
ECCV 2006.
Slide credit: Lana Lazebnik
Summary
• Segmentation to find object boundaries or mid- level regions, tokens.
• Bottom-up segmentation via clustering
– General choices — features, affinity functions, and clustering algorithms
• Grouping also useful for quantization, can create new feature summaries
– Texton histograms for texture within local region
• Example clustering methods – K-means
– Mean shift
– Graph cut, normalized cuts Slide credit: Kristen Grauman
71