代写代考 DE 60.50.50007

Introduction to Machine Learning Spectral Clustering
Prof. Kutty

Unsupervised Learning: Clustering

Basic idea: want to group similar points Intuitively:
– intra-cluster distancesàsmall – inter-cluster distancesàlarge

assign each datapoint to exactly i cluster
(hard) clustering algorithms
• agglomerative clustering
• k-means clustering
• spectral clustering

agglomerative
hierarchical clustering review

Agglomerative Clustering
Notation: !! set of all points in cluster i
1. Assign each pt. its own cluster
for each point i, !! = {%̅(!)}
2. Find the closest clusters & merge, repeat until convergence j
Single-linkage: Complete-linkage: 1 !!,!%
= min ‘̅∈)”;’̅+∈)#
= max ‘̅∈)”;’̅+∈)#
-(%̅,%̅+) -(%̅,%̅+)
arg min -(!!, !%) where i ≠ j !,%
find the dusters i and them
the distance between is minimized
Average-linkage: 1 !!,!% = !1 1 ∑ -(%̅,%̅+) ! !% ‘ ̅ ∈ ) ” ; ‘ ̅ + ∈ ) #

Agglomerative Hierarchical Clustering
Idea: start with each pt. in its own cluster and build a hierarchy Cut here for
3 clusters

CSE Chrysler Center
Mason Hall

in class exercise
For this dataset with agglomerative clustering and single linkage criteria, construct the dendogram and indicate the cut for 3 clusters.
assume datapoints in !! and euclidean distance
1. Assign each pt. its own cluster for each point i, ,! = {/̅(!)}
2. Find the closest clusters & merge, repeat until convergence
a r g m i n 7 ( , ! , ,% ) w h e r e i ≠ j !,%

in class exercise (solution)
Assign each pt. its own cluster for each point i, ,! = {/̅(!)}
Find the closest clusters & merge, repeat until convergence a r g m i n 7 ( , ! , ,% ) w h e r e i ≠ j
Cut here for 3 clusters
assume euclidean distance

k-means clustering review

K-Means Clustering (Initialization)
• Select k. Pick random means. – Example with k = 2.
parameter M hyper

K-Means Clustering (Iterations)
Assign Points to Clusters Re-estimate Cluster Centroids
Re-assign points to the now- nearest center.
Re-estimate Cluster Centroids
Compute centers for the new clusters assignments.
Re-assign Points to Clusters

K-Means Clustering (Convergence)
Re-assign Points to Clusters The cluster centers have stopped changing.

How do means determine cluster assignments?
By Jahobr – Own work, CC0, https://commons.wikimedia.org/w/index.php?curid=61356414

k-means Clustering convergence
k-means performs coordinate descent on the
objective function
56̅,7 =8%̅! −:̅;” !89
k-means is guaranteed to converge* not guaranteed to converge
to global minimum
k-means++ Approximation Guarantee. The expected value of the objective returned by K- means++ is never more than O(log K) from optimal and can be as close as O(1) from optimal. Even in the former case, with 2K random restarts, one restart will be O(1) from optimal (with high probability).

k-means assumes globular-shaped clusters

Clustering Algorithms
http://scikit-learn.org/stable/modules/clustering.html

spectral clustering

Spectral Clustering
• doesn’t assume globular-shaped clusters
• Reformulates data clustering problem as graph partitioning problem • Broadly
– first, convert data into a weighted graph
– next, partition graph so that each component has a weaker across- partition connection and stronger within-partition connection; ensure similar sized partitions
datapoint d edges
vertex node
similaritybetween datapoints

Data ⟶ Weighted Graph
Gaussian kernel similarity metric
;!% = <%= − %̅! −%̅% 2?< datapoints 1,...,n Guyperparama datapoints 1,...,n Adjacency Matrix datapoints 1,...,n Data → Graph: Adjacency matrix, # Vi e example: Given @: = %̅ ! compute similarity between each pair of datapoints e.g., Gaussian kernel similarity metric ;!% = <%= − '̅"='̅# $ <>$
How might you get a weight of 0?
## 0.0015 #$
A = {;!%} for D = 1,…,F;H = 1,…,F A is symmetric and each ;!% ≥ 0

Definition: Complement
• Suppose I ⊂ K
• ComplementofI:I̅=K∖I
if ! = {$!, $”, $#}
then !̅ = {$$, $%}

Definition: Cost of a cut
• Cost of a cut between I and I̅ 6MNI,I̅ = 8 ;!%
if I = {O9, O<, then I̅ = {OA, OB} 6MN I,I̅ = 0.37 0.0015 Spectral Clustering Goal: Given a graph representing the data, find a minimum cost cut? Issue: May not give a reasonable solution 0.4 E 0.8 0.9 A Az Cut AAz IcutCAA CAL 0 35 Spectral Clustering Goal: Given a graph representing the data, find a minimum cost RatioCutàk clustering Ah argmin RatiocutCA Al Az ite 1 PQNDR!MN I9, ... , IC = 2 8 6MN(I!, I!) called “ratio cut” I! ensures clusters are reasonably large groups Idea: Use the eigenvectors and eigenvalues of the graph Laplacian is the matrix S = 1 − A um Ratiocut A A Az I artiff t I Ye t I OI I artiff t Adjacency Matrix W, Degree Matrix ! and Graph Laplacian " • AdjacencyMatrixW ! = {$"#} for & = 1,...,*;, = 1,...,* where $"# = sim(2̅ " , 2̅ # ) • Degreematrix4with : 1=∑ ;and1=0forD≠H d!! %89 !% d!% • The graph Laplacian is the matrix S = 1 − A • We are interested in the eigenvectors and eigenvalues of L Recall: Eigenvalues and eigenvectors • A scalar ^ is called an eigenvalue of a matrix I if there is a non-trivial solution O of • We say that O is the eigenvector corresponding to the eigenvalue ^ Note: an eigenvector cannot be 0., but an eigenvalue can be 0. • Example: 3 6 −8 −10 I=006^=2O=3 3 6 −8 −10 −20 −10 IO=006 3=6=23=2O T y p i c a Tl h u 1 1 2 I Why does this work? Spectral Clustering for # partitions min $%&'()*& +',...,+( min /+0/ *̅(&),...,*̅(') indicator vectors ̅ min̅ /+0/ *(&),...,*(') real vectors indicator vector corresponding The solution to this is the first 1 eigenvectors of 0 Graph Laplacian $ The graph Laplacian is the matrix S = 1 − A PropertiesofS Iz EIR JTL3 70 • S is symmetric – S has F non-negative real-valued eigenvalues 0≤^9 ≤⋯≤^: • Fact: the multiplicity of the eigenvalue 0 is the number of connected components in the graph • Examples image from: https://towardsdatascience.com/spectral-clustering-aba2640c0d5b Spectral Clustering Example #1 23 0.5B DE 60.50.50007 0.25 60.5 0.5 0 0 0 7 C 64 0 0 0 0 0 75 W=2 3 0 0 0 0.25 0.25 1 0.5 0 0 0 60.5 1 0 0 07 64 0 0 1 0 0 75 0 0 0 1 0.25 0 0 0 0.25 1 Input: similarity metric, number of clusters k 1. Build adjacency matrix D 2. Compute graph Laplacian E = F − D 0 0 0 0.25 0.25 Spectral Clustering Example #1 6 0.5 0.5 0 0 0 7 Input: similarity metric, number of clusters k 60.5 0.5 0 0 0 7 1. Build adjacency matrix D 64 0 0 0 0 0 75 2 . C o m p u t e g r a p h L a p l a c i a n E = F − D 0 0 0 0.25 0.25 3. Compute eigenvectors/eigenvalues of E 0 0 0 0.25 0.25 (eigenvector, eigenvalue) pairs 2p 2p2p2pp3 p3p3p3 2 2 2 222p 2 2 2 p3 0 00 000 02 0 0 2 2 p2 p2 02 0p2 0p2 pp p666p2 7772 62 2 2 2627 2 2 2 7 6 600606000 072 0707 72 6 2 6 2 6 2 6 2 02 7 02 7 02 7 2 7 ,0 ,06,0 ,0.57,1 6 0 1.6000 061.00 6010.00 01.0700 0 7 0 7 0 7 116111110111 1.00 7 6 pp267222 7 4 42 42 52525 4 0 00 0 05 0 0 0 240p2 0p2p2 05 0 00000000 0 0 2 2 2 022 022 22 2 0 22 Spectral Clustering Example #1 Input: similarity metric, number of clusters k 1. Build adjacency matrix 3 2. Compute graph Laplacian 1 = $ − 3 3. Compute eigenvectors of 1 4. Build matrix with the first k eigenvectors (corresponding to the k smallest eigenvalues) as columns interpret rows as new data points 5. Apply k-means to new data representation Output: clusters assignments intuitively Eigenvectors 2p2 0 0 0 p23 6 p2 p2 7 B62 0 0 0 27 60 1.00 0 0 07 Cpp 6 2 2 7 D4 0 0 2 2 0 5 E 0 0 p2 p2 0 2 2 Eigenvalues [0, 0, 0, 0.5, 1.0] pyyyqy.no indicator vector for 3rdpartitio Spectral Clustering Example #1 23 0.5B DE 60.50.50007 0.25 60.5 0.5 0 0 0 7 C 64 0 0 0 0 0 75 W=2 3 0 0 0 0.25 0.25 1 0.5 0 0 0 60.5 1 0 0 0 7 64 0 0 1 0 0 75 0 0 0 1 0.25 0000.251 0 0 0 0.25 0.25 Eigenvectors 2 p p 3 A22 0 0 022 6 p2 p2 7 B62 0 0 0 27 60 1.00 0 0 07 Cpp 6227 D40 0 2 2 05 k dimensional embedding Eigenvalues [0, 0, 0, 0.5, 1.0] 0 0 p2 p2 0 22 程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts