程序代写代做代考 Option One Title Here

Option One Title Here

ANLY-601
Advanced Pattern Recognition

Spring 2018

L17 — Nonparametric Classifiers

Non-Parametric Classifiers

Kernel or Parzen Classifiers

The obvious approach is to use a kernel density estimate for each of the

class-conditional densities and proceed with a likelihood ratio test

Error estimation – use leave-one-out (see x-validation) to set kernel width.

Leave-one-out is cheap for kernel estimators:

t

xx

xx

xp

xp
N

i

jN

N

i

jN

2

1

2

2

1

1

1

)2(
2

1

1

)1(
1

1

2

1

)(

)(

)|(ˆ

)|(ˆ









 


)0()()(ˆ

1
1

1 
N

j

ikNkL
xxxp

2

KNN Classifiers

Two paradigms:

• KNN density estimates followed by likelihood ratio test

• Voting KNN. This is what people usually mean when they

refer to a KNN classifier.

For each test sample, select K nearest neighbors from the

training data (all classes). The sample is classified

according to the majority class of the k neighbors.

3

Asymptotics for Large N
Error Rates for Voting KNN

K=1 Classify the point x as the class of its nearest neighbor

The conditional risk is (classification error rate given x and xNN)

In the limit the data points become closer and closer together,

and we can replace with leaving

x
iiNN xx  

   
)()()()(

,andProb,andProb

1221

12211

NNNN

NNNNNNNN

xPxPxPxP

xxxxxxxxr









N

)( NNi xP  )( xP i

)|()|(2lim 211
*

1 xpxprr
N




4

For Odd Number of Neighbors in General

Asymptotically, and for odd number of neighbors, the error rate

at x becomes

   ki
k

i
ik

N
k xpxpk

k
xpxp

i
i

rr )|()|(
2

)|()|(
1
22

lim 212
1

21

1

1
12

*
12  










 




 

   

 

   421
3

21

2
2121

*
7

3
21

2
2121

*
5

2
2121

*
3

21
*

1

)|()|(40)|()|(2

)|()|()|()|(

)|()|(12)|()|()|()|(

)|()|(4)|()|(

)|()|(2

xpxpxpxp

xpxpxpxpr

xpxpxpxpxpxpr

xpxpxpxpr

xpxpr



















5

Bayes vs KNN Error

Recall that the Bayes risk at x is

and the error rate for (odd) KNN for infinitely many training

points

One can use these expressions to find the famous bound

   ki
k

i
ik

xpxp
k
k

xpxp
i
i

r )|()|(
2

)|()|(
1
22

212
1

21

1

1*
12  












   

 








1

21i
1

212
1

21

)|()|(
1
22

expansion seriesTaylor by or

)|()|(411)|(),|(min

i

i
B

B

xpxp
i
i

r

xpxpxpxpr





6

BB rrrrr 2…
*

1
*
3

*
5 

Bayes vs KNN Error

The KNN bound is tighter than the Bhattacharya bound )|()|( 21 xpxp 

2* Bayes
Bhatt
r1
r3
r5
r7
Bayes

7

Error Rate for Even Number of Neighbors

With even number of neighbors, we can have a tie (for two classes).

When there’s a tie, we don’t make a choice. This leads to a slightly

lower error rate

and a sequence of bounds

  i
k

i
ik

xpxp
i
i

r )|()|(
1
22

21

1

1*
2 






BBB rrrrrrrrr 2……
*

1
*
3

*
5

*
6

*
4

*
22

1 

8