程序代写代做代考 Option One Title Here

Option One Title Here

ANLY-601
Advanced Pattern Recognition

Spring 2018

L17 — Nonparametric Classifiers

Non-Parametric Classifiers

Kernel or Parzen Classifiers

The obvious approach is to use a kernel density estimate for each of the

class-conditional densities and proceed with a likelihood ratio test

Error estimation – use leave-one-out (see x-validation) to set kernel width.

Leave-one-out is cheap for kernel estimators:

xp
N

)2(
2

)1(
1

)(

)|(ˆ















































 




)0()()(ˆ

1
1

1 
N

ikNkL
xxxp

KNN Classifiers

Two paradigms:

• KNN density estimates followed by likelihood ratio test

• Voting KNN. This is what people usually mean when they

refer to a KNN classifier.

For each test sample, select K nearest neighbors from the

training data (all classes). The sample is classified

according to the majority class of the k neighbors.

Asymptotics for Large N
Error Rates for Voting KNN

K=1 Classify the point x as the class of its nearest neighbor

The conditional risk is (classification error rate given x and xNN)

In the limit the data points become closer and closer together,

and we can replace with leaving

x
iiNN xx  

   
)()()()(

,andProb,andProb

1221

12211

NNNN

NNNNNNNN

xPxPxPxP

xxxxxxxxr







N

)( NNi xP  )( xP i

)|()|(2lim 211
*

1 xpxprr
N




For Odd Number of Neighbors in General

Asymptotically, and for odd number of neighbors, the error rate

at x becomes

   ki
k

i
ik

N
k xpxpk

k
xpxp

i
i

rr )|()|(
2

)|()|(
1
22

lim 212
1

1
12

*
12  


















 







 

   

 

   421
3

2
2121

*
7

3
21

2
2121

*
5

2
2121

*
3

21
*

)|()|(40)|()|(2

)|()|()|()|(

)|()|(12)|()|()|()|(

)|()|(4)|()|(

)|()|(2

xpxpxpxp

xpxpxpxpr

xpxpxpxpxpxpr

xpxpxpxpr

xpxpr



















Bayes vs KNN Error

Recall that the Bayes risk at x is

and the error rate for (odd) KNN for infinitely many training

points

One can use these expressions to find the famous bound

   ki
k

i
ik

xpxp
k
k

xpxp
i
i

r )|()|(
2

)|()|(
1
22

212
1

1*
12  























   

 



















21i
1

212
1

)|()|(
1
22

expansion seriesTaylor by or

)|()|(411)|(),|(min

i
B

xpxp
i
i

xpxpxpxpr





BB rrrrr 2…
*

1
*
3

*
5 

Bayes vs KNN Error

The KNN bound is tighter than the Bhattacharya bound )|()|( 21 xpxp 

2* Bayes
Bhatt
r1
r3
r5
r7
Bayes

Error Rate for Even Number of Neighbors

With even number of neighbors, we can have a tie (for two classes).

When there’s a tie, we don’t make a choice. This leads to a slightly

lower error rate

and a sequence of bounds

  i
k

i
ik

xpxp
i
i

r )|()|(
1
22

1*
2 
















BBB rrrrrrrrr 2……
*

1
*
3

*
5

*
6

*
4

*
22

1 

Related Posts