程序代写代做代考 Option One Title Here

Option One Title Here

ANLY-601
Advanced Pattern Recognition

Spring 2018

L5 – Bayes Classifiers (cont’d)

Summary of Dichotomy (Two-Class)
Hypothesis Tests

• Bayes least error rate

• Bayes least cost





1

2

2

1

)|(

)|(
)(

P

P

xp

xp
xl >< 1 2 1 2 1121 2212 2 1 )|( )|( )( P P xp xp xl        >< 1 2 2 Summary of Dichotomy Hypothesis Tests • Neyman-Pearson with • Minimax theshold such that    )|( )|( 2 1 xp xp >
< 1 2 0Ε         )( 2 1 )|(   L n xdxp    )|( )|( 2 1 xp xp >
< 1 2 21 ΕΕ  3 Multi-Hypotheses Suppose there are L classes 1, ..., L and decision costs ij for choosing i when j is true. Then the minimal cost decision rule is pick k where )|( minarg )( minarg 1 xpxRk jij L ji i i    When jiand ijii  ,1 0  the cost is just the average error rate, and the decision rule is pick k k  arg max i p(i | x) 4 Reject Option For a 2-class, least error rate problem, when the posteriors are close to 0.5, the error rate will be large One might want to establish a window for rejection within which we refuse to make a judgment 2 ( | )p x  )|(min)( xpx i i E x t 0.5 L1 L2 L(t) Reject region 1 ( | )p x Reject Option Reject rate Error rate    )( )()(Prob tL n xdxptLx   xdxpxpxp n tL )()|(),|(min )( 21 E x t 0.5 L1 L2 L(t) Reject region – lower posterior greater than t p(2|x) p(1|x) 6 Reject Option Error rate     1 2 ( ) 1 1 2 2 1 2 ( ) min ( | ), ( | ) ( ) min ( | ), ( | ) n L t n L t p x p x p x d x P p x P p x d x P P           1 2 E E E x P1 p(x|1) P2 p(x|2) L(t) 7 Reject Option • Reject option lowers error rate by refusing to make decisions on feature values x where the error rate is high (near the crossing of the posterior curves). • A larger reject region (smaller t ) lowers the error, and increases the rate at which we refuse to make a decision. x t 0.5 L1 L2 L(t) Reject region – lower posterior greater than t p(2|x) p(1|x) 8 Sequential Hypothesis Tests Have sequence of observations assumed to be independent and identically distributed (i.i.d.). May be from a timeseries, e.g. speech segments, manufacturing production run … Each sequence is from one of two possible classes. Suppose we want to continue to accrue information from this sequence until we have enough information to make a decision -- e.g. maybe we have a reject threshold to overcome. It seems clear that if we make many measurements (e.g. on consecutive items in a manufacturing production run) that we’ll improve our classification results. n xxx ,...,, 21 9 Sequential Hypothesis Tests - log likelihood ratio How does H behave relative to hi ? Let’s look at its mean and variance             m i i m i i i m m m xh xp xp xxxp xxxp xxxH 11 2 1 221 121 21 )( )|( )|( ln )|...,,,( )|...,,,( ln)...,,,(             2 1 |var)(var|var || iii m i ii iii mhmxhH mhEmHE             10 Sequential Hypothesis Test Conditional mean Can bound i even for arbitrary density by appeal to the inequality ln z <= z-1 This gives   xdxp xp xp hE n iii )|( )|( )|( ln| 2 1              .0, 0 011)|(1 )|( )|( )|( )|( )|( ln 2 1 1 1 2 1 1 2 1                              Similarly So xdxp xp xp xdxp xp xp n n 11 Sequential Hypothesis Test 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 -20 -15 -10 -5 0 5 10 15 20 h p(h|w2)p(h|w1) Separation increases with increasing number of observations as m1/2. We have     2 21 |var | 0,0 ii ii mH mHE       m 1 m 2 p(H|1) p(H|2) A convenient measure of separation between the two classes is         2 2 2 1 12 12 12 |var|var ||          m HH HEHE 12 Sequential Hypothesis Tests m=1 m=10 m=50 13 Wald Test for Sequential Observations Terminate sequence of observations when H reaches some threshold -- e.g. when otherwise, continue gathering measurements.     0| 0|)( 22 1 1 1      mHE mHExhH m k km 2 1   choosebH or chooseaH m m   14 Wald Test • Wald showed that – Error rates: When h(x) is small – Average sequence length to reach threshold is ba eBeA BA B BA AB                    , 1 , )1( 1 2EE     2 22 2 1 11 1 )1( | )1( |     EE EE     ba mE ba mE 15