CS计算机代考程序代写 algorithm Exercises for the course

Exercises for the course
Machine Learning 1
Winter semester 2020/21
Abteilung Maschinelles Lernen Institut fu ̈r Softwaretechnik und theoretische Informatik Fakult ̈at IV, Technische Universit ̈at Berlin Prof. Dr. Klaus-Robert Mu ̈ller Email: klaus-robert.mueller@tu-berlin.de
Exercise Sheet 8
Exercise 1: Boosted Classifiers (15 + 15 P)
We consider a two-dimensional dataset x1, . . . , x8 ∈ R2 with binary labels y1, . . . , y8 ∈ {−1, 1}.
x1 x2 x3 x4 1
x5 x6 x7 x8 0
0123
Red circles denote the first class (yi = +1) and white squares denote the second class (yi = −1). We decide to classify this data with a boosted classifier and use the nearest mean classifier as a weak classifier. The boosted classifier is given by 􏰆 􏰃T 􏰇
f(x) = sign α0 + αtht(x) t=1
where α0, . . . , αT ∈ R are the boosting coefficients. The tth nearest mean classifier is given by
􏰞 + − h(x)= +1 ∥x−μt ∥<∥x−μt ∥ 􏰂 p(t)x 􏰂 p(t)x with μ+ = i:yi=+1 i i and μ− = i:yi=−1 i i. t −1 else where p(t), . . . , p(t) are the data weighting terms for this classifier. 1N (a) Draw at hand a possible boosted classifier that classifies the dataset above, i.e. draw the decision boundary of the weak classifiers ht(x) and of the final boosted classifier f(x). We use the convention sign(0) = 0. (b) Write the weighting terms p(t) and the coefficients α0, . . . , αT associated to the classifiers you have drawn. (Note: In this exercise, the boosted classifier does not need to derive from a particular algorithm. Instead, the number of weak classifiers, the coefficients and the weighting terms can be picked at hand with the sole constraint that the final classifier implements the desired decision boundary.) Exercise 2: AdaBoost as an Optimization Problem (15 + 15 P) Consider AdaBoost for binary classification applied to some dataset D = {(x1, y1), . . . , (xN , yN )}. The algorithm starts with uniform weighting (∀N i=1 : p(1) = 1/N) and performs the following iteration: i for t = 1 . . . T : Step 1: Step 2: Step 3: Step 4: D, p(t) 􏰛→ ht εt = Ep(t) [1(ht(x)̸=y)] αt = 1 log 􏰆 1 − εt 􏰇 2 εt ∀N : p(t+1) = Z−1p(t) exp(−α y h (x )) i=1i ti titi (learn tth weak classifier using weighting p(t)) (compute the weighted error of the classifier) (set its contribution to the boosted classifier) (set a new weighting for the data) i t 􏰂 p(t) t 􏰂 p(t) i:yi =+1 i i:yi =−1 i The term Ep(t) [·] denotes the expectation under the data weighting p(t), and Zt is a normalization term. An interesting property of AdaBoost is that it can be shown to minimize some objective function N G(α) = 􏰃 exp(−yifα,t(xi)) i=1 where fα,t(x) = 􏰂tτ=1 ατhτ(x) is the output score of the boosted classifier after t iterations. t−1 N (a) Show thattheobjectivecanberewrittenasG(α)=N·􏰆􏰉Z 􏰇·􏰃p(t)exp(−yαh(x)). (b) Show that Step 3 of the AdaBoost procedure above is equivalent to computing αt = arg min G(α). αt Exercise 3: Programming (40 P) Download the programming files on ISIS and follow the instructions. τiitti τ=1 i=1