Department of Informatics, King’s College London Pattern Recognition (6CCS3PRE/7CCSMPNN).
Assignment: Support Vector Machines (SVMs) and Ensemble Methods
This coursework is assessed. A type-written report needs to be submitted online through KEATS by the deadline specified on the module’s KEATS webpage. In this coursework, we consider (before Q8) a classification problem of 3 classes. A multi-class SVM-based classifier formed by multiple SVMs is designed to deal with the classification problem. And after Q8 (included Q8) considers your “own created” dataset to investigate the classification performance using the techniques of Bagging and Boosting. Some simple “weak” classifiers will be designed and combined to achieve an improved classification performance for a two-class classification problem.
Q1. Write down your 7-digit student ID denoted as s1s2s3s4s5s6s7. (5 Marks)
Q2. Find R1 which is the remainder of . Table 1 shows
the multi-class methods to be used corresponding to the value of R1 obtained. (5 Marks)
Table 1: R1 and its corresponding multi-class method.
Q3. Create a linearly separable two-dimensional dataset of your own, which consists of 3 classes. List the dataset in the format as shown in Table 2. Each class should contain at least 10 samples and all three classes have the same number of samples. Note: This is your own created dataset. The chance of having the same dataset in other submissions is slim. Do not share your dataset with others to avoid any plagiarism/collusion issues.
(10 Marks)
R1
Method
0
One against one
1
One against all
2
Binary decision tree
3
Binary coded
Table 2: Samples of three classes.
Q4. Plot the dataset in Q3 to show that the samples are linearly separable. Explain why your dataset is linearly separable. Hint: the Matlab built-in function plot can be used and show some example hyperplanes which can linearly separable the datasets. Identify which hyperplane is for which classes. (20 Marks)
Q5. According to the method obtained in Q2, draw a block diagram at SVM level to show the structure of the multi-class classifier constructed by linear SVMs. Explain the design (e.g., number of inputs, number of outputs, number f SVMs used, class label assignment, etc.) and describe how this multi-class classifier works.
Remark: A blocking diagram is a diagram which is used to, say, show a concept or a structure, etc. Here in this question, a diagram is used to show the structure of the multi- class SVM classifier, i.e., how to put binary SVM classifiers together to work as a multi- class SVM classifier. For example, Q5 of tutorial 9 is an example of a block diagram at SVM level. Neural network diagram is a kind of diagram to show its structure at neuron level. The block diagrams in lecture 9 are to show the architecture of ensemble classifier, etc. (20 Marks)
Q6. According to your dataset in Q3 and the design of your multi-class classifier in Q5, identify the support vectors of the linear SVMs by “inspection” and design their hyperplanes by hand. Show the calculations and explain the details of your design.
(20 Marks)
Q7. Produce a test dataset by averaging the samples for each row in Table 2, i.e., (sample of class 1 + sample of class 2 + sample of class 3)/3. Summarise the results in the form of Table 3, where N is the number of SVMs in your design and “Classification” is the class determined by your multi-class classifier. Explain how to get the “Classification” column using one test sample. Show the calculations for one or two samples to demonstrate how to get the contents in the table. (20 Marks)
Table 3: Summary of classification accuracy.
Marking: The learning outcomes of this assignment are that student understands the fundamental principle and theory of support vector machine (SVM) classifier; is able to design multi-class SVM classifier for linearly separable dataset and knows how to determine the classification of test samples with the designed classifier. The assessment will look into the knowledge and understanding on the topic. When answering the questions, show/explain/describe clearly the steps/design/concepts with reference to the equations/theory/algorithms (stated in the lecture slides). When making comments (if necessary), provide statements with the support from the results obtained.
Purposes of Assignment: This assignment provides the overall classification idea from samples to design to classification. It helps you to make clear the concept, working principle, theory, classification of samples, design procedure and multiple-class classification techniques for SVM.
Q8. Create a non-linearly separable dataset consisting of at least 20 two-dimensional dataset. Each data is characterised by two points x1 ∈ [−10, 10] and x2 ∈ [−10, 10] and
associated with a class y ∈ {−1, +1}. List the data in a table in a format as shown in Table 1 where the first column is for the data points of class “−1” and the second column is for the data points of class “+1”. (20 Marks)
Q9. Plot the dataset (x axis is x1 and y axis is x2) and show that the dataset is non- linearly separable. Represent class “−1” and class “+1” using “×” and ‘◦”, respectively.
Explain why your dataset is non-linearly separable. Hint: the Matlab built-in function plot can be used. (20 Marks)
Q10. Design Bagging classifiers consisting of 3, 4 and 5 weak classifiers using the steps shown in Appendix 1. A linear classifier should be used as the weak classifier. Ex- plain and show the design of the hyperplanes of weak classifiers. List the parameters of the design hyperplanes.
After designing the weak classifiers, apply the designed weak classifiers and bagging classifier to all the samples in Table 1. Present the classification results in a table as shown in Table 2. The columns “Weak classifier 1” to ‘Weak classifier n” list the output class ({−1, +1}) of the corresponding weak classifiers. The column “Overall classifier” list the output class ({−1, +1}) of the bagging classifier. The last row lists the classification
accuracy in percentage for all classifiers, i.e., . Explain how to determine the class (for each weak classifier and over all classifier) using
one test sample. You will have 3 tables (for 3, 4 and 5 weak classifiers) for this question. Comment on the results (in terms of classification performance when different number of weak classifiers are used). (30 Marks)
Table 2: Classification results using Bagging technique combining n weak classifiers. The first row “Data” are the samples (both classes 1 and 2) in Table 1.
Q11. Design a Boosting classifier consisting of 3 weak classifiers using the steps shown in Appendix 2. A linear classifier should be used as a weak classifier. Explain and show the design of the hyperplanes of weak classifiers. List the parameters of the design hyperplanes. After designing the weak classifiers, apply the designed weak classifiers and boosting classifier to all the samples in Table 1. Present the classification results in a table as shown in Table 2. Explain how to determine the class (for each weak classifier and boosting classifier) using one test sample. Comment on the results of the overall classifier in terms of classification performance when comparing with the 1st, 2nd and the 3rd weak classifiers, and with the bagging classifier with 3-weak classifiers in Q.3.
Appendix 1: Bagging1
Q1. Start with dataset D.
Q2. Generate M dataset D1, D2, . . ., DM .
• Each distribution is created by drawing n′ < n samples from D with. replacement.
• Some samples can appear more than once while others do not appear at. all. Q3. Learn weak classifier for each dataset.
• weak classifiers fi(x) for dataset Di, i = 1, 2, ..., M.
Q4. Combine all weak. classifiers using a majority voting scheme.
(30 Marks)
Appendix 2: Boosting 2
• Dataset D with n patterns • Training procedure:
(1Details can be found in Section “Bagging” in the Lecture notes
2Details can be found in Section “Boosting” in the Lecture notes. )
Step 1: Randomly select a set of n1 ≤ n patterns (without replacement) from D to create dataset D1. Train a weak classifier C1 using D1 (C1 should have at least 50% classification accuracy).
Step 2: Create an “informative” dataset D2 (n2 ≤ n) from D of which roughly. half of the patterns should be correctly classified by C1 and the rest is wrongly classified. Train a weak classifier C2 using D2.
Step 3: Create an “informative” dataset D3 from D of which the patterns are not well classified by C1 and C2 (C1 and C2 disagree). Train a weak classifier C3 using D3.
• The final decision of classification is based on the votes of the weak classifiers.– e.g., by the first two weak classifiers if they agree, and by the third weak classifier if the first two disagree.
Marking: The learning outcomes of this assignment are that student understands the fundamental principle and concepts of ensemble methods (Bagging and Boosting); is able to design weak classifies; knows the way to form Bagging/Boosting classifier and knows how to determine the classification of test samples with the designed Bagging/Boosting classifiers. The assessment will look into the knowledge and understanding on the topic. When answering the questions, show/explain/describe clearly the steps/design/concepts with reference to the equations/theory/algorithms (stated in the lecture slides). When making comments, provide statements with the support from the results obtained.