CS代考 COMP90049_2020_SM2)

Exam : Introduction to Machine Learning (COMP90049_2020_SM2)
Started: Nov 17 at 15:00
Quiz Instructions
Please read this message carefully to the end

Copyright By PowCoder代写 加微信 powcoder

Course: Comp90049 S2 2020
Identical examination papers: None
Exam duration: 2 hours + 30 minutes buffer time Reading time: 15 minutes
Calculators: Permitted
Total marks: 120 (1 mark ¡Ö 1 minute of work) Number of Questions: 19
Instructions to students:
Students must attempt all questions.
This is an open book exam. While you are undertaking this exam you are permitted to:
make use of the lecture slides and suggested readings for any lecture (including soft-copy versions) make use of the example programs or reports that have been part of this subject, including any code or analyses that you have written in the context of Assignment 1 and Assignment 2 and workshop materials, provided that you include suitable attribution
While you are undertaking this assessment you must not:
make use of any messaging or communications technology
make use of any world-wide web or internet-based resources such as wikipedia, stackoverflow, or google and other search services
act in any manner that could be regarded as providing assistance to another student who is undertaking this assessment, or will in the future be undertaking this assessment.
The work you submit must be based on your own knowledge and skills, without assistance from any other person.

Academic Integrity Declaration
By commencing and/or submitting this assessment I agree that I have read and understood the
University¡¯s policy on academic integrity. (https://academicintegrity.unimelb.edu.au/#online-exams)
I also agree that:
1. Unless paragraph 2 applies, the work I submit will be original and solely my own work (cheating);
2. I will not seek or receive any assistance from any other person (collusion) except where the work is for a designated collaborative task, in which case the individual contributions will be indicated; and,
3. I will not use any sources without proper acknowledgment or referencing (plagiarism).
4. Where the work I submit is a computer program or code, I will ensure that:
a. any code I have copied is clearly noted by identifying the source of that code at the start of the program or in a header file or, that comments inline identify the start and end of the copied code; and
b. any modifications to code sourced from elsewhere will be commented upon to show the nature of the modification.
Section A: Multiple-Choice Questions (MCQ)
These questions are designed to demonstrate your conceptual understanding of the methods we have studied in this class. Please select one or more correct answers from the options (by ticking the appropriate boxes).
Question 1 3 pts
ML basics [MCQ]
Which of the following statement(s) is/are TRUE (select all that apply)
We cannot use supervised learning if we do not have the labels.
In the regression task, none of the attributes should be categorical.

Question 2 5 pts
Evaluation [MCQ]
Consider evaluating a classifier on 100 test instances, and the resulting confusion matrix (as below), which of the following statement(s) is/are TRUE. (Select all that apply). The class labels refer to different levels of risk.
True labels
Predicted labels
ABCD A45 3 0 2 B9 10 0 1 C2 5 15 3 D0005
Micro-averaged recall is equal to accuracy. Class A achieves the highest class precision. Class D has the lowest class recall.
Class C has the lowest class F1.
Assume class labels A, B, C and D mean risk levels “none”, “low”, “medium” and “high”, respectively. The classifier is safe to use to avoid high risk situations.
We can use patients’ sex in machine leaning model to diagnose their disease. Identifying the type of a fossilised bone based on its genetic infromation is clustrering.
Question 3 3 pts
Optimisation [MCQ]
Which of the statement(s) is/are TRUE? (You may select more than one answer). To predict the shoe price using a linear combination of its material, market demand and its

Question 4 3 pts
Considering the given data set below, select the TRUE statement(s) (select all that apply).
Instance # X1 X2 X3 X4 Class 10111C 20101B 30011A 41111B 50010C 61100B 71010A 80000?
The predicted class label for instance #8 using Zero-R is B.
The predicted class label for instance #8 using 1-Nearest Neighbour is B.
If we define the weighting strategy as the inverse of the squared value of the Euclidean distance, the predicted class label for instance #8 using 5-Nearest Neighbour is C.
Calculating the distance using Euclidean and Manhattan gives different values.
type, we need to optimise the coefficients.
We prefer to use iterative optimisation over close-form (exact) optimisation when equalising the first derivative of the objective function with zero does not give the optimal values.
The logistic regression use gradient descent through an exact optimisation procedure to find the optimal parameters.
We always have global and local minimum values for the objective functions.
Question 5 3 pts

Neural Network [MCQ]
Which of the following statement(s) is/are TRUE (select all that apply)
By increasing the number of layers in multi-layer perceptron , the complexity of the model also increases.
Neural networks are as interpreable as logistic regression.
For a data set with few observations and many attributes, logistic regression can outperfrom neural networks.
Backpropagation provides a way to estimate the error for neurons in hidden layers in perceptron.
Question 6 4 pts
Feature selection [MCQ]
We are given the following dataset with three attributes A1, A2, A3, and Class Label. Assume that this table represents the full distributions of the data. By referring to the definition of Mutual Information (MI) and PointWise Mutual Information (PMI), which of the following statements are TRUE? (select all that apply )
[If computation is needed, use log2 with 3 decimal
point precision]
Class Label
False Medium
False Medium
False Medium
True Medium

Question 7 4 pts
Ensemble Learning [MCQ]
For a binary classification problem, your task is to classify 10 instances using a combination of 3 base classifiers, each of which has an accuracy of 80%. You can assume the classifier performance is stable over different subsets of instances.
If the majority voting method is used, what is the maximum and minimum accuracy you can obtain on the 10 test instances? [The answers are shown as minimum accuracy, maximum accuracy up to three decimal point.]
Minimum: 0.642 , Maximum: 0.982 Minimum: 0.642, Maximum:0.896 Minimum: 0.8, Maximum: 0.982 Minimum: 0.8 , Maximum: 0.896
By considering the MI, we can derive that A1 and Class are independent.
By considering the MI, we can derive that knowing A2 gives more information about Class than knowing A3.
PMI is always non-negative.
A3 is negatively correlated with Class. MI is always non-negative.
Question 8 4 pts
Association Rule Mining [MCQ]

Given the following transaction dataset, which of the following statements are TRUE? (select all that apply)
Transaction ID Items Bought
Chips, Cookies, Regular Soda, Ham
Chips, Ham, Boneless Chicken, Diet Soda
Ham, Bacon, Whole Chicken, Regular Soda
Chips, Ham, Boneless Chicken, Diet Soda
Chips, Bacon, Boneless Chicken
Chips, Ham, Bacon, Whole Chicken, Regular Soda
Chips, Cookies, Boneless Chicken, Diet Soda
If minsup = 0.12, the maximum size of a frequent item-set from this data is 5.
If {Chips}->{Diet Soda, Ham} is not a confident rule, using Apriori we prune {Chips, Diet Soda} -> {Ham}.
The total number of rules that can be generated from this data is 254.
The number of rules that we can generate from the item-set {Ham, Chips, Boneless Chicken} is 6.
A highly confident rule with low support is interesting. The support of {Ham, Bacon} is 2/7.
Question 9 2 pts

Section B: Method Questions (METHOD)
Conceptual and numeric (short-answer) questions. These questions are also designed to demonstrate your conceptual understanding of the methods we have studied in this class, by applying them to small example problems, or discussing them in the context of a problem.
You will answer these questions in a text box, with the option to include images of your hand-written solution (e.g., of formulas or diagrams).
If you combine images with typed text, all information must be presented in logical order, easy to follow for the marker. You are welcome to upload only an image of your hand-written solution (no typing).
Recommendation Systems [MCQ]
You want to design a recommendation system for an online Bookshop where you have a list of 100,000 available books and the users’ ratings for each book they have purchased. The total number of ratings is 1000 and we have relatively few users. No other metadata is available to us. Select all of the TRUE statements. ( please note CF is short for Collaborative Filtering )
For this dataset, Content-based Recommendation is appropriate.
For this dataset, user-based CF is appropriate.
By using a user-based CF, we won’t be able to recommend many of our books. For this dataset, we don’t have the problem of cold-start.
Question 10 5 pts
ML basics [Method]
Use the given data set to find the label for the test instance (denoted ‘?’). Show your calculation by building a linear model. Apply transformation if it is needed.

X1 X2 Y 228 1 5 27 2 3 13 36?
Edit View Insert Format Tools Table 12pt Paragraph
0 words
Question 11 5 pts
Probability [Method]
We have a data set of 100 cancer patients. Among them, 80 have lung cancer and all others have brain cancer. We also have genetic information whether a patient has gene X or not. In the data, 75 patients have gene X and 72 patients have lung cancer and gene X.
1. Predict the likelihood of having lung cancer if a patient has gene X based on this data set. [2 marks]
2. Predict the likelihood of having brain cancer if a patient does not have gene X based on this data set. [3 marks]

Note: Round your answers by two decimal digits.
Edit View Insert Format Tools Table 12pt Paragraph
0 words
Question 12 13 pts
Naive Bayes [Method]
Answer the following three questions:
1. What are the parameters that need to be trained in a Naive Bayes model? [2 marks]
2. Considering the given data set, calculate the parameters to train a Naive Bayes model. Use epsilon smoothing when is needed with an appropriate epsilon value. [6 marks]
X1 X2 Class AY1 BN1 CY1 AN1

BN1 CN1 AY0 BY0 CN0 AY0
3. Use the trained model in section 2 to predict the label for below test instances using Bayes formulations. [5 marks]
Instance # X1 X2 Class 1AN?
Note: round your calculations by two decimal digits.
Edit View Insert Format Tools Table 12pt Paragraph
0 words
Question 13 9 pts
Logistic regression [Method]

Considering the given training data,
X1 X2 class 111 211 010 100
1. Estimate the parameters of the logistic regression model after one iteration. The initial values of all parameters is zero and the learning rate is 0.5. [4.5 marks]
2. Use the estimated parameters to predict the labels for the given test instances. [3.5 marks]
Instance X1 X2 class 110? 203?
3. List two differences between logistic regression and Perceptron. [1 mark]
Note: round your answers by two decimal digits.
Edit View Insert Format Tools Table 12pt Paragraph
0 words

Question 14 11 pts
Neural Network [Method]
Consider the following data set to answer three following questions. X1 X2 X3 Class
Train 0.3 data -0.2
0.2 0.1 1 -0.6 -0.5 -1
1. Construct a multi-layer perceptron (MLP) consists of an input layer, a hidden layer of width 2, and an output layer to predict the class label. Define all necessary parameters including output functions. Assume a constant bias of 1.0, which is referred as neuron 0. Assume the “hyperbolic tan” (tanh) activation function. Draw your MLP. [2 marks]
2. Initialize all MLP parameters according to the formula . (For example, in weight layer 2 the weight connecting incoming node 1 to
outgoing node 2 is ). The learning rate is 1. The first derivative of tanh activation function is: . Perform one
epoch of backpropagation for your MLP. Compute the activations of the hidden
and output neurons, and the error of the network for the train data? [7.5 marks]
3. Use the estimated parameters (after one epoch in question 2) to predict the class
label of the test instance (use an appropriate decision boundary for tanh activation function). [1.5 mark]
Note: round your calculation by two decimal digits.
Edit View Insert Format Tools Table 12pt Paragraph
-0.1 -0.2 0.2 ?

p 0 words
Question 15 12 pts
Decision Trees [Method]
In the following table, we have 8 instances with 3 attributes Gender, Suburb, and Car Type, and a Class Label. Each row is showing an instance.
(Calculations up to two decimal points)
Gender Suburb
Car Type Class Family 1

1. Calculate the information gain and gain ratio of ‘Gender’ feature on the training dataset.(Note: you need to provide the results of each step to get full marks. You may need to use the following results: log2(1/2)=-1, log2(1/4)=-2, log2(1/8)=-3, log2(3/8)=-1.42, log2(5/8)=-0.68, log2(2/3)=-0.58, log2(1/3)=-1.58, log2(1/5)=-2.32, log2(2/5)= -1.32, log2 (3/5)=-0.74, log2(1)=0) [7 marks]
2. Does a decision tree exist, which can perfectly classify the given instances? If yes, draw that decision tree, otherwise, explain why not, by referring to the data. [2

3. If we use ‘Gender’ to build a decision stump, what is the accuracy of the stump on the dataset?[1 mark]
4. If we use ‘Suburb’ to build a decision stump, what would you expect to see for the accuracy of the decision stump given an evaluation dataset that you have not seen before? Explain why the stump has good/bad accuracy. [2 marks]
Edit View Insert Format Tools Table 12pt Paragraph
0 words
Question 16 5 pts
Evaluation II [Method]
Given the following learning curve for a KNN model, where the x-axis is the number of samples used in the training set, and y-axis is the accuracy of the model, answer the following questions:

1. What problem does the KNN suffer from? [1 marks]
2. (i) If you can only tune parameters of the model, what will you do to solve the
problem? (ii) Explain what was the previous model parameter and why it caused the problem (Relate your answer to the decision boundary, bias and variance) [4 marks] .
Edit View Insert Format Tools Table 12pt Paragraph
0 words
Question 17 4 pts
Unsupervised Learning [Method]

Suppose you have been given a labelled dataset for a classification problem. To learn more about the data, you first try an unsupervised clustering method (ignoring the labels) and evaluate the results. You find that the clusters have high separation and low purity. (Answer in about 4 sentences)
1. What does it mean for a clustering result to have high separation and low purity?
2. Will this dataset be easy or difficult for a nearest-neighbour algorithm to classify? Explain qualitatively, without using mathematical formulas. [2 marks]
Edit View Insert Format Tools Table 12pt Paragraph
0 words
Question 18 5 pts
Anomaly Detection [Method]
Imagine, we are given the following dataset:
Dataset = {1,1.05, 1.1, 1.15, 1.2, 1.21, 1.3, 1.4, 1.45, 1.5, 3.2, 4.25, 5.45, 6.23, 7.25, 8.35, 8.95, 10.05, 10.95, 12.15}

1. Which unsupervised anomaly detection technique(s) are appropriate here? briefly explain why? (refer to the properties of data) [3 marks]
2. Imagine after applying a clustering algorithm, we come up with the following two clusters.Use Cluster-based anomaly detection with relative distance to the closest cluster to calculate the outlier score of new test instances x1 = 0.5. (Use Manhattan distance and calculations up to 3 decimal points) [2 marks]
C1 = {1,1.05, 1.1, 1.15, 1.2, 1.21, 1.3, 1.4, 1.45, 1.5}
C2 = {3.2, 4.25, 5.45, 6.23, 7.25, 8.35, 8.95, 10.05, 10.95, 12.15}
Edit View Insert Format Tools Table 12pt Paragraph
0 words
Section C: Design and Application Questions (LONG_A)
The question(s) in this section are designed to demonstrate you have gained a high- level understanding of the methods and algorithms covered in this subject, and can apply that understanding to new scenarios. Expect your answer to each question to be from one third of a page to one full page in length (in hand writing). These questions will require significantly more thought than MCQ and METHOD questions, so you might want to attempt them last.

You will answer these questions in a text box, without the option to include images of your hand-written solution.
Question 19 20 pts
You want to start a business to help individuals find their dream job. You receive hundreds of CVs (curriculum vitaes, or resumes) that are uploaded by applicants for you to consider. For each submission you receive the following features:
Explanation of their dream job Degree Title
Degree Major
Date of degree completion Name of the applicant
Home address of the applicant Gender of the applicant
All submitted CVs belong to three primary categories of interest: “Technology and Engineering”, “Advertising, Arts, and Media”, and “Retail and Consumer Products”. You want to build a machine learning model that assigns incoming CVs to a job category. You do not have access to any labelled data to begin with.
1. (i) What machine learning algorithm is appropriate for this task in the beginning? Justify your choice. (ii) Explain each step of this algorithm in the context of this task. [5 marks]
2. Assume now you have access to an additional small set of CVs which are labelled with their dream job category. You build a multi-layer perceptron classifier to assign incoming CVs to a category. (i) Choose an appropriate evaluation strategy and justify your choice. (ii) Describe each of the steps you would follow in evaluating your model under this strategy in the context of the given task and data set. [6 marks]
3. After evaluation, you find that the performance of the model is not satisfactory. Discuss two reasons why this may be the case. [2 marks]
4. Now you want to improve the performance of your model, by also using CVs for which the true categories are unknown. Assume you have access to an expert who can distinguish Job categories based on CVs. You want to improve the performance of the model by leveraging the expert’s knowledge efficiently. (I) Select an appropriate machine learning algorithm and justify your choice. (II)

Explain the algorithm in the context of this data set. (III) Justify any settings of the algorithm you may need to decide on. [7 marks]
Edit View Insert Format Tools Table 12pt Paragraph
0 words
Question 20 0 pts
Select the response that applies to your attempt to this exam.
I confirm that all of the following points apply. While undertaking this exam
I have not made use of any communications technology (including, but not limited to: mobile phones, WeChat or email, …); and
I have not made use of any internet resources (including, but not limited to: Stackoverflow, Google or other search engines, Wikipedia, …); and
I have not made use of any print or electric resources except for the materials distributed in the context of this course COMP90049; and
I have not taken any actions that would encourage, permit, or support other enrolled students to violate the Academic Honesty expectations that apply to this exam; and
The answers I am submitting are my own unassisted work;
and thereby complied with the Academic Honesty expectations that apply to this exam.

I have not complied with the Academic Honesty expecta

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com