CS计算机代考程序代写 python Bayesian algorithm RMIT Classification: Trusted

RMIT Classification: Trusted
Evaluating Hypotheses & Bayesian Learning
COSC 2673-2793 | Semester 1 2021 (computational) Machine Learning
Image: Freepik.com

RMIT Classification: Trusted
Assignment 1
COSC 2673-2793 | Semester 1 2021 (computational) Machine Learning
Image: Freepik.com

RMIT Classification: Trusted
Key Dates and Important Information
Ø Assignment Type: Individual
Ø Due: Friday 16th April 2021 (Week 6)
Ø Marks: 30%
Ø Specifications: on canvas – Specification PDF & Marking rubric
Ø Late policy: After the due date, you will have 5 days to submit your assignment as a late submission. Late submissions will incur a penalty of 10% per day. After these five days, Canvas will be closed and you will lose ALL the assignment marks.
Clarifications/updates may be made via announcements/relevant discussion forums.
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 3

RMIT Classification: Trusted
Task
The machine learning task we are interested in is:
“Predict if a given patient (i.e. newborn child) will be discharged from the hospital within 3 days (class 0) or will stay in hospital beyond that – 4 days or more (class 1).”
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 4

RMIT Classification: Trusted
Dataset
The data set for this assignment is available on Canvas.
Ø README.md: Description of dataset.
Ø train_data.csv: This data is to be used in developing the
Ø test_data.csv: You need to make predictions for this data and submit the prediction via canvas.
Ø s1234567_predictions.csv: expected format for your predictions on the unseen test data. Any deviation from this format will result on zero marks for the results part.
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 5

RMIT Classification: Trusted
Restrictions
ØYou must NOT explicitly perform manual feature selection. That is, your models should have all features (attributes) as input (except the “ID” and “Health Service Area” fields which are not attributes).
ØYou are only allowed to use techniques taught in class up to week 5 (inclusive) for this assignment. That is, you are NOT allowed to use ML techniques such as: Neural networks or SVM for this task.
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 6

RMIT Classification: Trusted
Deliverables
Ø The PDF version of the python notebook used for the model development including critical analysis of your approach and ultimate judgement.
Ø A set of predictions from your ultimate judgement. Should be in CSV format. If your model predicts the patient will be discharged from the hospital within 3 days, the associated “LengthOfStay” value in CSV should be 0 (1 otherwise).
Ø Your code (Jupyter notebooks) used to perform your analysis. Should be a ZIP le containing all the support les. will be used for plagiarism checking – notebook should match PDF.
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 7

Marking Guidelines
Ø Approach 50%;
Ø Ultimate Judgment & Analysis
20%;
Ø Performance on test set (Unseen data) 20%;
Ø Implementation 10%; Rubric attached on canvas
Practice the typical machine learning process which includes:
• Selecting the appropriate ML techniques and applying them to solve a real-world ML problem.
• Analysing the output of the algorithm(s).
• Research how to extend the modelling techniques that are taught in class.
• Providing an ultimate judgement of the nal trained model that you would use in a real-world setting.
RMIT Classification: Trusted
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 8

RMIT Classification: Trusted
Academic Integrity and Plagiarism
You code and report will be screened using plagiarism checking software.
ØPDF Notebook: Turnitin ØCode: CodeQuiry
See section 6 on assignment specifications for more details.
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 9

RMIT Classification: Trusted
Evaluating Hypotheses
COSC 2673-2793 | Semester 1 2021 (computational) Machine Learning
Image: Freepik.com

RMIT Classification: Trusted
Practical Methodology
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 11

RMIT Classification: Trusted
Practical Methodology
A good machine learning practitioner:
Ø Know how to choose an algorithm for a particular application.
Ø Know how to setup experiments and use the results to improve a machine learning system.
• Gather more data?
• Increase or decrease model capacity?
• Add or remove regularizing features?
• Improve the optimization of a model?
• Debug the software implementation of the model
All these operations are at the very least time consuming to try out, so it is important to be able to determine the right course of action rather than blindly guessing.
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 12

RMIT Classification: Trusted
Typical Procedure: Model development Ø Determine your goals: Performance metric and target value. Problem
dependent.
Ø Setup the experiment: Setup the test/validation data, visualizers and debuggers needed to determine bottlenecks in performance (overfitting/underfitting, feature importance).
Ø Default Baseline Model: Identify the components of end-to-end pipeline including – Baseline Models, cost functions, optimization.
Ø Make incremental changes: Repeatedly make incremental changes such as gathering new data, adjusting hyperparameters, or changing algorithms, based on specific findings from your instrumentation.
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 13

RMIT Classification: Trusted
Measuring Performance
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 14

RMIT Classification: Trusted
Model Development
Km M. Year
Fuel C.
Price
f (x)
h? (x)
$9,750
Price
Km
M. Year
Fuel C.
10,000
102,000
2005
7.8
23,500
25,000
2010
5.2
12,250
256,000
2008
9.9
40,100
12,000
2018
11.2
5,000
23,000
2000
12.7
19,200
55,000
2015
12.4
12,500
121,000
2012
21.0
Assumption:
Nature of the relationship (e.g. Linear, polynomial)
h✓ (x) = ✓0 + ✓1×1 + ✓2×2 + ✓3×3
Training:
Find the ”Best hypothesis from the hypothesis space H” e.g. :
h?✓ (x) = 0.1 + 2.3×1 + 1.1×2 + .05×3
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 15

RMIT Classification: Trusted
How good is the hypothesis (model)?
• Didweusesuitableattributes(features)fortheproblem?
• Istheassumptionsmadevalid?
• E.g. The relationship between attributes and target variable linear.
• Willthederivedhypothesis(model)generalizewelltonew data?
• Has it overfitted to training data
• Is our assumption too limiting (Under fit – bias)
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 16

RMIT Classification: Trusted
How good is the hypothesis (model)?
We need an evaluation framework to measure performance of a hypothesis or model:
ØA set of data to measure the performance on.
ØA measure of “goodness” of a hypothesis – performance metric.
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 17

RMIT Classification: Trusted
Independent Test Data
Ø To measure performance we require independent test data:
• Data that “simulates” unseen data.
• It mimics the process of using a hypothesis “for real”
• Data which has not been used for training (or testing!)
ØWe will explore two mechanism to generate a test set:
• Hold-out validation
• Cross-validation
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 18

RMIT Classification: Trusted
Hold-Out Validation

RMIT Classification: Trusted
Constructing Training/Testing Data
Data set for Training and Testing are constructed by sub-dividing the experience
Typically a 80% – 20% split is used
Training Data
Testing Data
However, this doesn’t entirely help us train and test effectively
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 20

ML Process
RMIT Classification: Trusted
Machine Learning Algorithm/Program
Tuneable parameters
Loss
Stage 1.1: Find the optimal hypothesis from the hypothesis space given a 𝝀 value
Stage 2: Test the optimal hypothesis on “simulated” unseen data.
Training Data
Stage 1.2: Find the best hypothesis amongst the hypotheses for different 𝝀 value

Best Hypothesis
Price
Test Data
What data should we use in Stage 1.2?
What about hyper-parameters? How should we set them?
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 21

RMIT Classification: Trusted
Hyper Parameter Tuning
What data should we use? Training/Testing?
Ø Training
Ø Will end up with a trivial answer: We can always choose a very high capacity model
to fit the data and then set lambda to zero. This will give the best value for 𝐽(𝜃) 1% (“) (“)) +)
𝐽(𝜃)=𝑛∑ h& 𝐱 −𝑦 +𝜆∑𝜃* “#$ *#$
Ø Testing
Ø Will overfit to test data and select the best hypothesis that do well on test data.
Ø Now our test data is not longer independent.
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 22

ML Process
RMIT Classification: Trusted
Machine Learning Algorithm/Program
Tuneable parameters
Loss
Stage 1.1: Find the optimal hypothesis from the hypothesis space given a 𝝀 value
Training Data
Validation Data
Stage 1.2: Find the best
hypothesis amongst the hypotheses for different 𝝀 value
Stage 2: Test the optimal hypothesis on “simulated” unseen data.
Test Data

Best Hypothesis
Price
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 23

RMIT Classification: Trusted
Constructing Training/Testing Data
Typically, a data set is sub-divided into three sets:
Training data – for training a hypothesis
Validation data – for “testing” and tuning parameters of the ML algorithm Testing data – for evaluating and comparing final hypotheses
Typically a 60% – 20% – 20% split
Training Data
Validation Data
Testing Data
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 24

RMIT Classification: Trusted
Important Considerations
Ø No overlap between the partitions (Train, Validation and Test) Ø E.g. Assume that you want to build a model that predicts if a patient has
diabetes using two attributes: BMI and blood glucose level. Ø Given the data in the table can we do random splitting?
Patient ID
BMI
Glucose
Diabetes
P1 Visit 1
.
.
.
P1 Visit 2
.
.
.
P2 Visit 1
.
.
.
P3 Visit 1
.
.
.

.
.
.
Random splitting might not always work. Be careful.
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 25

RMIT Classification: Trusted
Important Considerations
Ø Reasonable representation (sampling) of unknown function
Ø E.g. Assume that you want to build a model that predicts if a patient has diabetes using two attributes: BMI and blood glucose level.
Ø We are planning to deploy this model to all hospitals in Victoria
Ø Given the data from only 5 hospitals in Victoria, is random splitting the best option?
Hospital
BMI
Glucose
Diabetes
1
.
.
.
1
.
.
.
2
.
.
.
3
.
.
.

.
.
.
Random splitting might not always work. Be careful.
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 26

Cross Validation
RMIT Classification: Trusted

RMIT Classification: Trusted
K-fold Cross Validation
• Divide data into 𝑘 partitions
• One partition is assigned the
test set, all other 𝑘 − 1 form
training set
• Evaluate then repeat, with
different partitions as test set
• Average results
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 28

RMIT Classification: Trusted
Evaluation Metric

RMIT Classification: Trusted
Evaluation Metrics
Ø Why not use the loss function?
Ø Loss function is selected to make optimization process easy. The
value of the loss function may not be very intuitive to us.
J(✓)= 1 Pn y(i)logh✓X(i)1y(i)log1h✓X(i) 2n i=1
Ø Evaluation metric is selected so that they are intuitive.
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 30

RMIT Classification: Trusted
Regression Evaluation Measures

RMIT Classification: Trusted
Regression Evaluation Metrics
Ø Evaluation Measures:
• •

Mean Absolute Error (MAE) Mean Squared Error (MSE)
o Root-Mean Squared Error (RMSE)
𝑅% – R squared, coefficient of determination
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 32

RMIT Classification: Trusted
Classification Evaluation Measures

RMIT Classification: Trusted
Classification Evaluation
Ø Given a classification problem, we want to evaluate how well our classifier performs in comparison to actual classes
Ø With classification, we can discuss types of errors
• Confusion Matrix
• Accuracy, Precision, Recall, F1-Score
• ROC curve
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 34

RMIT Classification: Trusted
Example: Screening COVID-19 Airport
11 3
Quarantine
Screening done at airport:
Ø False positive (Type I):Non-COVID patients detected as COVID by test.
Ø False negative (Type II): True COVID patient not detected by system.
1
No Action
0
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 35
Screening Test

RMIT Classification: Trusted
Example: Screening COVID-19 Airport
11 3
Quarantine Quarantine
1 No Action 1 No Action 00
Model A Model B
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 36
Screening Test
Screening Test

RMIT Classification: Trusted
Example: Screening COVID-19 Airport
False Positive
Quarantine Quarantine
1 No Action 1 No Action 00
Model A Model B
True Positive
False Negative
True Negative
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 37
Screening Test
Screening Test

RMIT Classification: Trusted
Example: Screening COVID-19
Hospital Admission
Quarantine Quarantine
1 No Action 1 No Action 00
Model A Model B
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 38
Screening Test
Screening Test

RMIT Classification: Trusted
Classification Errors (Type 1 vs Type 2)
Consider binary class problems
Assume there are some classification errors
Predict one class, but actually the test data had the other class
Not all classification errors are the same
From the COVID-19 example:
Person “does not have COVID-19” (negative) but the test result is (positive)
• Type1error,FalsePositive
Person “has COVID-19” (positive) but the test is (negative)
• Type2error,FalseNegative
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 39

RMIT Classification: Trusted
Types of Classification Outputs
True Positive (TP): class 1 predicted as class 1
False Positive (FP): class 0 predicted as class 1 (type 1 error) True Negative (TN): class 0 predicted as class 0
False Negative (FN): class 1 predicted as class 0 (type 2 error)
Total number of instances m = TP + FP +
TN + FN
Predicted
T (1)
F (0)
Actual
T (1)
TP
FN
F (0)
FP
TN
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 40

RMIT Classification: Trusted
Confusion Matrix
The confusion matrix summarise the four types of classification error
y
h(X) -A
h(X) -B
1
0
1
0
0
0
0
1
1
0
0
0
1
1
1
0
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
Predicted
T (1)
F (0)
Actual
T (1)
F (0)
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 41

RMIT Classification: Trusted
Hyper Parameter Tunning
Regularization parameter

RMIT Classification: Trusted
Problem of Overfitting and Underfitting
Temperature Temperature Temperature
𝜃! +𝜃”𝑥” 𝜃! +𝜃”𝑥” +𝜃#𝑥”# 𝜃! +𝜃”𝑥” +𝜃#𝑥”# +𝜃$𝑥”$ +𝜃%𝑥”% …
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 43
Power consumption
Power consumption
Power consumption

RMIT Classification: Trusted
How to Identify over/under fitting
Assume data split into train and validation sets
𝑃,-.”% 𝑋 = 2 𝐿(h 𝑋 ,𝑌) /∈ ,-.”%”%1
𝑃*.2″3 𝑋 = 2
/∈ *.2″3.,”4%
𝐿(h 𝑋 ,𝑌)
Model complexity Model complexity
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 44
Performance
Performance

RMIT Classification: Trusted
Tunning regularization parameter
• •

Change 𝜆 and observe the training and validation errors.
Pick the 𝜆∗ value that has the smallest gap and best validation performance.
The best hypothesis will be the the optimal hypothesis at 𝜆∗
Never use test data to do this. Always use validation data or cross validation.
𝜆
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 45

RMIT Classification: Trusted
How to choose values for Hyper parameters
Grid search Random Search
COSC2673 | COSC2793 Week 4: Evaluating Hypotheses 46