Tutorial Questions | Week 2
COSC2779 – Deep Learning
This tutorial is aimed at reviewing basic machine learning concepts. Please try the questions
before you join the session.
1. A computer program is said to learn from experience E with respect to some task T and some performance
measure P if its performance on T, as measured by P, improves with experience E. Suppose we feed a
learning algorithm a lot of historical weather data, and have it learn to predict weather.
(a) What would be a reasonable choice for P?
(b) What is T?
2. Suppose you are working on stock market prediction, and you would like to predict the price of a particular
stock tomorrow (measured in dollars). You want to use a learning algorithm for this. Would you treat this
as a classification or a regression problem?
3. In which one of the following figures do you think the hypothesis has overfit the training set?
(a) (b) (c) (d)
4. What’s the trade-off between bias and variance?
5. How do you ensure you’re not overfitting with a model?
6. We can regularize a regression model by introducing a ridge penalty as shown in the equation below:
L = 1
N
∑N
i=1 (y − ŷ)
2
+ λw>w
what happens if we tune λ by looking at the performance on the train set?
7. The breast cancer dataset is a standard machine learning dataset. It contains 9 attributes describing 286
women that have suffered and survived breast cancer and whether or not breast cancer recurred within 5
years. It is a binary classification problem. Of the 286 women, 201 did not suffer a recurrence of breast
cancer, leaving the remaining 85 that did. How would you handle an imbalanced dataset?
8. What validation technique would you use on a time series dataset?
http://archive.ics.uci.edu/ml/datasets/Breast+Cancer