CS计算机代考程序代写 deep learning decision tree GMM algorithm Beacon Conference of Undergraduate Research

Beacon Conference of Undergraduate Research

Introduction to Statistic Machine Learning Review

Lingqiao Liu
University of Adelaide

Overview of Machine Learning

University of Adelaide 2

• Types of machine learning systems

• Basic math skills
– The same set of skills you will need to use in the exam

Classification, KNN, Overfitting

• What is the classification system?
– Describe steps in building a classification system

• Nearest neighbour classifier
– 1 nearest neighbour

– K nearest neighbours

– The effect of K

• Model selection problem
– We introduce the model selection problem from the example of

choosing k in KNN classifiers.

– Concept of overfitting and generalization

– Validation set

– K-fold cross validation: special case, leave-one-out cross
validation

University of Adelaide 3

Linear Classifier, Linear SVM

• Linear discriminant function
– Know basic concepts, like separating hyperplanes

– Linear and non-linear classifiers

• Basic idea of linear SVM
– Concepts: support vectors, margin

• Hard margin and soft margin SVM
– What’s the difference?

– Motivation of soft-margin SVM

• Primal and Dual problems of linear SVMs
– Formulation, relationship between variables in primal problem

and dual problem, meaning of each term (objective terms and
constraints)

– How to derive dual from primal problem

University of Adelaide 4

Regression

• What is regression problem?

• Linear Regression
– Regression to scalar value and vector values.

– Close-form solution

• Regularized linear regression
– P-norm

– L2 regularized linear regression, or ridge regression and L1
regularized linear regression or Lasso

– Benefit of ridge regression, its close-form solution

– Benefit of Lasso

• Support vector regression
– motivation and intuitive idea

– The primal problem and dual problem (optional)

University of Adelaide 5

Ensemble methods

• Basic concepts

– Why ensemble methods, what is ensemble methods

– General idea or workflow

• Bagging

– Algorithm

• Random forest

– Decision tree (optional)

– How does random forest randomize decision trees to make a
random forest

• Adaboost

– Concepts: weak_learner, when does it work

– Algorithms: the update of each components.

University of Adelaide 6

PCA and LDA

• Concept of dimensionality reduction
– Benefit, why it is possible, applications

• PCA
– Motivation and understanding of PCA

– How PCA is derived, i.e., the relationship between PCA and
covariance matrix

– How to perform PCA

– Eigen-face model: how to solve the issue of calculating eigen
vectors for high-dimensional data

– Roles of eigen-vectors: the face reconstruction experiment

• LDA
– Motivation and intuitive idea of LDA (binary-class case)

– Solution of LDA and multi-class case (optional)

University of Adelaide 7

Unsupervised learning

• K-means clustering

– Steps, objective function

– Advantages and disadvantages

• GMM model

– Advantages over k-means

– Interpret GMM from the viewpoint of clustering, e.g. class
membership.

– EM algorithm [optional]

University of Adelaide 8

Kernel Method

University of Adelaide 9

• Basic concepts
– Benefit of using kernel

– How to prove one function is a valid kernel function

– Commonly used kernels

• Kernelize algorithms
– Kernel SVM

– How to kernelize algorithms: Euclidean distance,

– Kernel k-means

– Kernel PCA [optional]

– Kernel regression: representing w by weighted combination of
features

Neural Networks and Deep Learning

• Multi-layer perceptron

– Structure and benefit

• Convolutional Neural networks

– Structure and benefit

– Convolution operator

– Pooling operator

– How many parameters, how many activations

• Optimization in deep learning: Stochastic Gradient
descent (SGD)

– Relationship between gradient descent and SGD, why should we
use SGD

– Concepts like learning rate and batch size.

University of Adelaide 10

Semi-supervised Learning

• Concepts and basic setting

• Pseudo-labelling

– Assumption, Algorithm

– Advantage and disadvantage

• Co-training

– Basic idea

– Advantage and disadvantage

• S3SVM and Graph-based Semi-supervised learning

– Assumption

– loss functions

• Deep semi-supervised learning (optional)

University of Adelaide 11

Generative model

• Autoregressive model

– Theoretical foundation and key idea

– How to generate (sample) from an auto-regressive model

– How to train an auto-regressive model

• Generative Adversarial Networks (GAN)

– Basic idea

– Components in GAN and their roles

– Loss function

– Applications (optional)

University of Adelaide 12