CS代考程序代写 case study algorithm AI python LECTURE 1 TERM 2:

LECTURE 1 TERM 2:
MSIN0097
Predictive Analytics
A P MOORE

INTRODUCTION TO AI
Why do they call it intelligence?

MACHINE LEARNING
Data + modelàprediction

MACHINE LEARNING DATA DRIVEN AI
Assume there is enough data to find statistical associations to solve specific tasks
Data + modelàprediction
Define how well the model solves the task and adapt the parameters to maximize performance

LEARNING A FUNCTION
𝑥→𝑦
𝑥 →𝑓(𝑥)→𝑦

LEARNING A FUNCTION
𝑥→𝑦
𝑥 →𝑓(𝑥)→𝑦
Measured data
Features Inferred/Predicted/Estimated value
Trueinitialvalue𝑥 →𝑥’→𝑓 𝑥 =𝑦’ →𝑦
(world state) True target value
Learned/Fitted function (world state) From n observations

LEARNING A FUNCTION
𝑥→𝑦
𝑥 →𝑓(𝑥)→𝑦
Measured data
Features Inferred/Predicted/Estimated value
Trueinitialvalue𝑥 →𝑥’→𝑓 𝑥 =𝑦’ →𝑦
(world state) True target value
Learned/Fitted function (world state) From n observations
input 𝑥→ 𝑓 𝑥 →𝑦 output

MACHINE LEARNING DATA DRIVEN AI
Source: https://twitter.com/Kpaxs/status/1163058544402411520

MACHINE LEARNING DATA DRIVEN AI
𝑥 → 𝑥’ → 𝑓 𝑥 = 𝑦’ → 𝑦 𝑓𝑥
𝑦!
𝑓𝑥
𝑥! 𝑥!
𝑦!
{𝑥!,𝑦!} Labelled training data Source: https://twitter.com/Kpaxs/status/1163058544402411520

INTRODUCTION TO AI
Learning the rules

MATURITY OF APPROACHES ML
“Classical” Machine learning
“Modern” Machine learning
Source: hazyresearch.github.io/snorkel/blog/snorkel_programming_training_data.html

PARADIGMS IN ML
Source: https://twitter.com/IntuitMachine/status/1200796873495318528/photo/1

TASKS IN MACHINE LEARNING

MACHINE LEARNING BR ANCHES
We know what the right answer is
We don’t know what the right answer is – but we can recognize a good answer if we find it
We have a way to measure how good our current best answer is, and a method to improve it
Source: Introduction to Reinforcement Learning, David Silver

BUILDING BLOCKS OF ML

A – B – C- D
A TAXONOMY OF PROBLEMS
A. ClAssification B. Regression
C. Clustering D. Decomposition

A – B – C- D ALGORITHMIC APPROACHES
A. ClAssification
– Support vector machines
– Neural networks
– Random Forests
– Maximum entropy classifiers -…
C. Clustering
– K-means
– KD Trees
– Spectral clustering – Density estimation – …
B. Regression
– Logistic regression
– Support vector regression – SGD regressor
– …
D. Decomposition
– PCA – LDA – t-SNE – Umap – VAE -…

A – B – C- D ALGORITHMIC APPROACHES
A. ClAssification
– Support vector machines
– Neural networks
– Random Forests
– Maximum entropy classifiers -…
C. Clustering – K-means
– KD Trees
– Spectral clustering – Density estimation – …
B. Regression
– Logistic regression
– Support vector regression – SGD regressor
– …
Super vised
D. Decomposition – PCA
– LDA – t-SNE – Umap – VAE -…
Unsuper vised

A – B – C- D ALGORITHMIC APPROACHES
A. ClAssification
B. Regression
Super vised
C. Clustering
D. Decomposition
Unsuper vised

A – B – C- D ALGORITHMIC APPROACHES
A. ClAssification
B. Regression
We know what the right answer is
Super vised
C. Clustering
D. Decomposition
Unsuper vised

A – B – C- D ALGORITHMIC APPROACHES
A. ClAssification
B. Regression
C. Clustering
D. Decomposition
Super vised
Unsuper vised
We don’t know what the right answer is – but we can recognize a good answer if we find it

A – B – C- D ALGORITHMIC APPROACHES
A. ClAssification
B. Regression
Super vised
We have a way to measure how good our current best answer is Reinforcement
C. Clustering
D. Decomposition
Learning Unsuper vised

MACHINE LEARNING
B. Regression

B. REGRESSION REAL VALUED VARIABLE

B. REGRESSION REAL VALUED VARIABLE

B. REGRESSION REAL VALUED VARIABLE

LINEAR REGRESSION

REGRESSION BY MODELING PROBABILITIES

B. REGRESSION REAL VALUED VARIABLE

B. REGRESSION REAL VALUED VARIABLE

MULTIPLE DIMENSIONS

DEVELOPING MORE COMPLEX ALGORITHMS

MACHINE LEARNING
A. Classification

A. CLASSIFICATION CATEGORICAL VARIABLE

A. CLASSIFICATION CATEGORICAL VARIABLE

A. CLASSIFICATION CATEGORICAL VARIABLE

A. CLASSIFICATION CATEGORICAL VARIABLE

LOGISTIC REGRESSION

DEVELOPING MORE COMPLEX ALGORITHMS

CONFUSION MATRIX BINARY FORCED CHOICE

A. CLASSIFICATION CATEGORICAL VARIABLE
Model 1
predicted
predicted
Model 2
actual
actual

CL ASSIFIC ATION MNIST DATASET

CONFUSION MATRIX

MACHINE LEARNING
C. Clustering

CLASSIFICATION VS CLUSTERING CATEGORICAL VARIABLE

CLASSIFICATION VS CLUSTERING

C. CLUSTERING

C. CLUSTERING

C. CLUSTERING

C. CLUSTERING 1. AGGLOMERATIVE

C. CLUSTERING 1. AGGLOMERATIVE

C. CLUSTERING 1. AGGLOMERATIVE

C. CLUSTERING 1. AGGLOMERATIVE

C. CLUSTERING 1. AGGLOMERATIVE
Dendrogram

C. CLUSTERING 2. DIVISIVE

C. CLUSTERING 2. DIVISIVE

C. CLUSTERING 2. DIVISIVE

C. CLUSTERING 2. DIVISIVE

C. CLUSTERING 3. PARTITIONAL

C. CLUSTERING 3. PARTITIONAL

C. CLUSTERING 3. PARTITIONAL

EXPECTATION MAXIMISATION

MACHINE LEARNING
D. Decomposition

D. DECOMPOSITION 2. PROJECTION METHODS
Dimensionality reduction

D. DECOMPOSITION 2. KERNEL METHODS

D. DECOMPOSITION 3. MANIFOLD LEARNING

A – B – C- D ALGORITHMIC APPROACHES
A. ClAssification B. Regression
C. Clustering D. Decomposition

TAXONOM Y
A.
B.
C.
D.

A – B – C- D ALGORITHMIC APPROACHES
A. ClAssification
B. Regression Source: Computer Vision: Learning, Models and Inference

DISCRIMINATIVE VS GENERATIVE A SIMPLE EXAMPLE

PARAMETRIC VS NON-PARAMETRIC
— With data gathered from uncontrolled observations on complex systems involving unknown [physical, chemical, biological, social, economic] mechanisms, the a priori assumption that nature would generate the data through a parametric model selected by the statistician can result in questionable conclusions that cannot be substantiated by appeal to goodness-of-fit tests and residual analysis.
— Usually, simple parametric models imposed on data generated by complex systems, for example, medical data, financial data, result in a loss of accuracy and information as compared to algorithmic models
Source: Statistical Science 2001, Vol. 16, No. 3, 199–231 Statistical Modeling: The Two Cultures Leo Breiman

REGUL ARIZ ATION IMPOSING ADDITIONAL CONSTRAINTS

ASSESSING GOODNESS OF FIT

ML PIPELINES
Source: https://epistasislab.github.io/tpot/

ML PIPELINES
FEATURE SELECTION AND AUTOMATION
Source: https://epistasislab.github.io/tpot/

HOMEWORK
Hands-on Machine Learning
Chapter 2: End-to-End Machine Learning Project
Try reading the Chapter from start to finish.We will work through the problem in class but please come prepared to discuss the case study.
It is easier to understand the different stages of a ML project if you follow one from start to finish.

END TO END

TESTING AND VALIDATION
— Generalization of data
— Generalization of feature representation — Generalization of the ML model

TOY VS REAL DATA
— Toy data is useful for exploring behaviour of algorithms
— Demonstrating the advantages and disadvantages of an algorithm — However, best not to use just Toy datasets
— Use real datasets

Source: http://www.r2d3.us/visual-intro-to-machine-learning-part-1/

BOOKS

THINKING ABOUT BUSINESS

WORKING WITH DATA

DESIGNING PREDICTIVE MODELS

PYTHON PROGRAMMING

A – B – C- D
A TAXONOMY OF PROBLEMS
A. ClAssification B. Regression Week 2 – Classification and Regression
Week 3 – Trees and Ensembles
C. Clustering D. Decomposition
Week 5 – Clustering
Week 4 – Kernel spaces and Decomposition

LECTURE 1 TERM 2:
MSIN0097
Predictive Analytics
A P MOORE