CS计算机代考程序代写 python AI algorithm RMIT Classification: Trusted

RMIT Classification: Trusted
Foundations of ML
COSC 2673-2793 | Semester 1 2021 (Computational) Machine Learning

RMIT Classification: Trusted
Definitions
What is Machine Learning?
COSC2673 | COSC2793 Week1: Foundations of ML 2

RMIT Classification: Trusted
What is Machine Learning?
“Machine learning is the field of study that gives the ability to learn without being explicitly programmed.” – Arthur Samuel (1959)
Input
Distance from city Floor Area Number of Rooms Avg Income
Rank of Schools …
Output
Explicit Program
If distance > 500m^2 and floor area < 10km then: price = $100000 Elseif ... .... Explicit program: The human expert decide the criterion and implement it in code. Price ($) COSC2673 | COSC2793 Week1: Foundations of ML 3 RMIT Classification: Trusted House Price Prediction Example Distance from city Floor Area Number of Rooms Avg Income Rank of Schools ... Price ($) Data Machine Learning Algorithm or Program Tuneable parameters COSC2673 | COSC2793 Week1: Foundations of ML 4 RMIT Classification: Trusted What is Machine Learning? Machine learning is programming computers to optimise a performance criterion/perform a particular task by generalising from examples of past experience(s) to predict what is occur in future experience(s). More technically "A computer program is said to learn: • SomeclassoftasksT • FromexperienceE,and • PerformancemeasureP If its performance at tasks in T, as measured by P, improves with experience E” - Tom Mitchell (1998) COSC2673 | COSC2793 Week1: Foundations of ML 5 RMIT Classification: Trusted Task: Unknown Target Function The Task can be expressed an unknown target function: 𝐲 = 𝑓(𝐱) Ø Attributes of the task (input): x Ø Unknown function: 𝑓(𝐱) Ø Output of the function (target value): 𝐲 ML finds a Hypothesis, h, (from hypothesis space 𝐻) which is a function (or model) which approximates the unknown target function ∗ 𝑦' = h ( 𝐱 ) ≈ 𝑓 ( 𝐱 ) Ø The hypothesis is learnt from the Experience Ø A good hypothesis has a high evaluation from the Performance measure Ø The hypothesis generalises to predict the output of instances from outside of the Experience COSC2673 | COSC2793 Week1: Foundations of ML 6 RMIT Classification: Trusted Experience The Experience is typically a data set, 𝒟, of values: 𝒟= x("),𝑓 x(") Output of the Unknown function (target value): 𝑓(x(")) A data instance (or data point) 𝑖 is a tuple**: x("), 𝑓 x(") We do not know 𝑓(𝐱) but, can obtain samples (input output pairs) from it – observations or input/output of an black box phenomenon. ** Assume supervised learning Attributes of the task: x(") & "$% COSC2673 | COSC2793 Week1: Foundations of ML 7 RMIT Classification: Trusted Performance What does success look like? To evaluate the abilities of a machine learning algorithm, we must design quantitative measure of its performance. We would like to measure: h∗(𝐱) ≈ 𝑓(𝐱) The Performance is typically numerical measure that determines how well the hypothesis matches the experience. Note, the performance is measured against the experience. NOT the unknown target function! Usually we are interested in how well the machine learning algorithm performs when deployed in the real world - unseen data. We therefore evaluate these performance measures using a test set of data that is separate from the data used for training the machine learning system. COSC2673 | COSC2793 Week1: Foundations of ML 8 RMIT Classification: Trusted Revision qAssume you have to develop a machine learning model to do spam classification. What would be the task, experience and performance measure be? COSC2673 | COSC2793 Week1: Foundations of ML 9 RMIT Classification: Trusted Simple Example House price prediction COSC2673 | COSC2793 Week1: Foundations of ML 10 RMIT Classification: Trusted Example: House Price Prediction Experience Distance from city (d) Floor Area (a) Over $1m (or not) 𝑓(𝐱) 𝑖 d a 𝑓(𝐱) 1 25 150 N 2 10 100 Y ... ... ... ... n 32 450 Y ML Model What form should h(𝐱) take? a h(𝐱) d COSC2673 | COSC2793 Week1: Foundations of ML 11 Historical data from house RMIT Classification: Trusted Experience Distance from city (d) Floor Area (a) Over $1m (or not) 𝑓(𝐱) h(𝐱) a d ML Model We can select a hypothesis space that is “reasonably” capable of a representing the unknown target function. e.g., linear decision boundaries. Usually, the hypothesis space is determined by the ML technique. a d d How can we select the best hypothesis h∗(𝐱) from the hypothesis space? COSC2673 | COSC2793 Week1: Foundations of ML 12 Historical data from house RMIT Classification: Trusted Distance from city (d) Floor Area (a) Over $1m (or not) 𝑓(𝐱) h(𝐱) ML Model We need a performance measure that quantify what “best” means. Let's use the count of how many errors we make**. Now we combine data and hypothesis space and, select the hypothesis that makes the least number of errors as our optimal hypothesis h∗(𝐱). This is done automatically through the optimization procedure. Once we have the h∗(𝐱) We can use it to predict the target value for future data points. a d ** May not be the best measure. We will learn better ones later in the course. COSC2673 | COSC2793 Week1: Foundations of ML 13 Historical data from house RMIT Classification: Trusted Building a Machine Learning Algorithm Nearly all ML algorithms can be described with a fairly simple recipe: Ø Dataset (experience) Ø Model (hypothesis space) Ø Cost function (objective, loss) Ø Optimization procedure The first step in solving a ML problem is to analyse the data and task to identify the above components. Games against experts Determine Type of Training Experience Games against self Determine Target Function Table of correct moves ... Board ➝ move Board ➝ value Determine Representation of Learned Function ... Polynomial ... Linear function of six features Artificial neural network ... Determine Learning Algorithm Gradient descent Linear programming Completed Design Image: Tom Mitchell, ”Machine Learning”, 1997. COSC2673 | COSC2793 Week1: Foundations of ML 14 RMIT Classification: Trusted Revision qWhat is the difference between hypothesis space and optimal hypothesis? q What are the key ingredients of a general ML recipe? qDevise the Task, Performance measure and experience for “spam email classification” program. COSC2673 | COSC2793 Week1: Foundations of ML 15 RMIT Classification: Trusted True Error & Generalization Can the model be “effectively” used to predict for unseen data? COSC2673 | COSC2793 Week1: Foundations of ML 16 RMIT Classification: Trusted True Hypothesis & True Error Recall, Machine learning uses past experience to predict future experience(s). Ø We really want to know what the performance of a hypothesis is against the target function, known as the true error: Ø However, this cannot be known h∗(𝐱) ≈ 𝑓(𝐱) Ø That is because, ML uses experience (data sets) which are a limited sample of the “true” problem, that is, the unknown target function. All algorithms for Machine Learning make a significant assumption: The experience is a reasonable representation (or reasonable sample) of the true but unknown target function COSC2673 | COSC2793 Week1: Foundations of ML 17 RMIT Classification: Trusted True Hypothesis & True Error Instance space X -f ch - - Where c and h disagree f + + What is the error of the hypothesis h on the five data points provided? Ø Is the true-error zero? Ø How about the performance measured using the data? COSC2673 | COSC2793 Week1: Foundations of ML 18 Generalization The central challenge in machine learning is that our algorithm must perform well on new, previously unseen inputs (not just those on which our model was trained). The ability to perform well on previously unseen inputs is called generalization. Ø Generalization error is related to the true error of a hypothesis (cannot be measured). Select the hypothesis with train data h∗(𝐱) ≈ 𝑓(𝐱) Ø The generalization error of a machine learning model is typically estimated by measuring its performance on a test set collected separately from the training set. How can an algorithm influence the performance on an unseen dataset, if it only see training data? Assumption in ML: The test set is independent and identically distributed with respect to the training set. Measure performance with test data RMIT Classification: Trusted COSC2673 | COSC2793 Week1: Foundations of ML 19 RMIT Classification: Trusted Performance of a ML algorithm The factors determining how well a machine learning algorithm will perform are its ability to Ø Make the training set performance high. Ø Make the gap between training and test performance small (generalization) Test set Train set performance performance Generalization gap Performance The above concepts are related to model capacity, under or over fitting. We will discuss these in detail in future lectures. COSC2673 | COSC2793 Week1: Foundations of ML 20 RMIT Classification: Trusted Revision qIs the performance evaluated over training examples? why? qWhat would be the true error and the training error for the below data and the given hypothesis? hypothesis COSC2673 | COSC2793 Week1: Foundations of ML 21 RMIT Classification: Trusted Model complexity Which hypothesis should be chosen? COSC2673 | COSC2793 Week1: Foundations of ML 22 RMIT Classification: Trusted Hypothesis Space The hypothesis space, 𝐻, is the set of all hypotheses over the state space of a given problem that a given algorithm is capable of learning The great ML question is: which hypothesis should be learnt? In this example we want to separate (red x) from the rest. Out hypothesis space is all the possible rectangles. COSC2673 | COSC2793 Week1: Foundations of ML 23 RMIT Classification: Trusted Which hypothesis to learn? Consider questions such as: Ø Should the experience be matched? Ø Should be performance be maximized? Ø What happens if there is noise? Ø Which hypothesis should be learnt if multiple hypotheses all have the same performance? Ø Can a good hypothesis be found? COSC2673 | COSC2793 Week1: Foundations of ML 24 RMIT Classification: Trusted Ockham’s Razor The principle of Ockham’s Razor is to prefer the simplest hypothesis that is ("reasonably") consistent with the experience COSC2673 | COSC2793 Week1: Foundations of ML 25 RMIT Classification: Trusted Example: Ockham’s Razor Let's say student A has received low grade in ML assignment 1. The student and the teacher has produced two hypothesis that explains the results. Ø Teachers Hypothesis: “The student has not spent enough time on the assignment and therefore has done a suboptimal job”. Ø Students Hypothesis: “A foreign hacker has infiltrated the RMIT canvas site and removed some parts of (only) student A’s submission” . Which hypothesis would you pick? COSC2673 | COSC2793 Week1: Foundations of ML 26 RMIT Classification: Trusted Ockham’s Razor Hypothesis 1 Hypothesis 2 Hypothesis 3 Given that all three hypotheses has zero training error, Ockham’s razor says that we should choose the “simplest” hypothesis. We will cover how to measure “simplest” later in the course. COSC2673 | COSC2793 Week1: Foundations of ML 27 RMIT Classification: Trusted What is the best model? Is the additional data point correct, and outlier, or noise? What model should be learnt? COSC2673 | COSC2793 Week1: Foundations of ML 28 RMIT Classification: Trusted What is the best model? Is the additional data point correct, and outlier, or noise? What model should be learnt? COSC2673 | COSC2793 Week1: Foundations of ML 29 RMIT Classification: Trusted Ultimate Judgement The core challenge of ML is NOT: Ø Collecting Data Ø Running algorithms Ø Maximising Performance The core challenge is in: Ø Deciding what data to use Ø Deciding if an algorithm is suitable Ø Deciding the most suitable performance measure Ø Deciding which hypothesis is “the best” to use for a task Ø Making an ultimate judgement of how to approximate the unknown target function COSC2673 | COSC2793 Week1: Foundations of ML 30 RMIT Classification: Trusted Ultimate Judgement The core challenge is in analysis and evaluation. This is the focus of the course Running algorithms is necessary and important, but not the top priority This balance is best though of as: We are not looking for the “best hypothesis” We are looking for the “best hypothesis for the task that you can justify” COSC2673 | COSC2793 Week1: Foundations of ML 31 RMIT Classification: Trusted Can Machine Learning Pick Your Next Winning Lottery Number? https://www.youtube.com/watch?v=isTNnwk5SqE COSC2673 | COSC2793 Week1: Foundations of ML 32 RMIT Classification: Trusted Main Machine Learning Paradigms What are the common types of problems in ML? COSC2673 | COSC2793 Week1: Foundations of ML 33 RMIT Classification: Trusted Types of Machine Learning Problems Ø Supervised learning • Classification • Regression Ø Unsupervised learning Ø Reinforcement Learning COSC2673 | COSC2793 Week1: Foundations of ML 34 RMIT Classification: Trusted Supervised Learning In supervised learning, the output is known: 𝑦 = 𝑓(𝐱) Experience: Examples of input-output pairs Task: Learns a model that maps input to desired output Predict the output for new “unseen” inputs. Performance: Error measure of how closely the hypothesis predicts the target output Most typical of learning tasks Two main types of supervised learning: Ø Classification Ø Regression COSC2673 | COSC2793 Week1: Foundations of ML 35 RMIT Classification: Trusted Supervised Learning - Examples cat dog dog Classifier (ML Algorithm) COSC2673 | COSC2793 Week1: Foundations of ML 36 RMIT Classification: Trusted Examples: Computer Vision Toshiba Advanced Driver Assistance Systems Axial slices of two 3D-CT images with (left) and without (right) emphysema. Tennakoon, R., et. al. “Classification of Volumetric Images Using Multi- Instance Learning and Extreme Value Theorem”. IEEE Transactions on Medical Imaging, 2019. COSC2673 | COSC2793 Week1: Foundations of ML 37 RMIT Classification: Trusted Examples: Fraud Detection Source: http://businessdaily.co.zw/index-id-business-zk-34108.html COSC2673 | COSC2793 Week1: Foundations of ML 38 RMIT Classification: Trusted Examples: Speech Source: http://www.scmp.com/magazines/post- magazine/article/1925784/why-baidus-breakthrough- speech-recognition-may-be-game COSC2673 | COSC2793 Week1: Foundations of ML 39 RMIT Classification: Trusted Unsupervised Learning In unsupervised learning, the output is unknown: ? = 𝑓(𝐱) Experience: Data set with values for some or all attributes Task: “Invent” a suitable output. Identify trends and patterns between the data points Performance: How well the “invented” output matches the data set COSC2673 | COSC2793 Week1: Foundations of ML 40 RMIT Classification: Trusted Examples: Simple Clustering Patient blood glucose level Patient weight COSC2673 | COSC2793 Week1: Foundations of ML 41 RMIT Classification: Trusted Examples: Recommendation Source: https://github.com/watfood/worldofdance_app COSC2673 | COSC2793 Week1: Foundations of ML 42 RMIT Classification: Trusted Examples: Computer Vision Image: courtesy of Nvidia Smart paint brush with Generative adversarial network (GAN) COSC2673 | COSC2793 Week1: Foundations of ML 43 RMIT Classification: Trusted Reinforcement Learning In reinforcement learning, the target function is to learn an optimal policy, which is the best “action” for a dynamic agent to perform at any point in time 𝑎 = 𝜋∗(𝐬) Experience: A transition function, the result of performing any action in a state Task: Learn the optimal actions required for the agent to achieve a goal Performance: A reward (or reinforcement) for performing certain action(s) COSC2673 | COSC2793 Week1: Foundations of ML 44 RMIT Classification: Trusted Reinforcement Learning In reinforcement learning, the target function is to learn an optimal policy, which is the best “action” for a dynamic agent to perform at any point in time 𝑎 = 𝜋∗(𝐬) Conceptually, reinforcement learning shares similarities with supervised and unsupervised learning: The output (action, 𝑎) is unknown, however, The experience gives an “output” of performing actions in states: 𝑠, 𝑎 → 𝑠, The performance measures the “worth/reward” of each experience instance: 𝑅(𝑠, 𝑎) The performance acts as a proxy for the “actual” output, since in simple terms, it is the best “reward”, that is accumulated over time as the agent conducts actions COSC2673 | COSC2793 Week1: Foundations of ML 45 RMIT Classification: Trusted Reinforcement Learning COSC2673 | COSC2793 Week1: Foundations of ML 46 RMIT Classification: Trusted Examples: Robotics COSC2673 | COSC2793 Week1: Foundations of ML 47 RMIT Classification: Trusted Examples: Game AI COSC2673 | COSC2793 Week1: Foundations of ML 48 RMIT Classification: Trusted Types of Machine Learning Problems Others Semi-supervised Active learning Transfer learning ... COSC2673 | COSC2793 Week1: Foundations of ML 49 RMIT Classification: Trusted Summary & Todos Introduced Machine Learning Definition Applications Types of ML (supervised, unsupervised, reinforcement learning) For next week Get familiar with Python Revise maths No matter what, do not fall behind – this course that builds week by week COSC2673 | COSC2793 Week1: Foundations of ML 50 RMIT Classification: Trusted Example for Q&A session COSC2673 | COSC2793 Week1: Foundations of ML 51 RMIT Classification: Trusted Income and happiness? • Is there a relationship between Income and happiness? • Can we predict happiness of a person if we know the income? • Let's try to build a model that takes income (x) as input and produces the output happiness (y) COSC2673 | COSC2793 Week1: Foundations of ML 52 RMIT Classification: Trusted Income and happiness? • Is there a relationship between Income and happiness? • Can we predict happiness of a person if we know the income? • Let's try to build a model that takes income (x) as input and produces the output happiness (y) 𝑦=𝑓𝑥 𝑦% = h 𝑥 Task: Unknown target function exists hypothesis: model that approximates f(x) COSC2673 | COSC2793 Week1: Foundations of ML 53 RMIT Classification: Trusted Income and happiness? • Let's try to build a model that takes income (x) as input and produces the output happiness (y) 𝑥 𝑦 10k 2 Income ($10k) COSC2673 | COSC2793 Week1: Foundations of ML 54 Happiness RMIT Classification: Trusted Income and happiness? • Let's try to build a model that takes income (x) as input and produces the output happiness (y) 𝑥 𝑦 10k 2 40k 5 Income ($10k) COSC2673 | COSC2793 Week1: Foundations of ML 55 Happiness RMIT Classification: Trusted Income and happiness? • Let's try to build a model that takes income (x) as input and produces the output happiness (y) 𝑥 𝑦 10k 2 40k 5 25k 3 Income ($10k) COSC2673 | COSC2793 Week1: Foundations of ML 56 Happiness RMIT Classification: Trusted Income and happiness? • Let's try to build a model that takes income (x) as input and produces the output happiness (y) 𝑥 𝑦 10k 2 40k 5 25k 3 What is an assumption we are making here? Income ($10k) COSC2673 | COSC2793 Week1: Foundations of ML 57 Happiness RMIT Classification: Trusted Income and happiness? • Let's try to build a model that takes income (x) as input and produces the output happiness (y) 𝑥 𝑦 10k 2 40k 5 25k 3 Can we do more complex model to get error to 0? Income ($10k) COSC2673 | COSC2793 Week1: Foundations of ML 58 Happiness RMIT Classification: Trusted Income and happiness? • Will the model we developed generalize? 𝑥 𝑦 10k 2 40k 5 25k 3 What is the training error and true error? Income ($10k) COSC2673 | COSC2793 Week1: Foundations of ML 59 Happiness RMIT Classification: Trusted Income and happiness? • Will the model we developed generalize? 𝑥 𝑦 10k 2 40k 5 25k 3 80k 7 100k 7 30k 4 20k 2 Income ($10k) COSC2673 | COSC2793 Week1: Foundations of ML 60 Happiness