Introduction to
Artificial Intelligence
with Python
Learning
Supervised Learning
supervised learning
given a data set of input-output pairs, learn a function to map inputs to outputs
classification
supervised learning task of learning a function mapping an input point to a discrete category
Date
Humidity
(relative humidity)
Pressure
(sea level, mb)
Rain
Date
Humidity
(relative humidity)
Pressure
(sea level, mb)
Rain
January 1
93%
999.7
Rain
January 2
49%
1015.5
No Rain
January 3
79%
1031.1
No Rain
January 4
65%
984.9
Rain
January 5
90%
975.2
Rain
f(humidity, pressure) f(93, 999.7) = Rain
f(49, 1015.5) = No Rain f(79, 1031.1) = No Rain
h(humidity, pressure)
humidity
pressure
humidity
pressure
humidity
pressure
humidity
pressure
humidity
pressure
nearest-neighbor classification
algorithm that, given an input, chooses the class of the nearest data point to that input
humidity
pressure
humidity
pressure
humidity
pressure
humidity
pressure
humidity
pressure
humidity
pressure
k-nearest-neighbor classification
algorithm that, given an input, chooses the most common class out of the k nearest data points to that input
humidity
pressure
humidity
pressure
x1 = Humidity x2 = Pressure
h(x1, x2) =
Rainif w0 +w1x1+w2x2 ≥0 No Rain otherwise
Weight Vector w: (w0, w1, w2) Input Vector x: (1, x1, x2)
w·x:w0 +w1x1+w2x2
1if w0 +w1x1+w2x2 ≥0 0 otherwise
h(x1, x2) =
Weight Vector w: (w0, w1, w2) Input Vector x: (1, x1, x2)
w·x:w0 +w1x1+w2x2
1if w·x ≥0 0 otherwise
hw(x) =
perceptron learning rule
Given data point (x, y), update each weight according to:
wi = wi + α(y – hw(x)) × xi
perceptron learning rule
Given data point (x, y), update each weight according to:
wi = wi + α(actual value – estimate) × xi
1
0
w·x
output
humidity
pressure
humidity
pressure
hard threshold
1
0
w·x
output
soft threshold
1
0
w·x
output
Support Vector Machines
maximum margin separator
boundary that maximizes the distance between any of the data points
regression
supervised learning task of learning a function mapping an input point to a continuous value
f (advertising) f(1200) = 5800
f(2800) = 13400 f(1800) = 8400
h(advertising)
advertising
sales
Evaluating Hypotheses
loss function
function that expresses how poorly our hypothesis performs
0-1 loss function
L(actual, predicted) =
0 if actual = predicted, 1 otherwise
humidity
pressure
0000 01
0
00
0000 10
000
1 00
00
0
0
0 1000
humidity
pressure
L1 loss function
L(actual, predicted) = | actual – predicted |
advertising
sales
advertising
sales
L2 loss function
L(actual, predicted) = (actual – predicted)2
overfitting
a model that fits too closely to a particular data set and therefore may fail to generalize to future data
humidity
pressure
humidity
pressure
humidity
pressure
advertising
sales
advertising
sales
penalizing hypotheses that are more complex to favor simpler, more general hypotheses
cost(h) = loss(h)
penalizing hypotheses that are more complex to favor simpler, more general hypotheses
cost(h) = loss(h) + complexity(h)
penalizing hypotheses that are more complex to favor simpler, more general hypotheses
cost(h) = loss(h) + λcomplexity(h)
regularization
penalizing hypotheses that are more complex to favor simpler, more general hypotheses
cost(h) = loss(h) + λcomplexity(h)
holdout cross-validation
splitting data into a training set and a
test set, such that learning happens on the
training set and is evaluated on the test set
k-fold cross-validation
splitting data into k sets, and experimenting k times, using each set as a test set once, and using remaining data as training set
scikit-learn
Reinforcement Learning
reinforcement learning
given a set of rewards or punishments, learn what actions to take in the future
Environment
action
state reward
Agent
Markov Decision Process
model for decision-making, representing states, actions, and their rewards
Markov Decision Process
model for decision-making, representing states, actions, and their rewards
Markov Chain
X0 X1
X2 X3 X4
rrr
r rrrr
r
rrr
Markov Decision Process
•Set of states S
•Set of actions ACTIONS(s) •Transition model P(s’ | s, a) •Reward function R(s, a, s’)
Q-learning
method for learning a function Q(s, a), estimate of the value of performing action a in state s
Q-learning Overview
• •
•
•
Start with Q(s, a) = 0 for all s, a
When we taken an action and receive a reward:
Estimate the value of Q(s, a) based on current reward and expected future rewards
Update Q(s, a) to take into account old estimate as well as our new estimate
Q-learning
• •
Start with Q(s, a) = 0 for all s, a
Every time we take an action a in state s and observe a
reward r, we update:
Q(s, a) ← Q(s, a) + α(new value estimate – old value estimate)
Q-learning
• •
Start with Q(s, a) = 0 for all s, a
Every time we take an action a in state s and observe a
reward r, we update:
Q(s, a) ← Q(s, a) + α(new value estimate – Q(s, a))
Q-learning
• •
Start with Q(s, a) = 0 for all s, a
Every time we take an action a in state s and observe a
reward r, we update:
Q(s, a) ← Q(s, a) + α((r + future reward estimate) – Q(s, a))
Q-learning
• •
Start with Q(s, a) = 0 for all s, a
Every time we take an action a in state s and observe a
reward r, we update:
Q(s, a) ← Q(s, a) + α((r + maxa’ Q(s’, a’)) – Q(s, a))
Q-learning
• •
Start with Q(s, a) = 0 for all s, a
Every time we take an action a in state s and observe a
reward r, we update:
Q(s, a) ← Q(s, a) + α((r + γ maxa’ Q(s’, a’)) – Q(s, a))
Greedy Decision-Making
•
When in state s, choose action a with highest Q(s, a)
Explore vs. Exploit
ε-greedy
• • •
Set ε equal to how often we want to move randomly. With probability 1 – ε, choose estimated best move. With probability ε, choose a random move.
Nim
function approximation
approximating Q(s, a), often by a function combining various features, rather than storing one value for every state-action pair
Unsupervised Learning
unsupervised learning
given input data without any additional feedback, learn patterns
Clustering
clustering
organizing a set of objects into groups in such a way that similar objects tend to be in the same group
Some Clustering Applications
• • • • •
Genetic research
Image segmentation Market research Medical imaging
Social network analysis.
k-means clustering
algorithm for clustering data based on repeatedly assigning points to clusters and updating those clusters’ centers
Learning
•Supervised Learning •Reinforcement Learning •Unsupervised Learning
Learning
Introduction to
Artificial Intelligence
with Python