SGN-13006 Introduction to Pattern Recognition and Machine Learning (5 cr) – Concept Learning
SGN-13006 Introduction to Pattern
Recognition and Machine Learning (5 cr)
Concept Learning
Joni-Kristian Kämäräinen
November 2017
Laboratory of Signal Processing
Tampere University of Technology
1
Material
• Lecturer’s slides and blackboard notes
• T.M. Mitchell. Machine Learning. McGraw-Hill, 1997:
Chapter 2
2
Contents
General and Specific Concepts
FIND-S Algorithm
Candidate Elimination Algorithm
Version spaces
3
General and Specific Concepts
Concepts
Positive and negative examples: a training set
Sky Temp Humid Wind Water Forecst EnjoySpt
Sunny Warm Normal Strong Warm Same Yes
Sunny Warm High Strong Warm Same Yes
Rainy Cold High Strong Warm Change No
Sunny Warm High Strong Cool Change Yes
The inductive learning hypothesis: Any hypothesis found to
approximate the target function well over a sufficiently large set of
training examples will also approximate the target function well
over other unobserved examples.
4
Representing Hypotheses
• Many possible representations
• Here, h is conjunction of constraints on attributes
• Each constraint can be
• a specfic value (e.g., Water = Warm)
• don’t care (e.g., “Water =?”)
• no value allowed (e.g.,“Water=∅”)
For example,
Sky AirTemp Humid Wind Water Forecst
〈Sunny ? ? Strong ? Same〉
5
FIND-S Algorithm
Find-S algorithm
1: Initialize h to the most specific hypothesis in H
2: for For each positive training instance x do
3: for For each attribute constraint ai in h do
4: if the constraint ai in h is satisfied by x then
5: do nothing
6: else
7: replace ai in h by the next more general constraint that
is satisfied by x
8: end if
9: end for
10: end for
11: Output hypothesis h
6
Complaints about FIND-S
1. Can’t tell whether it has learned concept
2. Can’t tell when training data inconsistent
3. Picks a maximally specific h (why?)
4. Depending on H, there might be several!
7
Candidate Elimination Algorithm
Candidate Elimination Algorithm
Version spaces
Version Spaces
A hypothesis h is consistent with a set of training examples D of
target concept c if and only if h(x) = c(x) for each training
example 〈x , c(x)〉 in D.
Consistent(h,D) ≡ (∀〈x , c(x)〉 ∈ D) h(x) = c(x)
The version space, VSH,D , with respect to hypothesis space H
and training examples D, is the subset of hypotheses from H
consistent with all training examples in D.
VSH,D ≡ {h ∈ H|Consistent(h,D)}
8
The List-Then-Eliminate Algorithm
1: VersionSpace ← a list containing every hypothesis in H
2: For each training example, 〈x , c(x)〉
3: remove from VersionSpace any hypothesis h for which h(x) 6=
c(x)
4: Output the list of hypotheses in VersionSpace
9
Representing Version Spaces
• The General boundary, G, of version space VSH,D is the set
of its maximally general members
• The Specific boundary, S, of version space VSH,D is the set
of its maximally specific members
• Every member of the version space lies between these
boundaries
VSH,D = {h ∈ H|(∃s ∈ S)(∃g ∈ G )(g ≥ h ≥ s)}
where x ≥ y means x is more general or equal to y
10
Candidate Elimination Algorithm
1: G ← maximally general hypotheses in H
2: S ← maximally specific hypotheses in H
3: for each training example d do
4: if d is a positive example then
5: Remove from G any hypothesis inconsistent with d
6: for each hypothesis s in S that is not consistent with d do
7: Remove s from S
8: Add to S all minimal generalizations h of s such that h is consistent with d , and some member
of G is more general than h
9: Remove from S any hypothesis that is more general than another hypothesis in S
10: end for
11: end if
12: if d is a negative example then
13: Remove from S any hypothesis inconsistent with d
14: for each hypothesis g in G that is not consistent with d do
15: Remove g from G
16: Add to G all minimal specializations h of g such that h is consistent with d , and some member
of S is more specific than h
17: Remove from G any hypothesis that is less general than another hypothesis in G
18: end for
19: end if
20: end for
11
Summary
Summary
1. A concept: Definition and representation
2. Concept learning as search
3. General and specific hypotheses
4. FIND-S algorithm: Finding maximally specific hypothesis
5. The CANDIDATE-ELIMINATION algorithm: Version spaces
12
Summary
1. A concept: Definition and representation
2. Concept learning as search
3. General and specific hypotheses
4. FIND-S algorithm: Finding maximally specific hypothesis
5. The CANDIDATE-ELIMINATION algorithm: Version spaces
12
Summary
1. A concept: Definition and representation
2. Concept learning as search
3. General and specific hypotheses
4. FIND-S algorithm: Finding maximally specific hypothesis
5. The CANDIDATE-ELIMINATION algorithm: Version spaces
12
Summary
1. A concept: Definition and representation
2. Concept learning as search
3. General and specific hypotheses
4. FIND-S algorithm: Finding maximally specific hypothesis
5. The CANDIDATE-ELIMINATION algorithm: Version spaces
12
Summary
1. A concept: Definition and representation
2. Concept learning as search
3. General and specific hypotheses
4. FIND-S algorithm: Finding maximally specific hypothesis
5. The CANDIDATE-ELIMINATION algorithm: Version spaces
12
General and Specific Concepts
FIND-S Algorithm
Candidate Elimination Algorithm
Version spaces
Summary