程序代写代做代考 game algorithm graph Announcements

Announcements
Reminder: pset5 self-grading form and pset6 out, due Today 11/19 11:59pm Boston Time
• Class challenge out Today (will discuss in class)

Semi-Supervised Learning
Slides credit: Jerry Zhu, Aarti Singh

Supervised Learning
Feature Space Label Space Goal:
Optimal predictor (Bayes Rule) depends on unknown PXY, so instead learn a good prediction rule from training data
Learning algorithm
Labeled
3

Labeled and Unlabeled data
Human expert/ Special equipment/ Experiment
“Crystal” “Needle” “Empty”
“0” “1” “2” …
“Sports” “News” “Science”

Expensive and scarce !
Cheap and abundant !
4

Free-of-cost labels?
Luis von Ahn: Games with a purpose (ReCaptcha)
Word challenging to OCR (Optical Character Recognition)
You provide a free label!
5

Semi-Supervised learning
Learning algorithm
Supervised learning (SL)
“Crystal”
Semi-Supervised learning (SSL)
Goal: Learn a better prediction rule than based on labeled data alone.
6

Semi-Supervised learning in Humans
7

Can unlabeled data help?
Positive labeled data
Negative labeled data
Unlabeled data
Supervised Decision Boundary
Semi-Supervised Decision Boundary
Assume each class is a coherent group (e.g. Gaussian)
Then unlabeled data can help identify the boundary more accurately.
8

Can unlabeled data help?
7
9
4
1 1
8
5 5
22
“0” “1” “2” …
3 3
This embedding can be done by manifold learning algorithms, e.g. tSNE
“Similar” data points have “similar” labels
9

Algorithms
Semi-Supervised Learning
Slides credit: Jerry Zhu, Aarti Singh

Some SSL Algorithms
 Self-Training
 Generative methods, mixture models  Graph-based methods
 Co-Training
 Semi-supervised SVM  Many others
11

Notation
12

Self-training
13

Self-training Example
Propagating 1-NN
14

Related: Cluster and Label
17

Co-training

Co-training Algorithm
Co-training (Blum & Mitchell, 1998) (Mitchell, 1999) assumes that
(i) (ii)
• •

features can be split into two sets;
each sub-feature set is sufficient to train a good classifier.
Initially two separate classifiers are trained with the labeled data, on the two sub-feature sets respectively.
Each classifier then classifies the unlabeled data, and ‘teaches’ the other classifier with the few unlabeled examples (and the predicted labels) they feel most confident.
Each classifier is retrained with the additional training examples given by the other classifier, and the process repeats.
33

Co-training Algorithm
Blum & Mitchell’98

Semi-Supervised Learning
 Generative methods
 Graph-based methods  Co-Training
 Semi-Supervised SVMs  Many other methods
SSL algorithms can use unlabeled data to help improve prediction accuracy if data satisfies appropriate assumptions
21

Next Class
Practical Advice for Applying ML
Machine learning system design; feature engineering; feature pre-processing; learning with large datasets; SGD and mini-batch GD