代写代考 CVPR 2006)

PowerPoint 프레젠테이션

Changjae Oh

Copyright By PowCoder代写 加微信 powcoder

Computer Vision
– Machine learning basics and classification –

Semester 1, 22/23

Today’s lecture: Objectives

• To review the past recording

̶ with quizzes

• More details about

̶ K-NN classification

̶ SVM classification

Machine learning problems

The machine learning framework

• Apply a prediction function to a feature representation of the image to
get the desired output:

Slide credit: L. Lazebnik

f( ) = “apple”

f( ) = “tomato”

f( ) = “cow”

Machine learning framework

• Training: given a training set of labeled examples {(x1,y1), …, (xN,yN)},
estimate the prediction function f by minimizing the prediction error
on the training set

• Testing: apply f to a never before seen test example x and output the
predicted value y = f(x)

output prediction function Image feature

Slide credit: L. Lazebnik

Machine learning framework

Classifier

Classifier

Prediction

Test Image

Classifier

• Raw pixels

• Histograms

• GIST descriptors

Learning a classifier

• Given some set of features with corresponding labels, learn a function to p
redict the labels from the features

Example: Image Classification by K-NN

Image Feature Classifier Prediction

Extraction

Example: Image Classification by K-NN

Image Feature Classifier Prediction

Classifier

Example: Image Classification by K-NN

Image Feature Classifier Prediction

Classifier

Example: Image Classification by K-NN

Image Feature Classifier Prediction

Prediction

Spectrum of supervision

Supervised

Semi-Supervised

Unsupervised

Reinforcement

Computer vision

Spectrum of supervision

Slide credit: S. Lazebnik

Spectrum of supervision

• Which type of machine learning is this?

https://www.enjoyalgorithms.com/blogs/supervised-unsupervised-and-semisupervised-learning

Spectrum of supervision

• Which type of machine learning is this?

https://www.enjoyalgorithms.com/blogs/supervised-unsupervised-and-semisupervised-learning

Spectrum of supervision

• Which type of machine learning is this?

https://www.enjoyalgorithms.com/blogs/supervised-unsupervised-and-semisupervised-learning

Spectrum of supervision

• Which type of machine learning is this?

https://www.enjoyalgorithms.com/blogs/supervised-unsupervised-and-semisupervised-learning

Spectrum of supervision

• Using reinforcement learning, you can move an AI agent, such as:

Changjae Oh

Computer Vision
– Classification –

Semester 1, 22/23

• Overview of recognition tasks

• A statistical learning approach

• “Classic” or “shallow” classification pipeline

• “Bag of features” representation

• Classifiers: nearest neighbor, linear, SVM

Object recognition

• A collection of related tasks for identifying objects in digital photographs.

• Consists of recognizing, identifying, and locating objects within a picture with a given de
gree of confidence.

semantic segmentation instance segmentation

image classification object detection

Image source

https://arxiv.org/pdf/1405.0312.pdf

Image classification vs Object detection

• Image classification

̶ Identifying what is in the image and the associated level of confidence.

̶ can be binary label or multi-label classification

• Object detection

̶ Localising and classifying one or more objects in an image

̶ Object localisation and image classification

Semantic segmentation vs Instance segmentation

• Semantic segmentation

̶ Assigning a label to every pixel in the image.

̶ Treating multiple objects of the same class as a single entity

• Instance segmentation

̶ Similar process as semantic segmentation, but identifies , for each pixel, the object in
stance it belongs to.

̶ Treating multiple objects of the same class as distinct individual objects (or instances)

̶ typically, instance segmentation is harder than semantic segmentation

The machine learning framework

• Apply a prediction function to a feature representation of the image to
get the desired output:

Slide credit: L. Lazebnik

f( ) = “apple”

f( ) = “tomato”

f( ) = “cow”

Machine learning framework

• Training: given a training set of labeled examples {(x1,y1), …, (xN,yN)},
estimate the prediction function f by minimizing the prediction error
on the training set

• Testing: apply f to a never before seen test example x and output the
predicted value y = f(x)

output prediction function Image feature

Slide credit: S. Lazebnik

Machine learning framework

Classifier

Classifier

Prediction

Test Image

Classifier

“Classic” recognition pipeline

• Hand-crafted feature representation

• Off-the-shelf trainable classifier

representation

classifier

“Classic” representation: Bag of features

• Representing images as orderless collections of local features

Motivation 1: Part-based models

• Various parts of the image are used separately to determine if and wher
e an object of interest exists

Weber, Welling & Perona (2000), Fergus, Perona & Zisserman (2003)

Motivation 2: Texture models

• Texture is characterised by the repetition of basic elements or textons

Texton histogram

Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003

“Texton dictionary”

Motivation 3: Bags of words

• Orderless document representation:

̶ Frequencies of words from a dictionary Salton & McGill (1983)

Motivation 3: Bags of words

• Orderless document representation:

̶ Frequencies of words from a dictionary Salton & McGill (1983)

Motivation 3: Bags of words

• Orderless document representation:

̶ Frequencies of words from a dictionary Salton & McGill (1983)

Bag of features: Outline

1. Extract local features

2. Learn “visual vocabulary”

3. Quantize local features using visual vocabulary

4. Represent images by frequencies of “visual words”

1. Local feature extraction

• Sample patches and extract descriptors

2. Learning the visual vocabulary

Slide credit:

Extracted descriptors from
the training set

2. Learning the visual vocabulary

Clustering

2. Learning the visual vocabulary

Clustering

Visual vocabulary

Recall: K-means clustering

• Want to minimize sum of squared Euclidean distances between features xi
and their nearest cluster centers mk

• Algorithm:

̶ Randomly initialize K cluster centers

̶ Iterate until convergence:

1. Assign each feature to the nearest center

2. Recompute each cluster center as the mean of all features assigned to it

Visual vocabularies

Source: B. Leibe

Appearance codebook

Bag of features: Outline

1. Extract local features

2. Learn “visual vocabulary”

3. Quantize local features using visual vocabulary

4. Represent images by frequencies of “visual words”

Advanced approach: Spatial pyramids

Lazebnik, Schmid & Ponce (CVPR 2006)

Advanced approach: Spatial pyramids

level 0 level 1

Lazebnik, Schmid & Ponce (CVPR 2006)

Advanced approach: Spatial pyramids

level 0 level 1 level 2

Lazebnik, Schmid & Ponce (CVPR 2006)

Advanced approach: Spatial pyramids

• Caltech101 classification results

“Classic” recognition pipeline

• Hand-crafted feature representation

• Trainable classifier

̶ Nearest Neighbor classifiers

̶ Support Vector machines

representation

classifier

Classifiers: Nearest neighbor

f(x) = label of the training example nearest to x

• All we need is a distance or similarity function for our inputs

• No training required!

Training exa

mples from

Training exa

mples from

Functions for comparing histograms

• L1 distance:

• χ2 distance:

• Quadratic distance (cross-bin distance):

• Histogram intersection (similarity function):

))(),(min(),(

K-nearest neighbor classifier

• For a new point, find the k closest points from training data

• Vote for class label with labels of the k points

K-nearest neighbor classifier

• Which classifier is more robust to outliers?

Credit: , http://cs231n.github.io/classification/

http://cs231n.github.io/classification/

K-nearest neighbor classifier

Credit: , http://cs231n.github.io/classification/

http://cs231n.github.io/classification/

Example: Image Classification by K-NN

Image Feature Classifier Prediction

Extraction

Example: Image Classification by K-NN

Image Feature Classifier Prediction

Classifier

Example: Image Classification by K-NN

Image Feature Classifier Prediction

Classifier

Example: Image Classification by K-NN

Image Feature Classifier Prediction

Prediction

Where else can we use K-NN ?

• Image Matting (Soft segmentation)

̶ Instead of fixed-window based filtering, we can search K-NN for aggregation

[Chen 2012]

Where else can we use K-NN ?

• Deep learning based 3D Computer Vision

[Qiu 2021]

Linear classifiers

• Find a linear function to separate the classes:

f(x) = sgn(w  x + b)

Visualizing linear classifiers

Credit: , http://cs231n.github.io/classification/

Example learned weights at the end of learning for CIFAR-10.

http://cs231n.github.io/classification/

Nearest neighbor vs. linear classifiers

• NN pros:
̶ Simple to implement

̶ Decision boundaries not necessarily linear

̶ Works for any number of classes

̶ Nonparametric method

• NN cons:
̶ Need good distance function

̶ Slow at test time

• Linear pros:
̶ Low-dimensional parametric representation

̶ Very fast at test time

• Linear cons:
̶ Works for two classes

̶ How to train the linear function?

̶ What if data is not linearly separable?

Nonparametric methods are good when you have a lot of
data and no prior knowledge, and when you don’t want to
worry too much about choosing just the right features.

[Artificial Intelligence: A Modern Approach]

A learning model that summarizes data with a set of parameters of fixed
size (independent of the number of training examples) is called a parame
tric model. No matter how much data you throw at a parametric model,
it won’t change its mind about how many parameters it needs.

[Artificial Intelligence: A Modern Approach]

Linear classifiers

• When the data is linearly separable, there may be more than one separa
tor (hyperplane)

Which separator
is the best?

Support vector machines

• Find a hyperplane that maximizes the margin between the positive and
negative examples

1:1)(negative

1:1)( positive

Support vectors

Distance between point

and hyperplane: ||||

For support vectors, 1=+ bi wx

Therefore, the margin is 2 / ||w||

C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998

http://www.umiacs.umd.edu/~joseph/support-vector-machines4.pdf

1. Maximize margin 2 / ||w||

2. Correctly classify all training data:

• Quadratic optimization problem:

C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998

1)(subject to

1:1)(negative

1:1)( positive

Finding the maximum margin hyperplane

http://www.umiacs.umd.edu/~joseph/support-vector-machines4.pdf

SVM parameter learning

• Separable data:

• Non-separable data:

1)(subject to

Maximize margin Classify training data correctly

+C max 0,1- y

Maximize margin Minimize classification mistakes

SVM parameter learning

Demo: http://cs.stanford.edu/people/karpathy/svmjs/demo

+C max 0,1- y

http://cs.stanford.edu/people/karpathy/svmjs/demo

Nonlinear SVMs

• General idea: the original input space can always be mapped to some hi
gher-dimensional feature space where the training set is separable

Φ: x→ φ(x)

Nonlinear SVMs

• Linearly separable dataset in 1D:

• Non-separable dataset in 1D:

• We can map the data to a higher-dimensional space:

The kernel trick

• General idea:

• The original input space can always be mapped to some higher-dimensional feature s
pace where the training set is separable

• The kernel trick:

• Instead of explicitly computing the lifting transformation φ(x), define a kernel functio
n K such that

K(x,y) = φ(x) · φ(y)

The kernel trick

• Linear SVM decision function:

+=+  xxxw 

The kernel trick

• Linear SVM decision function:

• Kernel SVM decision function:

• This gives a nonlinear decision boundary in the original feature space

+=+  ),()()( xxxx 

+=+  xxxw 

Polynomial kernel:

cK )(),( yxyx +=

Gaussian kernel

• Also known as the radial basis function (RBF) kernel:

exp),( yxyx

Gaussian kernel

Kernels for bags of features

• Histogram intersection:

• Square root (Bhattacharyya kernel):

• Generalized Gaussian kernel:

̶ D can be L1 distance, Euclidean distance, χ2 distance, etc.

exp),( hhD

Another way?

• Use good features

R. Girshick et al.”Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation.” CVPR 2014

SVMs: Pros and cons

̶ Non-linear SVM framework is very powerful, flexible

̶ Training is convex optimization, globally optimal solution can be found

̶ SVMs work very well in practice, even with very small training sample sizes

̶ No “direct” multi-class SVM, must combine two-class SVMs (e.g., with one-vs-others)

̶ Computation, memory (esp. for nonlinear SVMs)

SVM for image processing?

C. Oh et al, Sparse Edit Propagation for Hight Resolution Image using Support Vector Machins, ICIP 2015

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com