CS计算机代考程序代写 chain deep learning flex Excel scheme 4/7/21

4/7/21
CSE 473/573
Introduction to Computer Vision and Image Processing
‘-
OBJECT DETECTION
‘-
1

4/7/21
Recap – Image Classification with Bags of Local Features
• Bag of Feature models were the state of the art for image classification for a decade
• BoF may still be the state of the art for instance retrieval
• We saw numerous strategies to fight b‘-ack against lost spatial information (spatial pyramid) and lost feature detail due to quantization.
• Food for thought, doesn’t the spatial pyramid seem kind of recursive / hierarchical? Like a SIFT feature on top of SIFT features?
SIFT vector formation
• 4×4 array of gradient orientation histogram weighted by
magnitude
• 8 orientations x 4×4 array = 128 dimensions
• Motivation: some sensitivity to spatial layout, but not too
much.
‘-
showing only 2×2 here but is 4×4
2

4/7/21
Spatial pyramid representation • Extension of a bag of features
• Locally orderless representation at several levels of resolution
level 0 level 1 level 2
‘-
Recap – Image Classification with Bags of Local Features
• Food for thought, doesn’t the spatial pyramid
seem kind of recursive / hierarchical? Like a
SIFT feature on top of SIFT features?
‘-
• Seems like there is a tendency for features to involve convolution, spatial pooling, and non- linearities.
3

4/7/21
Object Detection
• Overview
• Viola-Jones • Dalal-Triggs
• Later classes:
• Deformable models • Deep learning
‘-
Person detection with HoG’s & linear SVM’s • Histograms of Oriented Gradients for Human Detection, Navneet Dalal, Bill Triggs,
International Conference on Computer Vision & Pattern Recognition – June 2005 • http://lear.inrialpes.fr/pubs/2005/DT05/
‘-
4

4/7/21
Object detection vs Scene Recognition
• What’s the difference?
• Objects (even if deformable and articulated) probably have more consistent shapes than scenes.
• Scenes can be defined by distribution of “stuff” – ‘-
materials and surfaces with arbitrary shape.
• Objects are “things” that own their boundaries
• Bag of words models were less popular for object detection because they throw away shape info.
Object Category Detection • Focus on object search: “Where is it?”
• Build templates that quickly differentiate object patch from background patch
‘-
Dog Model
Object or Non-Object?
5

4/7/21
Challenges in modeling the object class
Illumination
Occlusions
Object pose
Intra-class appearance
Clutter
Viewpoint
Slide from K. Grauman, B. Leibe
‘-
Challenges in modeling the non-object class
True Detections
Bad Localization
‘-
Confused with Similar Object
Misc. Background
Confused with Dissimilar Objects
6

4/7/21
General Process of Object Recognition
Specify Object Model
What are the object parameters?
‘-
Generate Hypotheses
Score Hypotheses
Resolve Detections
Specifying an object model
1.


Statistical Template in Bounding Box Object is some (x,y,w,h) in image
Features defined wrt bounding box coordinates
‘-
Image Template Visualization
Images from Felzenszwalb
7

4/7/21
Specifying an object model
2.
• •
Articulated parts model Object is configuration of parts Each part is detectable
‘-
Images from Felzenszwalb
Specifying an object model
3. Hybrid template/parts model
Template Visualization
Detections
‘-
Felzenszwalb et al. 2008
8

4/7/21
Specifying an object model 4. 3D-ish model
• Object is collection of 3D planar patches under affine transformation
‘-
General Process of Object Recognition
Specify Object Model
‘-
Generate Hypotheses
Score Hypotheses
Resolve Detections
Propose an alignment of the model to the image
9

4/7/21
Generating hypotheses 1. Sliding window
• Test patch at each location and scale
‘-
Generating hypotheses 1. Sliding window
• Test patch at each location and scale
‘-
Note – Template did not change size
10

4/7/21
Each window is separately classified
‘-
Generating hypotheses
2. Voting from patches/keypoints
Interest Points
Matched Codebook Entries
Probabilistic Voting
s
3D Voting Space x
(continuous)
ISM model by Leibe et al.
‘-
y
11

4/7/21
Generating hypotheses
3. Region-based proposal
‘-
Endres Hoiem 2010
General Process of Object Recognition
Specify Object Model
Generate Hypotheses
Score Hypotheses
Resolve Detections
‘-
Mainly-gradient based features, usually based on summary representation, many classifiers
12

4/7/21
General Process of Object Recognition
Specify Object Model
Generate Hypotheses
Score Hypotheses
Resolve Detections
‘-
Rescore each proposed object based on whole set
Resolving detection scores 1. Non-max suppression
Score = 0.8
Score = 0.1
Score = 0.8
‘-
13

4/7/21
Resolving detection scores 1. Non-max suppression
Score = 0.8
Score = 0.1
Score = 0.8
Score = 0.1
‘-
“Overlap” score is below some threshold
Resolving detection scores
2. Context/reasoning
‘-
Hoiem et al. 2006
meters
14
meters

4/7/21
Basic Steps of Category Detection 1. Align
• E.g., choose position, scale orientation
• How to make this tractable? 1. Compare
• Compute similarity to an example object or to a summary representation
• Which differences in appearance are important?
‘-
Aligned Possible Objects
Exemplar Summary
Influential Works in Detection • Sung-Poggio (1994, 1998) : ~2000 citations
• Basicideaofstatisticaltemplatedetection,bootstrappingtoget“face-like”negative examples, multiple whole-face prototypes (in 1994)
• Rowley-Baluja-Kanade (1996-1998) : ~3600 citations
• “Parts” at fixed position, non-maxima suppression, simple cascade, rotation, pretty good
accuracy, fast
• Schneiderman-Kanade (1998-2000,2004) : ~1700 citations
• Careful feature engineering, excellent results, cascade • Viola-Jones (2001, 2004) : ~18,000 citations
• Haar-likefeatures,Adaboostasfeatureselection,hyper-cascade,veryfast • Dalal-Triggs (2005) : ~24,000 citations
• Carefulfeatureengineering,excellentresults,HOGfeature,easytoimplement • Felzenszwalb-McAllester-Ramanan (2008): ~7,300 citations
• Template/parts-basedblend
• Girshick et al. (2013): ~6500 citations
• R-CNN/FastR-CNN/FasterR-CNN.Deeplearnedmodelsonobjectproposals.
‘-
15

4/7/21
Sliding Window Face Detection with Viola-Jones
‘-
Many Slides from Lana Lazebnik
Face detection and recognition
Detection
Recognition
“Sally”
‘-
16

4/7/21
Consumer application: • Can be trained to recognize pets!
‘-
http://www.maclife.com/article/news/iphotos_faces_recognizes_cats
Consumer application: • Things iPhoto thinks are faces
‘-
17

4/7/21
Nikon ads
‘-
Challenges of face detection
• Sliding window detector must evaluate tens of thousands
of location/scale combinations
• Faces are rare: 0–10 per image
• For computational efficiency, we should try to spend as little time
‘-
• To avoid having a false positive in every image image, our false positive rate has to be less than 10-6
as possible on the non-face windows
• A megapixel image has ~106 pixels and a comparable number of
candidate face locations
18

4/7/21
The Viola/Jones Face Detector
• A seminal approach to real-time object detection
• Training is slow, but detection is very fast
• Key ideas
• Integral images for fast feature evaluation ‘-
• Boosting for feature selection
• Attentional cascade for fast rejection of non-face windows
P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001.
P. Viola and M. Jones. Robust real-time face detection. IJCV 57(2), 2004.
Key ideas
•Integral images for fast feature evaluation
•Boosting for feature selection
•Attentional cascade for fast rejection of
non-face windows
4/5/2021
‘-
42
19

4/7/21
Image Features
• Simple Features that measures the difference in intensity
“Rectangle filters”
Value =
∑ (pixels in white area) – ∑ (pixels in black area)
‘-
Example
Source
Result
‘-
20

4/7/21
Fast computation with integral images
• The integral image computes a value at each pixel (x,y) that is the sum of the pixel values above and to the left of (x,y), inclusive
• This can quickly be computed in one pass through the
image
‘-
(x,y)
Computing the integral image
‘-
21

4/7/21
Computing the integral image
• Cumulative row sum: s(x, y) = s(x–1, y) + i(x, y) • Integral image: ii(x, y) = ii(x, y−1) + s(x, y)
ii(x, y-1)
s(x-1, y)
‘-
i(x, y)
Computing sum within a rectangle
• Let A,B,C,D be the values of the integral image at the corners of a rectangle
• Then the sum of original image values within the rectangle can be computed as:
sum = A – B – C + D
• Only 3 additions are required for any size of rectangle!
‘-
DB
C
A
22

4/7/21
Computing a rectangle feature
• Claim: Feature Value = these scale factors * the area of
these regions. • Verify it.
Exercise 10 minutes
-1 +1 +2 -2
-1 +1
Integral ‘- Image
How do we get this???
• White Area – Black Area
• Sum of Entire Block = A – B – C + D • White Block = A – F – C + E
• Black Block = F – E – B + D
• White – Black =
• A – 2F – C +2E – D + B
4/5/2021
Integral Image
DB EF CA
‘-
Integral Image
-1 +2
-1
+1 -2 +1
50
23

4/7/21
Feature selection
• For a 24×24 detection region, the number of possible rectangle features is ~160,000!
‘-
Key ideas
•Integral images for fast feature evaluation
•Boosting for feature selection
•Attentional cascade for fast rejection of
non-face windows
4/5/2021
‘-
52
24

4/7/21
Feature selection
• At test time, it is impractical to evaluate the entire feature set
• Can we create a good classifier using just a small subset of all possible features?
• How to select such a subset?
‘-
Boosting
• Boosting is a learning scheme that combines weak
learners into a more accurate ensemble classifier • Weak learners based on rectangle filters:
value of rectangle feature
‘-
h (x)  1 if pt ft (x)  ptt
t 0 otherwise
parity
threshold
t
window
• Ensemble classification function: TT
1 if h(x)    1
learned weights
tt
C(x)   t1 2 t1
0 otherwise 
25

4/7/21
Training procedure
• Initially, weight each training example equally
• In each boosting round:
• Find the weak learner that achieves the lowest weighted training error
‘-
• Compute final classifier as linear combination of all weak learners (weight of each learner is directly proportional to its accuracy)
• Exact formulas for re-weighting and combining weak learners depend on the particular boosting scheme (e.g., AdaBoost)
Y. Freund and R. Schapire, A short introduction to boosting, Journal of Japanese Society for Artificial Intelligence, 14(5):771-780, September, 1999.
• Raise the weights of training examples misclassified by current weak learner
Boosting intuition
Weak Classifier 1
‘-
Slide credit: Paul Viola
26

4/7/21
Boosting illustration
Weights Increased
‘-
Boosting illustration
Weak Classifier 2
‘-
27

4/7/21
Boosting illustration
Weights Increased
‘-
Boosting illustration
Weak Classifier 3
‘-
28

4/7/21
Boosting illustration
Final classifier is
a combination of weak classifiers
‘-
Boosting for face detection • First two features selected by boosting:
‘-
This feature combination can yield 100% recall and 50% false positive rate
29

4/7/21
Boosting vs. SVM
• Advantages of boosting
• Integrates classifier training with feature selection
• Complexity of training is linear instead of quadratic in the number of training examples
• Flexibility in the choice of weak learners, boosting scheme
• Testing is fast
• Disadvantages
• Needs many training examples
• Training is slow
‘-
• Often doesn’t work as well as SVM (especially for many-class problems)
Boosting for face detection
• A 200-feature classifier can yield 95% detection rate and
a false positive rate of 1 in 14084
‘-
Not good enough!
Receiver operating characteristic (ROC) curve
30

4/7/21
Key ideas
•Integral images for fast feature evaluation
•Boosting for feature selection
•Attentional cascade for fast rejection of
non-face windows
4/5/2021
‘-
65
Attentional cascade
• We start with simple classifiers which reject many of the negative sub-windows while detecting almost all positive sub-windows
• Positive response from the first classifier triggers the ‘-
evaluation of a second (more complex) classifier, and so on
• A negative outcome at any point leads to the immediate rejection of the sub-window
IMAGE SUB-WINDOW
Classifier 1
T Classifier 2 T Classifier 3 T FACE
FFF
NON-FACE NON-FACE NON-FACE
31

4/7/21
Attentional cascade
• Chain classifiers that are progressively more complex
Receiver operating characteristic
and have lower false positive rates:
% False Pos
0 50
vs false neg determined by
IMAGE SUB-WINDOW
Classifier 1
T Classifier 2 T
‘-
Classifier 3
T FACE
FFF
NON-FACE NON-FACE NON-FACE
Attentional cascade
• The detection rate and the false positive rate of the cascade are found by multiplying the respective rates of the individual stages
• A detection rate of 0.9 and a false positive rate on the order of 10-6 can be achieved by a ‘-
10-stage cascade if each stage has a detection rate of 0.99 (0.9910 ≈ 0.9) and a false positive rate of about 0.30 (0.310 ≈ 6×10-6)
IMAGE SUB-WINDOW
Classifier 1
T Classifier 2 T Classifier 3 T FACE
FFF
NON-FACE NON-FACE NON-FACE
32
0
100
% Detection

4/7/21
Training the cascade
• Set target detection and false positive rates for each
stage
• Keep adding features to the current stage until its target
rates have been met
‘-
• Test on a validation set
• If the overall false positive rate is not low enough, then
add another stage
• Use false positives from current stage as the negative training examples for the next stage
• Need to lower AdaBoost threshold to maximize detection (as opposed to minimizing total classification error)
The implemented system
• Training Data • 5000 faces
• All frontal, rescaled to 24×24 pixels
• 300 million non-faces
• 9500 non-face images
• Faces are normalized • Scale, translation
• Many variations
• Across individuals • Illumination
• Pose
‘-
33

4/7/21
System performance
• Training time: “weeks” on 466 MHz Sun workstation
• 38 layers, total of 6061 features
• Average of 10 features evaluated per window on test set
• “On a 700 Mhz Pentium III processor, the face detector ‘-
can process a 384 by 288 pixel image in about .067 seconds”
• 15 Hz
• 15 times faster than previous detector of comparable accuracy (Rowley et al., 1998)
Output of Face Detector on Test Images
‘-
34

4/7/21
Other detection tasks
‘-
Facial Feature Localization Profile Detection
Male vs. female
Profile Detection
‘-
35

4/7/21
Profile Features
‘-
Summary: Viola/Jones detector
• Rectangle features
• Integral images for fast computation
• Boosting for feature selection
• Attentional cascade for fast rejection of negative
windows
‘-
36