程序代写代做代考 graph algorithm arm 2812ICT Perceptual Computing

2812ICT Perceptual Computing
Introduction

Recommended Texts
• Rick Szeliski’s draft Computer Vision: Algorithms and Applications; we will use an online copy of the September 3, 2010 draft (http://szeliski.org/Book/)
• Forsyth and Ponce, Computer Vision: A Modern Approach (2nd Edition).

Outline
• Overview of machine perception
• Computer vision
• Machine perception & Pattern Recognition

What is machine perception?
“Machine perception” is a term that is used to identify the capability of a computer system to interpret data in a manner that is similar to the way humans use their senses to relate to the world around them. Considered a form of artificial intelligence, the goal of machine perception is to equip the computer system with the necessary hardware and software to recognize images, sounds, and even touch in a manner that enhances the interactivity between human operators and the machines.
– https://www.wisegeek.com/what-is-machine-perception.htm

https://ece.vt.edu/research/area/perception

Types of machine perception
• Computer Vision
• Computer audition • Touch…

Why vision?
• Images and video are everywhere!
Personal photo albums
Movies, news, sports
Surveillance and security
Medical and scientific images
Slide credit: L. Lazebnik

What is computer vision?
• Computer vision studies the tools and theories that enable the design of machines that can extract useful information from imagery data (images and videos) toward the goal of interpreting the world.
• The image data can take many forms, such as a video sequence, depth images, views from multiple cameras, or multi-dimensional data from a medical scanner

Computer Vision
Make computers understand images and videos.
What kind of scene? Where are the cars?
How far is the building?

Slide credit: J. Chai

Vision for measurement
Real-time stereo
Structure from motion
Multi-view stereo for community photo collections
Goesele et al.
Slide credit: L. Lazebnik
NASA Mars Rover
Pollefeys et al.

Vision for perception, interpretation
sky
amusement park
Cedar Point
Objects Activities Scenes Locations Text / writing Faces Gestures Motions Emotions…
ride
The Wicked Twister
ride
Lake Erie
water
ride
12 E
tree
Ferris wheel
tree
people waiting in line
tree deck
maxair
carousel
people sitting on ride
umbrellas
bench
tree
pedestrians
Slide credit: K. Grauman

Digital images
width 520
i=1
Think of images as matrices taken from CCD array.
Intensity : [0,255]
j=1
500 height
im[176][201] has value 164
im[194][203] has value 37
Slide credit: K. Grauman

Color images, RGB color space
RGB
Slide credit: K. Grauman

The goal of computer vision
• To bridge the gap between pixels and “meaning”
What we see What a computer sees

Related disciplines
Artificial intelligence
Computer vision
Algorithms
Machine learning
Cognitive science
Graphics
Image processing
Slide credit: T. Darrell

Brief history of computer vision
• 1966: Minsky assigns computer vision as an undergrad summer project
• 1960’s: interpretation of synthetic worlds
• 1970’s: some progress on interpreting selected images
• 1980’s: ANNs come and go; shift toward geometry and increased mathematical rigor
• 1990’s: face recognition; statistical analysis in vogue
• 2000’s: broader recognition; large annotated datasets available; video processing starts

Origins of computer vision
• AnMITundergraduatesummerproject

Why is this hard?

Computer vision can be divided as:
• Low-level vision
• Eg. Edges, textures, color, shadow, …
• Mid-level vision
• Eg. Segmentation, shape-from-shading, alignment, planar region extraction,…
• High-level vision
• Eg. Face recognition, action recognition, scene labelling/understanding, …

What can computer visions do?
•Task: Classification of one of the largest dataset of images. 14M+ Images
21k+ Categories
With Many Sub-Classes

Object labelling

Scene understanding/semantic labelling
Goal: Classify every single pixel in the 2D projection, for the real 3D world

Self-driving car
Full-Resolution Residual Networks (FRRNs) for Semantic Image Segmentation in Street Scenes

(Basic) Techniques/theories in computer vision
• • •
• • •
Image Formation
Image Filtering
Feature Detection and Matching
Geometric Alignment Stereo
Optic Flow
• Dense Motion Models
• Shape from Silhouettes
• Shape from Shading and Texture
• Surface Models • Segmentation
• Recognition
AND
Machine Learning!

Machine perception & Pattern Recognition
• What is Pattern Recognition? • Definitions from the literature
• “The assignment of a physical object or event to one of several pre-specified categories” – Duda and Hart
• “A problem of estimating density functions in a high-dimensional space and dividing the space into the regions of categories or classes” – Fukunaga
• “Given some examples of complex signals and the correct decisions for them, make decisions automatically for a stream of future examples” –Ripley
• “The science that concerns the description or classification (recognition) of measurements” – Schalkoff
• Pattern Recognition is concerned with answering the question “What is this?” –Morse

Examples of pattern recognition problems
Machine vision
• Visual inspection
• Imaging device detects ground target
• Classification into “friend” or “foe”
Character recognition
• Automated mail sorting, processing bank checks • Scanner captures an image of the text Computer aided diagnosis
• Medical imaging, EEG, ECG signal analysis Speech recognition
• Human Computer Interaction, Universal Access
• Microphone records acoustic signal
• Speech signal is classified into phonemes and/or words

Components of a pattern recognition system
A basic pattern classification system contains • A sensor
• A preprocessing mechanism
• A feature extraction mechanism (manual or automated)
• A classification algorithm
• A set of examples (training set) already classified or described

Types of prediction problem
Classification
• The PR problem of assigning an object to a class
• The output of the PR system is a label
• eg. classifying a product as “good” or “bad” in a quality control test
Regression
• A generalization of a classification task
• The output of the PR system is a real-valued number
• e.g. predicting the share value of a firm based on past performance and stock market indicators
Clustering
• The problem of organizing objects into meaningful groups
• The system returns a (sometimes hierarchical) grouping of objects
Description
• The problem of representing an object in terms of a series of primitives
• The PR system produces a structural or linguistic description • e.g. labeling an ECG signal in terms of P, QRS and T complexes

Features and patterns
Feature
• Feature is any distinctive aspect, or characteristic: may be symbolic (i.e., color) or numeric (i.e., height)
• The combination of 𝑑 features is a 𝑑-dim column vector called a feature vector
• The 𝑑-dimensional space defined by the feature vector is called the feature space
• Objects are represented as points in feature space; the result is a scatter plot
Pattern
• Pattern is a composite of traits or features characteristic of an individual
• In classification tasks, a pattern is a pair of variables {𝑥,𝜔} where
• 𝑥 is a collection of observations or features (feature vector)
• 𝜔 is the concept behind the observation (label)

What makes a “good” feature vector?
The quality of a feature vector is related to its ability to discriminate examples from different classes
• Examples from the same class should have similar feature values • Examples from different classes have different feature values
More feature properties

Classifiers
The task of a classifier is to partition feature space into class-labeled decision regions
• Borders between decision regions are called decision boundaries
• The classification of feature vector 𝑥 consists of determining which
decision region it belongs to, and assign 𝑥 to this class
A classifier can be represented as a set of discriminant
functions
• The classifier assigns a feature vector 𝑥 to class 𝜔𝑖 if 𝑔𝑖(𝑥)>𝑔𝑗(𝑥) ∀𝑗≠𝑖

Example: neural, statistical and structural OCR

The pattern recognition design cycle
• Data collection
• Probablythemosttime-intensivecomponentofaPRproject • How many examples are enough?
• Feature choice
• Critical to the success of the PR problem
• “Garbage in, garbage out”
• Requiresbasicpriorknowledge
• Model choice
• Statistical,neuralandstructuralapproaches • Parametersettings
• Training
• Given a feature set and a “blank” model, adapt the model to explain the data • Supervised,unsupervisedandreinforcementlearning
• Evaluation
• How well does the trained model do? • Overfittingvs.generalization

Consider the following scenario
• A fish processing plan wants to automate the process of sorting incoming fish according to species (salmon or sea bass)
• The automation system consists of
• aconveyorbeltforincomingproducts
• two conveyor belts for sorted products
• a pick-and-place robotic arm
• a vision system with an overhead CCD camera
• a computer to analyze images and control the robot arm

• Sensor
• The vision system captures an image as a new fish enters the sorting area
• Preprocessing
• Image processing algorithms, e.g., adjustments for average intensity levels,
segmentation to separate fish from background • Feature extraction
• Suppose we know that, on the average, sea bass is larger than salmon • From the segmented image we estimate the length of the fish
• Classification
• Collect a set of examples from both species
• Compute the distribution of lengths for both classes
• Determine a decision boundary (threshold) that minimizes
the classification error
• We estimate the classifier’s probability of error and obtain
a discouraging result of 40%
• What do we do now?

Improving the performance of our PR system
• Determined to achieve a recognition rate of 95%, we try a number of features
• Width, area, position of the eyes w.r.t. mouth…
• only to find out that these features contain no discriminatory
information
• Finally we find a “good” feature: average intensity of the scales
• Critical to the success of the PR problem
• We combine “length” and “average intensity of the
scales” to improve class separability
• We compute a linear discriminant function to separate the two classes, and obtain a classification rate of 95.7%

Cost vs. classification rate
• Our linear classifier was designed to minimize the overall misclassification rate • Probably the most time-intensive component of a PR project
• How many examples are enough?
• Is this the best objective function for our fish processing plant?
• The cost of misclassifying salmon as sea bass is that the end customer will occasionally find a
tasty piece of salmon when he purchases sea bass
• The cost of misclassifying sea bass as salmon is an end customer upset when he finds a piece of sea bass purchased at the price of salmon
• Intuitively, we could adjust the decision boundary to minimize this cost function

The issue of generalization
• The recognition rate of our linear classifier (95.7%) met the design specs, but we still think we can improve the performance of the system
• We then design an ANN with five hidden layers, a combination of logistic and hyperbolic tangent activation functions, train it with the BP algorithm and obtain an impressive classification rate of 99.9975% with the following decision boundary
• Satisfied with our classifier, we integrate the system and deploy it to the fish processing plant
• After a few days, the plant manager calls to complain that the system is misclassifying an average of 25% of the fish
• Whatwentwrong?

Next week:
• Image formation: from world objects to images