CS计算机代考程序代写 chain algorithm deep learning Bayesian COMP 9517 Final Recap

COMP 9517 Final Recap
2021 T1
Topics
• Image formation
• Image processing
• Feature representation • Segmentation
• Pattern recognition
• Motion and Tracking
• Deep Learning

Image Formation
Geometry of Image Formation
Mapping between image and world coordinates • Pinhole camera model
• Project geometry
• Projection matrix

Pinhole Camera
Projective Geometry

Vanishing point and lines
Perspective Projection

Three colour spaces
• RGB (Red, Green, Blue)
Default, hard to transfer the colour
• HSV (Hue, Saturation, Value, a.k.a, Lightness) Very useful in colour segmentation
• YCbCr (a.k.a YUV)
Y means the luma (brightness)
Very useful in video process and digital camera systems
Spatial Resolution
• Spatial Resolution: number of pixels per unit of length
• Human faces can be recognized at 64 x 64 pixels per face.
• Appropriate resolution is essential:
• Too little resolution, poor recognition
• Too much resolution, slow and wastes memory

Quantisation
• Quantisation digitises the intensity or amplitude values, i.e., F(x, y)
• Called intensity or gray level quantisation
• Gray-Level resolution:
• Usually has 16, 32, 64, …, 128, 256 level
Image Processing

Image Processing
• Two types of image processing • Spatial domain
• Frequency domain
• Two principal categories in spatial processing • Intensity transformation
• Spatial filtering
Image processing on spatial domain
• Some basic gray-level transformation function
• Histogram processing
• Spatial filtering
• Smooth spatial filter • Sharpen spatial filter

Basic grey-level transformation
• Image reversal
• Log transformation
• Power transformation • Contrast Stretching
𝑠=𝐿−1−𝑟
𝑠 = 𝑐𝑙𝑜𝑔(1 + 𝑟)
𝑠 = 𝑐𝑟!
Image reversal
𝑠=𝐿−1−𝑟
• S and r represents the pixel values before and after processing respectively
• The image reversal is particularly suitable for processing white or gray details embedded in the dark areas of image.

Log transformation
𝑠 = 𝑐𝑙𝑜𝑔(1 + 𝑟)
• C is a constant
• To expand the value of dark pixel or suppress higher grey level value in the picture.
Power transformation
𝑠 = 𝑐𝑟!
• Similar to log transformation on input and output
• Family of possible transform by varying 𝛾
• Useful in displaying an image accurately on a computer screen (for example on web sites!) by pre-processing images appropriately before display.
• Also useful for general-purpose contrast manipulation

Contrast Stretching
• One of the simplest piecewise linear transformations
• To increase the dynamic range of grey levels in image
• Produces images of higher contrast
• Puts values below L in the input to black in the output • Puts values above H in the input to white in the output
• Linearly scales values between L and H in the input to the maximum range in the output
• Used in display devices or recording mediums to span the full intensity range
Contrast Stretching

Grey-Level Slicing
• Highlighting of specific range of grey levels
• Display high value for all grey levels in range of interest, and low value for
all others produces binary image
• Brighten the desired range of grey levels, while preserving background and other grey-scale tones of image
Grey-Level Slicing

Histogram Processing
• Histogram Equalization
• To get an image with equally distributed brightness levels over the whole brightness scale
• Histogram Matching
• To get an image with a specified histogram (brightness distribution)
Histogram Equalization

Histogram Equalization
Example (Histogram Equalization)

Histogram Matching
Example (Histogram Matching)

The difference
• Histogram Equalization is kind of generator of 𝑇(𝑟) • In histogram matching, the 𝑇(𝑟) has been given.
Smooth spatial filter
• Neighbourhood Averaging
• Gaussian filter
• Median filter (non-linear filter) • Max filter (non-linear filter)
• Min filter (non-linear filter)

Smooth spatial filter
Smooth spatial filter

Neighbourhood Averaging
Gaussian Filter

Non-linear spatial filters
Sharpening spatial filters
• The sharpening spatial filters are utilized by the differential • Gradient Operator
• The Laplacian
• The Sobel
• Non-sharpening mask

Gradient Operator
Basic idea – Derivatives
• Horizontal scan of the image
• Edge modelled as a ramp- to represent blurring due to sampling
• First derivative is
• Non-zero along ramp
• Zero in regions of constant intensity
• Constant during an intensity transition
• Second derivative is
• Nonzero at onset and end of ramp
• Stronger response at isolated noise point
• Zero everywhere except at onset and termination of intensity transition
• Thus, magnitude of first derivative can be used to detect the presence of an edge, and sign of second derivative to determine whether a pixel lies on dark or light side of an edge.

Basic idea – Derivatives
The Sobel

The Laplacian
Non-sharpening mask (sharpening process)
• The procedure:
• Blurring the original image
• Obtaining the mask via minus the original image with the blurred image • Plus the mask on the original
𝑔”#$% 𝑥,𝑦 =𝑓 𝑥,𝑦 −𝑓2(𝑥,𝑦) 𝑔𝑥,𝑦 =𝑓𝑥,𝑦 +𝑘×𝑔”#$%(𝑥,𝑦)

Padding
• When we use a spatial filters for pixels on the boundary of an image, we do not have enough neighbours.
• To get an image with the same size as input
• Zero: set all pixels outside the source image to –
• Constant: set all pixels outside the source image to a specified border value • Clamp: repeat edge pixels indefinitely
• Wrap: copy pixels from opposite side of the image
• Mirror: reflect pixels across the image edge
Image Processing on Frequency domain
• Fourier Transform
• Frequency Domain Filtering • Notch Filter
• Gaussian Filter • DoG Filter

One-Dim Discrete Fourier Transform and its Inverse
One-Dim Discrete Fourier Transform and its Inverse

Frequency Domain Filtering
• Frequency is directly related to rate of change, so frequencies in the Fourier transform may be related to patterns of intensity variations in the image.
• Slowest varying frequency at u=v=0 corresponds to average grey level of the image.
• Low frequencies correspond to slowly varying components in the image- for example, large areas of similar grey levels.
• Higher frequencies correspond to faster grey level changes such as edges, noise etc.
Procedure for Filtering in the Frequency Domain

Notch Filter
Gaussian Filter

DoG Filter
Image Pyramids
• Multiple resolutions may be useful
• Local statistics such as intensity averages can vary in different parts of an image

Feature Representation
Feature Representation
• Colour features
• Colour histogram • Colour moments
• Texture features
• Haralick texture features
• Local binary patterns
• Scale-invariant feature transform (SIFT) • Texture feature encoding
• Shape features
• Basic shape features
• Histogram of oriented gradients (HOG)

Colour Features
• Represent the global distribution of pixel colours in an image
• Step 1: Construct a histogram for each colour channel (R, G, B)
• Step 2: Concatenate the histogram (vectors) of all channels as the final feature vector
Colour Moments

Haralick Features
• Haralick features give an array of statistical descriptors of image patterns to capture the spatial relationship between neighbouring pixels, that is, textures
• Step 1: Construct the gray-level co-occurrence matrix (GLCM)
• Step 2: Compute the Haralick feature descriptors from the GLCM
Haralick Features

Haralick Features
Local Binary Patterns

Local Binary Patterns
Local Binary Patterns

SIFT
SIFT Extrema Detection

SIFT Keypoints Localization
SIFT Orientation Assignment

SIFT Keypoint Descriptor
SIFT procedure
• Find SIFT keypoints
• Find best matching between SIFT keypoints • Descript the keypoints
• Descriptor matching

Descriptor matching
Feature Encoding
• The most popular method: Bag-of-words (BoW)
• Local image features are encoded into a histogram to represent the overall image feature

Feature Encoding
Feature Encoding

Feature Encoding
Feature Encoding

Application Example
Feature Encoding
• Local features can be other types of features, not just SIFT • LBP, SURF, BRIEF, ORB
• There are also more advanced techniques than BoW • VLAD, Fisher Vector

Shape Features
• Shape is an essential feature of material objects that can be used to identify and classify them
• Example: object recognition
Shape Features
• Human perception of an object or region involves capturing prominent / salient aspects of shape
• Shape features in an image are normally extracted after the image has been segmented into object regions

Boundary Descriptors
• Chain code descriptor
• The shape of a region can be represented by labelling the relative position of
consecutive points on its boundary
• A chain code consists of a list of directions from a starting point and provides a compact boundary representation
Boundary Descriptors

Application Example
Histogram of Oriented Gradients
• HOG describes the distributions of gradient orientations in localized areas and does not require initial segmentation

Histogram of Oriented Gradients
• Step 1: Calculate gradient magnitude and orientation at each pixel with a gradient operator => gradient vector
Histogram of Oriented Gradients
• Step 2: Divide orientations into N bins and assign the gradient magnitude of each pixel to the bin corresponding to its orientation => cell histogram

Histogram of Oriented Gradients
• Step 3: Concatenate and block-normalise cell histograms to generate detection-window level HOG descriptor
Histogram of Oriented Gradients
• Detection via sliding window on the image

Image Segmentation
• The partition of an image into a set of regions • Meaningful areas
• Border pixels grouped into structures
• Groups of pixels with shapes
• Foreground and background
Segmentation approaches
• Region-based
• Curve-based
• Early techniques tend to use region splitting and/or merging • Recent algorithms optimize some global criterion

Segmentation approach
• Region Split and Merge
• Watershed
• Mean Shift
• Superpixel Segmentation • Conditional Random Field • Active Contours
Connected Components
• Number of components depends on the chosen connectivity

Region split and Merge
• The simplest possible techniques
• Use a threshold and then compute connected components
• Rarely sufficient due to lighting and intra-object statistical varitations
Region Splitting
• One of the oldest techniques in computer vision
• First computes a histogram for the whole image
• Then finds a threshold that best separates the large peaks in the histogram

Region splitting
• Otsu’s method
Region Merging

Watershed Segmentation
Mean Shift
• Mean shift is a variant of iterative steepest-ascent method to seek stationary points (i.e. peaks) in a density function, which is applicable in many areas of multi-dimensional data analysis
• Attempts to find all possible cluster centers in feature space (without the requirement of knowing the number of cluster like k-means)
• K-means clustering has limitations: • Needs to choose K
• Sensitive to outliers
• Prone to local minima

Mean Shift
• Iterative mode searching
1. Initialize a random seed point x and window N
2. Calculate the mean (center of gravity) m(x) within N
3. Shift the search window to the mean
4. Repeat Step 2 until convergence
Mean Shift
• Advantages:
• Model-free, does not assume any prior shape on data clusters • Just a single parameter (window size)
• Finds variable number of nodes (clusters)
• Robust to outliers
• Limitations:
• Computationally expensive (need to shift many windows) • Output depends on window size
• Window size (bandwidth) selection is not trivial
• Does not scale well with dimensions of feature space

Superpixel Segmentation
• Superpixel-based segmentation improves the efficiency • Group similar pixels into one superpixel
• Segmentation (classification) performed on superpixels
• Also classed over-segmentation
• The method: Simple linear iterative clustering (SLIC)
• A popular superpixel generation algorithm
• Pros: preserves image boundaries, fast, and memory efficient
Conditional Random Field
• An undirected graphical structure
• Nodes: superpixels (feature representation of superpixels)
• Edges: adjacent superpixels (similarity between superpixels)

Conditional Random Field
Active Contours
• Aim
• To locate boundary curves in images
• How
• Boundary detectors iteratively move towards their final solution under the combination of image, smoothness, and optional user-guidance forces.

Active Contours
• Active contours / Snakes are parametric models
• Level-set has become more popular
• Level-set evolve to fit and track objects of interest by modifying the underlying embedding function instead of curve functions
Mathematical morphology
• Erosion • Dilation

Basic set operations
Dilation of binary images

Erosion of binary images
Opening of binary images

Closing of binary images
Morphological edge detection

Reconstruction of binary objects
Reconstruction of binary objects

Distance transform of binary images
Ultimate erosion and reconstruction

Dilation of grey-scale images
Erosion of grey-scale images

Opening of grey-scale images
Closing of grey-scale images

Summary of mathematical morphology
Motion
• Change detection
• Using image subtraction to detect changes in scenes
• Sparse motion estimation
• Using template matching to estimate local displacements
• Dense motion estimation
• Using optical flow to compute a dense motion vector field

Tracking
• Bayesian inference
• Using probabilistic models to perform tracking
• Kalman filtering
• Using linear model assumptions for tracking
• Particle filtering
• Using nonlinear models for tracking