Multimedia Software Systems CS4551
Vector Quantization
Scalar Quantization (SQ)
• SQ treats each input symbol (e.g. pixel) independently from other input symbols in producing the quantized output.
Copyright By PowCoder代写 加微信 powcoder
• Examples
– Real to integer conversion
• Input: real numbers in the range [–10.0, 10.0] • Quantizer: 𝑄 𝑥 = floor 𝑥 + 0.5
[–10.0, –10.0] → { –10, –9, …, –1, 0, 1, 2, …, 9, 10} – 8-bit sample to 2-bit sample
• Input: a integer number in the range [0, 255] • Quantizer : 𝑄 𝑥 = floor 𝑥/64
[0, 255] → {0, 1, 2, 3}
* Inputs to the Quantizer are scalar values.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 2
Scalar Quantization (SQ) • Scalar Quantization
– Uniform Quantization (the quantization levels are uniformly spaced.)
– Non-uniform Quantization (quantization levels are unequal and mostly the relation between them is logarithmic.)
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 3
Vector Quantization (VQ)
• The quantizer takes in vectors, not scalars, as inputs. – Inputs (scalars values) should be formed as vectors.
• It divides a set of input vectors into groups having approximately the same number of vectors closest to them.
– Each group is represented by its centroid point, as in k- means and some other clustering algorithms.
• Vector quantization is also called “block quantization” often used in lossy data compression.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 4
Forming Vectors (1)
• For time signals, we usually form vectors from temporally-sequential samples (non-overlapping).
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 5
Vectors not limited to 2-D
Forming Vectors (2)
• For images, we usually form vectors from spatially- sequential samples.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 6
Example of Vector Space
Form 2D vectors 𝑔1, 𝑔2 by pairing
the values of adjacent pixels
Input “Lena” Image
• Plot the vector space showing the distribution of pairs of adjacent pixels from grayscale Lena.
• X (or Y) axis plots 𝑔1 (or 𝑔2) ranging [0, 255].
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Illustration of Vector Space
Input “Lena” Image
Dots in diagonal represent pairs of pixels having similar gray scale values.
Instead of quantizing the gray scale value, VQ quantize vectors.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 8
Why Vector Quantization?
• Assume that we want to quantize Lena image to 2 bits, that means to reduce the pixels to four possible values.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
SQ interpreted as VQ
• If we interpret this as VQ on the 2D vector distribution diagram, we get a picture on the right.
• Red dots (codebook vectors) – representing 16 evenly space possible values of pairs of pixels.
– Everypairwouldbemappedtoone 𝑔2 of these dots. All vectors inside a
cell would get quantized to the same codebook vector.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Scalar quantization to 2 bits/pixel interpreted as 2D VQ.
SQ interpreted as VQ
• This quantization is very inefficient.
• Two of the cells are completely empty and four other cells are very sparsely populated.
• The codebook vectors in the six cells
adjacent to the x = y diagonal are
shifted away from the density maxima
in their cells, which means that the
average quantization error in these cells 𝑔2 will be unnecessarily high.
• Six of the 16 possible pairs of pixel values are wasted, six more are not used efficiently and only four are O.K.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Vector Quantization (1)
• 2 bits/pixel quantization implies that we can allocate 4 bits per 2D vector.
• In VQ, we take the freedom to place the 16 vectors (red dots) of the codebook anywhere in the vector space.
• The right image shows that the vector
space are divided into cells such that they minimize the mean quantization error (i.e. distance between the vector in the cell 𝑔2 and the codebook vector). More cells are
used for the dense cloud around the x = y diagonal.
• The codebook vectors are represented as big red dots.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Vector quantization to 4 bits per 2-D vector.
Vector Quantization (2)
• In this example, the vector space is partitioned into a Voronoi diagram. The cells are called Voronoi cells.
• In the case of VQ, the cells are smaller (that is, the quantization introduces smaller errors) where it matters the most—in the areas of the vector space where the input vectors are dense.
• No codebook vectors are wasted on 𝑔2 unpopulated regions, and inside each cell
the codebook vector is optimally spaced
with regard to the local input vector
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Vector quantization to 4 bits per 2-D vector.
Vector Quantization (3) • Input vectors
– 𝑔1, 𝑔2 2D vectors
• Goal: 2 bits/pixel quantization
(implying 16 codebook vectors)
this codebook
has 16 entries
for 4-bit 𝑔2 indices.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Vector quantization to 4 bits per 2-D vector.
Codewords (red dots)
VQ Algorithm – 2 Main Tasks
• Codebook generation
– Choosing the vectors for the codebook.
– The algorithm should choose to minimize the mean quantization error.
• Mapping input vectors to an optimal codebook
– For each input vector, search codebook to find the closest codebook vector.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 15
VQ Algorithms – Codebook Generation
• Generalize Lloyd Algorithm (a.k.a. K-means clustering)
• ’s NeuQuant (introduced for image color quantization)
– See the article here if you are interested.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 16
Generalize Lloyd Algorithm (K-means clustering)
• The algorithm inputs are the number of clusters 𝐾 and the data points (input vectors in the vector space). In VQ, 𝐾 is determined by the number of bits per pixel and the number of pixels per vector.
• Assign initial estimates for the Κ centroids (K codebook vectors), which can either be randomly generated or randomly selected from the data points.
• Repeat/iterate between two steps until convergence:
– Each data point is assigned to its nearest centroid, based on the distance (squared Euclidean distance.)
– Thecentroidsarerecomputed.Thisisdonebytakingthemeanofalldata points assigned to that centroid’s cluster.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 17
Generalize Lloyd Algorithm (K-means clustering)
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 18
Generalize Lloyd Algorithm (K-means clustering)
• There are two problems with GLA:
– How to choose the initial codebook/centroid – What to do with “empty cells”
• Initial codebook/centroid values
– You could start out with random vectors. GLA will eventually move them into position and eliminate improper ones. This, however, can take more iterations – work which would be spared with a more careful starting position.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 19
Generalize Lloyd Algorithm (K-means clustering)
• In case of empty cell(s), you should remove it from the codebook. Need some kind of heuristic to come up with a prospective replacement.
– For example, you can split the codebook vector with the greatest number of assigned input vectors into two close vectors, and let several iterations pull them apart; or you could split the one whose assigned vectors are most distant. The first heuristic aims to minimize the mean error, while the second minimizes maximum error.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 20
Image Compression Using VQ
• Texture images are a prime candidate for VQ.
– They are often of limited color gamut (for example, in a grass texture you might have hundreds of shades of green, but only a few different reds, blues, and whites).
– They have a lot of similar, same-frequency features.
– Several hardware vendors have recognized the suitability of textures for VQ compression.
• For example, the Dreamcast’s video chip supports rendering directly from textures compressed to 2 bits/pixel. The vectors in this case are 2×2 blocks of RGB pixels (or 12-dimensional); the codebook has 256 entries for single-byte indices.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 21
Image Compression Example
A grass texture VQ-compressed grass texture, (uncompressed.) codebook size 256, compression
time 2 minutes, 2.1 bits/pixel.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 22
VQ Encoder/Decoder Operations
– Generate the codebook
– Find an optimal codebook entry per input vector and assign a binary codeword (index of the codebook)
– Use received binary codeword (the “index”) as an address into the codebook (i.e., Look-Up Table) and recover codebook vector.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 23
VQ Complexity
• VQ Complexity is Asymmetrical.
• Encoder is Computationally Complex. May not work well for Real-Time Encoding
• Decoder is Easy & Fast. Real-Time Decoding is very easy.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 24
Vector Dimension (Vector Size)
• Going to higher dimension: vectors concentrated in smaller % of whole space => Improved Performance
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 25
Optimal Vector Size
• How to determine the optimal vector size for a given set of input data is a rather complicated question beyond the scope of this class.
– In general, its is based on the autocorrelation properties of the data.
• It suffices to say that for images of the type and resolution commonly used in games, four is a good choice for the vector size.
• For other applications, such as voice compression, vectors of size 40-50 are used.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang 26
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com