CS代考 CS4551

Multimedia Software Systems CS4551
Basics of Lossy Compression Algorithms
Multimedia Software Systems by Eun-Young Kang
Lossy Compression

Copyright By PowCoder代写 加微信 powcoder

• Decompressedsignalisnotlikeoriginalsignal–data loss
• Objective: minimize the distortion for a given compression ratio
– Ideally, we would optimize the system based on perceptual distortion (difficult to compute)
– We’ll need a few more concepts from statistics…
Multimedia Software Systems by Eun-Young Kang

Distortion Measurements
• 𝑦 : the original value of a sample (e.g. 𝐼 𝑖, 𝑗 the image pixel at
• 𝑦ො : the sample value after compression/decompression (e.g.
𝐾 𝑖, 𝑗 the approximated image pixel at 𝑖, 𝑗 )
• 1. Measuring differences (errors)
– 𝑓 𝑦 − 𝑦ො : error (or noise or distortion) where f is a distance function (e.g. MSE that is Mean Squared Error)
𝑀𝑆𝐸= 1 ෍෍𝐼𝑖,𝑗−𝐾𝑖,𝑗 2 𝑀𝑁 𝑖=0 𝑗=0
Multimedia Software Systems by Eun-Young Kang
Distortion Measurements (2)
• 2. SNR (Signal to Noise Ratio) or PSNR (Peak SNR) in dB
𝑀𝐴𝑋2 𝑃𝑆𝑁𝑅 = 10 ∙ log 𝐼 10 𝑀𝑆𝐸
= 20 ∙ log10
=20∙log10 𝑀𝐴𝑋𝐼 −10∙log10 𝑀𝑆𝐸
• 𝑀𝐴𝑋𝐼 – the maximum possible value of the sample. For example, when the pixels are represented using 8 bits per sample, this is 255.
• Typical values for the PSNR in lossy image and video compression
are between 30 and 50 dB for 8-bit depth image, where higher is
• In the absence of noise, the two images 𝐼 and 𝐾 are identical, and
thus the MSE is zero. In this case the PSNR is undefined. Multimedia Software Systems by Eun-Young Kang

https://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio
Multimedia Software Systems by Eun-Young Kang
Lossy Compression – Simple Examples
• Subsampling:
– Retain only samples on a subsampling grid (spatial/temporal).
– See examples in previous lecture
– Compression achieved by reducing the number of samples
– Has fundamental limitations: see sampling theorem
• Quantization: quantize with fewer bits
– Compression achieved by reducing the number of bits per sample
– Different quantization methods: Scalar uniform/non-uniform quantization, Vector quantization
– As Quantization Interval size increases – compression increases (so does error!)
• Can we do better than simple subsampling and quantization? Multimedia Software Systems by Eun-Young Kang

Transform Coding
• The rational behind the transform coding is that if Y is the result of a transform T of the input X in such a way that the components (a.k.a. channels) of Y are much less correlated, the Y can be coded more efficiently than X.
• E.g. Let’s assume T: RGB -> YCbCr color space conversion
– T transforms a RGB image into a YCbCr image.
– Y (Luminance) component is little related to CbCr (Chrominance).
– Luminance component can be compressed differently from color components.
– Human visual system is more sensitive to luminance. So we can apply sub-sampling (a simple compression) to color components or use a bigger quantization interval (less number of bits for the quantization) for color components.
• In general, the transform T itself does not compress any data. The compression comes from the processing and quantization of the components of Y.
Multimedia Software Systems by Eun-Young Kang
Transform Coding (2)
• In Transform Coding, a segment of information undergoes an
invertible mathematical transformation.
• Segments of information can be samples (assume 𝑀 samples) such as
– 8×8pixelblock(𝑀=64)ofanimageframe
– A segment of speech
– A chunk of data in any other format
• Transform Coding works as follows:
– Let𝐗= 𝑥 1 ,𝑥 2 ,…,𝑥 𝑀 isasegmentofinformation
– Apply a suitable invertible transformation multiplication) to 𝐗
T (typically a matrix
𝐓𝐗 = 𝑦1,𝑦2,…,𝑦𝑀
– Y is the result of the transformation 𝐘 = ෠෠
– Quantize, get Y = 𝑦ො 1 ,𝑦ො 2 ,…,𝑦ො 𝑀 and transmit Y Multimedia Software Systems by Eun-Young Kang

Transform Coding (3)
Multimedia Software Systems by Eun-Young Kang
Transform Coding (4)
• Quantizers in different channels may have different numbers of quantization levels (different quantization interval size).➔Each channel ultimately might yield a different number of bits.
• Bit budget 𝐵: if we quantize 𝑦ො 1 ,𝑦ො 2 ,…,𝑦ො 𝑀 using 𝐵 1 ,𝐵 2 ,…,𝐵 𝑀 bits respectively, then
• Optimal bit allocation: allocate more bits to those channels that have the highest variance
Multimedia Software Systems by Eun-Young Kang

Transform Groups
• Frequency transforms – Discrete Fourier transforms, Hadamard transforms, Lapped Orthogonal transforms, Discrete Cosine transforms
– Involves converting the signal from the sample domain (e.g. 1D time domain for audio and 2D spatial domain for images) to the frequency domain.
• Statistical transforms – Karhunen-Loève Transforms (a.k.a. Hotelling Transform and Eigenvector Transform)
• Wavelet transforms – While similar to frequency transforms, these transforms work more efficiently because the input is transformed to a multi-resolution frequency representation.
Multimedia Software Systems by Eun-Young Kang
Discrete Cosine Transform (DCT)
• The DCT is a widely used transform coding technique. It is able to perform de-correlation of the input signal in a data independent manner.
• Ithasthepropertythatdifferentchannelsrepresentthe signal power along different (spatial or temporal) frequencies similarly to the (discrete) Fourier transform
Multimedia Software Systems by Eun-Young Kang

Background Before DCT
• DC (Direct Current) : An electrical signal with constant magnitude (e.g. A battery carrying 1.5 volts DC)
• AC(AlternatingCurrent):anelectricalsignalthat changes its magnitude periodically at a certain frequency (e.g. 110 volts AC, 60 Hz)
• Mostrealsignaliscomplex.However,anysignalcan be expressed as a sum of multiple sinusoidal waveform (a.k.a. Fourier Analysis).
Multimedia Software Systems by Eun-Young Kang
Multimedia Software Systems by Eun-Young Kang

Temporal to Frequency
Multimedia Software Systems by Eun-Young Kang
Frequency to Temporal
Multimedia Software Systems by Eun-Young Kang

Discrete Cosine Transform (DCT)
• Let’s start with 1D digital signal 𝑓 𝑖 .
• Definition of 1D DCT
ard Transform
• 1D DCT transforms the signal 𝑓 𝑖 in the time domain to 𝐹 𝑢
in the frequency domain. Multimedia Software Systems by Eun-Young Kang
𝑪(𝒖)𝑴−𝟏 𝟐𝒊+𝟏 𝒖𝝅 𝑭(𝒖)= 𝟐 ෍ 𝒄𝒐𝒔 𝟐𝑴
where u=0, …, M-1
M is the number of input samples
𝟐 , 𝒇𝒐𝒓 𝒖 = 𝟎 Forw 𝑪(𝒖)=൦ 𝟐
𝟏, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆
Discrete Cosine Transform (DCT)
𝑪(𝒖)𝑴−𝟏 𝟐𝒊+𝟏 𝒖𝝅
𝑭(𝒖)= 𝟐 ෍ 𝒄𝒐𝒔 𝟐𝑴 ×𝒇(𝒊)
where u=0, …, M-1
M is the number of input samples
𝟐 , 𝒇𝒐𝒓 𝒖 = 𝟎 𝑪(𝒖)=൦ 𝟐
𝟏, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆
Plot cos graphs with different frequencies at https://www.desmos.com/calculator/nqfu5lxaij
Multimedia Software Systems by Eun-Young Kang

Inverse Discrete Cosine Transform
• Definition of 1D IDCT
• 1D IDCT transforms back the coefficients 𝐹 𝑢 in frequency domain into the original signal 𝑓 𝑖 in the time domain.
Multimedia Software Systems by Eun-Young Kang
𝑴−𝟏 𝑪(𝒖) 𝟐𝒊+𝟏𝒖𝝅 𝒇(𝒊) = ෍ 𝟐 × 𝒄𝒐𝒔 𝟐𝑴
where i=0, …, M-1
M is the number of input samples
𝟐 , 𝒇𝒐𝒓 𝒖 = 𝟎 𝑪(𝒖)=൦ 𝟐
𝟏, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆
Discrete Cosine Transform (DCT)
• TheroleofDCTistouseCosinefunctionto decompose the original signal into its one DC component and several AC components of the original signal.
• When 𝑢 = 0, DCT yields the DC coefficient. When
𝑢 = 1, … , 𝑀 − 1, it yields the first, second, … (M-1)th AC coefficients.
• TheroleofIDCTistoreconstruct(recompose)the signal using DC and ACs.
Multimedia Software Systems by Eun-Young Kang

Plotting 1D DCT Basis Functions u=0 u=1 u=2
u=6 u=7 Multimedia Software Systems by Eun-Young Kang
1D DCT Basis Functions for M=8 • Let’s consider M=8, then we have following basis cosine functions.
Multimedia Software Systems by Eun-Young Kang

1D DCT Basis Functions for M=8 (cont’d)
• Let’s consider M=8, then we have following basis cosine functions.
Multimedia Software Systems by Eun-Young Kang
Discrete Cosine Transform (DCT) – Example 1 with M=8
• If the original signal values are 𝑓 𝑖 = 100, 𝑖 = 0,…,7 1
• Then DCT gives 𝐹 0 ≈ 283, 𝐹 1 = 𝐹 2 = ⋯ = 𝐹 1111
Multimedia Software Systems by Eun-Young Kang

Discrete Cosine Transform (DCT) – Example 2 with M=8
• If the original signal 𝑓 𝑖 has the same frequency and phase as 2
the second cosine basis function and its amplitude is 100, • Then DCT gives 𝐹 0 = 0, 𝐹 1 = 0, 𝐹 2 = 200, 𝐹
Multimedia Software Systems by Eun-Young Kang
Discrete Cosine Transform (DCT) –
Example 3 with M=8
• If the original signal 𝑓 𝑖 is the sum of the previous two signals 3
(𝑓 𝑖 = 𝑓 𝑖 + 𝑓 𝑖 ) 212
• Then DCT gives 𝐹 0 = 283, 𝐹 1 = 0, 𝐹 2 333
𝐹3=⋯=𝐹7=0 33
Multimedia Software Systems by Eun-Young Kang

Discrete Cosine Transform (DCT) –
Example 4 with M=8
• If the original signal 𝑓 𝑖 has 𝑓 0 = 85, 𝑓 1 = −65, 𝑓 2 = 15,𝑓 3 =30,𝑓 4 =−56,𝑓 5 =35,𝑓 6 =90,𝑓 7 =60
• Then DCT gives 𝐹 0 = 69, 𝐹 1 = −49, 𝐹 2 = 74, 𝐹 3 = 11,𝐹 4 =16,𝐹 5 =117,𝐹 6 =44,𝐹 7 =−5
Multimedia Software Systems by Eun-Young Kang
IDCT Example
• F(u):69,-49,74,11,16,117,44,-5whereu=0,…,7
Multimedia Software Systems by Eun-Young Kang

IDCT Example
• F(u):69,-49,74,11,16,117,44,-5whereu=0,…,7
Multimedia Software Systems by Eun-Young Kang
Orthonormal Basis Functions • Functions 𝐵 𝑖 and 𝐵 𝑖 are orthonormal if
෍𝐵𝑝𝑖∙𝐵𝑞𝑖 =0,𝑝≠𝑞 𝑖
෍𝐵𝑝𝑖∙𝐵𝑞𝑖 =1,𝑝=𝑞 𝑖
• DCT basis functions (with the help of 𝐶 𝑢 ) are orthonormal.
With this property, the signal is not amplified during the
transformation. We get the same signal back.
෍ 2 cos 16 ∙ 2 cos 16 =1,𝑝=𝑞
Multimedia Software Systems by Eun-Young Kang
16 =0,𝑝≠𝑞

1D Discrete Cosine Transform (DCT)
• The 0th DCT coefficient 𝐹 0
– 𝐹 0 corresponds to the DC component of the signal 𝑓 𝑖 . – 𝐹 0 is equal to the average magnitude of the signal.
• TheothersevenDCTcoefficients
– They reflect the various changing (AC) components at
different frequencies.
– If we denote 𝐹 1 as AC1, 𝐹 2 as AC2, … 𝐹 7 as AC7, then AC1 completes half a cycle as a cosine function over [0..7] (low frequency); AC2 completes a full cycle; … ; and AC7 completes three and a half cycle (high frequency).
Multimedia Software Systems by Eun-Young Kang
2D DCT Formula • 2D Discrete Cosine Transform (DCT)
𝑭(𝒖,𝒗)=𝑪 𝒖 𝑪(𝒗)෍ ෍ 𝒄𝒐𝒔 𝟐𝒊+𝟏 𝒖𝝅 𝒄𝒐𝒔 𝟐𝒋+𝟏 𝒗𝝅 𝒇(𝒊,𝒋) 𝟒 𝟏𝟔 𝟏𝟔
where u=0, …, 7 and v=0,…,7 𝟐 , 𝒊𝒇 𝒙 = 𝟎
Forward Transform
𝟏, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆
Multimedia Software Systems by Eun-Young Kang

2D DCT Formula • 2D Inverse Discrete Cosine Transform (IDCT)
𝒇(𝒊,𝒋)= ෍ ෍ 𝑪 𝒖 𝑪(𝒗)𝒄𝒐𝒔 𝟐𝒊+𝟏 𝒖𝝅 𝒄𝒐𝒔 𝟐𝒋+𝟏 𝒗𝝅 𝑭(𝒖,𝒗)
where i=0, …, 7 and j=0,…,7 𝟐 , 𝒊𝒇 𝒙 = 𝟎
Inverse Transform
𝟏, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆
CSULA Multimedia Software Systems by Eun-Young Kang
8×8 2D DCT Basis Functions
Multimedia Software Systems by Eun-Young Kang

8×8 2D DCT i
Input 𝟖 × 𝟖 image
CSULA Multimedia Software Systems by Eun-Young Kang
𝟖 × 𝟖 DCT Coefficients
Examples of 8×8 DCTs
Multimedia Software Systems by Eun-Young Kang

DCT for Compression
• How can we use this transformation for compression?
– Each component is little dependent to each other.
– Human visual system is less sensitive to high frequency components.
– So we apply different quantization interval size to each components. i.e. More bits for DC component and less bits for AC components.
Multimedia Software Systems by Eun-Young Kang

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com