CS代写 CS4551 - PowCoder代写

Multimedia Software Systems CS4551
JPEG Image Compression Algorithm
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Image Revisited

• We consider here still images, for example:
– Photographs (color or grayscale)
– Fax (bi-level and multilevel)
– Documents (text, handwriting, graphics and photographs) –…
• Components of Images
– Gray Images – single component
– Color Images – r, g, b components
– Sometimes images also have an alpha component.
– Each component is represented as a number of bits per pixel (application dependent).
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Why Compression?
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Some Compression Formats
• Therearemanydifferentuncompressedand compressed formats used in the industry! Some of commonly used compressed formats are
– RIFF – Resource Interchange File Format – GIF – Graphics Interchange File Format
– PNG – Portable Network Graphics
– JPEG – Joint Photographic Expert Groups – JPEG 2000
– MPEG4-VTC (Visual Texture Coding) –…
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Types of Image Compression
• Makes use of Lossless or Lossy Schemes or a Hybrid
• Lossless Schemes
– Already discussed in the previous lectures
• Lossy Schemes
– Frequency Domain Based
• Discrete Fourier Transforms
• Discrete Cosine Transform
• Wavelet Transforms
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
JPEG Background – Requirements
• Be at or near the state of art with regards to compression rate and accompanying fidelity
• Make no assumptions about the type of image – should be applicable to practically any kind of continuous-tone digital source image.
• Have a tractable computational complexity
• Support flexibility by allowing the following modes of operation
– Sequential Encoding
– Lossless Encoding
– Progressive Encoding
– Hierarchical Encoding
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

JPEG Background – Selection Process and Proposed Standard
• JPEGisthestandardalgorithmforcompressionofstill images
– Started in June 1987 for first selection (12→3) – Second selection in 1988
– Finalized and Accepted in 1991
• It is of reasonably low computational complexity, is capable of producing compressed images of high quality, and can provide both lossless and lossy compression of arbitrarily sized grayscale and color images
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
JPEG Coding Scheme
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

JPEG Compression Algorithm
1. The RGB color vectors of the input image are converted into YCbCr values and each channel is encoded independently
2. Each channel image is converted into a series of 8×8 pixel blocks, which are then processed in a raster scan sequence from left to right and from top to bottom
3. Each 8×8 block of pixels is spectrally analyzed using DCT (transform coding) and coefficients are quantized
4. After DCT coding and quantization, coefficients are entropy coded
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
JPEG Compression Algorithm • Supplementdiagram.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

JPEG Encoding
Conversion b/w RGB and YCbCr
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

The CbCr plane at constant luma Y′=0.5
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Color Sub-Sampling
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Color Sub-Sampling (2)
• Each of Cb and Cr has been down sampled by 2. The reconstructed image (right) is not too different from the original (left).
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

2D DCT Formula
• Discrete Cosine Transform (DCT) and Inverse DCT
F(u,v) = 4 cos 16
C(u)C(v) 7 7 (2i+1)u i=0 j=0
(2j+1)v 16
f (i, j) =  4 cos
(2i +1)u (2 j +1)v 16 cos 16
7 7 C(u)C(v) i=0 j=0
where i, j,u,v = 0,1,…7
C ( x ) =  2 i f x = 0
 1 o t h e r w i s e
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Examples of 8×8 DCTs
DCT coefficient values are calculated for the shifted input image samples. (shifted by -128)
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

8×8 DCT Coefficients
• The DCT coefficients represent the spatial frequency components within a 8×8 image block
• The (0, 0) coefficient is called DC coefficient. DC coefficient is the coefficient with zero frequency in both dimensions.
• AC coefficients are remaining 63 coefficients with non-zero frequencies.
– As we move to the right of the block, the coefficients represent the energy in higher horizontal frequencies
– As we move down the block, the coefficients represent the energy in higher vertical frequencies
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
DCT Coefficient Quantization
• Each DCT coefficient is quantized independently using a uniform quantizer
• Quantization is defined as division of each DCT coefficient by its corresponding quantizer step size followed by rounding to the
nearest integer.
• The number of quantization intervals per each DCT channel is chosen independently for each coefficient.
• De-quantization
𝑭𝑄′ 𝑢,𝑣 =𝑭𝑄 𝑢,𝑣 ∙𝑸𝑢,𝑣
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
𝑭𝑄 𝑢,𝑣 =Round

Rules for DCT Coefficient Quantization
• Low-frequency coefficients usually have more energy than high- frequency coefficients – therefore require more quantization intervals.
• The Human Visual System is more sensitive to low frequencies. Hence, low-frequency coefficients require more quantization intervals
• The Human Visual System is more sensitive to luminance channels than to chrominance channels.
• Hence, luminance channels require more quantization intervals than chrominance channels
• “Note: the quantization table is not standardized!” CSULA CS451 Multimedia Software Systems by Eun-Young Kang
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
DCT coefficient values are calculated for the shifted input image samples. (shifted by – 128)

Example – Quantization Tables
The Luminance Quantization Table (left) and the chrominance quantization table (right)
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Example – DCT and Quantization (1)
• Used the quantization tables in the previous slide.
• DCT coefficient values are calculated for the shifted input image samples. (shifted by -128)
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Example – DCT and Quantization (2)
• Used the quantization tables in the previous slide.
• DCT coefficient values are calculated for the shifted input image samples. (shifted by -128)
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Preparation for Entropy Coding – Differential Encoding for DCs
• DC is a measure of the average value of 64 image samples
• There is usually strong correlation between the DC coefficients of adjacent blocks
• After quantization, the DC coefficient of a block is encoded as the difference from the DC term of the previous block in the processing order.
• Differential encoding
– Diff(i) = DC(i) – DC(i-1)
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Preparation for Entropy Coding – Zigzag Order for ACs
• The quantized DCT coefficients are first processed in zigzag order
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Preparation for Entropy Coding – Zigzag Order for ACs
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Entropy Coding
• Zigzag ordering of coefficients places low frequency coefficients before high-frequency coefficients.
– Low frequency coefficients are more likely to be non-zero
– High frequency coefficients are more likely to be null after quantization
• Inthisway,thesequenceofquantizedcoefficientsis such that there often are long sequences of null values
• Two steps of entropy coding
– Step 1: Intermediate sequence of symbol
– Step 2: Assignment of variable length codes to the symbols
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Intermediate Representation – AC coefficients (1)
• AC coefficients are represented in an intermediate runlength representation, which encodes the runs of zero preceding any non-null coefficient
• Each non-zero AC coefficient is represented in combination with the run-length of the null coefficients before it, using a pair of symbols:
Symbol-1 Symbol-2
(RUNLENGTH,SIZE) (AMPLITUDE)
• Symbol-1 – RUNLENGTH is the number of consecutive null coefficients before the non-zero symbol; SIZE is the size of the symbol used to encode the non-null symbol
• Symbol-2 – AMPLITUDE is the actual (encoded) value CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Intermediate Representation – AC coefficients (2)
• RUNLENGTH must be 15. To represent a run of >15 zeros, use (15,0) that is interpreted as the extension symbol with runlength=16. There can be up to 3 consecutive (15,0) symbol-1 extensions
• If the last run of zeros contains the last DCT coefficient, then the special EOB (End-Of-Block) symbol-1 value (0,0) terminates the representation.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Intermediate Representation – DC coefficient
• DC coefficients are represented by only one pair:
Symbol-1 Symbol-2
(SIZE) (AMPLITUDE)
• Symbol- 1 – SIZE is the size of the symbol used to encode the
non-null symbol
• Symbol- 2 – AMPLITUDE is the actual (encoded) value
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Intermediate Representation – Symbol 2
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Entropy Coding – Step 2
• For both DC and AC coefficients, each symbol-1 is Huffman encoded (variable length prefix code)
– Huffman tables are not specified in the standard and must be input to the encoder
• Each symbol-2 is encoded with a Variable Length Integer (VLI) code
– It is different from a Huffman code because the symbol length is known in advance
– The coding table is hard-wired in the proposal CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Example of Entropy Coding
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Example of Entropy Coding
If the DC term of the previous block is 12, then the difference is +3. Thus the intermediate representation is (2)(3), size=2, amplitude=3.
The intermediate sequence of symbols for this example 8×8 block is
(2)(3), (1,2)(-2), (0,1)(-1), (0,1)(-1), (0,1)(-1) (2,1)(-1), (0,0)
Differential DC VLC (2) 011
AC luminance VLCs (generated by Huffman coding) are (0,0) 1010
(1,2) 11011
(2,1) 11100
VLIs (related to two’s compliment representation)
(3) 11 (-2) 01 (-1) 0
Then the 64 coefficients becomes 0111111011010000000001110001010 (31 bits)
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
JPEG Performances
• Assume the pixels of an arbitrary color image are digitized to 8 bits for luminance and chrominance channels with 4:2:2 color sampling scheme.
• Then effectively the source image has 16 bits/pixel
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Artifacts of JPEG – Original 380KB
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Artifacts of JPEG – Compressed 40KB
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Lossless JPEG Mode
• It is wholly independent of DCT processing
• Each sample value is predicted from a combination of nearby sample values, and the prediction residual is entropy coded (Huffman)
• Prediction effectively reduces the entropy of the residual, so that small average symbol length can be achieved.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Lossless JPEG Mode
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Sequential Encoding
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Progressive Encoding
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Progressive Encoding (1)
• Layering or progressive encoding: an image can be transmitted at a low rate (and a low quality) and then progressively improved by subsequent transmission
• Convenient for browsing applications, where a low-quality or low-resolution image is adequate browsing
• Three forms of JPEG progressive encoding:
– Spectral selection
– Successive approximation (SNR scalability)
– Hierarchical (pyramidal) mode
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Progressive Encoding (2)
• Spectralselection
– Initial transmission sends low-frequency DCT coefficients, followed progressively by the higher frequency coefficients, according to the zig-zag scan
– Simple to implement but reconstructed images from the early scans are blurred
• Successive approximation (SNR scalability)
– Only the most significant bits of each coefficient are sent in the first scan, followed by the next most significant bits and so on until all bits have been transmitted
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Progressive Encoding (3)
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Progressive Encoding (4) • Hierarchical(pyramidal)mode
– Image can be sent in one of several resolution modes to accommodate different kinds of displays
– Different resolution modes achieved by filtering and down sampling the image in multiples of two in each direction. The resulting image is interpolated and subtracted from the next level to create a residual
– The different resolution modes are transmitted together with the residuals
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Pyramid Decomposition (1)
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

Pyramid Decomposition (2)
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
Hierarchical (pyramidal) mode – Encoding/Decoding
• Encoder for three level hierarchical mode (f→f2→f4)
– Reduction of image resolution : Reduce resolution of the input image by a
factor of 2 in each dimension. Repeat this to obtain f4
– Compress lowest-resolution image f4 using JPEG sequential mode ~> F4
– Compress difference image d = f − F′ , where F′ is the interpolated 224 4
(expanded) images of the decoded f4 image ~> D2
– Compress difference image d = f − F′ , where F′ is the interpolated
22 (expanded) images of the decoded f2 image ~> D
• Decoder for three level hierarchical mode
– Decompress F4
– Restoreimagef′ usingF′ +𝐃 ~>F 2422
– RestoreimagefusingF′ +𝐃~>F 2
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

JPEG 2000 Standard
• A New Standard
– efficient, flexible, interactive
• Big emphasis on providing a scalable data stream:
– an embedded set of smaller streams, where each sub- stream gives an efficient R-D compression of the image
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
• OtherJPEG2000Advantagesinclude:
– Lossless and Lossy Compression: There is currently no standard that can provide superior lossless compression and lossy compression in a single bitstream.
– Low Bit-rate Compression: The current JPEG standard offers excellent rate-distortion performance in mid and high bit-rates. However, at bit-rates below 0.25 bpp, subjective distortion becomes unacceptable. This is important if we hope to receive images on our web-enabled ubiquitous devices, such as web-aware wristwatches and so on.
– Large Images: The new standard will allow image resolutions greater than 64K by 64K without tiling. It can handle image size up to 232 − 1.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

– Single Decompression Architecture: The JPEG standard has 44 modes, many of which are application specific and not used by the majority of JPEG decoders.
– Transmission in Noisy Environments: The new standard will provide improved error resilience for transmission in noisy environments such as wireless networks and the Internet.
– Progressive Transmission: The new standard provides seamless quality and resolution scalability from low to high bit-rate. The target bit-rate and reconstruction resolution need not be known at the time of compression.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang
– Region of Interest Coding: The new standard allows the specification of Regions of Interest (ROI) which can be coded with superior quality than the rest of the image. One might like to code the face of a speaker with more quality than the surrounding furniture.
– Computer Generated Imagery: The current JPEG standard is optimized for natural imagery and does not perform well on computer generated imagery.
– Compound Documents: The new standard offers metadata mechanisms for incorporating additional non-image data as part of the file. This might be useful for including text along with imagery, as one important example.
– In addition, JPEG2000 is able to handle up to 256 channels of information whereas the current JPEG standard is only able to handle three color channels.
CSULA CS451 Multimedia Software Systems by Eun-Young Kang

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts