Multimedia Software Systems CS4551
Basic Video Compression Technique
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
What is Video Compression?
Copyright By PowCoder代写 加微信 powcoder
• Image compression
– to reduce spatial redundancy
• Videocompression
– In a video, spatial redundancy exists in each frame like images.
– In addition, temporal redundancy (redundancy between frames) exists and can be exploited for compression reasons.
– Video compression is to reduce spatial redundancy within a frame AND temporal redundancy between frames.
– Each video frame may be encoded differently depending on whether to use spatial or temporal redundancies.
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Modalities of Video Compression
• Depending on which mode (spatial or temporal) you want to exploit for compression reasons, you have two modes of compression for video
• Intraframe (I frame)
– Each frame is encoded as an individual entity (like an “image”)
– Uses Image Compression techniques (e.g. DCT)
• Interframe (P frame, B frame, or multiframe)
– Predictive Encoding but for the temporal domain
– Instead of encoding the current frame directly, we encode the difference between the current frame and a prediction based on previous frames
– Term Used – Motion Compensation
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Intraframe Coding
• These frames are compressed using a combination of a Lossy Scheme such as Transform Coding or Subsampling/Quantization and Lossless Entropy Coding such Huffman or Arithmetic.
• For example, the MPEG/ITU Standard compresses these frames as discussed in JPEG image compression
– Get 8×8 blocks (for each component)
– DCT on each block
– Quantization of all coefficients
– AC zigzag ordering
– DC => DPCM => (size) (value)
– AC => Runlength Encoded => (runlength, size) (value)
– Both AC and DC Coefficients get Huffman encoded to form a bit stream
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Changes from Frame to Frame
• What can happen to a pixel (or pixel region) from one frame to another?
– Nothing! – Like an unchanging background – so do we need to encode information here?
– Changes (slight) due to quantization and noise
– Changes due to the motion of the object
– Changes due to the motion of the camera
– Changes due to environment and lighting
• If nothing has changed – no need to encode
• If changes in pixel color or pixel value due to motion of object or camera, maybe we can predict how the pixels have moved, thereby needing to encode only the change vector
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Motion Vector
• Let’s assume that that if a pixel has moved from frame to frame, the color of the pixel has not changed.
• In other words: A point at 𝑥, 𝑦 in frame𝑛+1withcolor𝑐𝑛+1 𝑥,𝑦 corresponds to some point 𝑥′, 𝑦′ in frame 𝑛 if 𝑐𝑛+1 𝑥, 𝑦 is same or very similarto𝑐𝑛 𝑥′,𝑦′
• If we call this displacement or motion vector 𝑑 = 𝑑𝑥, 𝑑𝑦 , then we have
𝑑 = 𝑑𝑥 , 𝑑𝑦 = 𝑐𝑢𝑟𝑟𝑒𝑛𝑡 − 𝑝𝑟𝑒𝑣𝑖𝑜𝑢𝑠 = 𝑥,𝑦 − 𝑥′,𝑦′
=>𝑑𝑥 =𝑥−𝑥′ and𝑑𝑦 =𝑦−𝑦′
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Motion Vector
Previous Current
For the blue block, what is d = (dx, dy) = ?
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Motion Vector Example
𝑑= 𝑑𝑥,𝑑𝑦 = 𝑥,𝑦 − 𝑥′,𝑦′ assuming that 𝑐𝑛+1 𝑥, 𝑦 is same
or very similar to 𝑐𝑛 𝑥′, 𝑦′
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Motion Compensation
• If we compute motion vector for pixels of the current frame with respect to the previous frame, it means that 𝑐𝑛+1 𝑥, 𝑦 can be predicted from 𝑐𝑛 𝑥′, 𝑦′ => 𝑐𝑛+1 𝑥, 𝑦 = 𝑐𝑛 𝑥 − 𝑑𝑥, 𝑦 − 𝑑𝑦 .
• So instead of storing/transmitting 𝑐𝑛+1 𝑥, 𝑦 for the current frame, we can store the following two things
– the motion vectors 𝑑𝑥,𝑑𝑦
– residual error, 𝑒 𝑥, 𝑦 = 𝑐𝑛+1 𝑥, 𝑦 − 𝑐𝑛 𝑥 − 𝑑𝑥, 𝑦 − 𝑑𝑦
• Then, to recover 𝑐𝑛+1 𝑥, 𝑦 , we use the pixel from the previous frame 𝑐𝑛 𝑥′, 𝑦′ , which can be computed by 𝑐𝑛 𝑥 − 𝑑𝑥, 𝑦 − 𝑑𝑦 the motion vector and the residual error.
–𝑐𝑛+1𝑥,𝑦=𝑐𝑛𝑥−𝑑𝑥,𝑦−𝑑𝑦 +𝑒𝑥,𝑦
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Motion Compensation – Macroblocks
• Do we need to compute and transmit one motion vector d per pixel?
– Computationally intensive!
– Lots of data to send
• Instead, transmit only 1 motion vector per groups of pixels called macroblock (e.g., 16×16 pixels)
• Advantage & Disadvantages:
– Fewer motion vectors to be transmitted
– Faster computationally
– Less precise motion prediction if motion is not constant within the macroblock (e.g., if macroblock covers the edge of a moving object)
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Motion Compensation – Macroblocks
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Motion Compensation – Macroblocks
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
MacroBlock Motion Vectors – Example
(n-1)th frame CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
MacroBlock Motion Vectors – Example
nth frame showing motion vector CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
MacroBlock Motion Vectors – Example
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
MacroBlock Motion Vectors – Example
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
A Note on Frame Segmentation
• This also refers to deciding a size and shape of these macroblocks that are used in computing motion
• Normally a frame is divided in non-overlapping blocks of a certain size B (16×16) and shape (square). Block size and shape affects the performance of compression
– Larger B => fewer blocks to search => fewer motion vectors to encode
– For large B, the movement of objects do not coincide with boundaries of B => larger errors or residuals that need to be encoded
• Thus block size represents a trade off between minimizing the number of motion vectors and maximizing the quality of the matching blocks
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
MacroBlocks at Motion Boundary
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Search and Block Matching
• If the difference between the target block and the candidate block at the same position in the past frame is below some threshold then no motion, else search
• Block matching is the most time consuming – for accurate results you need to do an exhaustive search which is – given a block B in the current frame search for a match block A in the entire previous frame. This may be further optimized: Instead of entire frame,
– Limit search to a region
– Logarithmic Search
– Hierarchical search
• In practice, this search can be limited to a range around the “target block”
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Calculating a Motion Vector per a Target Macroblock
The Previous frame A target macro block at (4,4) at the current frame
Find the best matching block for the target block and calculate the motion vector 𝑑 = 𝑑𝑥 , 𝑑𝑦
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
For Search Window Size 𝑝 = 1
The Previous frame A target macro block at (4,4) at the current frame
For 𝑝 = 1,
2𝑝+1 × 2𝑝+1 =9matcheswillbeperformed.
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
For Search Window Size 𝑝 = 1
The Previous frame A target macro block at (4,4) at the current frame
First, match a candidate block at
𝑑=𝑐𝑢𝑟𝑟𝑒𝑛𝑡−𝑝𝑟𝑒𝑣𝑖𝑜𝑢𝑠= 1,1
Computer MSE and store CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
For Search Window Size 𝑝 = 1
The Previous frame A target macro block at (4,4) at the current frame
Second, match a candidate block at 𝑑 = 0,1 Computer MSE and store
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
For Search Window Size 𝑝 = 1
The Previous frame A target macro block at (4,4) at the current frame
Third, match a candidate block at 𝑑 = −1,1 Computer MSE and store
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
For Search Window Size 𝑝 = 1
The Previous frame A target macro block at (4,4) at the current frame
Fourth, match a candidate block at 𝑑 = 1,0 Computer MSE and store
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
For Search Window Size 𝑝 = 1
The Previous frame A target macro block at (4,4) at the current frame
Firth,matchacandidateblockat𝑑= 0,0 Computer MSE and store
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
For Search Window Size 𝑝 = 1
The Previous frame A target macro block at (4,4) at the current frame
Sixth, match a candidate block at 𝑑 = −1,0 Computer MSE and store
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
For Search Window Size 𝑝 = 1
The Previous frame A target macro block at (4,4) at the current frame
Seventh, match a candidate block at 𝑑 = −1,1 Computer MSE and store
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
For Search Window Size 𝑝 = 1
The Previous frame A target macro block at (4,4) at the current frame
Eighth, match a candidate block at 𝑑 = 0, −1 Computer MSE and store
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
For Search Window Size 𝑝 = 1
The Previous frame A target macro block at (4,4) at the current frame
Last, match a candidate block at 𝑑 = −1, −1 Computer MSE and store
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
For Search Window Size 𝑝 = 1
The Previous frame A target macro block at (4,4) at the current frame
Search window size in red.
Find the best matching block using MSE.
Let’s assume that the best matching candidate block is at (3,4) then, the motion vector for the target block is 𝑑 = 1,0 .
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Search for Motion Vectors
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Search for Motion Vectors
• Sequential Search (a.k.a. full search) – sequentially search the whole (2k+1)×(2k+1) search window area
• Very costly.
• Not good for real- time encoding.
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Search for Motion Vectors
• 2D Logarithmic Search – Acts like binary search. More efficient than sequential search
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Search for Motion Vectors
• Hierarchical Search. The initial motion vector is obtained from images with a significantly reduced resolution and refined.
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Matching Criteria
• Various Criteria are used to decided whether the current block matches a target block
– Mean Absolute Difference (MAD) – Mean Square Difference (MSD)
– Pel Difference Classification (PDC) – Integral Projection
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Matching Criteria
• MeanAbsoluteDifference(MAD)
– The mean absolute difference (MAD) is the most popular block matching criterion. Corresponding pixels from each block are compared and their differences summed, as described by this equation.
𝑚𝑛𝐴𝑝,𝑞 −𝐵𝑝,𝑞 𝑝=1 𝑞=1
– Mean Absolute Difference function for two blocks A and B of size n×m. A[p, q] is the value of the pixel in the pth row and qth column of block A.
– The lower the MAD the better the match and so the candidate block with the minimum MAD should be chosen. The function is alternatively called Mean Absolute Error (MAE).
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Matching Criteria
• MeanSquareDifference(MSD)
– The mean square difference function is similar to the mean absolute difference function, except that the difference between pixels is squared before summation.
1𝐴𝑝,𝑞−𝐵𝑝,𝑞 2 𝑚𝑛 𝑝=1 𝑞=1
– The mean square difference is more commonly called the mean square error (MSE) and the lower this value the better the match. This criterion is said to result in better matches than the MAD criterion
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Matching Criteria
• Pel Difference Classification (PDC)
– The Pel Difference Classification (PDC) function compares each pixel of the target block with its counterpart in the candidate block and classifies each pixel pair as either matching or not matching. Pixels are matching if the difference between their values is less than some threshold and the greater the number of matching pixels the better the match.
𝑚𝑛𝑜𝑟𝑑𝐴𝑝,𝑞−𝐵𝑝,𝑞 ≤𝑡2 𝑝=1 𝑞=1
– Pel Difference Classification function. Ord(e) evaluates to 1 if e is true and 0 otherwise.
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Matching Criteria
• Integral Projection
– Integral projections are calculated by summing the values of pixels from each column and each row of a block. The most attractive feature of this criterion is that values calculated for a particular candidate block can be reused in calculating the integrals for overlapping candidate blocks. This feature is of particular value during an exhaustive search, but less useful in the case of sub-optimal searches. The inventors claims that this choice of matching criterion is less susceptible to noise than others.
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Matching Criteria
• IntegralProjection
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Motion Compensation – Algorithm
• Given a sequence of frames (each have macroblocks)
• Encode first frame as IntraFrame
• For each corresponding macroblock of next frame and current frame, find the difference.
– If difference less than threshold => no motion, find residual error
– If difference above threshold => may be motion, look in search range to find a matching block using matching criteria discussed above. Note motion vector and residual error
– If difference (or total residual error) is too large for a majority of macroblocks, and/or after regular intervals encode the current frame as an Intraframe and proceed to previous step
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Motion Compensation – Encoding
• There are two things to encode here:
– Motion Vector for every macroblock
– Difference or residual for every macroblock
• Motion vectors are typically encoded losslessly (similar to JPEG lossless mode)
• The residuals 𝑒 𝑥, 𝑦 are encoded lossy + lossless (DCT+Entropy) producing variable bit rate (VBR).
• If smooth motion or no motion:
– Motion prediction is good (residuals are small)
– Entropy coded with few bits
• If complex motion or change of scene:
– Motion prediction is bad (residual are large)
– Entropy coded with many bits
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
Motion Compensation – Decoding
CSULA CS4551 Multimedia Software Systems by Eun-Young Kang
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com