程序代写代做代考 c++ flex algorithm 1 | P a g e

1 | P a g e

FINAL REPORT ON

HEVC Intra Prediction

A PROJECT UNDER THE GUIDANCE OF

DR. K. R. RAO

COURSE: EE5359 – MULTIMEDIA PROCESSING, SPRING 2016

SUBMITTED BY:

Swaroop Krishna Rao (1001256012)

Nikita Thakur (1001102923)

Srirama Kartik Adavi (1001052277)

DEPARTMENT OF ELECTRICAL ENGINEERING

UNIVERSITY OF TEXAS AT ARLINGTON

2 | P a g e

Table of Contents:

1. Objective………………………………………………………………………………………………………………….5

2. Need for video compression……………………………………………………………………………………….5

3. Fundamental Concepts in Video coding……………………………………………………………………….5

4. HEVC………………………………………………………………………………………………………………………7

4.1 Encoder and Decoder in HEVC……………………………………………………………………………….8

4.2 Features of HEVC………………………………………………………………………………………………….9

4.2.1. Picture Partitioning…………………………………………………………………………………….9

4.2.2 Prediction………………………………………………………………………………………………….11

5. Intra Prediction…………………………………………………………………………………………………………13

5.1 PB Partitioning……………………………………………………………………………………………………..13

5.2 Reference Samples ……………………………………………………………………………………………….14

5.2.1 Reference Sample Generation……………………………………………………………………….14

5.2.2 Reference Sample Substitution……………………………………………………………………..14

5.3 Filtering Process of Reference Samples……………………………………………………………………15

5.4 Angular Prediction…………………………………………………………………………………………………16

5.4.1 Angle Definitions…………………………………………………………………………………………16

5.4.2 Sample Prediction for Angular Prediction Mode………………………………………………18

5.5 DC Prediction…………………………………………………………………………………………………….19

5.6 Planar Prediction…………………………………………………………………………………………………19

5.7 Smoothing Filter…………………………………………………………………………………………………20

6. Intra Mode Coding………………………………………………………………………………………………………20

6.1 Prediction of Luma Intra Modes………………………………………………………………………………21

6.2 Derived Mode for Chroma Intra Prediction……………………………………………………………….22

7. Encoding Algorithm………………………………………………………………………………………………….23

8. Intra Prediction in HM mode………………………………………………………………………………………..24

3 | P a g e

9. Computational Complexity of Intra prediction………………………………………………………25

10. Differences of Intra Prediction techniques between H.264/AVC and HEVC…………………….26

Acknowledgement…………………………………………………………………………………………………………..27

References……………………………………………………………………………………………………………………..28

4 | P a g e

List of Acronyms and Abbreviations:

AHG: Ad Hoc Group.

AVC: Advanced Video Coding.

CABAC: Context Adaptive Binary Arithmetic Coding.

CTU: Coding Tree Unit.

CTB: Coding Tree Block.

CU: Coding Unit.

DCT: Discrete Cosine Transform.

DST: Discrete Sine Transform.

DVD: Digital Video Disk.

HD: High Definition.

HDR: High Dynamic Range.

HEVC: High Efficiency Video Coding.

HM: HEVC Test Model.

IEC: International Electrotechnical Commission.

ISO: International Organization for Standardization.

ITU: International Telecommunication Union.

JCT: Joint Collaborative Team.

JCT-VC: Joint Collaborative Team on Video Coding.

JPEG: Joint Photographic Experts Group.

KTA: Key Technical Areas

MPEG: Moving Picture Experts Group.

MPM: Most Probable Mode

MVC: Multiview Video Coding.

PB: Prediction Block.

PU: Prediction Unit

RD: Rate Distortion

RDO: Rate Distortion Optimization

SATD: Sum of Absolute Transformed Differences

SCC: Screen Content Coding.

TB: Transform Block.

TU: Transform Unit.

UHD: Ultra High Definition

VCEG: Visual Coding Experts Group

WCG: Wide Color Gamut

https://en.wikipedia.org/wiki/International_Telecommunication_Union

5 | P a g e

1. Objective:

The objective of this project is to provide a review of the intra prediction part in recently developed

HEVC standard (ITU -T Video Coding Experts Group (VCEG) and the ISO / IEC Moving Picture

Experts Group (MPEG) standardization organizations)[1]. HEVC (H.265) standard[1][2][3][4] is

the latest enhanced video coding standard which was planned to improve the rendered

specifications of its preceding standard MPEG-4 (H.264)[5][2][3].The main goal of the HEVC

standardization effort is to enable significantly improved compression performance relative to

existing standard H.264. For similar video quality, HEVC bit-streams consume only about half of

the bitrate compared to previous standard H.264. Any video possesses redundant bits in every

frame. To remove this redundancy, prediction process is used. Intra-picture prediction is a tool in

HEVC which uses some prediction of data spatially from region to region within a specific picture,

but has no dependence on other pictures in the video frames. HEVC gives higher compression

compared with previous standard H.264 because of its new features like quadtree structure, more

directional intra-prediction modes. HEVC is suitable for resolutions up to Ultra High Definition

(UHD) video coding.

2. Need for Video Compression:

• Video compression technologies are about reducing and removing redundant video data so

that a digital video file can be effectively sent over a network and stored on computer disks.

• With efficient compression techniques, a significant reduction in file size can be achieved

with little or no adverse effect on the visual quality.

• The video quality, however, can be affected if the file size is further lowered by raising the

compression level for a given compression technique.

3. Fundamental Concepts in Video Coding [1]:

Digital video data consists of a time-ordered sequence of a natural or real-world visual scene in

digital form, sampled spatially and temporally. As shown in Fig .1 a scene is a sampled at a point

in time to produce a complete frame or an interlaced field (namely one field consists of half of the

data in a frame, spatially sampled at odd- or even-numbered interval lines) [1].

Spatial sampling: It means to take samples of the signal at a concrete point in time using a

rectangular grid in the video image plane producing a frame. At each intersection point on the grid

6 | P a g e

is taken a sample on spatial domain and it represents a square picture element (pixel), as shown in

Figure 1.

Temporal sampling: It means to capture a series of frames sampled at periodic intervals in time

producing a moving video signal. Cameras typically generate approximately 24, 25 or 30 frames

per second. This results in a large amount of information that demands the use of compression.

Figure 1: Spatial and temporal sampling of a video sequence [1]

Color space [1]:

The visual information at each spatial-temporal sample (picture element or pixel) must be

described in digital form by means of a number that depict the physical appearance of the sample.

For a scene or a picture in color, it is required at least three color components samples (three

different numbers) to represent a color space accurately for each pixel position. This happens by

means of color spaces, namely models to map physical colors to measurable numeric expressions.

Two well-known color spaces are described here:

RGB:

In the RGB color space, a color image sample is represented with three numbers that indicate the

relative proportions of Red (R), Green (G) and Blue (B) (the three additive primary colors of light).

Colors can be created by combining red, green and blue in varying proportions. These three colors

are equally important and so are usually all stored at the same resolution.

YCbCr [1]:

In the YCbCr color space (sometimes referred to as YUV), a color image sample is represented

with three numbers; one component indicate the brightness (luminance or luma) whereas the other

two components indicate the color. Due to the fact that human visual system is more sensitive to

luminance than to color, sometimes the resolution of the chroma components is down-sampled

with respect to the resolution of the luma component to reduce the amount of information needed

to describe each pixel. This color space with chroma subsampling is an efficient way of

representing color images. The YCbCr color space can be converted to RGB by means of simple

expressions.

Y is the luminance (luma) component, and indicates the brightness in an image, and can be

calculated as a weighted average of R, G and B [1]:

7 | P a g e

where kr, kg and kb are weighting factors. The color information can be represented as color

difference (chrominance or chroma) components, where each chrominance component is the

difference between R, G or B and the luminance Y.

The complete description of a color image is given by Y (the luminance component) and three

color differences Cb, Cg and Cr that represent the difference between the color intensity and the

mean luminance of each image sample. Cg is not transmitted because it is possible to extract it

from the other components [1].

YCbCr Sampling Formats:

Chroma subsampling can be performed in different ways as shown in Fig.2. The most common

formats of sampling the images to obtain the three components are:

4:4:4 sampling: this means that each chroma component (Cb, Cr) have the same resolution as luma

component (Y), preserving the full fidelity of the chrominance components.

4:2:2 sampling: this means that the chroma components have half the horizontal resolution of luma

component. For every four luminance samples in the horizontal direction there are two Cb and two

Cr samples. 4:2:2 video is used for high-quality color reproduction.

4:2:0 sampling: this means that each chroma component has one fourth of the number of samples

of the luma component (half the number of samples in both the horizontal and vertical dimensions).

4:2:0 sampling is widely used for consumer applications such as video conferencing, digital

television and digital versatile disk (DVD) storage [1]

Figure 2: A sample of the YCbCr Sampling Formats [1]

4. HEVC:
High Efficiency Video Coding (HEVC) [1] is an international standard for video compression

developed by a working group of ISO/IEC MPEG (Moving Picture Experts Group) and ITU-T

VCEG (Visual Coding Experts Group). The main goal of HEVC standard is to significantly

improve compression performance compared to existing standards (such as H.264/Advanced

Video Coding [5]) in the range of 50% bit rate reduction at similar visual quality [3].

HEVC is designed to address existing applications of H.264/MPEG-4 AVC and to focus on two

key issues: increased video resolution and increased use of parallel processing architectures [3]. It

primarily targets consumer applications as pixel formats are limited to 4:2:0 8-bit and 4:2:0 10-bit.

The next revision of the standard, finalized in 2014, enables new use-cases with the support of

8 | P a g e

additional pixel formats such as 4:2:2 and 4:4:4 and bit depth higher than 10-bit, embedded bit-

stream scalability, 3D video and multiview video [20].

4.1. Encoder and Decoder in HEVC [21]:

Source video, consisting of a sequence of video frames, is encoded or compressed by a video

encoder to create a compressed video bit stream. The compressed bit stream is stored or

transmitted. A video decoder decompresses the bit stream to create a sequence of decoded frames.

The video encoder performs the following steps as shown in Fig. 3:

 Partitioning each picture into multiple units

 Predicting each unit using inter or intra prediction, and subtracting the prediction from the
unit

 Transforming and quantizing the residual (the difference between the original picture unit
and the prediction)

 Entropy encoding transform output, prediction information, mode information and headers

The video decoder performs the following steps as shown in Fig. 4:

 Entropy decoding and extracting the elements of the coded sequence

 Rescaling and inverting the transform stage

 Predicting each unit and adding the prediction to the output of the inverse transform

 Reconstructing a decoded video image

Figure 3: Block Diagram of HEVC Encoder [21]

9 | P a g e

Figure 4: Block diagram of HEVC Decoder [21]

4.2. Features of HEVC:

4.2.1 Picture Partitioning [3]:

The previous standards split the pictures in block-shaped regions called Macroblocks and

Blocks. Nowadays we have high-resolution video content, so the use of larger blocks is

advantageous for encoding. To support this wide variety of blocks size in efficient manner

HEVC pictures are divided into so-called coding tree Units (CTUs) as shown in Fig.5 [3].

Depending on the stream parameters, the CTUs in a video sequence can have the size: 64×64,

32×32, or 16×16 as shown in Fig.6 [3].

Figure 5: Frame split in CTUs [3] Figure 6: Possible sizes of the CTUs [3]

10 | P a g e

Coding Tree Unit (CTU) is therefore a coding logical unit, which is in turn encoded into an

HEVC bit-stream. It consists of three blocks, namely luma (Y), that covers a square picture area

of LxL samples of the luma component, and two chroma components (Cb and Cr), that cover

L/2xL/2 samples of each of the two chroma components, and associated syntax elements, as

shown in Fig.7. Each block is called Coding Tree Block (CTB).

Syntax elements describe properties of different types of units of a coded block of pixels and

how the video sequence can be reconstructed at the decoder. This includes the method of

prediction (e.g. inter or intra prediction, intra prediction mode, and motion vectors) and other

parameters [9].

Figure 7: CTU consist of three CTBs and Syntax Elements [8]

Each CTB has the same size (LxL) as the CTU (64×64, 32×32, or 16×16). However, CTB could

be too big to decide whether we should perform inter-picture prediction or intra-picture

prediction. So each CTB can be split recursively in a quad-tree structure, from the same size as

CTB to as small as 8×8. Each block resulting from this partitioning is called Coding Blocks

(CBs) as shown in Fig.8 and becomes the decision making point of prediction type (inter or intra

prediction) [12].

Fig.9 illustrates an example of 64×64 CTBs split into CBs.

Figure 8: Illustration of 64×64 CTBs split into CBs [12]

The prediction type along with other parameters is coded in Coding Unit (CU). So CU is the

basic unit of prediction in HEVC, each of which is predicted from previously coded data. And

11 | P a g e

the CU consists of three CBs (Y, Cb and Cr), and associated syntax elements, as shown in Fig.9

[12].

Figure 9: CU consist in three CBs and associated syntax elements [12]

CBs could still be too large to store motion vectors (inter-picture (temporal) prediction) or intra-

picture (spatial) prediction mode. Therefore, Prediction Block (PB) was introduced. Each CB can

be split into PBs differently depending on the temporal and/or spatial predictability.

4.2.2 Prediction:
Frames of video are coded using intra or inter prediction:

Intra-frame prediction:

In the spatial domain, the redundancy means that pixels (samples) that are close to each other in

the same frame or field are usually highly correlated. This means that the appearance of samples

in an image is often similar to their adjacent neighbor samples; this is called the spatial redundancy

or intra-frame correlation.

This redundant information in the spatial domain can be exploited to compress the image. Note

that when using this kind of compression, each picture is compressed without referring to other

pictures in the video sequence. This technique is called Intra-frame prediction and it is designed

to minimize the duplication of data in each picture (spatial-domain redundancy) [7]. It consists in

forming a prediction frame and subtracting this prediction from the current frame.

12 | P a g e

Figure 10: Spatial (intra-frame) correlation in a video sequence [7]

Several methods can be used to remove this redundant information in the spatial domain.

Typically, the values of the prediction samples are constructed by combining their adjacent

neighbor samples (reference samples) by means of several techniques. In some cases, considerably

prediction accuracy can be obtained by means of efficient intra-prediction techniques.

Inter frame prediction:

In the temporal domain, redundancy means that successive frames in time order are usually high

correlated; therefore parts of the scene are repeated in time with little or no changes. This type of

redundancy is called temporal redundancy or inter-frame correlation [13].

It is clear then that the video can be represented more efficiently by coding only the changes in the

video content, rather than coding each entire picture repeatedly. This technique is called Inter-

frame prediction; it is designed to minimize the temporal-domain redundancy and at the same time

improve coding efficiency to achieve video compression [7].

Figure 11: Temporal (inter-frame) correlation in a video sequence [7]

To remove the redundant information in the temporal domain typically motion compensated

prediction or inter prediction methods are used. Motion compensation (MC) consists of

13 | P a g e

constructing a prediction of the current video frame from one or more previous or future encoded

frames (reference frames) by compensating differences between the current frame and the

reference frame. To achieve this, the motion or trajectory between successive blocks of the image

is estimated. The information regarding motion vectors (describes how the motion was

compensated) and residuals from the previous frames are coded and sent to the decoder.

5. Intra Prediction:

The Intra-picture prediction uses the previously decoded boundary samples from spatially

neighboring TBs in order to predict a new prediction block PB. So the first picture of a video

sequence and the first picture at each clean random access point into a video sequence are coded

using only intra-picture prediction [1].

Several improvements have been introduced in HEVC in the intra prediction module:

 Due to the larger size of the pictures, the range of supported coding block sizes has been
increased.

 A plane mode that guarantees continuity at block boundaries is desired.

 The number of directional orientations has been increased.

 For intra mode coding, efficient coding techniques to transmit the mode for each block are
needed due to the increased number of intra modes.

 HEVC supports a large variety of block sizes, so it needs consistency across all block sizes.

HEVC employs 35 different intra modes to predict a PB: 33 Angular modes, Planar mode, and DC

mode. Table. 1 shows the mode name with their corresponding intra prediction mode index as by

the convention used throughout the standard [11].

Table 1: Specification of Intra Prediction modes and associated index [11].

At the end, the video encoder will choose the intra prediction mode that provides de best Rate-

Distortion performance.

5.1. PB partitioning:
The CB can be split into size of MxM or M/2xM/2, as shown in Fig.12. The first one means that

the CB is not split, so the PB has the same size as the CB. It is possible to use it in all CUs. The

14 | P a g e

second partitioning means that the CB is split into four equally-sized PBs. This can only be used

in the smallest 8×8 CUs. In this case, a flag is used to select which partitioning is used in the CU.

Each resulting PB has its own intra prediction mode [1]. The prediction blocks size range from

4×4 to 64×64.

Figure 12: Prediction Block for Intra Prediction [1]

5.2. Reference Samples:

5.2.1. Reference Sample Generation:

The intra sample prediction in HEVC is performed by extrapolating sample values from the

reconstructed reference samples as defined by the selected intra prediction mode. Compared to the

H.264/AVC, HEVC introduces a reference sample substitution process which allows HEVC to use

the complete set of intra prediction modes regardless of the availability of the neighboring

reference samples. In addition, there is an adaptive filtering process that can pre-filter the reference

samples according to the intra prediction mode, block size and directionality to increase the

diversity of the available predictors.

5.2.2. Reference Sample Substitution:

Some or all of the reference samples may not be available for prediction due to several reasons.

For example, samples outside of the picture, slice or tile are considered unavailable for prediction.

In addition, when constrained intra prediction is enabled, reference samples belonging to inter-

predicted PUs are omitted in order to avoid error propagation from potentially erroneously

received and reconstructed prior pictures. As opposed to H.264/AVC which allows only DC

prediction to be used in these cases, HEVC allows the use of all its prediction modes after

substituting the non-available reference samples.

For the extreme case with none of the reference samples available, all the reference samples are

substituted by a nominal average sample value for a given bit depth (e.g., 128 for 8-bit data). If

there is at least one reference sample marked as available for intra prediction, the unavailable

reference samples are substituted by using the available reference samples. The unavailable

reference samples are substituted by scanning the reference samples in clock-wise direction and

15 | P a g e

using the latest available sample value for the unavailable ones. More specifically, the process is

defined as follows:

1. When p[-1][2N -1] is not available, it is substituted by the first encountered available

reference sample when scanning the samples in the order of p[-1][2N -2], : : : , p[-1][-1],

followed by p[0][-1], : : : , p[2N -1][-1].

2. All non-available reference samples of p[-1][y] with yD2N -2 : : : -1 are substituted by the

reference sample below p[-1][yC1].

3. All non-available reference samples of p[x][-1] with xD0 : : : 2N -1 are substituted by the

reference sample left p[x-1][-1].

Figure 13 shows an example of reference sample substitution.

Figure 13: An example of reference sample substitution process. Non-available reference samples are marked as

grey: (a) reference samples before the substitution process (b) reference samples after the substitution process [3]

5.3. Filtering Process of Reference Samples:

The reference samples used by HEVC intra prediction are conditionally filtered by a smoothing

filter similarly to what was done in 8×8 intra prediction of H.264/AVC. The intention of this

processing is to improve visual appearance of the prediction block by avoiding steps in the values

of reference samples that could potentially generate unwanted directional edges to the prediction

block. For the optimal usage of the smoothing filter, the decision to apply the filter is done based

on the selected intra prediction mode and size of the prediction block.

• Two types of filtering process of reference samples
• Fig.14 shows a three-tap filtering using two neighboring reference samples. Reference

sample X is replaced by the filtered value using A, X and B while reference sample Y is

replaced by the filtered value using C, Y and D.

• Fig.14 shows a strong intra smoothing process using corner reference samples. Reference
sample X is replaced by a linearly filtered value using A and B while Y is replaced by a

linearly filtered value using B and C [3].

16 | P a g e

Figure 14: Filtering of Reference sample [3]

5.4. Angular Prediction:

Angular intra prediction in HEVC is designed to efficiently model different directional structures

typically present in video and image content. The set of available prediction directions has been

selected to provide a good trade-off between encoding complexity and coding efficiency for typical

video material. The sample prediction process itself is designed to have low computational

requirements and to be consistent across different block sizes and prediction directions. This has

been found especially important as the number of block sizes and prediction directions supported

by HEVC intra coding far exceeds those of previous video codecs, such as H.264/AVC. In HEVC

there are four effective intra prediction block sizes ranging from 4×4 to 32×32 samples, each of

which supports 33 distinct prediction directions. A decoder must thus support 132 combinations

of block sizes and prediction directions.

5.4.1. Angle Definitions:

HEVC defines a set of 33 angular prediction directions at 1/32 sample accuracy as illustrated in

Fig. 15. In natural imagery, horizontal and vertical patterns typically occur more frequently than

patterns with other directionalities. Small differences for displacement parameters for modes close

to horizontal and vertical directions take advantage of that phenomenon and provide more accurate

prediction for nearly horizontal and vertical patterns. The displacement parameter differences

become larger closer to diagonal directions to reduce the density of prediction modes for less

frequently occurring patterns.

Table 2 provides the exact mapping from indicated intra prediction mode to angular parameter A

[3]. That parameter defines the angularity of the selected prediction mode (how many 1/32 sample

grid units each row of samples is displaced with respect to the previous row).

17 | P a g e

Table 2: Angular parameter A defines the directionality of each angular intra prediction mode [3]

Figure 15: Angle definitions of angular intra prediction in HEVC numbered from 2 to 34 and the associated

displacement parameters. H and V are used to indicate the horizontal and vertical directionalities, respectively, while

the numeric part of the identifier refers to the sample position displacements in 1/32 fractions of sample grid

positions [3]

For each octant, eight angles are defined with associated displacement parameters, as shown in

Fig.15. When getting closer to diagonal directions, the displacement parameter become larger in

order to reduce the density of prediction modes for less occurring directions. For modes close to

horizontal and vertical directions, the displacement becomes smaller in order to provide more

accurate prediction for nearly horizontal and vertical patterns.

In order to calculate the value of each sample of the PB, the angular mode extrapolates the samples

from the reference samples, depending on the directional orientation in order to achieve lower

complexity. When the direction selected is between 2 and 17, the samples located in the above row

(red samples and may be green samples, as shown in Fig.16 (a)) are projected as additional samples

located in the left column, extending the left reference column. When the direction selected is

between 18 and 34, the samples located at the left column (blue samples and may be orange

samples, as shown in Fig.16 (b)) are projected as samples located in the above row, extending the

top reference row. In both cases the samples projected would have negative indexes [3].

18 | P a g e

Figure 16: Example of diagonal orientation [3]

5.4.2. Sample Prediction for Angular Prediction Mode [3]:
Predicted sample values p[x][y] are obtained by projecting the location of the sample p[x][y] to

the reference sample array applying the selected prediction direction and interpolating a value for

the sample at 1/32 sample position accuracy[3].

Prediction for horizontal modes (modes 2-17) is given by:

And Sample Prediction for vertical modes (modes 18–34) is given by:

19 | P a g e

where i is the projected integer displacement on row y (for vertical modes) or column x (for

horizontal modes) and calculated as a function of angular parameter A as follows:

f represents the fractional part of the projected displacement on the same row or column and is

calculated as:

5.5. DC Prediction:
This mode is also similar to the DC mode in H.264/MPEG-4 AVC. It is efficient to predict plane

areas of smoothly-varying content in the image, but gives a coarse prediction on the content of

higher frequency components and as such it is not efficient for finely textured areas.

The value of each sample of the PB is an average of the reference samples. As explained before,

for this case the reference samples will be the boundary samples of the top and left neighboring

TBs.

5.6. Planar Prediction:
This mode in HEVC is similar to the planar mode in H.264/MPEG-4 AVC, and is known as

mode 0. In H.264/MPEG-4 AVC this method is a plane prediction mode for textured images, and

may introduce discontinuities along the block boundaries. Conversely, in HEVC this mode was

improved in order to preserve continuities along the block edges.

Planar mode is essentially defined as an average value of two linear predictions using four corner

reference samples. This mode is implemented as follows, with reference to Fig.17; the sample X

is the first sample predicted as an average of the samples D and E, then the right column samples

(blue samples) are predicted using bilinear interpolation between samples in D and X, and the

bottom row samples (orange samples) are predicted using bilinear interpolation between samples

in E and X. The remaining samples are predicted as the averages of bilinear interpolations between

boundaries samples and previously coded samples [14].

20 | P a g e

Figure 17: Planar intra prediction mode [14]

5.7. Smoothing Filter:

HEVC uses a smoothing filter in order to reduce the discontinuities introduced by the intra-

prediction modes. This is applied to the boundary samples, namely the first prediction row and

column for DC mode, or the first prediction row for pure horizontal prediction, or the first

prediction column for pure vertical prediction. The smoothing filter consists of a two-tap finite

impulse response filter for DC prediction or a gradient-based smoothing filter for horizontal (mode

10) and vertical (mode 26) prediction. Due to the fact that chroma components tend to be already

smooth, this filter is not used in this case. Prediction boundary smoothing is only applied to luma

component.

The smoothing filter is applied to the reference samples depending on the size of the blocks and

the directionalities of the prediction. Thanks to using this filter, contouring artefacts caused by

boundaries in the reference samples may be drastically reduced.

6. Intra Mode Coding:
While increasing the number of intra prediction modes provides better prediction, efficient intra

mode coding is required to ensure that the selected mode is signaled with minimal overhead. For

luma component, three most probable modes are derived to predict the intra mode instead of a

single most probable one as in H.264/AVC [5]. Possible redundancies among the three most

probable modes are also considered and redundant modes are substituted with alternative ones to

maximize the signaling efficiency. For chroma intra mode, HEVC introduces a derived mode

which allows efficient signaling of the likely scenario where chroma is using the same prediction

mode as luma. The syntax elements for signaling luma and chroma intra modes are designed by

utilizing the increased number of most probable modes for the luma component and the statistical

behavior of the chroma component.

21 | P a g e

6.1. Prediction of Luma Intra Mode [3]:
HEVC supports a total of 33 angular prediction modes as well as planar and DC prediction for

luma intra prediction for all the PU sizes. Due to the large number of intra prediction modes,

H.264/AVC-like mode coding approach based on a single most probable mode was not effective

in HEVC. Instead, HEVC defines three most probable modes for each PU based on the modes of

the neighboring PUs. The selected number of most probable modes makes it also possible to

indicate one of the 32 remaining modes by a CABAC bypassed fixed-length code, as distribution

of the mode probabilities outside of the set of most probable modes has been found to be relatively

uniform. The selection of the set of three most probable modes is based on modes of two

neighboring PUs, one left and one to the above of the current PU. Let the intra modes of left and

above of the current PU be A and B, respectively. If a neighboring PU is not coded as intra or is

coded with pulse code modulation (PCM) mode, the PU is considered to be a DC predicted one.

In addition, B is assumed to be DC mode when the above neighboring PU is outside the CTU to

avoid introduction of an additional line buffer for intra mode reconstruction.

If A is not equal to B, the first two most probable modes denoted as MPM[0] and MPM[1] are set

equal to A and B, respectively, and the third most probable mode denoted as MPM[2] is determined

as follows:

• If neither of A or B is planar mode, MPM[2] is set to planar mode.

• Otherwise, if neither of A or B is DC mode, MPM[2] is set to DC mode.

• Otherwise (one of the two most probable modes is planar and the other is DC),

MPM[2] is set equal to angular mode 26 (directly vertical). If A is equal to B, the three most

probable modes are determined as follows. In the case they are not angular modes (A and B are

less than 2), the three most probable modes are set equal to planar mode, DC mode and angular

mode 26, respectively. Otherwise (A and B are greater than or equal to 2), the first most probable

mode MPM[0] is set equal to A and two remaining most probable modes MPM[1] and MPM[2]

are set equal to the neighboring directions of A and calculated as [3]:

where% denotes the modulo operator (i.e., a%b denotes the remainder of a divided by b). Fig.18

summarizes the derivation process for the three most probable modes MPM[0], MPM[1] and

MPM[2] from the neighboring intra modes A and B. The three most probable modes MPM[0],

MPM[1] and MPM[2] identified as described above are further sorted in an ascending order

according to their mode number to form the ordered set of most probable modes. If the current

intra prediction mode is equal to one of the elements in the set of most probable modes, only the

index in the set is transmitted to the decoder. Otherwise, a 5-bit CABAC bypassed code word is

used to specify the selected mode outside of the set of most probable modes as the number of

modes outside of the set is equal to 32.

22 | P a g e

Figure 18: Derivation process for the three most probable modes MPM[0], MPM[1] and MPM[2]. A and B indicate

the neighboring intra modes of the left and the above PU, respectively [3]

6.2. Derived Mode for Chroma Intra Prediction:
To enable the use of the increased number of directionalities in the chroma intra prediction while

minimizing the signaling overhead, HEVC introduces the INTRA_DERIVED mode to indicate

the cases when a chroma PU uses the same prediction mode as the corresponding luma PU. More

specifically, for a chroma PU one of the five chroma intra prediction modes: planar, angular 26

(directly vertical), angular 10 (directly horizontal), DC or derived mode is signaled.

This design is based on the finding that often structures in the chroma signal follow those of the

luma. In the case the derived mode is indicated for a chroma PU, intra prediction is performed by

using the corresponding luma PU mode. Angular intra mode 34 is used to substitute the four

chroma prediction modes with individual identifiers when the derived mode is one of those four

modes. The substitution process is illustrated in Table 3.

Table 3: Determination of chroma intra prediction mode according to luma intra prediction mode [3]

23 | P a g e

7. Encoding Algorithms:
Due to the large number of intra modes in HEVC, the computations of the rate distortion costs for

all intra modes are impractical for most of the applications. In the HEVC reference software, the

sum of absolute transformed differences (SATD) between prediction and original samples is used

to reduce the number of luma intra mode candidates before applying rate-distortion optimization

(RDO).

The number of luma intra mode candidates entering full RDO is determined according to the

corresponding PU sizes as eight for PUs of size 4_4 and 8_8 PU, and three for other PU sizes. For

those luma intra mode candidates as well as luma intra modes which are a part of the most probable

mode set, intra sample prediction and transform coding are performed to obtain both the number

of required bits and the resulting distortion. Finally, the luma intra mode which minimizes the rate-

distortion cost is selected. For the chroma intra encoding, all possible intra chroma modes are

evaluated based on their rate-distortion costs as the number of intra chroma modes is much smaller

than that of luma modes. It has been reported that this kind of technique can provide negligible

coding efficiency loss (less than 1%increase in bit rate at aligned objective quality) compared to a

full rate-distortion search while reducing the encoding complexity by a factor of three.

24 | P a g e

8. Intra Prediction in HM code [23]:

Figure 19: Flow chart showing important functions in Intra Prediction

• Library-TLibcommon
• Source file :TComPattern.cpp ,TComPrediction.cpp and TComRDcost.cpp

• Functions: xCompressCu, estIntraPredQT, estIntraPredChromaQT,
xRecurIntraCodingQT, xRecurIntraChromaCodingQT, initAdiPattern,

predIntraLumaAng, calcHAD, xUpdateCandListx, RecurIntraCodingQT,

fillReferenceSamples, xPredIntraPlanar, xPredIntraAng and predIntraGetPredValDC.

25 | P a g e

9. Computational Complexity of Intra prediction [26]:

HEVC encoders are expected to be several times more complex than H.264/AVC encoders and

HEVC decoders does not appear to be significantly different from that of H.264/AVC decoders.

The complexity of some key modules such as transforms, intra picture prediction, and motion

compensation is likely higher in HEVC than in H.264/AVC, complexity was reduced in others

such as entropy coding and deblocking.

HEVC features many more mode combinations as a result of the added flexibility from the quad

tree structures and the increase of intra picture prediction modes. An encoder fully exploiting the

capabilities of HEVC is thus expected to be several times more complex than an H.264/AVC

encoder. This added complexity does however have a substantial benefit in the expected

significant improvement in rate-distortion performance.

Table 4: Encoding time [26]

26 | P a g e

10. Differences of intra prediction techniques between H.264/AVC and HEVC

[3]:

Table 5 shows the comparison of intra prediction techniques between H.264 and HEVC.

Table 5: Differences of intra prediction techniques between H.264 and HEVC [3]

27 | P a g e

ACKNOWLEDGEMENT

We would sincerely like to thank Dr. K. R. Rao and Tuan Ho for their constant support and

guidance throughout the duration of our project.

28 | P a g e

References:

[1] G. J. Sullivan et al, “Overview of the high efficiency video coding (HEVC) standard”, IEEE

Transactions on circuits and systems for video technology, vol. 22, no.12, pp. 1649 – 1668, Dec .

2012.

[2] K. R. Rao, D. N. Kim and J. J. Hwang, “Video Coding Standards: AVS China, H.264/MPEG-

4 Part10, HEVC, VP6, DIRAC and VC-1”, Springer, 2014.

[3] V. Sze, M. Budagavi, and G. J. Sullivan, “High Efficiency Video Coding (HEVC): Algorithms

and Architectures”, Springer, 2014.

[4] M. Wien, “High Efficiency Video Coding: Coding Tools and Specification”, Springer, 2014.

[5] JVT Draft ITU-T recommendation and final draft international standard of joint video

specification (ITU-T Rec. H.264-ISO/IEC 14496-10 AVC), March 2003, JVT-G050 available on

http://ip.hhi.de/imagecom_G1/assets/pdfs/JVT-G050.pdf.

[6] I.E.G.Richardson, “H.264 and MPEG-4 Video Compression Video Coding for Next-

generation Multimedia”, New York, Wiley, 2003

[7] G.J. Sullivan and T.Wiegand, “Rate-Distortion Optimization for Video Compression”. IEEE

Signal Processing Magazine, vol. 15, no. 6, pp. 74-90, Nov. 1998.

[8] J.-R. Ohm et al, “Comparison of the coding efficiency of video coding standards – including

high efficiency video coding (HEVC)”, IEEE Transactions on circuits and systems for video

technology, vol. 22, pp.1669-1684, Dec. 2012. Software and data for reproducing selected results

can be found at ftp://ftp.hhi.de/ieee-tcsvt/2012.

[9] V. Sze and M. Budagavi, “High throughput CABAC entropy coding in HEVC”, IEEE

Transactions on circuits and systems for video technology, vol. 22, pp.1778-1791, Dec. 2012.

[10] I. E. G. Richardson, “The H.264 Advanced Video Compression Standard,” II Edition, Wiley,

2010.

[11] J. Lainema et al, “Intra coding of the HEVC standard”, IEEE Transactions on circuits and

systems for video technology, vol. 22, pp.1792-1801, Dec. 2012.

[12] P. Helle et al, “Block merging for quadtree-based partitioning in HEVC”, IEEE Transactions

on circuits and systems for video technology, vol. 22, pp.1720-1731, Dec. 2012.

[13] G. Hill, “The Cable and Telecommunications Professionals’ Reference: Transport Networks”.

Vol 2, Third Edition, 2008.

ftp://ftp.hhi.de/ieee-tcsvt/2012

29 | P a g e

[14] N. Ling, “High efficiency video coding and its 3D extension: A research perspective,”

Keynote Speech, ICIEA, pp. 2150-2155, Singapore, July 2012

http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6361087.

[15] I. E. G. Richardson, “Video Codec Design: Developing Image and Video Compression

Systems”, Wiley, 2002.

[16] HEVC white paper- http://www.ateme.com/an-introduction-to-uhdtv-and-hevc

[17] G. J. Sullivan, et al, “Standardized Extensions of High Efficiency Video Coding (HEVC)”,

IEEE Journal of selected topics in Signal Processing, Vol. 7, No. 6, pp. 1001-1016, Dec. 2013.

[18] D. K. Kwon and M. Budagavi, “Combined scalable and multiview extension of High

Efficiency Video Coding (HEVC)”, IEEE Picture Coding Symposium, pp. 414 – 417 , Dec . 2013.

[19] M. Budagavi, “HEVC/H.265 and recent developments in video coding standards ” , EE

Department Seminar , UT Arlington , Arlington , 21 Nov. 2014.

[20] HEVC tutorial by I.E.G. Richardson: http://www.vcodex.com/h265.html

[21] C. Fogg, “Suggested figures for the HEVC specification”, ITU-T & ISO/IEC JCTVC-

J0292r1, July 2012.

[22] D. Marpe, et al, “Context-based adaptive binary arithmetic coding in the H.264/AVC video

compression standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol.

13, pp. 620–636, July 2003.

[23] HM C++ Code: https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/

[24] HM software manual:

https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/trunk/doc/software-manual.pdf

[25] T.L. da Silva, L.V. Agostini and L.A. da Silva Cruz, “HEVC intra coding acceleration based

on tree intra-level mode correlation”, IEEE Conference 2013 Signal Processing: Algorithms,

Architectures, Arrangements, and Applications, Poznan, Poland, Sept.2013.

[26] H. Zhang and Z. Ma, “Fast intra mode decision for high-efficiency video coding”, IEEE

Transactions on circuits and systems for video technology, Vol.24, pp.660-668, Apr. 2014.

[27] JCT-VC documents can be accessed. [online]. Available: http://phenix.int-

evry.fr/jct/doc_end_user/current_meeting.php?id_meeting=154&type_order=&sql_type=docume

nt_number

http://www.ateme.com/an-introduction-to-uhdtv-and-hevc
http://www.vcodex.com/h265.html
https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/
https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/trunk/doc/software-manual.pdf
https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/trunk/doc/software-manual.pdf
https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/trunk/doc/software-manual.pdf
http://phenix.int-evry.fr/jct/doc_end_user/current_meeting.php?id_meeting=154&type_order=&sql_type=document_number
http://phenix.int-evry.fr/jct/doc_end_user/current_meeting.php?id_meeting=154&type_order=&sql_type=document_number
http://phenix.int-evry.fr/jct/doc_end_user/current_meeting.php?id_meeting=154&type_order=&sql_type=document_number

30 | P a g e

[28] VCEG & JCT documents available from http://wftp3.itu.int/av-arch in the video-site and jvt-

site folders

[29] Test Sequences – download

TS.1 http://media.xiph.org/video/derf/

TS.2 http://trace.eas.asu.edu/yuv/

TS.3 http://media.xiph.org/

TS.4 http://www.cipr.rpi.edu/resource/sequences/

TS.5 HTTP://BASAK0ZTAS.NET