程序代写 INTERNATIONAL TELECOMMUNICATION UNION

INTERNATIONAL TELECOMMUNICATION UNION
CCITT T.81
THE INTERNATIONAL (09/92) TELEGRAPH AND TELEPHONE
CONSULTATIVE COMMITTEE
TERMINAL EQUIPMENT AND PROTOCOLS FOR TELEMATIC SERVICES
INFORMATION TECHNOLOGY –
DIGITAL COMPRESSION AND CODING OF CONTINUOUS-TONE STILL IMAGES – REQUIREMENTS AND GUIDELINES
Recommendation T.81

Foreword
ITU (International Telecommunication Union) is the United Nations Specialized Agency in the field of telecommunications. The CCITT (the International Telegraph and Telephone Consultative Committee) is a permanent organ of the ITU. Some 166 member countries, 68 telecom operating entities, 163 scientific and industrial organizations and 39 international organizations participate in CCITT which is the body which sets world telecommunications standards (Recommendations).
The approval of Recommendations by the members of CCITT is covered by the procedure laid down in CCITT Resolution No. 2 (Melbourne, 1988). In addition, the Plenary Assembly of CCITT, which meets every four years, approves Recommendations submitted to it and establishes the study programme for the following period.
In some areas of information technology, which fall within CCITT’s purview, the necessary standards are prepared on a collaborative basis with ISO and IEC. The text of CCITT Recommendation T.81 was approved on 18th September 1992. The identical text is also published as ISO/IEC International Standard 10918-1.
___________________
CCITT NOTE
In this Recommendation, the expression “Administration” is used for conciseness to indicate both a telecommunication administration and a recognized private operating agency.
 ITU 1993
All rights reserved. No part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from the ITU.

Contents
Introduction………………………………………………………………………………………………………………………………………….. iii
1 Scope ………………………………………………………………………………………………………………………………………… 1
2 Normative references……………………………………………………………………………………………………………………. 1
3 Definitions, abbreviations and symbols …………………………………………………………………………………………… 1
4 General ……………………………………………………………………………………………………………………………………… 12
5 Interchange format requirements ……………………………………………………………………………………………………. 23
6 Encoder requirements ………………………………………………………………………………………………………………….. 23
7 Decoder requirements ………………………………………………………………………………………………………………….. 23
Annex A – Mathematical definitions………………………………………………………………………………………………………… 24 Annex B – Compressed data formats………………………………………………………………………………………………………… 31 Annex C – Huffman table specification …………………………………………………………………………………………………….. 50 Annex D – Arithmetic coding …………………………………………………………………………………………………………………. 54 Annex E – Encoder and decoder control procedures…………………………………………………………………………………… 77 Annex F – Sequential DCT-based mode of operation………………………………………………………………………………….. 87 Annex G – Progressive DCT-based mode of operation………………………………………………………………………………… 119 Annex H – Lossless mode of operation …………………………………………………………………………………………………….. 132 Annex J – Hierarchical mode of operation…………………………………………………………………………………………………. 137 Annex K – Examples and guidelines………………………………………………………………………………………………………… 143 Annex L – Patents…………………………………………………………………………………………………………………………………. 179 Annex M – Bibliography………………………………………………………………………………………………………………………… 181
CCITT Rec. T.81 (1992 E) i
Page

Introduction
This CCITT Recommendation | ISO/IEC International Standard was prepared by CCITT Study Group VIII and the Joint Photographic Experts Group (JPEG) of ISO/IEC JTC 1/SC 29/WG 10. This Experts Group was formed in 1986 to establish a standard for the sequential progressive encoding of continuous tone grayscale and colour images.
Digital Compression and Coding of Continuous-tone Still images, is published in two parts:
– Requirements and guidelines;
– Compliance testing.
This part, Part 1, sets out requirements and implementation guidelines for continuous-tone still image encoding and decoding processes, and for the coded representation of compressed image data for interchange between applications. These processes and representations are intended to be generic, that is, to be applicable to a broad range of applications for colour and grayscale still images within communications and computer systems. Part 2, sets out tests for determining whether implementations comply with the requirments for the various encoding and decoding processes specified in Part 1.
The user’s attention is called to the possibility that – for some of the coding processes specified herein – compliance with this Recommendation | International Standard may require use of an invention covered by patent rights. See Annex L for further information.
The requirements which these processes must satisfy to be useful for specific image communications applications such as facsimile, Videotex and audiographic conferencing are defined in CCITT Recommendation T.80. The intent is that the generic processes of Recommendation T.80 will be incorporated into the various CCITT Recommendations for terminal equipment for these applications.
In addition to the applications addressed by the CCITT and ISO/IEC, the JPEG committee has developped a compression standard to meet the needs of other applications as well, including desktop publishing, graphic arts, medical imaging and scientific imaging.
Annexes A, B, C, D, E, F, G, H and J are normative, and thus form an integral part of this Specification. Annexes K, L and M are informative and thus do not form an integral part of this Specification.
This Specification aims to follow the guidelines of CCITT and ISO/IEC JTC 1 on Rules for presentation of CCITT | ISO/IEC common text.

INTERNATIONAL STANDARD
ISO/IEC 10918-1 : 1993(E) CCITT Rec. T.81 (1992 E)
CCITT RECOMMENDATION
1 Scope
INFORMATION TECHNOLOGY – DIGITAL COMPRESSION AND CODING OF CONTINUOUS-TONE STILL IMAGES – REQUIREMENTS AND GUIDELINES
This CCITT Recommendation | International Standard is applicable to continuous-tone – grayscale or colour – digital still image data. It is applicable to a wide range of applications which require use of compressed images. It is not applicable to bi-level image data.
This Specification
– specifies processes for converting source image data to compressed image data;
– specifies processes for converting compressed image data to reconstructed image data;
– gives guidance on how to implement these processes in practice;
– specifies coded representations for compressed image data.
NOTE – This Specification does not specify a complete coded image representation. Such representations may include certain parameters, such as aspect ratio, component sample registration, and colour space designation, which are application- dependent.
2 Normative references
The following CCITT Recommendations and International Standards contain provisions which, through reference in this text, constitute provisions of this CCITT Recommendation | International Standard. At the time of publication, the editions indicated were valid. All Recommendations and Standards are subject to revision, and parties to agreements based on this CCITT Recommendation | International Standard are encouraged to investigate the possibility of applying the most recent edition of the Recommendations and Standards listed below. Members of IEC and ISO maintain registers of currently valid International Standards. The CCITT Secretariat maintains a list of currently valid CCITT Recommendations.
– CCITT Recommendation T.80 (1992), Common components for image compression and communication – Basic principles.
3 Definitions, abbreviations and symbols
3.1 Definitions and abbreviations
For the purposes of this Specification, the following definitions apply.
3.1.1 abbreviated format: A representation of compressed image data which is missing some or all of the table specifications required for decoding, or a representation of table-specification data without frame headers, scan headers, and entropy-coded segments.
3.1.2 AC coefficient: Any DCT coefficient for which the frequency is not zero in at least one dimension.
3.1.3 (adaptive) (binary) arithmetic decoding: An entropy decoding procedure which recovers the sequence of
symbols from the sequence of bits produced by the arithmetic encoder.
3.1.4 (adaptive) (binary) arithmetic encoding: An entropy encoding procedure which codes by means of a recursive subdivision of the probability of the sequence of symbols coded up to that point.
3.1.5 application environment: The standards for data representation, communication, or storage which have been established for a particular application.
ISO/IEC 10918-1 : 1993(E)
CCITT Rec. T.81 (1992 E) 1

ISO/IEC 10918-1 : 1993(E)
3.1.6 arithmetic decoder: An embodiment of arithmetic decoding procedure.
3.1.7 arithmetic encoder: An embodiment of arithmetic encoding procedure.
3.1.8 baseline (sequential): A particular sequential DCT-based encoding and decoding process specified in this
Specification, and which is required for all DCT-based decoding processes.
3.1.9 binary decision: Choice between two alternatives.
3.1.10 bit stream: Partially encoded or decoded sequence of bits comprising an entropy-coded segment.
3.1.11 block: An 8 × 8 array of samples or an 8 × 8 array of DCT coefficient values of one component.
3.1.12 block-row: A sequence of eight contiguous component lines which are partitioned into 8 × 8 blocks.
3.1.13 byte: A group of 8 bits.
3.1.14 byte stuffing: A procedure in which either the Huffman coder or the arithmetic coder inserts a zero byte into
the entropy-coded segment following the generation of an encoded hexadecimal X’FF’ byte.
3.1.15 carry bit: A bit in the arithmetic encoder code register which is set if a carry-over in the code register overflows the eight bits reserved for the output byte.
3.1.16 ceiling function: The mathematical procedure in which the greatest integer value of a real number is obtained by selecting the smallest integer value which is greater than or equal to the real number.
3.1.17 class (of coding process): Lossy or lossless coding processes.
3.1.18 code register: The arithmetic encoder register containing the least significant bits of the partially completed
entropy-coded segment. Alternatively, the arithmetic decoder register containing the most significant bits of a partially decoded entropy-coded segment.
3.1.19 coder: An embodiment of a coding process.
3.1.20 coding: Encoding or decoding.
3.1.21 coding model: A procedure used to convert input data into symbols to be coded.
3.1.22 (coding) process: A general term for referring to an encoding process, a decoding process, or both.
3.1.23 colour image: A continuous-tone image that has more than one component.
3.1.24 columns: Samples per line in a component.
3.1.25 component: One of the two-dimensional arrays which comprise an image.
3.1.26 compressed data: Either compressed image data or table specification data or both.
3.1.27 compressed image data: A coded representation of an image, as specified in this Specification.
3.1.28 compression: Reduction in the number of bits used to represent source image data.
3.1.29 conditional exchange: The interchange of MPS and LPS probability intervals whenever the size of the LPS
interval is greater than the size of the MPS interval (in arithmetic coding).
3.1.30 (conditional) probability estimate: The probability value assigned to the LPS by the probability estimation state machine (in arithmetic coding).
3.1.31 conditioning table: The set of parameters which select one of the defined relationships between prior coding decisions and the conditional probability estimates used in arithmetic coding.
3.1.32 context: The set of previously coded binary decisions which is used to create the index to the probability estimation state machine (in arithmetic coding).
3.1.33 3.1.34
continuous-tone image: An image whose components have more than one bit per sample.
data unit: An 8 × 8 block of samples of one component in DCT-based processes; a sample in lossless processes.
2
CCITT Rec. T.81 (1992 E)

ISO/IEC 10918-1 : 1993(E)
3.1.35 DC coefficient: The DCT coefficient for which the frequency is zero in both dimensions.
3.1.36 DC prediction: The procedure used by DCT-based encoders whereby the quantized DC coefficient from the
previously encoded 8 × 8 block of the same component is subtracted from the current quantized DC coefficient.
3.1.37 (DCT) coefficient: The amplitude of a specific cosine basis function – may refer to an original DCT coefficient, to a quantized DCT coefficient, or to a dequantized DCT coefficient.
3.1.38 decoder: An embodiment of a decoding process.
3.1.39 decoding process: A process which takes as its input compressed image data and outputs a continuous-tone
image.
3.1.40 default conditioning: The values defined for the arithmetic coding conditioning tables at the beginning of coding of an image.
3.1.41 dequantization: The inverse procedure to quantization by which the decoder recovers a representation of the DCT coefficients.
3.1.42 differential component: The difference between an input component derived from the source image and the corresponding reference component derived from the preceding frame for that component (in hierarchical mode coding).
3.1.43 differential frame: A frame in a hierarchical process in which differential components are either encoded or decoded.
3.1.44 (digital) reconstructed image (data): A continuous-tone image which is the output of any decoder defined in this Specification.
3.1.45 (digital) source image (data): A continuous-tone image used as input to any encoder defined in this Specification.
3.1.46 (digital) (still) image: A set of two-dimensional arrays of integer data.
3.1.47 discrete cosine transform; DCT: Either the forward discrete cosine transform or the inverse discrete cosine
transform.
3.1.48 downsampling (filter): A procedure by which the spatial resolution of an image is reduced (in hierarchical mode coding).
3.1.49 encoder: An embodiment of an encoding process.
3.1.50 encoding process: A process which takes as its input a continuous-tone image and outputs compressed image
data.
3.1.51 entropy-coded (data) segment: An independently decodable sequence of entropy encoded bytes of compressed image data.
3.1.52 (entropy-coded segment) pointer: The variable which points to the most recently placed (or fetched) byte in the entropy encoded segment.
3.1.53 entropy decoder: An embodiment of an entropy decoding procedure.
3.1.54 entropy decoding: A lossless procedure which recovers the sequence of symbols from the sequence of bits
produced by the entropy encoder.
3.1.55 entropy encoder: An embodiment of an entropy encoding procedure.
3.1.56 entropy encoding: A lossless procedure which converts a sequence of input symbols into a sequence of bits
such that the average number of bits per symbol approaches the entropy of the input symbols.
3.1.57 extended (DCT-based) process: A descriptive term for DCT-based encoding and decoding processes in which additional capabilities are added to the baseline sequential process.
3.1.58 forward discrete cosine transform; FDCT: A mathematical transformation using cosine basis functions which converts a block of samples into a corresponding block of original DCT coefficients.
CCITT Rec. T.81 (1992 E) 3

ISO/IEC 10918-1 : 1993(E)
3.1.59 frame: A group of one or more scans (all using the same DCT-based or lossless process) through the data of one
or more of the components in an image.
3.1.60 frame header: A marker segment that contains a start-of-frame marker and associated frame parameters that are coded at the beginning of a frame.
3.1.61 frequency: A two-dimensional index into the two-dimensional array of DCT coefficients.
3.1.62 (frequency) band: A contiguous group of coefficients from the zig-zag sequence (in progressive mode coding).
3.1.63 full progression: A process which uses both spectral selection and successive approximation (in progressive
mode coding).
3.1.64 grayscale image: A continuous-tone image that has only one component.
3.1.65 hierarchical: A mode of operation for coding an image in which the first frame for a given component is
followed by frames which code the differences between the source data and the reconstructed data from the previous frame for that component. Resolution changes are allowed between frames.
3.1.66 hierarchical decoder: A sequence of decoder processes in which the first frame for each component is followed by frames which decode an array of differences for each component and adds it to the reconstructed data from the preceding frame for that component.
3.1.67 hierarchical encoder: The mode of operation in which the first frame for each component is followed by frames which encode the array of differences between the source data and the reconstructed data from the preceding frame for that component.
3.1.68 horizontal sampling factor: The relative number of horizontal data units of a particular component with respect to the number of horizontal data units in the other components.
3.1.69 Huffman decoder: An embodiment of a Huffman decoding procedure.
3.1.70 Huffman decoding: An entropy decoding procedure which recovers the symbol from each variable length code
produced by the Huffman encoder.
3.1.71 Huffman encoder: An embodiment of a Huffman encoding procedure.
3.1.72 Huffman encoding: An entropy encoding procedure which assigns a variable length code to each input symbol.
3.1.73 Huffman table: The set of variable length codes required in a Huffman encoder and Huffman decoder.
3.1.74 image data: Either source image data or reconstructed image data.
3.1.75 interchange format: The representation of compressed image data for exchange between application
environments.
3.1.76 interleaved: The descriptive term applied to the repetitive multiplexing of small groups of data units from each component in a scan in a specific order.
3.1.77 inverse discrete cosine transform; IDCT: A mathematical transformation using cosine basis functions which converts a block of dequantized DCT coefficients into a corresponding block of samples.
3.1.78 Joint Photographic Experts Group; JPEG: The informal name of the committee which created this Specification. The “joint” comes from the CCITT and ISO/IEC collaboration.
3.1.79 latent output: Output of the arithmetic encoder which is held, pending resolution of carry-over (in arithmetic coding).
3.1.80 less probable symbol; LPS: For a binary decision, the decision value which has the smaller probability.
3.1.81 level shift: A procedure used by DCT-based encoders and decoders whereby each input sample is either
converted from an unsigned representation to a two’s complement representation or from a two’s complement representation to an unsigned representation.
4 CCITT Rec. T.81 (1992 E)

ISO/IEC 10918-1 : 1993(E) 3.1.82 lossless: A descriptive term for encoding and decoding processes and procedures in which the output of the
decoding procedure(s) is identical to the input to the encoding procedure(s).
3.1.83 lossless coding: The mode of operation which refers to any one of the coding processes defined in this Specification in which all of the procedures are lossless (see Annex H).
3.1.84 lossy: A descriptive term for encoding and decoding processes which are not lossless.
3.1.85 marker: A two-byte code in which the first byte is hexadecimal FF (X’FF’) and the second byte is a value
between 1 and hexadecimal FE (X’FE’).
3.1.86 marker segment: A marker and associated set of parameters.
3.1.87 MCU-row: The smallest sequence of MCU which contains at least one line of samples or one block-row from
every component in the scan.
3.1.88 minimum coded unit; MCU: The smallest group of data units that is coded.
3.1.89 modes (of operation): The four main categories of image coding processes defined in this Specification.
3.1.90 more probable symbol; MPS: For a binary decision, the decision value which has the larger probability.
3.1.91 non-differential frame: The first frame for any components in a hierarchical encoder or decoder. The
components are encoded or decoded without subtraction from reference components. The term refers also to any frame in modes other than the hierarchical mode.
3.1.92 non-interleaved: The descriptive term applied to the data unit processing sequence when the scan has only one component.
3.1.93 parameters: Fixed length integers 4, 8 or 16 bits in length, used in the compressed data formats.
3.1.94 point transform: Scaling of a sample or DCT coefficient.
3.1.95 precision: Number of bits allocated to a particular sample or DCT coefficient.
3.1.96 predictor: A linear combination of previously reconstructed values (in lossless mode coding).
3.1.97 probability estimation state machine: An interlinked table of probability values and indices which is used to
estimate the probability of the LPS (in arithmetic coding).
3.1.98 probability interval: The probability of a particular sequence of binary decisions within the ordered set of all possible sequences (in arithmetic coding).
3.1.99 (probability) sub-interval: A portion of a probability interval allocated to either of the two possible binary decision values (in arithmetic coding).
3.1.100 procedure: A set of steps which accomplishes one of the tasks which comprise an encoding or decoding process.
3.1.101 process: See coding process.
3.1.102 progressive (coding): One of the DCT-based processes defined in this Specification in which each scan
typically improves the quality of the reconstructed image.
3.1.103 progressive DCT-based: The mode of operation which refers to any one of the processes defined in Annex G.
3.1.104 quantization table: The set of 64 quantization values used to quantize the DCT coefficients.
3.1.105 quantization value: An integer value used in the quantization procedure.
3.1.106 quantize: The act of performing the quantization procedure for a DCT coefficient.
3.1.107 reference (reconstructed) component: Reconstructed component data which is used in a subsequent frame of a
hierarchical encoder or decoder process (in hierarchical mode coding).
CCITT Rec. T.81 (1992 E) 5

ISO/IEC 10918-1 : 1993(E)
3.1.108 renormalization: The doubling of the probability interval and the code register value until the probability
interval exceeds a fixed minimum value (in arithmetic coding).
3.1.109 restart interval: The integer number of MCUs processed as an independent sequence within a scan.
3.1.110 restart marker: The marker that separates two restart intervals in a scan.
3.1.111 run (length): Number of consecutive symbols of the same value.
3.1.112 sample: One element in the two-dimensional array which comprises a component.
3.1.113 sample-interleaved: The descriptive term applied to the repetitive multiplexing of small groups of samples from
each component in a scan in a specific order.
3.1.114 scan: A single pass through the data for one or more of the components in an image.
3.1.115 scan header: A marker segment that contains a start-of-scan marker and associated scan parameters that are
coded at the beginning of a scan.
3.1.116 sequential (coding): One of the lossless or DCT-based coding processes defined in this Specification in which each component of the image is encoded within a single scan.
3.1.117 sequential DCT-based: The mode of operation which refers to any one of the processes defined in Annex F.
3.1.118 spectral selection: A progressive coding process in which the zig-zag sequence is divided into bands of one or
more contiguous coefficients, and each band is coded in one scan.
3.1.119 stack counter: The count of X’FF’ bytes which are held, pending resolution of carry-over in the arithmetic encoder.
3.1.120 statistical conditioning: The selection, based on prior coding decisions, of one estimate out of a set of conditional probability estimates (in arithmetic coding).
3.1.121 statistical model: The assignment of a particular conditional probability estimate to each of the binary arithmetic coding decisions.
3.1.122 statistics area: The array of statistics bins required for a coding process which uses arithmetic coding.
3.1.123 statistics bin: The storage location where an index is stored which identifies the value of the conditional
probability estimate used for a particular arithmetic coding binary decision.
3.1.124 successive approximation: A progressive coding process in which the coefficients are coded with reduced precision in the first scan, and precision is increased by one bit with each succeeding scan.
3.1.125 table specification data: The coded representation from which the tables used in the encoder and decoder are generated and their destinations specified.
3.1.126 transcoder: A procedure for converting compressed image data of one encoder process to compressed image data of another encoder process.
3.1.127 (uniform) quantization: The procedure by which DCT coefficients are linearly scaled in order to achieve compression.
3.1.128 upsampling (filter): A procedure by which the spatial resolution of an image is increased (in hierarchical mode coding).
3.1.129 vertical sampling factor: The relative number of vertical data units of a particular component with respect to the number of vertical data units in the other components in the frame.
3.1.130 zero byte: The X’00’ byte.
3.1.131 zig-zag sequence: A specific sequential ordering of the DCT coefficients from (approximately) lowest spatial
frequency to highest.
3.1.132 3-sample predictor: A linear combination of the three nearest neighbor reconstructed samples to the left and above (in lossless mode coding).
6 CCITT Rec. T.81 (1992 E)

3.2 Symbols
The symbols used in this Specification are listed below.
A probability interval
AC AC DCT coefficient
ACji AC coefficient predicted from DC values
Ah successive approximation bit position, high
Al successive approximation bit position, low
Api ith 8-bit parameter in APPn segment
APPn marker reserved for application segments
B current byte in compressed data
B2 next byte in compressed data when B = X’FF’
BE counter for buffered correction bits for Huffman coding in the successive approximation
process
BITS 16-byte list containing number of Huffman codes of each length
BP pointer to compressed data
BPST pointer to byte before start of entropy-coded segment
BR counter for buffered correction bits for Huffman coding in the successive approximation
process
Bx byte modified by a carry-over
C value of bit stream in code register Ci component identifier for frame
Cu horizontal frequency dependent scaling factor in DCT
Cv vertical frequency dependent scaling factor in DCT
CE conditional exchange
C-low low order 16 bits of the arithmetic decoder code register Cmi ith 8-bit parameter in COM segment
CNT bit counter in NEXTBYTE procedure
CODE Huffman code value
CODESIZE(V) code size for symbol V
COM comment marker
Cs conditioning table value
Csi component identifier for scan
CT renormalization shift counter
Cx high order 16 bits of arithmetic decoder code register CX conditional exchange
dji data unit from horizontal position i, vertical position j djik dji for component k
D decision decoded
ISO/IEC 10918-1 : 1993(E)
CCITT Rec. T.81 (1992 E) 7

ISO/IEC 10918-1 : 1993(E)
Da
DAC
Db
DC
DCi
DCk
DHP
DHT
DIFF
DNL
DQT
DRI
E
EC
ECS
ECSi
Eh
EHUFCO
EHUFSI
EOB
EOBn
EOBx
EOB0, EOB1, …, EOB14 EOI
Ev
EXP FREQ(V)
Hi
Hmax HUFFCODE HUFFSIZE HUFFVAL
i
I
Index(S)
j
J
in DC coding, the DC difference coded for the previous block from the same component; in lossless coding, the difference coded for the sample immediately to the left
define-arithmetic-coding-conditioning marker
the difference coded for the sample immediately above
DC DCT coefficient
DC coefficient for ith block in component
kth DC value used in prediction of AC coefficients
define hierarchical progression marker
define-Huffman-tables marker
difference between quantized DC and prediction define-number-of-lines marker
define-quantization-tables marker
define restart interval marker
exponent in magnitude category upper bound
event counter
entropy-coded segment
ith entropy-coded segment
horizontal expansion parameter in EXP segment
Huffman code table for encoder
encoder table of Huffman code sizes
end-of-block for sequential; end-of-band for progressive
run length category for EOB runs
position of EOB in previous successive approximation scan
run length categories for EOB runs
end-of-image marker
vertical expansion parameter in EXP segment
expand reference components marker
frequency of occurrence of symbol V
horizontal sampling factor for ith component
largest horizontal sampling factor
list of Huffman codes corresponding to lengths in HUFFSIZE
list of code lengths
list of values assigned to each Huffman code
subscript index
integer variable
index to probability estimation state machine table for context index S subscript index
integer variable
8 CCITT Rec. T.81 (1992 E)

JPG JPGn k
K Kmin Kx
L
Li
Li(t)
La
LASTK
Lc
Ld
Le

Lp
LPS
Lq
Lr
Ls
LSB
m
mt
M
Mn
MAXCODE
MCU
MCUi
MCUR
MINCODE
MPS
MPS(S)
MSB
M2, M3, M4, … , M15
n
N N/A
marker reserved for JPEG extensions marker reserved for JPEG extensions subscript index
integer variable
index of 1st AC coefficient in band (1 for sequential DCT) conditioning parameter for AC arithmetic coding model
DC and lossless coding conditioning lower bound parameter element in BITS list in DHT segment
element in BITS list in the DHT segment for Huffman table t length of parameters in APPn segment
largest value of K
length of parameters in COM segment
length of parameters in DNL segment length of parameters in EXP segment length of frame header parameters
length of parameters in DHT segment length of parameters in DAC segment
less probable symbol (in arithmetic coding) length of parameters in DQT segment length of parameters in DRI segment length of scan header parameters
least significant bit
modulo 8 counter for RSTm marker
number of Vi,j parameters for Huffman table t
bit mask used in coding magnitude of V
nth statistics bin for coding magnitude bit pattern category
table with maximum value of Huffman code for each code length
minimum coded unit
ith MCU
number of MCU required to make up one MCU-row
table with minimum value of Huffman code for each code length
more probable symbol (in arithmetic coding)
more probable symbol for context-index S
most significant bit
designation of context-indices for coding of magnitude bits in the arithmetic coding models
integer variable
data unit counter for MCU coding not applicable
ISO/IEC 10918-1 : 1993(E)
CCITT Rec. T.81 (1992 E) 9

ISO/IEC 10918-1 : 1993(E)
Nb Next_Index_LPS Next_Index_MPS Nf
NL
Ns
OTHERS(V)
P
Pq
Pq(t)
PRED
Pt

Qvu
Q00
QACji
QDCk
Qe
Qe(S)
Qk
rvu
R
Rvu
Ra
Rb
Rc
Rd
RES
Ri
RRRR
RS
RSTm
syx
S
Svu
number of data units in MCU
new value of Index(S) after a LPS renormalization
new value of Index(S) after a MPS renormalization
number of components in frame
number of lines defined in DNL segment
number of components in scan
index to next symbol in chain
sample precision
quantizer precision parameter in DQT segment
quantizer precision parameter in DQT segment for quantization table t quantized DC coefficient from the most recently coded block of the component point transform parameter
calculated value of sample
quantizer value for coefficient ACji
quantization value for DCT coefficient Svu
quantizer value for DC coefficient
quantized AC coefficient predicted from DC values
kth quantized DC value used in prediction of AC coefficients
LPS probability estimate
LPS probability estimate for context index S
kth element of 64 quantization elements in DQT segment
reconstructed image sample
length of run of zero amplitude AC coefficients
dequantized DCT coefficient
reconstructed sample value
reconstructed sample value
reconstructed sample value
rounding in prediction calculation
reserved markers
restart interval in DRI segment
4-bit value of run length of zero AC coefficients
composite value used in Huffman coding of AC coefficients
restart marker number m
reconstructed value from IDCT
context index
DCT coefficient at horizontal frequency u, vertical frequency v
10 CCITT Rec. T.81 (1992 E)

SC
Se
SE
SI
SIGN SIZE SLL SLL α β SN SOF0 SOF1 SOF2 SOF3 SOF5 SOF6 SOF7 SOF9 SOF10 SOF11 SOF13 SOF14 SOF15 SOI SOS
SP
Sqvu SRL
SRL α β
Ss
SS
SSSS
ST Switch_MPS Sz
S0 t T
context-index for coding of correction bit in successive approximation coding end of spectral selection band in zig-zag sequence
context-index for coding of end-of-block or end-of-band
Huffman code size
1 if decoded sense of sign is negative and 0 if decoded sense of sign is positive length of a Huffman code
shift left logical operation
logical shift left of α by β bits
context-index for coding of first magnitude category when V is negative baseline DCT process frame marker
extended sequential DCT frame marker, Huffman coding
progressive DCT frame marker, Huffman coding
lossless process frame marker, Huffman coding
differential sequential DCT frame marker, Huffman coding
differential progressive DCT frame marker, Huffman coding differential lossless process frame marker, Huffman coding
sequential DCT frame marker, arithmetic coding
progressive DCT frame marker, arithmetic coding
lossless process frame marker, arithmetic coding
differential sequential DCT frame marker, arithmetic coding differential progressive DCT frame marker, arithmetic coding differential lossless process frame marker, arithmetic coding start-of-image marker
start-of-scan marker
context-index for coding of first magnitude category when V is positive quantized DCT coefficient
shift right logical operation
logical shift right of α by β bits
start of spectral selection band in zig-zag sequence
context-index for coding of sign decision
4-bit size category of DC difference or AC coefficient amplitude
stack counter
parameter controlling inversion of sense of MPS
parameter used in coding magnitude of V
context-index for coding of V = 0 decision
summation index for parameter limits computation
temporary variable
ISO/IEC 10918-1 : 1993(E)
CCITT Rec. T.81 (1992 E) 11

ISO/IEC 10918-1 : 1993(E)
Tc Tdj TEM Th U
V
Vi
Vi,j
Vmax
Vt VALPTR V1
V2
xi
X
Xi
X1, X2, X3, … , X15
XHUFCO XHUFSI X’values’ yi
Y ZRL ZZ(K) ZZ(0)
4 General
AC entropy table destination selector for jth component in scan arithmetic conditioning table destination identifier
Huffman coding or arithmetic coding table class
DC entropy table destination selector for jth component in scan temporary marker
Huffman table destination identifier in DHT segment
quantization table destination identifier in DQT segment
quantization table destination selector for ith component in frame
DC and lossless coding conditioning upper bound parameter
symbol or value being either encoded or decoded
vertical sampling factor for ith component
jth value for length i in HUFFVAL
largest vertical sampling factor
temporary variable
list of indices for first value in HUFFVAL for each code length
symbol value
symbol value
number of columns in ith component
number of samples per line in component with largest horizontal dimension
ith statistics bin for coding magnitude category decision
designation of context-indices for coding of magnitude categories in the arithmetic coding models
extended Huffman code table
table of sizes of extended Huffman codes
values within the quotes are hexadecimal
number of lines in ith component
number of lines in component with largest vertical dimension value in HUFFVAL assigned to run of 16 zero coefficients
Kth element in zig-zag sequence of quantized DCT coefficients quantized DC coefficient in zig-zag sequence order
The purpose of this clause is to give an informative overview of the elements specified in this Specification. Another purpose is to introduce many of the terms which are defined in clause 3. These terms are printed in italics upon first usage in this clause.
12 CCITT Rec. T.81 (1992 E)

4.1 Elements specified in this Specification
There are three elements specified in this Specification:
a) An encoder is an embodiment of an encoding process. As shown in Figure 1, an encoder takes as input digital source image data and table specifications, and by means of a specified set of procedures generates as output compressed image data.
b) A decoder is an embodiment of a decoding process. As shown in Figure 2, a decoder takes as input compressed image data and table specifications, and by means of a specified set of procedures generates as output digital reconstructed image data.
c) The interchange format, shown in Figure 3, is a compressed image data representation which includes all table specifications used in the encoding process. The interchange format is for exchange between application environments.
ISO/IEC 10918-1 : 1993(E)
Encoder
Source image data
FIGURE 1 [D01] 5 cm = 195%
Compressed image data
FIGURE 2 [D02] 6 cm 234%
TISO0650-93/d001
Compressed image data
T able specifications
Figure 1 – Encoder
Decoder
T able specifications
Figure 2 – Decoder
Figures 1 and 2 illustrate the general case for which the continuous-tone source and reconstructed image data consist of multiple components. (A colour image consists of multiple components; a grayscale image consists only of a single component.) A significant portion of this Specification is concerned with how to handle multiple-component images in a flexible, application-independent way.
TISO0660-93/d002
Reconstructed image data
CCITT Rec. T.81 (1992 E) 13

ISO/IEC 10918-1 : 1993(E)
Application environment A
14 CCITT Rec. T.81 (1992 E)
Compressed image data, including table specifications
Application environment B
Figure 3 – Interchange format for compressed image data
FIGURE 3 [D03] 9,5cm = 371 %
These figures are also meant to show that the same tables specified for an encoder to use to compress a particular image must be provided to a decoder to reconstruct that image. However, this Specification does not specify how applications should associate tables with compressed image data, nor how they should represent source image data generally within their specific environments.
Consequently, this Specification also specifies the interchange format shown in Figure 3, in which table specifications are included within compressed image data. An image compressed with a specified encoding process within one application environment, A, is passed to a different environment, B, by means of the interchange format. The interchange format does not specify a complete coded image representation. Application-dependent information, e.g. colour space, is outside the scope of this Specification.
4.2 Lossy and lossless compression
This Specification specifies two classes of encoding and decoding processes, lossy and lossless processes. Those based on the discrete cosine transform (DCT) are lossy, thereby allowing substantial compression to be achieved while producing a reconstructed image with high visual fidelity to the encoder’s source image.
The simplest DCT-based coding process is referred to as the baseline sequential process. It provides a capability which is sufficient for many applications. There are additional DCT-based processes which extend the baseline sequential process to a broader range of applications. In any decoder using extended DCT-based decoding processes, the baseline decoding process is required to be present in order to provide a default decoding capability.
The second class of coding processes is not based upon the DCT and is provided to meet the needs of applications requiring lossless compression. These lossless encoding and decoding processes are used independently of any of the DCT-based processes.
A table summarizing the relationship among these lossy and lossless coding processes is included in 4.11.
The amount of compression provided by any of the various processes is dependent on the characteristics of the particular image being compressed, as well as on the picture quality desired by the application and the desired speed of compression and decompression.
TISO0670-93/d003

4.3 DCT-based coding
Figure 4 shows the main procedures for all encoding processes based on the DCT. It illustrates the special case of a single- component image; this is an appropriate simplification for overview purposes, because all processes specified in this Specification operate on each image component independently.
8 × 8 blocks
Source image data
FIGURE 4 [D04] 7 cm = 273 %
DCT-based encoder
ISO/IEC 10918-1 : 1993(E)
FDCT
Quantizer
Entropy
encoder
T able specifications
T able specifications
Figure 4 – DCT-based encoder simplified diagram
In the encoding process the input component’s samples are grouped into 8 × 8 blocks, and each block is transformed by the forward DCT (FDCT) into a set of 64 values referred to as DCT coefficients. One of these values is referred to as the DC coefficient and the other 63 as the AC coefficients.
Each of the 64 coefficients is then quantized using one of 64 corresponding values from a quantization table (determined by one of the table specifications shown in Figure 4). No default values for quantization tables are specified in this Specification; applications may specify values which customize picture quality for their particular image characteristics, display devices, and viewing conditions.
After quantization, the DC coefficient and the 63 AC coefficients are prepared for entropy encoding, as shown in Figure 5. The previous quantized DC coefficient is used to predict the current quantized DC coefficient, and the difference is encoded. The 63 quantized AC coefficients undergo no such differential encoding, but are converted into a one- dimensional zig-zag sequence, as shown in Figure 5.
The quantized coefficients are then passed to an entropy encoding procedure which compresses the data further. One of two entropy coding procedures can be used, as described in 4.6. If Huffman encoding is used, Huffman table specifications must be provided to the encoder. If arithmetic encoding is used, arithmetic coding conditioning table specifications may be provided, otherwise the default conditioning table specifications shall be used.
Figure 6 shows the main procedures for all DCT-based decoding processes. Each step shown performs essentially the inverse of its corresponding main procedure within the encoder. The entropy decoder decodes the zig-zag sequence of quantized DCT coefficients. After dequantization the DCT coefficients are transformed to an 8 × 8 block of samples by the inverse DCT (IDCT).
4.4 Lossless coding
Figure 7 shows the main procedures for the lossless encoding processes. A predictor combines the reconstructed values of up to three neighbourhood samples at positions a, b, and c to form a prediction of the sample at position x as shown in Figure 8. This prediction is then subtracted from the actual value of the sample at position x, and the difference is losslessly entropy-coded by either Huffman or arithmetic coding.
Compressed image data
TISO0680 -93/ d004
CCITT Rec. T.81 (1992 E) 15

ISO/IEC 10918-1 : 1993(E)
DCi – 1 DCi
DIFF = DC i – DC i – 1
Differential DC encoding
TISO0690-93/d005
DC AC01 AC07
FIGURE 5 [D05] 8 cm = 313 %
Figure 5 – Preparation of quantized coefficients for entropy encoding
AC70 AC77 Zig-zag order
DCT-based decoder
Entropy
decoder
Dequantizer
IDCT
Table specifications
T able specifications
Compressed image data
FIGURE 6 [D06] 6,5 cm = 254 %
TISO0700-93/d006
Reconstructed image data
Figure 6 – DCT-based decoder simplified diagram
Lossless encoder
Source image data
FIGURE 7 [D07] 6,5 cm = 254 %
TISO0710-93/d007
Compressed image data
16 CCITT Rec. T.81 (1992 E)
Block i – 1 Block i
Predictor
Figure 7 – Lossless encoder simplified diagram
Entropy
encoder
T able specifications

FIGURE 8 [D08] 5 cm = 195 %
TISO0720-93/d008
Figure 8 – 3-sample prediction neighbourhood
This encoding process may also be used in a slightly modified way, whereby the precision of the input samples is reduced by one or more bits prior to the lossless coding. This achieves higher compression than the lossless process (but lower compression than the DCT-based processes for equivalent visual fidelity), and limits the reconstructed image’s worst-case sample error to the amount of input precision reduction.
4.5 Modes of operation
There are four distinct modes of operation under which the various coding processes are defined: sequential DCT-based, progressive DCT-based, lossless, and hierarchical. (Implementations are not required to provide all of these.) The lossless mode of operation was described in 4.4. The other modes of operation are compared as follows.
For the sequential DCT-based mode, 8 × 8 sample blocks are typically input block by block from left to right, and block- row by block-row from top to bottom. After a block has been transformed by the forward DCT, quantized and prepared for entropy encoding, all 64 of its quantized DCT coefficients can be immediately entropy encoded and output as part of the compressed image data (as was described in 4.3), thereby minimizing coefficient storage requirements.
For the progressive DCT-based mode, 8 × 8 blocks are also typically encoded in the same order, but in multiple scans through the image. This is accomplished by adding an image-sized coefficient memory buffer (not shown in Figure 4) between the quantizer and the entropy encoder. As each block is transformed by the forward DCT and quantized, its coefficients are stored in the buffer. The DCT coefficients in the buffer are then partially encoded in each of multiple scans. The typical sequence of image presentation at the output of the decoder for sequential versus progressive modes of operation is shown in Figure 9.
There are two procedures by which the quantized coefficients in the buffer may be partially encoded within a scan. First, only a specified band of coefficients from the zig-zag sequence need be encoded. This procedure is called spectral selection, because each band typically contains coefficients which occupy a lower or higher part of the frequency spectrum for that 8 × 8 block. Secondly, the coefficients within the current band need not be encoded to their full (quantized) accuracy within each scan. Upon a coefficient’s first encoding, a specified number of most significant bits is encoded first. In subsequent scans, the less significant bits are then encoded. This procedure is called successive approximation. Either procedure may be used separately, or they may be mixed in flexible combinations.
In hierarchical mode, an image is encoded as a sequence of frames. These frames provide reference reconstructed components which are usually needed for prediction in subsequent frames. Except for the first frame for a given component, differential frames encode the difference between source components and reference reconstructed components. The coding of the differences may be done using only DCT-based processes, only lossless processes, or DCT-based processes with a final lossless process for each component. Downsampling and upsampling filters may be used to provide a pyramid of spatial resolutions as shown in Figure 10. Alternatively, the hierarchical mode can be used to improve the quality of the reconstructed components at a given spatial resolution.
Hierarchical mode offers a progressive presentation similar to the progressive DCT-based mode but is useful in environments which have multi-resolution requirements. Hierarchical mode also offers the capability of progressive coding to a final lossless stage.
ISO/IEC 10918-1 : 1993(E)
c
b
a
x
CCITT Rec. T.81 (1992 E) 17

ISO/IEC 10918-1 : 1993(E)
AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAA
AAA