CS计算机代考程序代写 information theory algorithm INTRODUCTION

INTRODUCTION

Dr. – CS576 Lecture 3 9/20/2021 Page 1

IINNFFOORRMMAATTIIOONN TTHHEEOORRYY &&

GGEENNEERRIICC CCOOMMPPRREESSSSIIOONN

TTEECCHHNNIIQQUUEESS

Dr. – CS576 Lecture 3 9/20/2021 Page 2

TTOOPPIICCSS TTOO BBEE CCOOVVEERREEDD

Need for Compression

Data Sources and Information Representation

Overview of Lossless Compression Techniques

Overview of Lossy Compression Techniques

Dr. – CS576 Lecture 3 9/20/2021 Page 3

NNEEEEDD FFOORR CCOOMMPPRREESSSSIIOONN

Multimedia information has to be stored and
transported efficiently.

Multimedia information is bulky –
Interlaced HDTV: 1080x1920x30x12 = 745 Mb/s!

Use technologies with more bandwidth

• Expensive

• Research

Find ways to reduce the number of bits to transmit
without compromising on the “information content”

• Compression Technology

• Reliable Transmission

Dr. – CS576 Lecture 3 9/20/2021 Page 4

IINNFFOORRMMAATTIIOONN TTHHEEOORRYY

Pioneered primarily by

Information Theory deals with two sub topics

Source Coding – How can a communication system
efficiently transmit the information that a source
produces? Relates to Compression

Channel Coding – How can a communication system
achieve reliable communication over a noisy channel?
Relates to minimizing Transmission Error

Dr. – CS576 Lecture 3 9/20/2021 Page 5

TTYYPPEESS OOFF CCOOMMPPRREESSSSIIOONN

Lossless (entropy coding)

• Does not lose information – the signal is perfectly
reconstructed after decompression

• Produces a variable bit-rate

• It is not guaranteed to actually reduce the data
size

Lossy

• Loses some information – the signal is not
perfectly reconstructed after decompression

• Produces any desired constant bit-rate

Dr. – CS576 Lecture 3 9/20/2021 Page 6

MMOODDEELL IINNFFOORRMMAATTIIOONN AATT TTHHEE SSOOUURRCCEE

Model data at the Source as a Stream of Symbols – This
defines the “Vocabulary” of the source.

Each symbol in the vocabulary is represented by bits

If your vocabulary has N symbols, each symbol
represented with log 2 N bits.

• Speech – 16 bits/sample: N =216 =65,536 symbols

• Color Image: 3×8 bits/sample: N =224 =17×106
symbols

• 8×8 Image Blocks: 8×64 bits/block: N =2512 =1077
symbols

Dr. – CS576 Lecture 3 9/20/2021 Page 7

LLOOSSSSLLEESSSS CCOOMMPPRREESSSSIIOONN

Lossless compression techniques ensure no loss of
data after compression/decompression.

Coding: “Translate” each symbol in the vocabulary into
a “binary codeword”. Codewords may have different
binary lengths.

Example: You have 4 symbols (a, b, c, d). Each in binary
may be represented using 2 bits each, but coded using a
different number of bits.

a (00)  000

b (01)  001

c (10)  01

d (11)  1

Goal of Coding is to minimize the average symbol length

Dr. – CS576 Lecture 3 9/20/2021 Page 8

AAVVEERRAAGGEE SSYYMMBBOOLL LLEENNGGTTHH

Symbol Length l(i) = binary length of i th symbol

Source emits M symbols (one every T seconds)

Symbol i has been emitted m(i) times: M =

=

=

Ni

i

im
1

)(

Number of bits been emitted: L = 
=

=

Ni

i

ilim
1

)()(

Average length per symbol: L =
Milim

Ni

i


=

=1

)()(

Average bit rate: L / T

Dr. – CS576 Lecture 3 9/20/2021 Page 9

MMIINNIIMMUUMM AAVVEERRAAGGEE SSYYMMBBOOLL LLEENNGGTTHH

Main goal is to minimize the number of bits being

transmitted  mimimize the average symbol length.

Basic idea for reducing the average symbol length:
assign Shorter Codewords to symbols that appear more
frequently, Longer Codewords to symbols that appear
less frequently

Theorem: “the Average Binary Length of the encoded
symbols is always greater than or equal to the entropy H
of the source”

What is the entropy of a source of symbols and how is it
computed?

Dr. – CS576 Lecture 3 9/20/2021 Page 10

SSYYMMBBOOLL PPRROOBBAABBIILLIITTIIEESS

Probability P(i) of a symbol: number of times it can
occur in the transmission (also relative frequency) and

is defined as : P(i) = m(i)/M

From Probability Theory we know

0  P(i)  1 & 
=

=

Ni

i

iP
1

)(
= 1

Average symbol length is defined as

Milim
Ni

i


=

=1

)()(
=


=

=





Ni

i

il
M

im

1

)(
)(

=

=

=

Ni

i

iliP
1

)()(

Dr. – CS576 Lecture 3 9/20/2021 Page 11

EENNTTRROOPPYY

Entropy is defined as 
=

=

−=
Ni

i

iPiPH
1

2
)(log)(

The entropy is characteristic of a given source of
symbols

H is highest (equal to log 2 N) if all symbols are equally
probable

Entropy is small (always  0) when some symbols that
are much more likely to appear than other symbols

Dr. – CS576 Lecture 3 9/20/2021 Page 12

EENNTTRROOPPYY EEXXAAMMPPLLEE

Dr. – CS576 Lecture 3 9/20/2021 Page 13

TTAAXXOONNOOMMYY OOFF CCOOMMPPRREESSSSIIOONN TTEECCHHNNIIQQUUEESS

Compression techniques are broadly classified into

• Lossless Compression Techniques, also known
as Entropy Coding

• Lossy Compression Techniques

Dr. – CS576 Lecture 3 9/20/2021 Page 14

Repetitive

Sequence

Compression

Statistical

Compression

Prediction Based

Frequency Based

Signal Importance

Hybrid

Compression Techniques

Lossless Lossy

Dictionary

Based

Dr. – CS576 Lecture 3 9/20/2021 Page 15

EEXXAAMMPPLLEESS OOFF EENNTTRROOPPYY EENNCCOODDIINNGG

Types of techniques vary depending on how much you
statistically analyze the signal

Run-length Encoding
Sequence of elements c1, ci… is mapped to a run
– (ci, li) where ci = code and li = length of the i

th run

Repetition Suppression
Repetitive occurrences of a specific character are

replaced by a special flag ab000000000  ab9

Pattern Substitution
A pattern, which occurs frequently, is replaced by
a specific symbol eg LZW

Arithmetic Coding

Dr. – CS576 Lecture 3 9/20/2021 Page 16

LLZZWW –– PPAATTTTEERRNN SSUUBBSSTTIITTUUTTIIOONN

Algorithm :

1. Initialize the dictionary to contain all blocks of
length one ( D={a,b} in the example below ).

2. Search for the longest block W which has
appeared in the dictionary.

3. Encode W by its index in the dictionary.

4. Add W followed by the first symbol of the next
block to the dictionary.

5. Go to Step 2.

Dr. – CS576 Lecture 3 9/20/2021 Page 17

PPAATTTTEERRNN SSUUBBSSTTIITTUUTTIIOONN –– EEXXAAMMPPLLEE

Dr. – CS576 Lecture 3 9/20/2021 Page 18

HHUUFFFFMMAANN CCOODDIINNGG

Huffman code assignment procedure is based on the
frequency of occurrence of a symbol. It uses a binary
tree structure and the algorithm is as follows

• The leaves of the binary tree are associated with the
list of probabilities

• Take the two smallest probabilities in the list and make
the corresponding nodes siblings. Generate an
intermediate node as their parent and label the branch
from parent to one of the child nodes 1 and the branch
from parent to the other child 0.

• Replace the probabilities and associated nodes in the
list by the single new intermediate node with the sum
of the two probabilities. If the list contains only one
element, quit. Otherwise, go to step 2.

Dr. – CS576 Lecture 3 9/20/2021 Page 19

HHUUFFFFMMAANN CCOODDIINNGG ((22))

Codeword formation:
Follow the path from the root of the tree to the
symbol, and accumulates the labels of all the
branches.

Example
N = 8 symbols: {a, b, c, d, e, f, g, h}
3 bits per symbol (N =23 =8)
P(a) = 0.01, P(b)=0.02, P(c)=0.05, P(d)=0.09,
P(e)=0.18, P(f)=0.2, P(g)=0.2, P(h)=0.25

Average length per symbol L = 3)(3
8

1

=
=

=

i

i

iP

Entropy H = 5828.2)(log)(
8

1

2
=−= 

=

=

i

i

iPiPH = 0.86L

Dr. – CS576 Lecture 3 9/20/2021 Page 20

HHUUFFFFMMAANN CCOODDIINNGG ((22))

Dr. – CS576 Lecture 3 9/20/2021 Page 21

HHUUFFFFMMAANN CCOODDIINNGG ((33))

Average Length per Symbol (with )

LHuff = 63.225.022.022.0218.0309.0405.0502.0601.06)()(
8

1

=+++++++=
=

=

xxxxxxxxiliP
i

i

Efficiency of the Encoder = H/ LHuff = 98%

Dr. – CS576 Lecture 3 9/20/2021 Page 22

CCCCIITTTT GGRROOUUPP 33 11–DD FFAAXX EENNCCOODDIINNGG

Facsimile: bilevel image, 8.5” x 11” page scanned (left-
to-right) at 200 dpi: 3.74 Mb = (1728 pixels on each line)

Encoding Process

• Count each run of white/black pixels

• Encode the run-lengths with Huffman coding

Probabilities of run-lengths are not the same for black
and white pixels

Decompose each run-length n as n =k x 64 +m (with

  m <64), and create Huffman tables for both k and m Special Codewords for end-of-line, end-of-page, synchronization Average compression ratio on text documents: 20-to-1 Dr. – CS576 Lecture 3 9/20/2021 Page 23 AARRIITTHHMMEETTIICC CCOODDIINNGG Map a message of symbols to a real number interval. If you have a vocabulary of M symbols - 1. Divide the interval [0,1) into segments corresponding to the M symbols; the segment of each symbol has a length proportional to its probability. 2. Choose the segment of the first symbol in the string message. 3. Divide the segment of this symbol again into M new segments with length proportional to the symbols probabilities. 4. From these new segments, choose the one corresponding to the next symbol in the message. 5. Continue steps 3) and 4) until the whole message is coded. 6. Represent the segment's value by a binary fraction. Dr. – CS576 Lecture 3 9/20/2021 Page 24 AARRIITTHHMMEETTIICC CCOODDIINNGG ((22)) Suppose you have a vocabulary consisting of two symbols X, Y having prob(X) = 2/3 and prob(Y) = 1/3 If we are only concerned with encoding length 2 messages, then we can map all possible messages to intervals in the range [0,1): To encode message, just send enough bits of a binary fraction that uniquely specifies the interval. Dr. – CS576 Lecture 3 9/20/2021 Page 25 AARRIITTHHMMEETTIICC CCOODDIINNGG ((22)) Here is how code words are formed for the above example with message of length 2. Dr. – CS576 Lecture 3 9/20/2021 Page 26 AARRIITTHHMMEETTIICC CCOODDIINNGG ((33)) Here is how code words are formed for the above example with message of length 3. Dr. – CS576 Lecture 3 9/20/2021 Page 27 LLOOSSSSYY CCOOMMPPRREESSSSIIOONN Decompressed signal is not like original signal – data loss Objective: minimize the distortion for a given compression ratio • Ideally, we would optimize the system based on perceptual distortion (difficult to compute) • We’ll need a few more concepts from statistics… Dr. – CS576 Lecture 3 9/20/2021 Page 28 SSTTAATTIISSTTIICCAALL DDEEFFIINNIITTIIOONNSS y = the original value of a sample ŷ = the sample value after compression/decompression e = y - ŷ = error (or noise or distortion) y 2 = power (variance) of the signal e 2 = power (variance) of the error Signal to Noise Ratio (SNR) = 10 log10 (y 2 e 2) Dr. – CS576 Lecture 3 9/20/2021 Page 29 LLOOSSSSYY CCOOMMPPRREESSSSIIOONN:: SSIIMMPPLLEE EEXXAAMMPPLLEESS Subsampling: retain only samples on a subsampling grid (spatial/temporal). See examples in previous lecture • Compression achieved by reducing the number of samples • Has fundamental limitations: see sampling theorem Quantization: quantize with fewer bits • Compression achieved by reducing the number of bits per sample • As Quantization Interval size increases – compression increases (so does error!) • Scalar Vs Vector Quantization Can we do better than simple quantization and subsampling? Dr. – CS576 Lecture 3 9/20/2021 Page 30 DDIIFFFFEERREENNTTIIAALL PPCCMM If the (n-1)-th sample has value y(n-1), what is the “most probable” value for y(n)? The simplest case: the predicted value for y(n) is y(n-1) Differential PCM Encoding (DPCM): • Don’t quantize and transmit y(n) • Quantize and transmit the residual d(n) = y(n)-y(n-1) Decoding: ŷ(n) = y(n-1) + d ^ (n) where d ^ (n) is the quantized version of d(n) We can show that SNR DPCM > SNR PCM

Dr. – CS576 Lecture 3 9/20/2021 Page 31

DDPPCCMM –– EEXXAAMMPPLLEE

Dr. – CS576 Lecture 3 9/20/2021 Page 32

CCLLOOSSEEDD–LLOOOOPP DDPPCCMM

In DPCM, the decoder reconstructs ŷ(n) = y(n-1)+ d
^

(n)

The decoder must know y(n-1)!

In Closed Loop DPCM, instead of computing the
difference

d(n) = y(n) – y(n-1), we compute

d(n) = y(n) – ŷ (n-1)

Quantize d(n) to get d
^

(n) and transmit d
^

(n)

At the decoder, this implies  ŷ(n) = ŷ(n-1) + d
^

(n)

Dr. – CS576 Lecture 3 9/20/2021 Page 33

CCLLOOSSEEDD–LLOOOOPP DDPPCCMM

Dr. – CS576 Lecture 3 9/20/2021 Page 34

TTRRAANNSSFFOORRMM CCOODDIINNGG

In Transform Coding a segment of information
undergoes an invertible mathematical transformation.

Segments of information can be samples (assume M
samples) such as –

• image frame

• 8×8 pixel block (M =64)

• a segment of speech

Transform Coding works as follows –

• Apply a suitable invertible transformation
(typically a matrix multiplication)

x(1), x(2), … ,x(M)  y(1), y(2), … ,y(M) (channels)

• Quantize and transmit y(1),…,y(M)

Dr. – CS576 Lecture 3 9/20/2021 Page 35

TTRRAANNSSFFOORRMM CCOODDIINNGG ((22))

Dr. – CS576 Lecture 3 9/20/2021 Page 36

IISSSSUUEESS WWIITTHH TTRRAANNSSFFOORRMM CCOODDIINNGG

Quantizers in different channels may have different

numbers of levels  each channel ultimately might yield
a different number of bits.

Bit budget B : if we we quantize y(1),…,y(M) using
B(1),…,B(M) bits respectively, then


=

=

=
Mi

i

iBB
1

)(

Optimal bit allocation: allocate more bits to those
channels that have the highest variance

It can be shown that the transformation increases the
quantization SNR

Transformation also reduces the signal entropy!

Dr. – CS576 Lecture 3 9/20/2021 Page 37

DDIISSCCRREETTEE CCOOSSIINNEE TTRRAANNSSFFOORRMM ((DDCCTT))

The DCT is a kind of transform coding with matrix T
defined as

It has the property that different channels represent the
signal power along different (spatial or temporal)
frequencies similarly to the (discrete) Fourier transform

1/24/2001

Dr. – CS576 Lecture 3 9/20/2021 Page 38

SSUUBBBBAANNDD CCOODDIINNGG

It is a kind of transform coding (although operating on
the whole signal)

It is not implemented as a matrix multiplication but as a
bank of filters followed by decimation (i.e., subsampling)

Again, each channel represents the signal energy
content along different frequencies. Bits can then be
allocated to the coding of an individual subband signal
depending upon the energy within the band

Dr. – CS576 Lecture 3 9/20/2021 Page 39

AA SSIIMMPPLLEE SSCCHHEEMMEE OOFF SSUUBBBBAANNDD CCOODDIINNGG