代写代考 COMP2610/6261 – Information Theory Lecture 15: Shannon-Fano-Elias and Inter

COMP2610/6261 – Information Theory Lecture 15: Shannon-Fano-Elias and Interval Coding
U Logo Use Guidelines . Williamson
logo is a contemporary n of our heritage.
presents our name, ld and our motto:

Copyright By PowCoder代写 加微信 powcoder

earn the nature of things.
authenticity of our brand identity, there are n how our logo is used.
go – horizontal logo
go should be used on a white background. ludes black text with the crest in Deep Gold in
rinting is not available, the black logo can hite background.
e used white reversed out of a black occasionally a neutral dark background.
Research School of Computer Science
Preferred logo
Black version
24 September, 2018

1 The Trouble with
2 Interval Coding Shannon-Fano- Lossless property
The Prefix Property and Intervals Decoding
Expected Length

Prefix Codes as Trees (Recap)
C2 ={0,10,110,111}
The total symbol code budget

The Source Coding Theorem for Symbol Codes
Source Coding Theorem for Symbol Codes
For any ensemble X there exists a prefix code C such that H (X ) ≤ L(C , X ) < H (X ) + 1. In particular, Shannon codes C — those with lengths li = 􏰅log2 p1 􏰆 — i have expected code length within 1 bit of the entropy. d strings to all the symbols after |AX | −  stepsà 5 .Let AX={a, b, c,d, e } and AP={=a,b{,cà,àd2,e5},aànàd2P5,à=à2{,0.à2à5,05.,25à,à0.25,}0.à15,0.15} x step  step 2 step 3 step 4 àà25 0 àà55 0 àà 0 "" !! a àà25 àà25 bàà25 àà25!!àà45"àà451 càà2 àà21 d àà5 0 àà3 àà3 !! dewords are then obtained by concatenating the binar order: C = {00,10,11,010,011}à The codelength From Example 5.15 of Mac = {00, 10, 11, 010, 011} Hu!man algorithm column 4 of table 5à5 are in s : Advantages and Disadvantages Advantages: are provably optimal amongst prefix codes Algorithm is simple and efficient : Advantages and Disadvantages Advantages: are provably optimal amongst prefix codes Algorithm is simple and efficient Disadvantages: Assumes a fixed distribution of symbols The extra bit in the SCT 􏰜 If H(X) is large – not a problem 􏰜 If H(X) is small (e.g., ∼ 1 bit for English) codes are 2× optimal Huffman codes are the best possible symbol code but symbol coding is not always the best type of code A different way of coding (interval coding) Shannon-Fano-Elias codes Worse guarantee than Huffman codes, but will lead us to the powerful arithmetic coding procedure 1 The Trouble with 2 Interval Coding Shannon-Fano- Lossless property The Prefix Property and Intervals Decoding Expected Length Coding via Cumulative Probabilities Suppose X is an ensemble with probabilities (p1 , . . . , p|X | ) Define the cumulative distribution function by F(x) = 􏰄pi i≤x Coding via Cumulative Probabilities Suppose X is an ensemble with probabilities (p1 , . . . , p|X | ) Define the cumulative distribution function by F(x) = 􏰄pi i≤x and the modified cumulative distribution function by F(x)=􏰄pi +12·p(x)=F(x)−21·p(x) i F(x − 1). We also have ⌊F(x)⌋l(x)+ 1 ≤F(x)+ 1
≤ F(x) + p(x)
= F(x), 􏰋1􏰃
⌊F (x )⌋l(x ) , ⌊F (x )⌋l(x ) + 2l ⊂ [F (x − 1), F (x ))
The intervals for each codeword are thus trivially disjoint, since we know
each of the [F (x − 1), F (x )) intervals is disjoint The SFE code is prefix-free!

Two Types of Interval
The symbol interval for some outcome xi is (assuming F(x0) = 0) [F(xi−1),F(xi))
These intervals are disjoint for each outcome The codeword interval for some outcome xi is
􏰋1􏰃 ⌊F (xi )⌋l(xi ), ⌊F (xi )⌋l(xi ) + 2l(xi )
This is a strict subset of the symbol interval
All strings in the codeword interval start with the same prefix This is not true in general for the symbol interval

1 The Trouble with
2 Interval Coding Shannon-Fano- Lossless property
The Prefix Property and Intervals Decoding
Expected Length

Shannon-Fano-
To decode a given bitstring:
1 start with the first bit, and compute the corresponding binary interval
2 if the interval is strictly contained within that of a codeword:
1 output the codeword
2 skip over any redundant bits for this codeword
3 repeat (1) for the rest of the bitstring
3 else include next bit, and compute the corresponding binary interval
We might be able to stop early owing to redundancies in SFE

Shannon-Fano-
Let p = {29, 19, 31, 13}. Suppose we want to decode 01000:
Find symbol interval containing codeword interval for 01000 = [0.25, 0.28125)10
0 010 0100
[0,0.5) [0.25,0.5) [0.25,0.375) [0.25,0.3125) [0.25,0.2815)
We could actually stop once we see 0100, since [0.25, 0.3125) ⊂ [0.22, 0.33]

1 The Trouble with
2 Interval Coding Shannon-Fano- Lossless property
The Prefix Property and Intervals Decoding
Expected Length

Expected Code Length of SFE Code
The extra bit for the code lengths is because we code pi and
2 log 2=log 1+log2=log 1+1
2 pi 2 pi 2 2 pi
What is the expected length of a SFE code C for ensemble X with
probabilities p?
K K 􏰂􏱆 1􏱇 􏰃 L(C,X)=􏰄pil(ai)=􏰄pi log2 p +1
≤ pi log2p+2 i=1 i
= H(X) + 2 Similarly, H(X) + 1 ≤ L(C,X) for the SFE codes.

Why bother?
Let X be an ensemble, CSFE be a Shannon-Fano-Elias code for X and CH be a Huffman code for X
H (X ) ≤ L(CH , X ) ≤ H (X ) + 1 ≤ L(CSFE , X ) ≤ H (X ) + 2 􏰘 􏰗􏰖 􏰙
Source Coding Theorem
so why not just use Huffman codes?
SFE is a stepping stone to a more powerful type of codes Roughly, try to apply SFE to a block of outcomes

Summary and Reading
Main points:
Problems with Huffman coding symbol distribution
Binary strings to/from intervals in [0, 1] Shannon-Fano- :
􏰜 Code C via cumulative distribution function for p
􏰜 H(X)+1≤L(C,X)≤H(X)+2
Extra bit guarantees interval containment

Summary and Reading
Main points:
Problems with Huffman coding symbol distribution
Binary strings to/from intervals in [0, 1] Shannon-Fano- :
􏰜 Code C via cumulative distribution function for p
􏰜 H(X)+1≤L(C,X)≤H(X)+2
Extra bit guarantees interval containment
Interval coding: MacKay §6.1 and §6.2
Shannon-Fano- : Cover & Thomas §5.9

Summary and Reading
Main points:
Problems with Huffman coding symbol distribution
Binary strings to/from intervals in [0, 1] Shannon-Fano- :
􏰜 Code C via cumulative distribution function for p
􏰜 H(X)+1≤L(C,X)≤H(X)+2
Extra bit guarantees interval containment
Interval coding: MacKay §6.1 and §6.2
Shannon-Fano- : Cover & Thomas §5.9 Next time:
Extending SFE Coding to sequences of symbols

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com