COMP2610/6261 – Information Theory Lecture 19: Block Codes and the Coding Theorem
U Logo Use Guidelines . Williamson
logo is a contemporary n of our heritage.
presents our name, ld and our motto:
Copyright By PowCoder代写 加微信 powcoder
earn the nature of things.
authenticity of our brand identity, there are n how our logo is used.
go – horizontal logo
go should be used on a white background. ludes black text with the crest in Deep Gold in
rinting is not available, the black logo can hite background.
e used white reversed out of a black occasionally a neutral dark background.
Research School of Computer Science
Preferred logo
Black version
9 October, 2018
Channel Capacity: Recap
The largest possible reduction in uncertainty achievable across a channel is its capacity
Channel Capacity
The capacity C of a channel Q is the largest mutual information between its input and output for any choice of input ensemble. That is,
C = maxI(X;Y) pX
1 Block Codes
2 The Noisy-Channel Coding Theorem
3 Extended Channels
1 Block Codes
2 The Noisy-Channel Coding Theorem
3 Extended Channels
Communicating over Noisy Channels
Suppose we know we have to communicate over some channel Q and we want build an encoder/decoder pair to reliably send a message s over Q.
s Encoder x Q y Decoder s
Block Codes
We now consider codes that make repeated use of a noisy channel to communicate a predefined set of messages S = {1, 2, . . . , S}
Block Codes
We now consider codes that make repeated use of a noisy channel to
communicate a predefined set of messages S = {1, 2, . . . , S} Recall a general encoder is of the form
enc: S → XN
Equivalently, each s ∈ S = {1, 2, . . . , S} is paired with a unique block of symbols x ∈ X N
Block Codes
We now consider codes that make repeated use of a noisy channel to
communicate a predefined set of messages S = {1, 2, . . . , S} Recall a general encoder is of the form
enc: S → XN
Equivalently, each s ∈ S = {1, 2, . . . , S} is paired with a unique block of
symbols x ∈ X N
Thus, we can imagine there being S unique codewords {x(1), . . . , x(S)}, where each codeword has block length N
Block Codes: Example
Suppose S = {1, 2, 3, 4}
Message ID s Message encoding 1 00
2 01 3 10 4 11
Block size N = 2
Codewords x(1) = 00, x(2) = 01, and so on
Block Codes: Formally
We formalise the preceding with the following notion:
(N,K) Block Code
Given a channel Q with inputs X and outputs Y, an integer N > 0, and
K >0,an(N,K)BlockCodeforQisalistofS=2K codewords C = {x(1), x(2), . . . , x(2K )}
where each x(s) ∈ X N consists of N symbols from X .
The code is parameterised by the length of the block, and the number of
messages that are encoded
We parametrise by K = log2 S for mathematical convenience
Doesn’t have to be an integer
Block Codes and Rates
An (N , K ) block code makes N uses of a channel to transmit one of S possible outcomes
We can measure the amount of “information” contained in each use as:
Rate of an (N, K ) Block Code
The rate of an (N,K) block code is log2 S = K bits per channel use. NN
Block Codes: Examples
Examples (for Binary Symmetric Channel Q) A (1,1) block code: C = {0,1} — Rate: 1
A (3, 2) block code: C = {000, 001, 100, 111} — Rate: 32
A (3, log 3) block code: C = {001, 010, 100} — Rate: log2 3 ≈ 0.53 23
Decoding Block Codes
An (N,K) block code sends each message s ∈ S = {1,2,…,2K} over a channelQasxs ∈XN
The receiver sees the block y ∈ YN, and attempts to infer s via some dec:YN →S
Decoding Block Codes
An (N,K) block code sends each message s ∈ S = {1,2,…,2K} over a channelQasxs ∈XN
The receiver sees the block y ∈ YN, and attempts to infer s via some dec:YN →S
Even if X = Y, the decoder must allow for any YN, not just the expected codewords {x(1), . . . , x(2K )}
Decoding Block Codes: Formally
Block Decoder
A decoder for a (N , K ) block code is a mapping that associates each y ∈ YN with an sˆ ∈ {1,2,…,2K}.
Decoding Block Codes: Formally
Block Decoder
A decoder for a (N , K ) block code is a mapping that associates each y ∈ YN with an sˆ ∈ {1,2,…,2K}.
Ideally, we would like the decoded sˆ to be maximally likely to be equal to s Optimal Decoder
An optimal decoder for a code S, channel Q, and prior P(s) maps y to sˆ such that P(sˆ|y) is maximal.
That is, decopt (y) = arg maxs P(s|y) = arg maxs P(y|s) · P(s)
Decoding Block Codes: Examples
Example The (2, 1) block code S = {000, 111} and majority vote decoder d : {0,1}3 → {1,2} defined by
d(000) = d(001) = d(010) = d(100) = 1 d(111) = d(110) = d(101) = d(011) = 2
1 Block Codes
2 The Noisy-Channel Coding Theorem
3 Extended Channels
Rates and Reliability
Ideally, we would like to have high rate for our channel code
Low rate implies that we are being “wasteful” with our channel use
But intuitively, at high rates, we run the risk of losing reliability
If N is small, we may be more easily “confused” about an input
How to measure reliability?
Reliability
Want an encoder/decoder pair to reliably send a messages over channel Q.
s Encoder x in
Reliability
Want an encoder/decoder pair to reliably send a messages over channel
s Encoder x Q y Decoder in
Probability of (Block) Error
Given a channel Q the probability of (block) error for a code is pB = P(sout ̸= sin) = P(sout ̸= sin|sin)P(sin)
and its maximum probability of (block) error is
pBM =maxP(sout ̸=sin|sin) sin
Reliability
Want an encoder/decoder pair to reliably send a messages over channel
s Encoder x Q y Decoder in
Probability of (Block) Error
Given a channel Q the probability of (block) error for a code is pB = P(sout ̸= sin) = P(sout ̸= sin|sin)P(sin)
and its maximum probability of (block) error is
pBM =maxP(sout ̸=sin|sin) sin
As P(sout ̸= sin|sin) ≤ pBM for all sin we get pB ≤ sin pBMP(sin) = pBM
andsoifpBM →0thenpB →0.
Reliability: Example
Suppose s ∈ {a, b} and we encode by a → 000 and b → 111.
To decode we count the number of 1s and 0s and set all bits to the majority count to determine s
000, 001, 010, 100 → a and 111, 110, 101, 011 → b
Reliability: Example
Suppose s ∈ {a, b} and we encode by a → 000 and b → 111.
To decode we count the number of 1s and 0s and set all bits to the majority count to determine s
000, 001, 010, 100 → a and 111, 110, 101, 011 → b
If the channel Q is binary symmetric,
pB =P(sin ̸=sout)
=P(y∈B|000)pa +P(y∈A|111)pb =[f3 +3f2(1−f)]pa +[f3 +3f2(1−f)]pb =f3 +3f2(1−f).
pBM =max(P(y∈B|000),P(y∈A|111))=f3 +3f2(1−f).
Achievable Rates
Ideally, we would like to consider rates of transmission for which we can guarantee small maximum probability of block error
Even more ideally, we would like rates for which we can guarantee arbitrarily small maximum probability of block error
We will call such rates achievable
Achievable Rates
Ideally, we would like to consider rates of transmission for which we can guarantee small maximum probability of block error
Even more ideally, we would like rates for which we can guarantee arbitrarily small maximum probability of block error
We will call such rates achievable Achievable Rate
A rate R over a channel Q is said to be achievable if, for any ε > 0 there is a (N,K) block code and decoder such that its rate K/N ≥ R and its maximum probability of block error satisfies
pBM = max P(sout ̸= sin|sin) < ε sin
Achievable Rates
Achievable rates sound nice in theory, but surely they cannot exist? Surely we will have to drive R → 0 to get small error probability?
Achievable Rates
Achievable rates sound nice in theory, but surely they cannot exist? Surely we will have to drive R → 0 to get small error probability?
Remarkably, we have:
Noisy-Channel Coding Theorem (Brief)
If Q is a channel with capacity C then the rate R is achievable if and only if R ≤ C, that is, the rate is no greater than the channel capacity.
The Noisy-Channel Coding Theorem Example
In last lecture: BSC Q with f = 0.15 has capacity C = 0.39 bits.
Suppose we want error less than ε = 0.05 and rate R > 0.25
The NCCT tells us there should be, for N large enough, an (N , K )
code with K /N ≥ 0.25
Indeed, we showed the code S = {000, 111} with majority vote decoder has probability of error 0.028 < 0.05 for Q and rate 1/3 > 0.25.
The Noisy-Channel Coding Theorem Example
In last lecture: BSC Q with f = 0.15 has capacity C = 0.39 bits.
Suppose we want error less than ε = 0.05 and rate R > 0.25
The NCCT tells us there should be, for N large enough, an (N , K )
code with K /N ≥ 0.25
Indeed, we showed the code S = {000, 111} with majority vote decoder
has probability of error 0.028 < 0.05 for Q and rate 1/3 > 0.25. For N = 3 there is a (3, 1) code meeting the requirements.
However, there is no code with same ε and rate 1/2 > 0.39 = C.
The Noisy Typewriter Channel
This channel simulates a noisy “typewriter”. Inputs and outputs are 26 letters A through Z plus space. With probability 13 , each letter is either: unchanged; changed to the next letter, changed to the previous letter.
The Noisy Typewriter Channel
This channel simulates a noisy “typewriter”. Inputs and outputs are 26 letters A through Z plus space. With probability 13 , each letter is either: unchanged; changed to the next letter, changed to the previous letter.
Inputs X = {A,B,…,Z, }; Outputs Y = {A,B,…,Z, }; Transition probabilities
1 1 0 0 ··· 0 1 3 3 3
AA BB CC DD
1110···00 3 3 3
0 1 1 1 ··· 0 0 Q= 3 3 3
. . . . … . .
100···11 333
The Noisy Typewriter Channel
This channel simulates a noisy “typewriter”. Inputs and outputs are 26 letters A through Z plus space. With probability 13 , each letter is either: unchanged; changed to the next letter, changed to the previous letter.
Inputs X = {A,B,…,Z, }; Outputs Y = {A,B,…,Z, }; Transition probabilities
1 1 0 0 ··· 0 1 3 3 3
AA BB CC DD
The transition matrix for this channel has a diagonal structure: all of the probability mass is concentrated around the diagonal.
1110···00 3 3 3
0 1 1 1 ··· 0 0 Q= 3 3 3
. . . . … . . 100···11
148 9 The Noisy Typewriter Channel
Communication over a Noisy Channel
Some useful model channels are:
Binary symmetric channel. AX ={0,1}à AY ={0,1}à
Thisàchàannelsimulatesanoisy“typewriter”.Inputsandoutputsare26
0d0 Py=0|x=0 = −fà Py=0|x=1 = àà 0
P y=0|x=0 = −fà P y=0|x=1 = fà 0
E P y=1|x=0 = fà P y=1|x=1 = 1−f.
, each letter is either: unchanged; changed to the next letter, changed to the previous letter.
letters A through Z plus space. With probability
Binary erasure channel. AX = {0, 1}à AY = {0, ?, 1}à E01
x d?y Py=?|x=0 = fà Py=?|x=1 = fà ? ” 1
P y=1|x=0 = àà P y=1|x=1 = −f.
Inputs X = {A,B,…,Z, };
tAypewriter. AX = AY = the 2à lettersA
re arranged in a circle, and when the typi
omes out is either A, B or C, with probabili
,Cthe output is B, C or Dà and so forth, wiCt
o the Þrst letter Aà DED
P y=F|x=G Y
IG P y=G|x=G
IH P y=H|x=G
* à à g Z Z
*àg I
-* Eg – Y I Y
{A, B, ààà, Z, -}à The letters
Z I Z – * E g –
Z channel. AX ={0,1}à AY ={0,1}à
Outputs Y = {A,B,…,Z, }; st attempts to type B, what
ty 3 eachà when the input is
Transition probabilities
h the Þnal letter Ô-Õ adjacent
= /3à = /3à = /3à
ABCDEFGHIJKLMNOPQRSTUVWXYZ-
A B C D E F G H I J K L M N O P Q R S T U V W X Y
The transition matrix for this channel has a diagonal structure: all of the
Py=0|x=0 = à Py=0|x=1 = fà 0
E P y=1|x=0 = àà P y=1|x=1 = −f. probability mass is concentrated around the diagonal.
9.4 Inferring the input given the output
Noisy Channel Coding Theorem
Any rate R < C is achievable for Q (i.e., for any tolerance ε > 0, an (N,K) code with rate K /N ≥ R exists with max. block error pBM < ε)
Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge Consider a simple exampYloeu c:an buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.
For noisy typewriter Q:
The capacity is C = log 9
ABCDEFGHIJKLMNOPQRSTUVWXYZ-
subset typewri
9àà: Intuitive preview of proof
Foranyε>0andR