程序代写代做代考 information theory algorithm COMP2610/6261 – Information Theory – Lecture 19: Block Codes and the Coding Theorem

COMP2610/6261 – Information Theory – Lecture 19: Block Codes and the Coding Theorem

COMP2610/6261 – Information Theory
Lecture 19: Block Codes and the Coding Theorem

Robert C. Williamson

Research School of Computer Science

1 L O G O U S E G U I D E L I N E S T H E A U S T R A L I A N N A T I O N A L U N I V E R S I T Y

ANU Logo Use Guidelines

Deep Gold
C30 M50 Y70 K40

PMS Metallic 8620

PMS 463

Black
C0 M0 Y0 K100

PMS Process Black

Preferred logo Black version

Reverse version
Any application of the ANU logo on a coloured
background is subject to approval by the Marketing
Office, contact

brand@anu.edu.au

The ANU logo is a contemporary
reflection of our heritage.
It clearly presents our name,
our shield and our motto:

First to learn the nature of things.
To preserve the authenticity of our brand identity, there are
rules that govern how our logo is used.

Preferred logo – horizontal logo
The preferred logo should be used on a white background.
This version includes black text with the crest in Deep Gold in
either PMS or CMYK.

Black
Where colour printing is not available, the black logo can
be used on a white background.

Reverse
The logo can be used white reversed out of a black
background, or occasionally a neutral dark background.

9 October, 2018

1 / 28

Channel Capacity: Recap

The largest possible reduction in uncertainty achievable across a channel
is its capacity

Channel Capacity
The capacity C of a channel Q is the largest mutual information between
its input and output for any choice of input ensemble. That is,

C = max
pX

I(X ;Y )

2 / 28

1 Block Codes

2 The Noisy-Channel Coding Theorem

3 Extended Channels

3 / 28

1 Block Codes

2 The Noisy-Channel Coding Theorem

3 Extended Channels

4 / 28

Communicating over Noisy Channels

Suppose we know we have to communicate over some channel Q and we
want build an encoder/decoder pair to reliably send a message s over Q.

QEncoder
x ysin sout

Decoder

5 / 28

Block Codes

We now consider codes that make repeated use of a noisy channel to
communicate a predefined set of messages S = {1, 2, . . . ,S}

Recall a general encoder is of the form

enc : S → XN

Equivalently, each s ∈ S = {1, 2, . . . ,S} is paired with a unique block of
symbols x ∈ XN

Thus, we can imagine there being S unique codewords {x(1), . . . , x(S)},
where each codeword has block length N

6 / 28

Block Codes

We now consider codes that make repeated use of a noisy channel to
communicate a predefined set of messages S = {1, 2, . . . ,S}

Recall a general encoder is of the form

enc : S → XN

Equivalently, each s ∈ S = {1, 2, . . . ,S} is paired with a unique block of
symbols x ∈ XN

Thus, we can imagine there being S unique codewords {x(1), . . . , x(S)},
where each codeword has block length N

6 / 28

Block Codes

We now consider codes that make repeated use of a noisy channel to
communicate a predefined set of messages S = {1, 2, . . . ,S}

Recall a general encoder is of the form

enc : S → XN

Equivalently, each s ∈ S = {1, 2, . . . ,S} is paired with a unique block of
symbols x ∈ XN

Thus, we can imagine there being S unique codewords {x(1), . . . , x(S)},
where each codeword has block length N

6 / 28

Block Codes: Example

Suppose S = {1, 2, 3, 4}

Message ID s Message encoding
1 00
2 01
3 10
4 11

Block size N = 2

Codewords x(1) = 00, x(2) = 01, and so on

7 / 28

Block Codes: Formally

We formalise the preceding with the following notion:

(N,K ) Block Code
Given a channel Q with inputs X and outputs Y , an integer N > 0, and
K > 0, an (N,K ) Block Code for Q is a list of S = 2K codewords

C = {x(1), x(2), . . . , x(2K )}

where each x(s) ∈ XN consists of N symbols from X .

The code is parameterised by the length of the block, and the number of
messages that are encoded

We parametrise by K = log2 S for mathematical convenience

Doesn’t have to be an integer

8 / 28

Block Codes and Rates

An (N,K ) block code makes N uses of a channel to transmit one of S
possible outcomes

We can measure the amount of “information” contained in each use as:

Rate of an (N,K ) Block Code

The rate of an (N,K ) block code is log2 SN =
K
N bits per channel use.

9 / 28

Block Codes: Examples

Examples (for Binary Symmetric Channel Q)

A (1, 1) block code: C = {0, 1}— Rate: 1

A (3, 2) block code: C = {000, 001, 100, 111}— Rate: 23

A (3, log2 3) block code: C = {001, 010, 100}— Rate:
log2 3

3 ≈ 0.53

10 / 28

Decoding Block Codes

An (N,K ) block code sends each message s ∈ S = {1, 2, . . . , 2K} over a
channel Q as xs ∈ XN

The receiver sees the block y ∈ YN , and attempts to infer s via some

dec : YN → S

Even if X = Y , the decoder must allow for any YN , not just the expected
codewords {x(1), . . . , x(2K )}

11 / 28

Decoding Block Codes

An (N,K ) block code sends each message s ∈ S = {1, 2, . . . , 2K} over a
channel Q as xs ∈ XN

The receiver sees the block y ∈ YN , and attempts to infer s via some

dec : YN → S

Even if X = Y , the decoder must allow for any YN , not just the expected
codewords {x(1), . . . , x(2K )}

11 / 28

Decoding Block Codes: Formally

Block Decoder
A decoder for a (N,K ) block code is a mapping that associates each
y ∈ YN with an ŝ ∈ {1, 2, . . . , 2K}.

Ideally, we would like the decoded ŝ to be maximally likely to be equal to s

Optimal Decoder
An optimal decoder for a code S, channel Q, and prior P(s) maps y to ŝ
such that P(ŝ|y) is maximal.

That is, decopt(y) = argmaxs P(s|y) = argmaxs P(y|s) · P(s)

12 / 28

Decoding Block Codes: Formally

Block Decoder
A decoder for a (N,K ) block code is a mapping that associates each
y ∈ YN with an ŝ ∈ {1, 2, . . . , 2K}.

Ideally, we would like the decoded ŝ to be maximally likely to be equal to s

Optimal Decoder
An optimal decoder for a code S, channel Q, and prior P(s) maps y to ŝ
such that P(ŝ|y) is maximal.

That is, decopt(y) = argmaxs P(s|y) = argmaxs P(y|s) · P(s)

12 / 28

Decoding Block Codes: Examples

Example The (2, 1) block code S = {000, 111} and majority vote decoder
d : {0, 1}3 → {1, 2} defined by

d(000) = d(001) = d(010) = d(100) = 1

d(111) = d(110) = d(101) = d(011) = 2

13 / 28

1 Block Codes

2 The Noisy-Channel Coding Theorem

3 Extended Channels

14 / 28

Rates and Reliability

Ideally, we would like to have high rate for our channel code

Low rate implies that we are being “wasteful” with our channel use

But intuitively, at high rates, we run the risk of losing reliability

If N is small, we may be more easily “confused” about an input

How to measure reliability?

15 / 28

Reliability

Want an encoder/decoder pair to reliably send a messages over channel
Q.

QEncoder
x ysin sout

Decoder

Probability of (Block) Error
Given a channel Q the probability of (block) error for a code is

pB = P(sout 6= sin) =

sin

P(sout 6= sin|sin)P(sin)

and its maximum probability of (block) error is

pBM = max
sin

P(sout 6= sin|sin)

As P(sout 6= sin|sin) ≤ pBM for all sin we get pB ≤

sin
pBMP(sin) = pBM

and so if pBM → 0 then pB → 0.

16 / 28

Reliability

Want an encoder/decoder pair to reliably send a messages over channel
Q.

QEncoder
x ysin sout

Decoder

Probability of (Block) Error
Given a channel Q the probability of (block) error for a code is

pB = P(sout 6= sin) =

sin

P(sout 6= sin|sin)P(sin)

and its maximum probability of (block) error is

pBM = max
sin

P(sout 6= sin|sin)

As P(sout 6= sin|sin) ≤ pBM for all sin we get pB ≤

sin
pBMP(sin) = pBM

and so if pBM → 0 then pB → 0.

16 / 28

Reliability

Want an encoder/decoder pair to reliably send a messages over channel
Q.

QEncoder
x ysin sout

Decoder

Probability of (Block) Error
Given a channel Q the probability of (block) error for a code is

pB = P(sout 6= sin) =

sin

P(sout 6= sin|sin)P(sin)

and its maximum probability of (block) error is

pBM = max
sin

P(sout 6= sin|sin)

As P(sout 6= sin|sin) ≤ pBM for all sin we get pB ≤

sin
pBMP(sin) = pBM

and so if pBM → 0 then pB → 0. 16 / 28

Reliability: Example

Suppose s ∈ {a, b} and we encode by a→ 000 and b→ 111.
To decode we count the number of 1s and 0s and set all bits to the
majority count to determine s

000, 001, 010, 100︸ ︷︷ ︸
A

→ a and 111, 110, 101, 011︸ ︷︷ ︸
B

→ b

If the channel Q is binary symmetric,

pB = P(sin 6= sout)
= P(y ∈ B|000) pa + P(y ∈ A|111) pb
= [f 3 + 3f 2(1− f )]pa + [f 3 + 3f 2(1− f )]pb
= f 3 + 3f 2(1− f ).

In fact,

pBM = max(P(y ∈ B|000),P(y ∈ A|111)) = f 3 + 3f 2(1− f ).

17 / 28

Reliability: Example

Suppose s ∈ {a, b} and we encode by a→ 000 and b→ 111.
To decode we count the number of 1s and 0s and set all bits to the
majority count to determine s

000, 001, 010, 100︸ ︷︷ ︸
A

→ a and 111, 110, 101, 011︸ ︷︷ ︸
B

→ b

If the channel Q is binary symmetric,

pB = P(sin 6= sout)
= P(y ∈ B|000) pa + P(y ∈ A|111) pb
= [f 3 + 3f 2(1− f )]pa + [f 3 + 3f 2(1− f )]pb
= f 3 + 3f 2(1− f ).

In fact,

pBM = max(P(y ∈ B|000),P(y ∈ A|111)) = f 3 + 3f 2(1− f ).
17 / 28

Achievable Rates

Ideally, we would like to consider rates of transmission for which we can
guarantee small maximum probability of block error

Even more ideally, we would like rates for which we can guarantee
arbitrarily small maximum probability of block error

We will call such rates achievable

Achievable Rate
A rate R over a channel Q is said to be achievable if, for any � > 0 there
is a (N,K ) block code and decoder such that its rate K/N ≥ R and its
maximum probability of block error satisfies

pBM = max
sin

P(sout 6= sin|sin) < � 18 / 28 Achievable Rates Ideally, we would like to consider rates of transmission for which we can guarantee small maximum probability of block error Even more ideally, we would like rates for which we can guarantee arbitrarily small maximum probability of block error We will call such rates achievable Achievable Rate A rate R over a channel Q is said to be achievable if, for any � > 0 there
is a (N,K ) block code and decoder such that its rate K/N ≥ R and its
maximum probability of block error satisfies

pBM = max
sin

P(sout 6= sin|sin) < � 18 / 28 Achievable Rates Achievable rates sound nice in theory, but surely they cannot exist? Surely we will have to drive R → 0 to get small error probability? Remarkably, we have: Noisy-Channel Coding Theorem (Brief) If Q is a channel with capacity C then the rate R is achievable if and only if R ≤ C, that is, the rate is no greater than the channel capacity. 19 / 28 Achievable Rates Achievable rates sound nice in theory, but surely they cannot exist? Surely we will have to drive R → 0 to get small error probability? Remarkably, we have: Noisy-Channel Coding Theorem (Brief) If Q is a channel with capacity C then the rate R is achievable if and only if R ≤ C, that is, the rate is no greater than the channel capacity. 19 / 28 The Noisy-Channel Coding Theorem Example Example: In last lecture: BSC Q with f = 0.15 has capacity C = 0.39 bits. Suppose we want error less than � = 0.05 and rate R > 0.25

The NCCT tells us there should be, for N large enough, an (N,K )
code with K/N ≥ 0.25

Indeed, we showed the code S = {000, 111} with majority vote decoder
has probability of error 0.028 < 0.05 for Q and rate 1/3 > 0.25.

For N = 3 there is a (3, 1) code meeting the requirements.

However, there is no code with same � and rate 1/2 > 0.39 = C.

20 / 28

The Noisy-Channel Coding Theorem
Example

Example:

In last lecture: BSC Q with f = 0.15 has capacity C = 0.39 bits.

Suppose we want error less than � = 0.05 and rate R > 0.25

The NCCT tells us there should be, for N large enough, an (N,K )
code with K/N ≥ 0.25

Indeed, we showed the code S = {000, 111} with majority vote decoder
has probability of error 0.028 < 0.05 for Q and rate 1/3 > 0.25.

For N = 3 there is a (3, 1) code meeting the requirements.

However, there is no code with same � and rate 1/2 > 0.39 = C.

20 / 28

The Noisy Typewriter Channel

This channel simulates a noisy “typewriter”. Inputs and outputs are 26
letters A through Z plus space. With probability 13 , each letter is either:
unchanged; changed to the next letter, changed to the previous letter.

A A

C

B B

D

C

D

E

Z

Y Y

Z

..
.

..
.

..
.

Inputs X = {A, B, . . . , Z, };
Outputs Y = {A, B, . . . , Z, };
Transition probabilities

Q =




1
3

1
3 0 0 · · · 0

1
3

1
3

1
3

1
3 0 · · · 0 0

0 13
1
3

1
3 · · · 0 0



. . .


1
3 0 0 · · ·

1
3

1
3




The transition matrix for this channel has a diagonal structure: all of the
probability mass is concentrated around the diagonal.

21 / 28

The Noisy Typewriter Channel

This channel simulates a noisy “typewriter”. Inputs and outputs are 26
letters A through Z plus space. With probability 13 , each letter is either:
unchanged; changed to the next letter, changed to the previous letter.

A A

C

B B

D

C

D

E

Z

Y Y

Z

..
.

..
.

..
.

Inputs X = {A, B, . . . , Z, };
Outputs Y = {A, B, . . . , Z, };
Transition probabilities

Q =




1
3

1
3 0 0 · · · 0

1
3

1
3

1
3

1
3 0 · · · 0 0

0 13
1
3

1
3 · · · 0 0



. . .


1
3 0 0 · · ·

1
3

1
3




The transition matrix for this channel has a diagonal structure: all of the
probability mass is concentrated around the diagonal.

21 / 28

The Noisy Typewriter Channel

This channel simulates a noisy “typewriter”. Inputs and outputs are 26
letters A through Z plus space. With probability 13 , each letter is either:
unchanged; changed to the next letter, changed to the previous letter.

A A

C

B B

D

C

D

E

Z

Y Y

Z

..
.

..
.

..
.

Inputs X = {A, B, . . . , Z, };
Outputs Y = {A, B, . . . , Z, };
Transition probabilities

Q =




1
3

1
3 0 0 · · · 0

1
3

1
3

1
3

1
3 0 · · · 0 0

0 13
1
3

1
3 · · · 0 0



. . .


1
3 0 0 · · ·

1
3

1
3




The transition matrix for this channel has a diagonal structure: all of the
probability mass is concentrated around the diagonal.

21 / 28

The Noisy Typewriter Channel

This channel simulates a noisy “typewriter”. Inputs and outputs are 26
letters A through Z plus space. With probability 13 , each letter is either:
unchanged; changed to the next letter, changed to the previous letter.

A A

C

B B

D

C

D

E

Z

Y Y

Z

..
.

..
.

..
.

Inputs X = {A, B, . . . , Z, };
Outputs Y = {A, B, . . . , Z, };
Transition probabilities

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981
You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.

148 9 — Communication over a Noisy Channel

Some useful model channels are:

Binary symmetric channel. AX = {0, 1}. AY ={0, 1}.

x
!

!

“”#$$%
1

0

1

0
y P (y =0 |x=0) = 1 ! f ;

P (y =1 |x=0) = f ;
P (y =0 |x=1) = f ;
P (y =1 |x=1) = 1 ! f. 1

0

0 1

Binary erasure channel. AX = {0, 1}. AY = {0, ?, 1}.

x
!

!

“”#
$$%

1

0

1

0

? y
P (y =0 |x=0) = 1 ! f ;
P (y =? |x=0) = f ;
P (y =1 |x=0) = 0;

P (y =0 |x=1) = 0;
P (y =? |x=1) = f ;
P (y =1 |x=1) = 1 ! f. 1

?
0

0 1

Noisy typewriter. AX = AY = the 27 letters {A, B, . . . , Z, -}. The letters
are arranged in a circle, and when the typist attempts to type B, what
comes out is either A, B or C, with probability 1/3 each; when the input is
C, the output is B, C or D; and so forth, with the final letter ‘-’ adjacent
to the first letter A.

!

!
!

&&
&’

((()

Z
Y


Z
Y

&&
&’

((()

!&&
&’

…((()

&&
&’!&

&&’
((()

!””
“#

$$$%

!&&
&’

((()

!&&
&’

((()

!&&
&’

((()

!&&
&’

((()

!&&
&’

((()

!((()

*
*
*
*
*
*
*
*
*
*
*
*+,

,
,
,
,
,
,
,
,
,
,
,-

H H
G G
F F
E E
D D
C C
B B
A A


P (y =F |x=G) = 1/3;
P (y =G |x=G) = 1/3;
P (y =H |x=G) = 1/3;


Z
Y
X
W
V
U
T
S
R
Q
P
O
N
M
L
K
J
I
H
G
F
E
D
C
B
A

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z –

Z channel. AX ={0, 1}. AY ={0, 1}.

x
!

!

“”#
1

0

1

0
y P (y =0 |x=0) = 1;

P (y =1 |x=0) = 0;
P (y =0 |x=1) = f ;
P (y =1 |x=1) = 1 ! f. 1

0

0 1

9.4 Inferring the input given the output

If we assume that the input x to a channel comes from an ensemble X, then
we obtain a joint ensemble XY in which the random variables x and y have
the joint distribution:

P (x, y) = P (y |x)P (x). (9.3)
Now if we receive a particular symbol y, what was the input symbol x? We
typically won’t know for certain. We can write down the posterior distribution
of the input using Bayes’ theorem:

P (x | y) = P (y |x)P (x)
P (y)

=
P (y |x)P (x)!
x! P (y |x!)P (x!)

. (9.4)

Example 9.1. Consider a binary symmetric channel with probability of error
f =0.15. Let the input ensemble be PX : {p0 =0.9, p1 =0.1}. Assume
we observe y =1.

P (x=1 | y =1) = P (y =1 |x=1)P (x=1)!
x! P (y |x!)P (x!)

=
0.85 ” 0.1

0.85 ” 0.1 + 0.15 ” 0.9
=

0.085

0.22
= 0.39. (9.5)

The transition matrix for this channel has a diagonal structure: all of the
probability mass is concentrated around the diagonal.

21 / 28

Noisy Channel Coding Theorem

NCCT
Any rate R < C is achievable for Q (i.e., for any tolerance � > 0, an (N,K )
code with rate K/N ≥ R exists with max. block error pBM < �) Consider a simple example: For noisy typewriter Q: The capacity is C = log2 9 For any � > 0 and R < C we can choose N = 1 . . . . . . and code messages using C = {B, E, . . . , Z} Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981 You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links. 9.7: Intuitive preview of proof 153 !Z - Z Y "" "# $$$% ... !"" "# $$$% !"" "# $$$% !"" "# $$$% I H H G F E E D C B B A - Z Y X W V U T S R Q P O N M L K J I H G F E D C B A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z - Figure 9.5. A non-confusable subset of inputs for the noisy typewriter. 1 0 0 1 11 01 10 00 0 0 1 0 0 1 1 1 1111 0111 1011 0011 1101 0101 1001 0001 1110 0110 1010 0010 1100 0100 1000 0000 0 0 0 0 1 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 1 1 1 N = 1 N = 2 N = 4 Figure 9.6. Extended channels obtained from a binary symmetric channel with transition probability 0.15. How does this translate into the terms of the theorem? The following table explains. The theorem How it applies to the noisy typewriter Associated with each discrete memoryless channel, there is a non-negative number C. The capacity C is log2 9. For any ! > 0 and R < C, for large enough N , No matter what ! and R are, we set the blocklength N to 1. there exists a block code of length N and rate ! R The block code is {B, E, . . . , Z}. The value of K is given by 2K = 9, so K = log2 9, and this code has rate log2 9, which is greater than the requested value of R. and a decoding algorithm, The decoding algorithm maps the received letter to the nearest letter in the code; such that the maximal probability of block error is < !. the maximal probability of block error is zero, which is less than the given !. 9.7 Intuitive preview of proof Extended channels To prove the theorem for any given channel, we consider the extended channel corresponding to N uses of the channel. The extended channel has |AX |N possible inputs x and |AY |N possible outputs. Extended channels obtained from a binary symmetric channel and from a Z channel are shown in figures 9.6 and 9.7, with N = 2 and N = 4. Since |C| = 9 we have K = log2 9 so K/N = log2 9 ≥ R for any R < C, and C has zero error so pBM = 0 < � 22 / 28 Noisy Channel Coding Theorem NCCT Any rate R < C is achievable for Q (i.e., for any tolerance � > 0, an (N,K )
code with rate K/N ≥ R exists with max. block error pBM < �) Consider a simple example: For noisy typewriter Q: The capacity is C = log2 9 For any � > 0 and R < C we can choose N = 1 . . . . . . and code messages using C = {B, E, . . . , Z} Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981 You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links. 9.7: Intuitive preview of proof 153 !Z - Z Y "" "# $$$% ... !"" "# $$$% !"" "# $$$% !"" "# $$$% I H H G F E E D C B B A - Z Y X W V U T S R Q P O N M L K J I H G F E D C B A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z - Figure 9.5. A non-confusable subset of inputs for the noisy typewriter. 1 0 0 1 11 01 10 00 0 0 1 0 0 1 1 1 1111 0111 1011 0011 1101 0101 1001 0001 1110 0110 1010 0010 1100 0100 1000 0000 0 0 0 0 1 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 1 1 1 N = 1 N = 2 N = 4 Figure 9.6. Extended channels obtained from a binary symmetric channel with transition probability 0.15. How does this translate into the terms of the theorem? The following table explains. The theorem How it applies to the noisy typewriter Associated with each discrete memoryless channel, there is a non-negative number C. The capacity C is log2 9. For any ! > 0 and R < C, for large enough N , No matter what ! and R are, we set the blocklength N to 1. there exists a block code of length N and rate ! R The block code is {B, E, . . . , Z}. The value of K is given by 2K = 9, so K = log2 9, and this code has rate log2 9, which is greater than the requested value of R. and a decoding algorithm, The decoding algorithm maps the received letter to the nearest letter in the code; such that the maximal probability of block error is < !. the maximal probability of block error is zero, which is less than the given !. 9.7 Intuitive preview of proof Extended channels To prove the theorem for any given channel, we consider the extended channel corresponding to N uses of the channel. The extended channel has |AX |N possible inputs x and |AY |N possible outputs. Extended channels obtained from a binary symmetric channel and from a Z channel are shown in figures 9.6 and 9.7, with N = 2 and N = 4. Since |C| = 9 we have K = log2 9 so K/N = log2 9 ≥ R for any R < C, and C has zero error so pBM = 0 < � 22 / 28 1 Block Codes 2 The Noisy-Channel Coding Theorem 3 Extended Channels 23 / 28 Noisy Channel Coding Theorem: How Is This Possible? The main “trick” to minimising pBM is to construct a (N,K ) block code with (almost) non-confusable codes A code such that the set of y that each x(s) are sent to by Q have low probability intersection This is possible because extended channels look like the noisy typewriter! 24 / 28 Noisy Channel Coding Theorem: How Is This Possible? The main “trick” to minimising pBM is to construct a (N,K ) block code with (almost) non-confusable codes A code such that the set of y that each x(s) are sent to by Q have low probability intersection This is possible because extended channels look like the noisy typewriter! 24 / 28 Extended Channels When used N times, a channel Q from X to Y can be seen as an extended channel taking “symbols” from XN to “symbols” in YN . Extended Channel The N th extended channel of Q from X to Y is a channel from XN to YN with transition probability from x ∈ XN to y ∈ YN given by P(y|x) = N∏ n=1 P(yi |xi) Example: BSC Q with f = 0.1 from X = {0, 1} to Y = {0, 1} has N = 2 extended channel from X 2 = {00, 01, 10, 11} to Y2 = {00, 01, 10, 11} with Q2 =   0.81 0.09 0.09 0.01 0.09 0.81 0.01 0.09 0.09 0.01 0.81 0.09 0.01 0.09 0.09 0.81   1 0 0 1 11 01 10 00 00 10 01 11 1111 0111 1011 0011 1101 0101 1001 0001 1110 0110 1010 0010 1100 0100 1000 0000 00 00 10 00 01 00 11 00 00 10 10 10 01 10 11 10 00 01 10 01 01 01 11 01 00 11 10 11 01 11 11 11 N = 1 N = 2 N = 4 25 / 28 Extended Channels When used N times, a channel Q from X to Y can be seen as an extended channel taking “symbols” from XN to “symbols” in YN . Extended Channel The N th extended channel of Q from X to Y is a channel from XN to YN with transition probability from x ∈ XN to y ∈ YN given by P(y|x) = N∏ n=1 P(yi |xi) Example: BSC Q with f = 0.1 from X = {0, 1} to Y = {0, 1} has N = 2 extended channel from X 2 = {00, 01, 10, 11} to Y2 = {00, 01, 10, 11} with Q2 =   0.81 0.09 0.09 0.01 0.09 0.81 0.01 0.09 0.09 0.01 0.81 0.09 0.01 0.09 0.09 0.81   1 0 0 1 11 01 10 00 00 10 01 11 1111 0111 1011 0011 1101 0101 1001 0001 1110 0110 1010 0010 1100 0100 1000 0000 00 00 10 00 01 00 11 00 00 10 10 10 01 10 11 10 00 01 10 01 01 01 11 01 00 11 10 11 01 11 11 11 N = 1 N = 2 N = 4 25 / 28 Extended Channels When used N times, a channel Q from X to Y can be seen as an extended channel taking “symbols” from XN to “symbols” in YN . Extended Channel The N th extended channel of Q from X to Y is a channel from XN to YN with transition probability from x ∈ XN to y ∈ YN given by P(y|x) = N∏ n=1 P(yi |xi) Example: BSC Q with f = 0.1 from X = {0, 1} to Y = {0, 1} has N = 2 extended channel from X 2 = {00, 01, 10, 11} to Y2 = {00, 01, 10, 11} with Q2 =   0.81 0.09 0.09 0.01 0.09 0.81 0.01 0.09 0.09 0.01 0.81 0.09 0.01 0.09 0.09 0.81   1 0 0 1 11 01 10 00 00 10 01 11 1111 0111 1011 0011 1101 0101 1001 0001 1110 0110 1010 0010 1100 0100 1000 0000 00 00 10 00 01 00 11 00 00 10 10 10 01 10 11 10 00 01 10 01 01 01 11 01 00 11 10 11 01 11 11 11 N = 1 N = 2 N = 4 25 / 28 Extended Channels and the Noisy Typewriter As N increases, any extended channel looks like the noisy typewriter! Extended Z Channel Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981 You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links. 148 9 — Communication over a Noisy Channel Some useful model channels are: Binary symmetric channel. AX = {0, 1}. AY ={0, 1}. x ! ! ""#$$% 1 0 1 0 y P (y =0 |x=0) = 1 ! f ; P (y =1 |x=0) = f ; P (y =0 |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 0 0 1 Binary erasure channel. AX = {0, 1}. AY = {0, ?, 1}. x ! ! ""# $$% 1 0 1 0 ? y P (y =0 |x=0) = 1 ! f ; P (y =? |x=0) = f ; P (y =1 |x=0) = 0; P (y =0 |x=1) = 0; P (y =? |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 ? 0 0 1 Noisy typewriter. AX = AY = the 27 letters {A, B, . . . , Z, -}. The letters are arranged in a circle, and when the typist attempts to type B, what comes out is either A, B or C, with probability 1/3 each; when the input is C, the output is B, C or D; and so forth, with the final letter ‘-’ adjacent to the first letter A. ! ! ! && &' ((() - Z Y - Z Y && &' ((() !&& &' ...((() && &'!& &&' ((() !"" "# $$$% !&& &' ((() !&& &' ((() !&& &' ((() !&& &' ((() !&& &' ((() !((() * * * * * * * * * * * *+, , , , , , , , , , , ,- H H G G F F E E D D C C B B A A ... P (y =F |x=G) = 1/3; P (y =G |x=G) = 1/3; P (y =H |x=G) = 1/3; ... - Z Y X W V U T S R Q P O N M L K J I H G F E D C B A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z - Z channel. AX ={0, 1}. AY ={0, 1}. x ! ! ""# 1 0 1 0 y P (y =0 |x=0) = 1; P (y =1 |x=0) = 0; P (y =0 |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 0 0 1 9.4 Inferring the input given the output If we assume that the input x to a channel comes from an ensemble X, then we obtain a joint ensemble XY in which the random variables x and y have the joint distribution: P (x, y) = P (y |x)P (x). (9.3) Now if we receive a particular symbol y, what was the input symbol x? We typically won’t know for certain. We can write down the posterior distribution of the input using Bayes’ theorem: P (x | y) = P (y |x)P (x) P (y) = P (y |x)P (x)! x! P (y |x!)P (x!) . (9.4) Example 9.1. Consider a binary symmetric channel with probability of error f =0.15. Let the input ensemble be PX : {p0 =0.9, p1 =0.1}. Assume we observe y =1. P (x=1 | y =1) = P (y =1 |x=1)P (x=1)! x! P (y |x!)P (x!) = 0.85 " 0.1 0.85 " 0.1 + 0.15 " 0.9 = 0.085 0.22 = 0.39. (9.5) Noisy Typewriter Channel 26 / 28 Extended Channels and the Noisy Typewriter As N increases, any extended channel looks like the noisy typewriter! 1 0 0 1 11 01 10 00 00 10 01 11 1111 0111 1011 0011 1101 0101 1001 0001 1110 0110 1010 0010 1100 0100 1000 0000 00 00 10 00 01 00 11 00 00 10 10 10 01 10 11 10 00 01 10 01 01 01 11 01 00 11 10 11 01 11 11 11 N = 1 N = 2 N = 4 Extended Binary Symmetric Channel Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981 You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links. 148 9 — Communication over a Noisy Channel Some useful model channels are: Binary symmetric channel. AX = {0, 1}. AY ={0, 1}. x ! ! ""#$$% 1 0 1 0 y P (y =0 |x=0) = 1 ! f ; P (y =1 |x=0) = f ; P (y =0 |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 0 0 1 Binary erasure channel. AX = {0, 1}. AY = {0, ?, 1}. x ! ! ""# $$% 1 0 1 0 ? y P (y =0 |x=0) = 1 ! f ; P (y =? |x=0) = f ; P (y =1 |x=0) = 0; P (y =0 |x=1) = 0; P (y =? |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 ? 0 0 1 Noisy typewriter. AX = AY = the 27 letters {A, B, . . . , Z, -}. The letters are arranged in a circle, and when the typist attempts to type B, what comes out is either A, B or C, with probability 1/3 each; when the input is C, the output is B, C or D; and so forth, with the final letter ‘-’ adjacent to the first letter A. ! ! ! && &' ((() - Z Y - Z Y && &' ((() !&& &' ...((() && &'!& &&' ((() !"" "# $$$% !&& &' ((() !&& &' ((() !&& &' ((() !&& &' ((() !&& &' ((() !((() * * * * * * * * * * * *+, , , , , , , , , , , ,- H H G G F F E E D D C C B B A A ... P (y =F |x=G) = 1/3; P (y =G |x=G) = 1/3; P (y =H |x=G) = 1/3; ... - Z Y X W V U T S R Q P O N M L K J I H G F E D C B A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z - Z channel. AX ={0, 1}. AY ={0, 1}. x ! ! ""# 1 0 1 0 y P (y =0 |x=0) = 1; P (y =1 |x=0) = 0; P (y =0 |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 0 0 1 9.4 Inferring the input given the output If we assume that the input x to a channel comes from an ensemble X, then we obtain a joint ensemble XY in which the random variables x and y have the joint distribution: P (x, y) = P (y |x)P (x). (9.3) Now if we receive a particular symbol y, what was the input symbol x? We typically won’t know for certain. We can write down the posterior distribution of the input using Bayes’ theorem: P (x | y) = P (y |x)P (x) P (y) = P (y |x)P (x)! x! P (y |x!)P (x!) . (9.4) Example 9.1. Consider a binary symmetric channel with probability of error f =0.15. Let the input ensemble be PX : {p0 =0.9, p1 =0.1}. Assume we observe y =1. P (x=1 | y =1) = P (y =1 |x=1)P (x=1)! x! P (y |x!)P (x!) = 0.85 " 0.1 0.85 " 0.1 + 0.15 " 0.9 = 0.085 0.22 = 0.39. (9.5) Noisy Typewriter Channel 26 / 28 Extended Channels and the Noisy Typewriter As N increases, any extended channel looks like the noisy typewriter! 1 0 0 1 11 01 10 00 00 10 01 11 1111 0111 1011 0011 1101 0101 1001 0001 1110 0110 1010 0010 1100 0100 1000 0000 00 00 10 00 01 00 11 00 00 10 10 10 01 10 11 10 00 01 10 01 01 01 11 01 00 11 10 11 01 11 11 11 N = 1 N = 2 N = 4 Extended Binary Symmetric Channel Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981 You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links. 148 9 — Communication over a Noisy Channel Some useful model channels are: Binary symmetric channel. AX = {0, 1}. AY ={0, 1}. x ! ! ""#$$% 1 0 1 0 y P (y =0 |x=0) = 1 ! f ; P (y =1 |x=0) = f ; P (y =0 |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 0 0 1 Binary erasure channel. AX = {0, 1}. AY = {0, ?, 1}. x ! ! ""# $$% 1 0 1 0 ? y P (y =0 |x=0) = 1 ! f ; P (y =? |x=0) = f ; P (y =1 |x=0) = 0; P (y =0 |x=1) = 0; P (y =? |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 ? 0 0 1 Noisy typewriter. AX = AY = the 27 letters {A, B, . . . , Z, -}. The letters are arranged in a circle, and when the typist attempts to type B, what comes out is either A, B or C, with probability 1/3 each; when the input is C, the output is B, C or D; and so forth, with the final letter ‘-’ adjacent to the first letter A. ! ! ! && &' ((() - Z Y - Z Y && &' ((() !&& &' ...((() && &'!& &&' ((() !"" "# $$$% !&& &' ((() !&& &' ((() !&& &' ((() !&& &' ((() !&& &' ((() !((() * * * * * * * * * * * *+, , , , , , , , , , , ,- H H G G F F E E D D C C B B A A ... P (y =F |x=G) = 1/3; P (y =G |x=G) = 1/3; P (y =H |x=G) = 1/3; ... - Z Y X W V U T S R Q P O N M L K J I H G F E D C B A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z - Z channel. AX ={0, 1}. AY ={0, 1}. x ! ! ""# 1 0 1 0 y P (y =0 |x=0) = 1; P (y =1 |x=0) = 0; P (y =0 |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 0 0 1 9.4 Inferring the input given the output If we assume that the input x to a channel comes from an ensemble X, then we obtain a joint ensemble XY in which the random variables x and y have the joint distribution: P (x, y) = P (y |x)P (x). (9.3) Now if we receive a particular symbol y, what was the input symbol x? We typically won’t know for certain. We can write down the posterior distribution of the input using Bayes’ theorem: P (x | y) = P (y |x)P (x) P (y) = P (y |x)P (x)! x! P (y |x!)P (x!) . (9.4) Example 9.1. Consider a binary symmetric channel with probability of error f =0.15. Let the input ensemble be PX : {p0 =0.9, p1 =0.1}. Assume we observe y =1. P (x=1 | y =1) = P (y =1 |x=1)P (x=1)! x! P (y |x!)P (x!) = 0.85 " 0.1 0.85 " 0.1 + 0.15 " 0.9 = 0.085 0.22 = 0.39. (9.5) Noisy Typewriter Channel 26 / 28 Extended Channels and the Noisy Typewriter As N increases, any extended channel looks like the noisy typewriter! 1 0 0 1 11 01 10 00 00 10 01 11 1111 0111 1011 0011 1101 0101 1001 0001 1110 0110 1010 0010 1100 0100 1000 0000 00 00 10 00 01 00 11 00 00 10 10 10 01 10 11 10 00 01 10 01 01 01 11 01 00 11 10 11 01 11 11 11 N = 1 N = 2 N = 4 Extended Binary Symmetric Channel Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981 You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links. 148 9 — Communication over a Noisy Channel Some useful model channels are: Binary symmetric channel. AX = {0, 1}. AY ={0, 1}. x ! ! ""#$$% 1 0 1 0 y P (y =0 |x=0) = 1 ! f ; P (y =1 |x=0) = f ; P (y =0 |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 0 0 1 Binary erasure channel. AX = {0, 1}. AY = {0, ?, 1}. x ! ! ""# $$% 1 0 1 0 ? y P (y =0 |x=0) = 1 ! f ; P (y =? |x=0) = f ; P (y =1 |x=0) = 0; P (y =0 |x=1) = 0; P (y =? |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 ? 0 0 1 Noisy typewriter. AX = AY = the 27 letters {A, B, . . . , Z, -}. The letters are arranged in a circle, and when the typist attempts to type B, what comes out is either A, B or C, with probability 1/3 each; when the input is C, the output is B, C or D; and so forth, with the final letter ‘-’ adjacent to the first letter A. ! ! ! && &' ((() - Z Y - Z Y && &' ((() !&& &' ...((() && &'!& &&' ((() !"" "# $$$% !&& &' ((() !&& &' ((() !&& &' ((() !&& &' ((() !&& &' ((() !((() * * * * * * * * * * * *+, , , , , , , , , , , ,- H H G G F F E E D D C C B B A A ... P (y =F |x=G) = 1/3; P (y =G |x=G) = 1/3; P (y =H |x=G) = 1/3; ... - Z Y X W V U T S R Q P O N M L K J I H G F E D C B A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z - Z channel. AX ={0, 1}. AY ={0, 1}. x ! ! ""# 1 0 1 0 y P (y =0 |x=0) = 1; P (y =1 |x=0) = 0; P (y =0 |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 0 0 1 9.4 Inferring the input given the output If we assume that the input x to a channel comes from an ensemble X, then we obtain a joint ensemble XY in which the random variables x and y have the joint distribution: P (x, y) = P (y |x)P (x). (9.3) Now if we receive a particular symbol y, what was the input symbol x? We typically won’t know for certain. We can write down the posterior distribution of the input using Bayes’ theorem: P (x | y) = P (y |x)P (x) P (y) = P (y |x)P (x)! x! P (y |x!)P (x!) . (9.4) Example 9.1. Consider a binary symmetric channel with probability of error f =0.15. Let the input ensemble be PX : {p0 =0.9, p1 =0.1}. Assume we observe y =1. P (x=1 | y =1) = P (y =1 |x=1)P (x=1)! x! P (y |x!)P (x!) = 0.85 " 0.1 0.85 " 0.1 + 0.15 " 0.9 = 0.085 0.22 = 0.39. (9.5) Noisy Typewriter Channel 26 / 28 Extended Channels and the Noisy Typewriter As N increases, any extended channel looks like the noisy typewriter! 1 0 0 1 11 01 10 00 00 10 01 11 1111 0111 1011 0011 1101 0101 1001 0001 1110 0110 1010 0010 1100 0100 1000 0000 00 00 10 00 01 00 11 00 00 10 10 10 01 10 11 10 00 01 10 01 01 01 11 01 00 11 10 11 01 11 11 11 N = 1 N = 2 N = 4 Extended Z Channel Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981 You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links. 148 9 — Communication over a Noisy Channel Some useful model channels are: Binary symmetric channel. AX = {0, 1}. AY ={0, 1}. x ! ! ""#$$% 1 0 1 0 y P (y =0 |x=0) = 1 ! f ; P (y =1 |x=0) = f ; P (y =0 |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 0 0 1 Binary erasure channel. AX = {0, 1}. AY = {0, ?, 1}. x ! ! ""# $$% 1 0 1 0 ? y P (y =0 |x=0) = 1 ! f ; P (y =? |x=0) = f ; P (y =1 |x=0) = 0; P (y =0 |x=1) = 0; P (y =? |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 ? 0 0 1 Noisy typewriter. AX = AY = the 27 letters {A, B, . . . , Z, -}. The letters are arranged in a circle, and when the typist attempts to type B, what comes out is either A, B or C, with probability 1/3 each; when the input is C, the output is B, C or D; and so forth, with the final letter ‘-’ adjacent to the first letter A. ! ! ! && &' ((() - Z Y - Z Y && &' ((() !&& &' ...((() && &'!& &&' ((() !"" "# $$$% !&& &' ((() !&& &' ((() !&& &' ((() !&& &' ((() !&& &' ((() !((() * * * * * * * * * * * *+, , , , , , , , , , , ,- H H G G F F E E D D C C B B A A ... P (y =F |x=G) = 1/3; P (y =G |x=G) = 1/3; P (y =H |x=G) = 1/3; ... - Z Y X W V U T S R Q P O N M L K J I H G F E D C B A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z - Z channel. AX ={0, 1}. AY ={0, 1}. x ! ! ""# 1 0 1 0 y P (y =0 |x=0) = 1; P (y =1 |x=0) = 0; P (y =0 |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 0 0 1 9.4 Inferring the input given the output If we assume that the input x to a channel comes from an ensemble X, then we obtain a joint ensemble XY in which the random variables x and y have the joint distribution: P (x, y) = P (y |x)P (x). (9.3) Now if we receive a particular symbol y, what was the input symbol x? We typically won’t know for certain. We can write down the posterior distribution of the input using Bayes’ theorem: P (x | y) = P (y |x)P (x) P (y) = P (y |x)P (x)! x! P (y |x!)P (x!) . (9.4) Example 9.1. Consider a binary symmetric channel with probability of error f =0.15. Let the input ensemble be PX : {p0 =0.9, p1 =0.1}. Assume we observe y =1. P (x=1 | y =1) = P (y =1 |x=1)P (x=1)! x! P (y |x!)P (x!) = 0.85 " 0.1 0.85 " 0.1 + 0.15 " 0.9 = 0.085 0.22 = 0.39. (9.5) Noisy Typewriter Channel 26 / 28 Extended Channels and the Noisy Typewriter As N increases, any extended channel looks like the noisy typewriter! 1 0 0 1 11 01 10 00 00 10 01 11 1111 0111 1011 0011 1101 0101 1001 0001 1110 0110 1010 0010 1100 0100 1000 0000 00 00 10 00 01 00 11 00 00 10 10 10 01 10 11 10 00 01 10 01 01 01 11 01 00 11 10 11 01 11 11 11 N = 1 N = 2 N = 4 Extended Z Channel Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981 You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links. 148 9 — Communication over a Noisy Channel Some useful model channels are: Binary symmetric channel. AX = {0, 1}. AY ={0, 1}. x ! ! ""#$$% 1 0 1 0 y P (y =0 |x=0) = 1 ! f ; P (y =1 |x=0) = f ; P (y =0 |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 0 0 1 Binary erasure channel. AX = {0, 1}. AY = {0, ?, 1}. x ! ! ""# $$% 1 0 1 0 ? y P (y =0 |x=0) = 1 ! f ; P (y =? |x=0) = f ; P (y =1 |x=0) = 0; P (y =0 |x=1) = 0; P (y =? |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 ? 0 0 1 Noisy typewriter. AX = AY = the 27 letters {A, B, . . . , Z, -}. The letters are arranged in a circle, and when the typist attempts to type B, what comes out is either A, B or C, with probability 1/3 each; when the input is C, the output is B, C or D; and so forth, with the final letter ‘-’ adjacent to the first letter A. ! ! ! && &' ((() - Z Y - Z Y && &' ((() !&& &' ...((() && &'!& &&' ((() !"" "# $$$% !&& &' ((() !&& &' ((() !&& &' ((() !&& &' ((() !&& &' ((() !((() * * * * * * * * * * * *+, , , , , , , , , , , ,- H H G G F F E E D D C C B B A A ... P (y =F |x=G) = 1/3; P (y =G |x=G) = 1/3; P (y =H |x=G) = 1/3; ... - Z Y X W V U T S R Q P O N M L K J I H G F E D C B A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z - Z channel. AX ={0, 1}. AY ={0, 1}. x ! ! ""# 1 0 1 0 y P (y =0 |x=0) = 1; P (y =1 |x=0) = 0; P (y =0 |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 0 0 1 9.4 Inferring the input given the output If we assume that the input x to a channel comes from an ensemble X, then we obtain a joint ensemble XY in which the random variables x and y have the joint distribution: P (x, y) = P (y |x)P (x). (9.3) Now if we receive a particular symbol y, what was the input symbol x? We typically won’t know for certain. We can write down the posterior distribution of the input using Bayes’ theorem: P (x | y) = P (y |x)P (x) P (y) = P (y |x)P (x)! x! P (y |x!)P (x!) . (9.4) Example 9.1. Consider a binary symmetric channel with probability of error f =0.15. Let the input ensemble be PX : {p0 =0.9, p1 =0.1}. Assume we observe y =1. P (x=1 | y =1) = P (y =1 |x=1)P (x=1)! x! P (y |x!)P (x!) = 0.85 " 0.1 0.85 " 0.1 + 0.15 " 0.9 = 0.085 0.22 = 0.39. (9.5) Noisy Typewriter Channel 26 / 28 Extended Channels and the Noisy Typewriter As N increases, any extended channel looks like the noisy typewriter! 1 0 0 1 11 01 10 00 00 10 01 11 1111 0111 1011 0011 1101 0101 1001 0001 1110 0110 1010 0010 1100 0100 1000 0000 00 00 10 00 01 00 11 00 00 10 10 10 01 10 11 10 00 01 10 01 01 01 11 01 00 11 10 11 01 11 11 11 N = 1 N = 2 N = 4 Extended Z Channel Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981 You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links. 148 9 — Communication over a Noisy Channel Some useful model channels are: Binary symmetric channel. AX = {0, 1}. AY ={0, 1}. x ! ! ""#$$% 1 0 1 0 y P (y =0 |x=0) = 1 ! f ; P (y =1 |x=0) = f ; P (y =0 |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 0 0 1 Binary erasure channel. AX = {0, 1}. AY = {0, ?, 1}. x ! ! ""# $$% 1 0 1 0 ? y P (y =0 |x=0) = 1 ! f ; P (y =? |x=0) = f ; P (y =1 |x=0) = 0; P (y =0 |x=1) = 0; P (y =? |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 ? 0 0 1 Noisy typewriter. AX = AY = the 27 letters {A, B, . . . , Z, -}. The letters are arranged in a circle, and when the typist attempts to type B, what comes out is either A, B or C, with probability 1/3 each; when the input is C, the output is B, C or D; and so forth, with the final letter ‘-’ adjacent to the first letter A. ! ! ! && &' ((() - Z Y - Z Y && &' ((() !&& &' ...((() && &'!& &&' ((() !"" "# $$$% !&& &' ((() !&& &' ((() !&& &' ((() !&& &' ((() !&& &' ((() !((() * * * * * * * * * * * *+, , , , , , , , , , , ,- H H G G F F E E D D C C B B A A ... P (y =F |x=G) = 1/3; P (y =G |x=G) = 1/3; P (y =H |x=G) = 1/3; ... - Z Y X W V U T S R Q P O N M L K J I H G F E D C B A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z - Z channel. AX ={0, 1}. AY ={0, 1}. x ! ! ""# 1 0 1 0 y P (y =0 |x=0) = 1; P (y =1 |x=0) = 0; P (y =0 |x=1) = f ; P (y =1 |x=1) = 1 ! f. 1 0 0 1 9.4 Inferring the input given the output If we assume that the input x to a channel comes from an ensemble X, then we obtain a joint ensemble XY in which the random variables x and y have the joint distribution: P (x, y) = P (y |x)P (x). (9.3) Now if we receive a particular symbol y, what was the input symbol x? We typically won’t know for certain. We can write down the posterior distribution of the input using Bayes’ theorem: P (x | y) = P (y |x)P (x) P (y) = P (y |x)P (x)! x! P (y |x!)P (x!) . (9.4) Example 9.1. Consider a binary symmetric channel with probability of error f =0.15. Let the input ensemble be PX : {p0 =0.9, p1 =0.1}. Assume we observe y =1. P (x=1 | y =1) = P (y =1 |x=1)P (x=1)! x! P (y |x!)P (x!) = 0.85 " 0.1 0.85 " 0.1 + 0.15 " 0.9 = 0.085 0.22 = 0.39. (9.5) Noisy Typewriter Channel 26 / 28 Extended Channels and the Noisy Typewriter Why does this happen? Remember that as N gets larger, sequences x = x1x2 . . . xN start looking typical For a given x, the corresponding p(y | x) will also be concentrated on a few sequences Formalising this will require a notion of joint typicality 27 / 28 Summary and Reading Main Points The Noisy Typewriter Extended Channels Block Codes The Noisy-Channel Coding Theorem (Statement only) Reading MacKay §9.6 Cover & Thomas §7.5 Next time: Detail of the NCCT, joint typicality, and a sketch of the proof! 28 / 28 Summary and Reading Main Points The Noisy Typewriter Extended Channels Block Codes The Noisy-Channel Coding Theorem (Statement only) Reading MacKay §9.6 Cover & Thomas §7.5 Next time: Detail of the NCCT, joint typicality, and a sketch of the proof! 28 / 28 Block Codes The Noisy-Channel Coding Theorem Extended Channels