COMP2610/6261 – Information Theory Lecture 18: Channel Capacity
U Logo Use Guidelines R . Williamson
logo is a contemporary n of our heritage.
presents our name, ld and our motto:
Copyright By PowCoder代写 加微信 powcoder
earn the nature of things.
authenticity of our brand identity, there are n how our logo is used.
go – horizontal logo
go should be used on a white background. ludes black text with the crest in Deep Gold in
rinting is not available, the black logo can hite background.
e used white reversed out of a black occasionally a neutral dark background.
Research School of Computer Science
Preferred logo
Black version
8 October, 2018
1 Channel Capacity
2 Computing Capacities
Channels: Recap
“Hi”% “Hi”%
Source : : Telephone handset Channel : Analogue telephone line Decoder : Telephone handset
Destination : Mark
Channels: Recap
A discrete channel Q consists of:
an input alphabet X = {a1,…,aI}
an output alphabet Y = {b1,…,bJ}
transition probabilities P(y|x).
The channel Q can be expressed as a matrix
Qj,i =P(y=bj|x=ai)
This represents the probability of observing bj given that we transmit ai
The Binary Noiseless Channel
One of the simplest channels is the Binary Noiseless Channel. The received symbol is always equal to the transmitted symbol – there is no probability of error, hence noiseless.
Inputs X = {0, 1}; Outputs Y = {0, 1}; Transition probabilities
The Binary Symmetric Channel
Each symbol sent across a binary symmetric channel has a chance of being “flipped” to its counterpart (0 → 1; 1 → 0)
Inputs X = {0, 1}; Outputs Y = {0, 1}; Transition probabilities with P(flip) = f
1−f f X Y Q=f1−f
The Z Channel
Symbols may be corrupted over the channel asymmetrically.
Inputs X = {0, 1}; Outputs Y = {0, 1}; Transition probabilities
1 f Q=01−f
Communicating over Noisy Channels
Suppose we know we have to communicate over some channel Q and we want build an encoder/decoder pair to reliably send a message s over Q.
s Encoder x Q y Decoder s
Reliability is measured via probability of error — that is, the probability of incorrectly decoding sout given sin as input:
P(sout ̸= sin) = P(sout ̸= sin|sin = s)P(sin = s) s
1 Channel Capacity
2 Computing Capacities
Mutual Information for a Channel
A key quantity when using a channel is the mutual information between its inputs X and outputs Y :
I (X ; Y ) = H (X ) − H (X |Y ) = H (Y ) − H (Y |X )
This measures how much what was received reduces uncertainty about
what was transmitted
This requires we specify some particular p(X)
A channel is only specified by its transition matrix!
Mutual Information for a Channel: Example
For noiseless channel H(X|Y) = 0 so I(X;Y) = H(X). If pX = (0.9,0.1) then I(X;Y) = 0.47 bits.
Mutual Information for a Channel: Example
For binary symmetric channel with f = 0.15 and pX = (0.9, 0.1) we have
p(Y =1)=p(Y =1|X =1)·p(X =1)+p(Y =1|X =0)·p(X =0) =(1−f)·0.1+f ·0.9
= 0.085 + 0.135 = 0.22,
and so H(Y) = 0.76
Further,H(Y |X =0)=H(Y |X =1)=H(0.15)=0.61. So, I(X;Y) = 0.15 bits
Mutual Information for a Channel: Example
For Z channel with f = 0.15 and same pX we have H(Y) = 0.42, H(Y|X) = 0.061 so I(X;Y) = 0.36 bits
So, intuitively, the reliability is “noiseless > Z > symmetric”
Channel Capacity
The mutual information measure for a channel depends on the choice of input distribution pX. If H(X) is small then I(X;Y) ≤ H(X) is small.
The largest possible reduction in uncertainty achievable across a channel is its capacity.
Channel Capacity
The capacity C of a channel Q is the largest mutual information between its input and output for any choice of input ensemble. That is,
C = maxI(X;Y) pX
Later, we will see that the capacity determines the rate at which we can communicate across a channel with arbitrarily small error.
1 Channel Capacity
2 Computing Capacities
Computing Capacities
Definition of capacity for a channel Q with inputs AX and ouputs AY :
C = maxI(X;Y) pX
How do we actually calculate this quantity?
1 Compute the mutual information I(X ; Y ) for a general pX
2 DeterminewhichchoiceofpX maximisesI(X;Y)
3 Use that maximising value to determine C
Binary Symmetric Channel:
We first consider the binary symmetric channel with AX = AY = {0, 1} and flip probability f . It has transition matrix
1−f f Q=f1−f
Computing Capacities Binary Symmetric Channel – Step 1
The mutual information can be expressed as I(X;Y) = H(Y) − H(Y|X). We therefore need to compute two terms: H(Y) and H(Y|X) so we need the distributions P(y) and P(y|x).
Computing H(Y):
P(y =0)=(1−f)·P(x =0)+f ·P(x =1)=(1−f)·p0 +f ·p1
P(y =1)=(1−f)·P(x =1)+f ·P(x =0)=f ·p0 +(1−f)·p1 Ingeneral,q:=pY =QpX,soabovecalculationisjust
1−f f p0 q=pY= f 1−f p1
Using H2(q) = −q log2 q − (1 − q) log2(1 − q) and letting q =q1 =P(y =1)weseetheentropy
H(Y)=H2(q1)=H2(f ·p0 +(1−f)·p1)
Computing Capacities Binary Symmetric Channel – Step 1
Computing H(Y|X):
Since P(y|x) is described by the matrix Q, we have H(Y|x = 0) = H2(P(y = 1|x = 0)) = H2(Q1,0) = H2(f) and similarly,
H(Y|x = 1) = H2(P(y = 1|x = 1)) = H2(Q0,1) = H2(f) So,
H(Y|X) = H(Y|x)P(x) = H2(f)P(x) = H2(f)P(x) = H2(f) xxx
Computing I(X;Y): Putting it all together gives
I(X;Y)=H(Y)−H(Y|X)=H2(f ·p0 +(1−f)·p1)−H2(f)
Computing Capacities Binary Symmetric Channel – Steps 2 and 3
Binary Symmetric Channel (BSC) with flip probability f ∈ [0, 1]:
e University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981
ok for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.
I(X;Y)=H2(fp0 +(1−f)p1)−H2(f) he noisy-channel coding theorem
I(X;Y) 0.4
0.3 0.2 0.1
tion is {à.5, à.5} and the capacity is
How much better can we do By symmetry, the optimal input distribu-
BSC (f = 0) and pX = (0.5, 0.5):
C QBSC = H2 à.5 −H2 à.5 = .à−à.6 = à.39bits. 9à
I (X ; Y ) = H2 (0.5) − H2 (0) = 1
WeÕll justify the symmetry argument laterà If thereÕs any doubt about
BSC (f = 0.15) and pX = (0.5, 0.5):
the symmetry argument, we can always resort to explicit maximization
of the muIt(uXal;iYnfo)rm=atiHon I(0X.5à Y) −, H (0.15) ≈ 0.39 22
BSC (f = 0.15) and p = (0.9, 0.1):
I X à Y = H 2 − f p + − p f X− H 2 f Þ g u r e 9 à 2 à 9 à 2
I (X ; Y ) = H2 (0.22) − H2 (0.15) ≈ 0.15
ple 9.10. The noisy typewriterà The optimal input distribution is a uni-
form distribution over x, and gives C = log2 9 bitsà
ple 9.11. Consider the Z channel with f = à.5à Identifying the optimal
0.5 0.75 1
symmetric channel with f = 0.15 input distribution is not so straightforwardà We evaluate I X à Y explic- as a function of the input
Maximise I(X ; Y ): Since I(X ; Y ) is symmetric in p1 it is maximised when itly for PX = {pà, p}à First, we need to compute P y à The probability distribution.
p0 =p1 =0.5inwhichcaseC =0.39forBSCwithf =0.15. of y = is easiest to write down:
p Figure 9.2. The mutual
I(X;Y),f =0.15 information I(X;Y) for a binary
Py= =p−f. 9à3
Channel Capacity: Example
For a binary symmetric channel, we could also argue
I(X;Y)=H(Y)−H(Y |X)
=H(Y)−p(X =x)·H(Y |X =x)
=H(Y)−p(X =x)·H(f) x
= H(Y) − H(f) ≤ 1 − H(f),
where equality of the last line holds for uniform pX
Symmetric Channels
The BSC was easy to work with due to considerable symmetry Symmetric Channel
A channel with input AX and outputs AY and matrix Q is symmetric if AY can be partitioned into subsets Y′ ⊆ Y so that each sub-matrix Q′ containing only rows for outputs Y ′ has:
Columns that are all permutations of each other Rows that are all permutations of each other
Symmetric Channels: Examples
AX =AY ={0,1} 0.9 0.1
AX ={0,1},AY ={0,?,1} 0.7 0.1
AX =AY ={0,1} 0.9 0
Q= 0.1 0.9 Symmetric
Q= 0.2 0.2 0.1 0.7
Q= 0.1 1 Not Symmetric
Subsets: {0, 1}
If one of our partitions has just one row, then every element in that must be
Subsets: {0, 1}, {?}
equal for the columns to be permutations of each other
Simplest case: all rows and columns are permutations of each other But this is not a requirement
Channel Capacity for Symmetric Channels
For symmetric channels, the optimal distribution for the capacity has a simple form:
If Q is symmetric, then its capacity is achieved by a uniform distribution over X .
Exercise 10.10 in MacKay
Computing Capacities in General
What can we do if the channel is not symmetric?
We can still calculate I(X ; Y ) for a general input distribution pX
Finding the maximising pX is more challenging What to do once we know I(X;Y)?
I (X ; Y ) is concave in pX =⇒ single maximum
For binary inputs, just look for stationary points (not for |AX | > 2) i.e.,
where d I(X;Y)=0forpX =(1−p,p) dp
In general, need to consider distributions that place 0 probability on one of the inputs
do By symmetry, the optimal input distribu-
Computing Capacities in General
capacity is
− H2 à.5 = .à − à.6 = à.39 bits. 9à
Example (Z Channel with P(y = 0|x = 1) = f):
0.4 ry argument laterà If thereÕs any doubt about 0.3
we can always resort to explicit maximization
)=H (0p +(1−f)p ) 0.1
n I XàY ,
+ −p f − H2 f
=H ((1−f)p ) Þgure 9à2 à 2 9à2 1
H(Y) = H2(P(y = 1 2 0 1
0.5 0.75 1
H(Y|X)=p H (P(y =1|x =0))+p H (P(y =0|x =1)) riterà The optimal input distribution0is a2uni- p 1 2
and gives C = log2 9 bitsà Figure 9.2. The mutual =p H (0)+p H (f)
0 2 i n f o r 1m a t i o2 n I ( X ; Y ) f o r a b i n a r y channel with f = à.5à Identifying the optimal symmetric channel with f = 0.15
o straightforwardà We evaluate I X à Y explic- as a function of the input
=0 rst, we need to compute P y à The probability
distribution.
e down: I(X;Y) = H2((1 − f)p1) − p1H2(f) y= =p−f. 9à3
2p−f −pàH2à+pH2f
2 p −f −pH2 f . 9à4
0.7 0.6 0.5 0.4 0.3 0.2 0.1
tion of p, shown in Þgure 9à3à It is maximized 5à We Þnd C QZ = à.685à Notice the optimal {à.5, à.5}à We can communicate slightly more
ut symbol 0 more frequently than 1à
the capacity of the binary symmetric channel
0.25 0.5 0.75 1
p Figure 9.3. The mutual
information I(X; Y ) for a Z I(X;Y)forZchannelwithf =0.15
channel with f = 0.15 as a function of the input distribution.
Computing Capacities in General
Example (Z Channel):
Showed earlier that I(X;Y) = H2((1 − f)p) − pH2(f) so solve
d 1−(1−f)p dpI(X;Y)=0 ⇐⇒ (1−f)log2 (1−f)p
⇐⇒ 1−(1−f)p=2H2(f)/(1−f) (1−f)p
⇐⇒ p= 1/(1−f) 1+2H2(f)/(1−f)
Forf =0.15,wegetp= 1/0.85 ≈0.44andso 1+20.61/0.85
C = H2(0.38) − 0.44H2(0.15) ≈ 0.685
Homework: Show that d H (p) = log 1−p dp2 2p
Why Do We Care?
We have a template for computing channel capacity for generic channels
But what does this tell us?
How, if at all, does it relate to the error probability when decoding? What, if anything, does it tell us about the amount of redundancy we can get away with when encoding?
We will see next time that there is a deep connection between the capacity and the best achievable rate of transmission
Rates above the capacity cannot be achieved while ensuring arbitrarily small error probabilities
Summary and Conclusions
Mutual information between input and output should be large Depends on input distribution
Capacity is the maximal possible mutual information
Can compute easily for symmetric channels Can compute explicitly for generic channels
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com