CS代考 EECS 70 Discrete Mathematics and Probability Theory Fall 2021

EECS 70 Discrete Mathematics and Probability Theory Fall 2021
Error Correcting Codes
In this note, we will discuss the problem of transmitting messages across an unreliable communication chan- nel. The channel may cause some parts of the message (“packets”) to be lost, or dropped; or, more seriously, it may cause some packets to be corrupted. We will learn how to “encode” the message by introducing re- dundancy into it in order to protect against both of these types of errors. Such an encoding scheme is known as an “error correcting code.” Error correcting codes are a major object of study in mathematics, com- puter science and electrical engineering; they belong to a field known as “Information Theory,” one of the core computer sciences1, along with the theory of computation, control theory, communication theory, and estimation/learning/signal-processing theory. In addition to the beautiful theory underlying them (which we will glimpse in this note), they are of great practical importance: every time you use your cellphone, satellite TV, DSP, cable modem, disk drive, CD-ROM, DVD player etc., or send and receive data over the internet, you are using error correcting codes to ensure that information is transmitted reliably. Error-correcting codes are also used in data centers and to speed up parallel computations.
There are, very roughly speaking, (at least) two distinct flavors of error correcting codes: algebraic2 codes, which are based on polynomials over finite fields, and combinatorial codes, which are based on graph theory. In this note we will focus on algebraic codes, and in particular on so-called Reed-Solomon codes (named after two of their inventors). In doing so, we will be making essential use of the properties we learned about polynomials in the last lecture. These error-correcting codes also have a fundamentally linear-algebraic flavor to them as well, as will become clear throughout this note. Being based in finite fields allows the “information as degrees of freedom” perspective from linear-algebraic modeling to be made literal in terms of discrete symbols, bit-rates, etc.
Erasure Errors
We will consider two situations in which we wish to transmit information over an unreliable channel. The first is exemplified by the Internet, where the information (say a file) is broken up into packets, and the unreliability is manifest in the fact that some of the packets are lost during transmission, as shown below:
We refer to such errors as erasure errors. Suppose that the message consists of n packets and suppose that
Engineers and the American Institute of Electrical Engineers. The Radio Engineers were intimately involved with communication,
This is also reflected in an interesting history. The IEEE was formed by the merger of two societies — the Institute of Radio
c1 c2 c3 c4 c5 c6
cryptography, radar, etc. The IRE and AIEE merged to form the IEEE in 1963, but the IRE was already bigger than the AIEE by
1957. This merger was natura3l becau1se ele5ctron0ics was becoming an important implementation substrate for both sets of e3nginee-rs 506
and the modern theory of electrical systems used the same mathematics as radio and signal processing (as well as control). This is
since you simply experience EECS as a single entity, but the history remains in some of the course numbers.
geometry — the branch of mathematics that studies the roots of polynomial equations.
EECS 70, Fall 2021, Note 9 1
encoded message
m1 m2 m3 m4 c1 – c3 c4 c5 why the word “EE” is often used to describe these computer sciences. At Berkeley, of course, this is all just a historical footnote
2The ones we study are representative of what are called algebraic-geometry codes, because of their connections to algebraic
at most k packets are lost during transmission. We will show how to encode the initial message consisting of n packets into a redundant encoding consisting of n + k packets such that the recipient can reconstruct the message from any n received packets. Note that in this setting the packets are labeled with headers, and thus the recipient knows exactly which packets were dropped during transmission.
We can assume without loss of generality that the content of each packet is a number modulo q, where q is a prime. For example, the content of the packet might be a 32-bit string and can therefore be regarded as a number between 0 and 232 − 1; then we could choose q to be any prime larger than 232. The properties of polynomials over GF(q) (i.e., with coefficients and values reduced modulo q) are perfectly suited3 to solve this problem and are the backbone of this error-correcting scheme.
To see this, let us denote the message to be sent by m1,…,mn, where each mi is a number in GF(q), and make the following crucial observations:
1. There is a unique polynomial P(x) of degree n − 1 such that P(i) = mi for 1 ≤ i ≤ n (i.e., P(x) contains all of the information about the message, and evaluating P(i) gives the intended contents of the i-th packet).
2. The message to be sent is now m1 = P(1), . . . , mn = P(n). We can generate additional packets by evalu- ating P(x) at points n + j. (Recall that our transmitted codeword should be redundant, i.e., it should contain more packets than the original message to account for the lost packets. This is the distinction between a codeword and a message. A codeword is what is transmitted and has redundancy by construction while the message is something that the user gives us and does not necessarily contain any redundancy.) Thus the transmitted codeword is c1 = P(1), c2 = P(2), . . . , cn+k = P(n + k). Since we are working modulo q, we must make sure that n + k ≤ q, but this condition does not impose a serious constraint since q is assumed to be very large.
3. We can uniquely reconstruct P(x) from its values at any n distinct points, since it has degree n − 1. This means that P(x) can be reconstructed from any n of the transmitted packets (not just the original n packets). Once we have reconstructed the polynomial P, we can evaluate P(x) at x = 1, . . . , n to recover the original message m1,…,mn.
Suppose Alice wants to send Bob a message of n = 4 packets and she wants to guard against k = 2 lost packets. Then, assuming the packets can be coded up as integers between 0 and 6, Alice can work over GF (7) (since 7 ≥ n + k = 6; of course, in real applications, we would be working over a much larger field!). SupposethemessagethatAlicewantstosendtoBobism1 =3,m2 =1,m3 =5,andm4 =0. Theunique polynomial of degree n − 1 = 3 described by these 4 points is P(x) = x3 + 4×2 + 5.
Exercise. We derived this polynomial using Lagrange interpolation mod 7. Check this derivation, and verify alsothatindeedP(i)=mi for1≤i≤4.
Since k = 2, Alice must evaluate P(x) at 2 extra points: P(5) = 6 and P(6) = 1. Now, Alice can transmit the encoded codeword which consists of n+k = 6 packets, where cj = P(j) for 1 ≤ j ≤ 6. So Alice will send c1 =P(1)=3,c2 =P(2)=1,c3 =P(3)=5,c4 =P(4)=0,c5 =P(5)=6,andc6 =P(6)=1.
Now suppose packets 2 and 6 are dropped, in which case we have the following situation:
3In real-world implementations, we do not do this. Instead, we work directly in finite fields that have size 232 because that is a prime power and working with fields that are a power-of-two in size is convenient for computer operations. However, the efficient construction of such fields is beyond the scope of 70. A taste can be had in Math 114 and in EE229B.
EECS 70, Fall 2021, Note 9 2
m1 m2 m3 m4
c1 c2 c3 c4 c5 c6
encoded message
3 – 5 0 6 –
c1 – c3 c4 c5 – the values that Bob received (3, 5, 0, and 6), he uses Lagrange interpolation and computes the follow- ing basis polynomials (where everything should be interpreted mod 7):
∆1(x) = (x−3)(x−4)(x−5) ≡ 2(x−3)(x−4)(x−5) −24
∆3(x) = (x−1)(x−4)(x−5) ≡ 2(x−1)(x−4)(x−5) 4
∆4(x) = (x−1)(x−3)(x−5) ≡ 2(x−1)(x−3)(x−5) −3
∆5(x) = (x−1)(x−3)(x−4) ≡ (x−1)(x−3)(x−4) 8
(mod 7) (mod 7) (mod 7)
(Note that we have used the fact here that the inverses of −24, 4, −3 and 8 (mod 7) are 2, 2, 2 and 1 respectively.) He then reconstructs the polynomial P(x) = 3 · ∆1(x) + 5 · ∆3(x) + 0 · ∆4(x) + 6 · ∆5(x) = x3 + 4×2 + 5 (mod 7). Bob then evaluates m2 = P(2) = 1, which is the packet that was lost from the original codeword. More generally, no matter which two packets were dropped, following exactly the same method Bob can always reconstruct P(x) and thus the original underlying message.
Exercise. Check Bob’s calculation above, and verify that he really does reconstruct the correct polynomial, as claimed. Remember that all arithmetic must be done mod 7.
Let us consider what would happen if Alice sent one fewer packet. If Alice only sent c j for 1 ≤ j ≤ n + k − 1, then with k erasures, Bob would only receive cj for n−1 distinct values j. Thus, Bob would not be able to reconstruct P(x) (since there are q polynomials of degree at most n − 1 that agree with the n − 1 packets which Bob received)! This error-correcting scheme is therefore optimal: it can recover the n characters of the transmitted message from any n received characters, but recovery from any smaller number of characters is impossible.
Polynomial Interpolation Revisited
Let us take a brief digression to discuss another method of polynomial interpolation (different from Lagrange interpolation discussed in the previous note) which will be useful in handling general errors. We need to make the underlying linear-algebra here more explicit so that we can better lean on it going forward.
Again, the goal of the algorithm will be to take as input d +1 pairs (x1,y1),··· ,(xd+1,yd+1), and output the polynomialp(x)=adxd+···+a1x+a0 suchthatp(xi)=yi fori=1tod+1.
The first step of the algorithm is to write a system of d + 1 linear equations in d + 1 variables: the variables are
the coefficients of the polynomial, a0 , . . . , ad . Each equation is obtained by fixing x to be one of d + 1 values:
x1,··· ,xd+1. Note that in p(x), x is a variable and a0,…,ad are fixed constants. In the equations below, these
roles are swapped: xi is a fixed constant and a0,…,ad are variables. For example, the i-th equation is the
result of fixing x to be xi, and saying that the value of the polynomial is yi: adxd +ad−1xd−1 +···+a0 = yi. ii
Now solving these equations gives the coefficients of the polynomial p(x). For example, suppose we are EECS 70, Fall 2021, Note 9 3
given the three pairs (−1, 2), (0, 1), and (2, 5); our goal is to construct the degree-2 polynomial p(x) which goes through these points. The first equation says a2(−1)2 + a1(−1) + a0 = 2. Simplifying, we get a2 − a1 + a0 = 2. Similarly, the second equation says a2 (0)2 + a1 (0) + a0 = 1, or a0 = 1. And the third equation says a2 (2)2 + a1 (2) + a0 = 5 So we get the following system of equations:
a2 − a1 + a0 = 2 a0 = 1 4a2 + 2a1 + a0 = 5
Substituting for a0 and multiplying the first equation by 2 we get: 2a2 − 2a1 = 2
4a2 + 2a1 = 4
Then, adding the two equations we find that 6a2 = 6, so a2 = 1, and plugging back in we find that a1 = 0. Thus, we have determined the polynomial p(x) = x2 + 1. To justify this method more carefully, we must show that such systems of equations always have a solution and that it is unique. Fortunately, the work that we did in the previous note via has already established this. (Can you see why?) This is one of the advantages of being able to look at the same problem from different angles — this theme will become more prominent in later parts of the course.
General Errors
Let us now return to our main topic of error correction, and consider a much more challenging scenario. Suppose that Alice wishes to communicate with Bob over a noisy channel (say, via a modem). Her mes- sage is m1,…,mn, where we may think of the mi’s as characters (either bytes or characters in the English alphabet). The problem now is that some of the characters are corrupted during transmission due to channel noise. Thus Bob receives exactly as many characters as Alice transmits, but k of them are corrupted, and Bob has no idea which k these are! If you’d like, you can think of these as being corrupted by a malicious adversary that is only limited by being able to corrupt no more than k characters. Recovering from such gen- eral errors is much more challenging than recovering from erasure errors, though once again polynomials hold the key. As we shall see, Alice can still guard against k general errors, at the expense of transmitting only 2k additional characters (twice as many as in the erasure case we saw above).
We will again think of each character as a number modulo q for some prime q. (For the English alphabet, q is some prime larger than 26, say q = 29.) As before, we can describe the message by a polynomial P(x) of degree n − 1 over GF(q), such that P(1) = m1, . . . , P(n) = mn. As before, to cope with the errors Alice will transmit additional characters obtained by evaluating P(x) at additional points. As mentioned above, in order to guard against k general errors, Alice must transmit 2k additional characters: thus the encoded codewordisc1,…,cn+2k wherecj =P(j)for1≤ j≤n+2k,andn+kofthesecharactersthatBobreceives are uncorrupted. As before, we must put the mild constraint on q that it be large enough so that q > n + 2k.
For example, if Alice wishes to send n = 4 characters to Bob via a modem in which k = 1 of the characters is corrupted, she must redundantly send an encoded message consisting of 6 characters. Suppose she wants to transmit the same message (3150) as in our erasure example above, and that c1 is corrupted from 3 to 2. This scenario can be visualized in the following figure:
From Bob’s viewpoint, the problem of reconstructing Alice’s message is the same as reconstructing the polynomial P(x) from the n+2k received characters r1,r2,…,rn+2k. In other words, Bob is given n+2k values, r1,r2,…,rn+2k modulo q, with the promise that there is a polynomial P(x) of degree n−1 over
EECS 70, Fall 2021, Note 9 4
m1 m2 m3 m4
c1 c2 c3 c4 c5 c6
encoded message
3 – 5 0 6 –
c1 – c3 c4 c5 – Bob
m1 m2 m3 m4
c1 c2 c3 c4 c5 c6
encoded message
2 1 5 0 6 1
r1 r2 r3 r4 r5 r6
GF (q) such that P(i) = ri for n + k distinct values of i between 1 and n + 2k. Bob must reconstruct P(x) from c1 c2 c3 c4 c5
thisdata(intheaboveexample,n+k=5,andr2 =P(2)=1,r3 =P(3)=5,r4 =P(4)=0,r5 =P(5)=6, 306 20603
and r6 = P(6) = 1). Note, however, thaetnBcodbedoemsensostakgneow which of the n + k values are correct! m1 m2 m3 r1 r2 r3 r4 r5
Does Bob even have sufficient information to reconstruct P(x)? Our first observation shows that this is at
least plausible: for any given subset of n + k values of i between 1 and n + 2k, there is a unique polynomial P(x) such that P(i) = ri at these values of i. To see this, suppose that P′(x) is any other polynomial of degree n − 1 that goes through these n + k points. Then among these n + k points there are at most k errors, and therefore on at least n of the points we must have P′(i) = P(i). But, as we saw in the previous note (Property 2), a polynomial of degree n−1 is uniquely defined by its values at n points, and therefore P′(x) and P(x) must be the same polynomial.
To summarize, then, Bob’s task is to find a polynomial P(x) of degree n − 1 such that P(i) = ri for at least n + k values of i. If he can do this, he can be sure that the polynomial he has found is in fact the original polynomial P(x).
But how can Bob efficiently find such a polynomial? The issue at hand is the locations of the k errors. Let e1,…,ek be the k locations at which errors occurred (so that P(i) = ri for all i ∈/ {e1,…,ek}). Bob doesn’t know where these errors are.
Now Bob could try to guess where the k errors lie, but this would take too long (it would take exponential time, in fact). Instead, Bob will employ a clever trick based on the following definition. Consider the so-called error-locator polynomial
E(x) = (x−e1)(x−e2)···(x−ek).
Note that E(x) is a polynomial of degree k (since x appears k times). Note also that Bob does not know this polynomial explicitly, because he does not know the positions ei of the errors. However, he will use the polynomial symbolically, and will eventually compute the values ei that appear in it!
Let us make a simple but crucial observation about this polynomial:
P(i)E(i) = riE(i) for 1 ≤ i ≤ n+2k. (1)
To see this, note that it holds at points i at which no error occurred since at those points P(i) = ri; and it is trivially true at points i at which an error occurred since then E(i) = 0.
This observation forms the basis of a very clever algorithm invented by Berlekamp and Welch. Looking more closely at the equalities in (1), we will show that they can be made to correspond to n + 2k linear equations in n + 2k unknowns, from which the locations of the errors and coefficients of P(x) can be easily deduced (in analogous fashion to the interpolation scheme we saw in the previous section).
Define the polynomial Q(x) := P(x)E (x), which has degree n + k − 1 and is therefore described by n + k coefficients a0,a1,…,an+k−1. The error-locator polynomial E(x) = (x−e1)···(x−ek) has degree k and is described by k+1 coefficients b0,b1,…,bk but the leading coefficient (the coefficient bk of xk) is always 1.
EECS 70, Fall 2021, Note 9 5
So we can write:
Q(x) = an+k−1xn+k−1 +an+k−2xn+k−2 +···+a1x+a0; E(x) = xk +bk−1xk−1 +···+b1x+b0.
(Remember that we don’t yet know any of the coefficients ai or bi.)
Now notice that, substituting Q(x) = P(x)E (x) in (1), we get the n + 2k equations
Q(i) = riE(i) for 1 ≤ i ≤ n+2k. Writing out the ith equation using the coefficients of Q(x) and E(x), we get
an+k−1in+k−1 +an+k−2in+k−2 +···+a1i+a0 = ri(ik +bk−1ik−1 +···+b1i+b0)
This is a set of n+2k linear equations, one for each value of i, in the n+2k unknowns a0,a1,…,an+k−1,
(mod q). b0,b1,…,bk−1. We can solve the systems of linear equations and get E(x) and Q(x). We can then compute
the ratio Q(x) to obtain P(x). This is best illustrated by an example.
Example. Suppose we are working over GF(7) and Alice wants to send Bob the n = 3 characters “3,” “0,”
and “6” over a modem. Turning to the analogy of the English alphabet, this is equivalent to using only the
(x) = x2 +x+1, Bob which is the unique polynomial of degree 2 such that P(1) = 3, P(2) = 0, and P(3) = 6.
c1 c2 c3 c4 c5 c6
first 7 letters of the alphabet, where a = 0, . . . , g = 6. So the message which Alice wishes for Bob to receive 3150 3-506-
is“dag”.ThenAliceinterpolates(e.g.,usinegncLoadgeradnmge:sesxaegrecise!)tofindthepolynomial
m1 m2 m3 m4 c1 – c3 c4 c5 –
SupposeAliceknowsthatuptok=1chacra1ctecr2cocu3ldcb4eco5rruc6pted;thensheneedstotransmitthen+2k=5 3150 215061
characters P(1) = 3, P(2) = 0, P(3) = 6, P(4) = 0, and P(5) = 3 to Bob. Suppose P(1) is actually corrupted,
m1 m2 m3 m4
encoded message
r1 r2 r3 r4 r5 r6
and Bob receives the character 2 instead of 3 (i.e., Alice sends the encoded codeword “dagad” but Bob instead receives “caAgliacde”). Summarizing, we have the following situation: E (x) = (x − e1 ) = x + b0 be the error-locator polynomial—remember, Bob doesn’t know what e1 = −b0 is yet since he doesn’t know where the error occurred—and let Q(x) = a3 x3 + a2 x2 + a1 x + a0 (where again the coefficients ai are unknown). Now Bob just substitutes i = 1, i = 2, . . . , i = 5 into the equations Q(i) = riE(i) from (1) above and simplifies to get five linear equations in five unknowns (recall that we are working modulo 7 and that ri is the value Bob received for the i-th character):
a3 + a2 + a1 + a0 + 5b0 = 2 a3 + 4a2 + 2a1 + a0 = 0 6a3 +2a2 +3a1 +a0 +b0 = 4 a3 + 2a2 + 4a1 + a0 = 0 6a3 +4a2 +5a1 +a0 +4b0 = 1
c1 c2 c3 c4 c5
encoded message
r1 r2 r3 r4 r5
ECS 70, Fall 2021, Note 9