Last Modified: May 23, 2020
CS 179: Introduction to Graphical Models: Spring 2020
Homework 5
Due Date: Friday, June 5th
The submission for this homework should be a single PDF file containing all of the relevant code, figures, and any text explaining your results. When coding your answers, try to write functions to encapsulate and reuse code, instead of copying and pasting the same code multiple times. This will not only reduce your programming efforts, but also make it easier for us to understand and give credit for your work. Show and explain the reasoning behind your work!
A Jupyter notebook with some helpful code structure has been provided; you may want to use it as a starting point.
Background: Error Correcting Codes
In this homework, we will explore error correcting codes, probability, and graphical models. Whenever we wish to represent data in an error-prone medium, whether it be transmitting it wirelessly from a phone, or storing it on flash or magentic drives, it is often very useful to encode it in such a way that the inevitable errors that creep into the data can be corrected. For a broad overview, see https://en.wikipedia.org/wiki/Error_detection_and_ correction.
As a simple model of noise, suppose that whenever we transmit (or, store and retrieve) a bit, there is an independent probability of error p that we will read the wrong bit value, i.e., if we stored a zero we read a one, and vice versa. (This is called the “binary symmetric channel” model.)
Repetition codes. The simplest possible error correcting code is a repetition code, where we simply transmit multiple copies of the same bit. If we mean to send a 0, we can send “000”, and for 1 send “111”. Then, if one bit is transmitted incorrectly, we can still recover the correct value. The original “0” is called the data bit(s), while the transmitted “000” is called the codeword. Note that any error correction forces us to send extra bits; the ratio of the data length to the codeword length is called the rate, so this is a rate-1/3 repetition code. It is also easy to see that we can correct one error (e.g., decode 010 → 0) by simple majority vote, but cannot correct two errors (e.g., 110 → 1, since it is more probable this is 111 with a single bit error).
Hamming codes. A more sophisticated approach uses more carefully chosen parity bits to add redundancy. The (7,4) Hamming code (for 4 data bits encoded into 7 code bits) is given by
0000 → 0000000 0001 → 1101001
.
1000 → 1110000 1001 → 0011001
.
where the blue digits indicate the redundant, parity check bit additions. See the table in https://en.wikipedia. org/wiki/Hamming(7,4) code for the full values, and an elegant “venn diagram” illustration of how the data bits and parity check bits are related. For the purpose of our problem, it is sufficient to know that (a) we transmit 7 bits for any block of 4 data bits, and (b) if we encounter an error in zero or one bits, we can recover from (correct for) the error, while errors in two or more bits will make at least one error in the four data bits.
Problem 1: Probability
Let us analytically evaluate the probability of making an error using either no code, a repetition code, or a Hamming code under various settings. We will send “blocks” of size B = 4 bits, and wish to recover these blocks error-free. For the purposes of this problem, let the probability p of an error be p = 0.05.
(a) Compute the probability that, with no encoding at all, we will make at least one mistake in sending a block of four bits.
Homework 5 UC Irvine 1/ 3
CS 179: Introduction to Graphical Models Spring 2020
(b) Now compute the probability that, with a rate 1/3 repetition code, we will make at least one error in the block of four bits. Note that we are now sending 12 bits: 3 copies of each of the four data bits; any single error in each of these four sets of 3 bits can be recovered from, but not two errors.
(c) Now compute the probability that, with a Hamming (7,4) code, we will make at least one error in the four bits. (Again, we send 7 bits, and decode all 4 correctly if we have at most one error in the 7.)
You should find that the repetition code is slightly better, in terms of the probability of error; however, this ignores the fact that it must transmit almost twice as many bits as the Hamming code. In order to make this a bit more comparable, let us suppose that we are willing to use rate 1/2, but no lower – so, we are willing to use 8 bits to send a block of size 4, but no more. Then, our repetition code can only afford to repeat two bits, and must send the other two unencoded.
(d) Compute the probability of making at least one error if we use a repetition code for two bits, and send two bits unencoded. (So: single errors in the first 3 bits and second 3 bits are recoverable, but single errors in the last two bits are not.)
The difference between these two codes, intuitively, is that in the repetition code we introduce redundancy for individual bits, while in the Hamming code we introduce redundancy that spans several bits. By distributing the redundancy better across the data bits, we can improve our ability to correct for (rare) errors.
Problem 2: Simulations
Let us next set up a basic framework for encoding a data block, simulating bit errors, and then decoding and checking for correctness. First, we will use it to confirm the Hamming result from the previous problem.
In the homework template, you will find a function that takes a data block and a set of parity checks and returns an encoded block . Parity checks for a (7,4) Hamming code are also provided; for example, T4 equals the parity of D0 ∧ D1 ∧ D2, and so on. Finally, a simple greedy_decode function can decode the codeword block. (It is written slightly more generally than necessary.)
(a) Using the provided functions, encode the data [0,1,1,0] , and verify that the resulting codeword decodes correctly.
(b) Try decoding after introducing a single bit error, and again after introducing a second bit error. Comment on your results.
(c) Finally, a small simulation loop (also provided) will estimate the probability of an error in blocks of size B=4. Try this, and verify that it matches your analytic solution from Problem 1.
Problem 3: Encoding Larger Blocks
Elegant, hand designed codes like the Hamming code are used in many systems; compact disks use a class of codes called Reed-Solomon codes. But for many systems, graphical models provide the best error correction techniques. Convolutional codes are a type of code that works as a finite state machine, and is decoded using dynamic programming on a Markov chain; but here we’ll look at one of the best types of codes for large blocks: LDPC codes and loopy graphical models.
(a) Suppose that we have a large block (say, B = 100) that we wish to transmit without error. If we encode this with 25 Hamming (7,4) codes, we will have an error if any of these 25 codes introduce an error. Making use of your result from Problem 1(c), compute (analytically) the probability of this happening.
(b) Code provided in the template will generate a parity check array for this encoding. Run your simulation procedure on this (larger) code and verify that your esimate matches your analytic solution.
LDPC codes use many parity checks randomly distributed through the data block in order to create a large amount of interconnected redundancy. To ensure that every bit has some redundancy, we will generate one check bit for each data bit, and connect it to two other randomly selected data bits. You can use the provided parity check generator to create a random LDPC code.
encode(D, checks)
D
checks
T
Homework 5 UC Irvine 2/ 3
CS 179: Introduction to Graphical Models Spring 2020
LDPC Decoding. To decode an LDPC code, we want to find the closest (i.e., most probable) valid codeword: a weighted constraint satisfaction problem. We can build a graphical model that encodes the parity constraints, along with the probability of each bit being in error, and then either find the most likely joint configuration (most probable codeword), or the most likely value of each bit individually (the maximum of each bit’s marginal probability). Unfortunately, the resulting graph is high treewidth, so we will resort to approximate inference to do the prediction.
(c) Your template code contains a decoding function based on running loopy belief propagation (LBP) on a graphical model. Test this routine by encoding a data block with your LDPC code, introducing a small number of errors, and decoding.
(d) Use your simulation loop to estimate the block error rate (probabilty of making a mistake in the B = 100 size block) using your LDPC code. Note that the LBP decoding is pretty slow (due in part to Python’s limiations), so I suggest only running 100 simulation trials; if you have a lot of time on your hands, feel free to run more.
(e) (Extra Credit) Modify your simulation loops to also measure the bit error rate, i.e., the average number of bits that are incorrect in the decoding. Compare this value between the Hamming (7,4) code and the LDPC code.
(f) (Extra Credit) Modify the decoding function to use an approximate MAP estimation function instead, for example the dual decomposition implementation in . Estimate the block error rate for this decoding method and compare it with the LBP performance.
gm.messagepass
Homework 5 UC Irvine 3/ 3