COMP 3711 − Design and Analysis of Algorithms 2022 Spring Semester − Written Assignment # 3
Distributed: March 20, 2022 Due: April 3, 2022, 11:59 PM
Your solution should contain
(i) your name, (ii) your student ID #, and (iii) your email address at the
Copyright By PowCoder代写 加微信 powcoder
top of its first page. Some Notes:
• Please write clearly and briefly. In particular, your solutions should be written or printed on clean white paper with no watermarks, i.e., student society paper is not allowed.
• Please follow the guidelines on doing your own work and avoiding plagiarism as described in the class policies. You must acknowledge individuals who assisted you, or sources where you found solutions. Failure to do so will be considered plagiarism.
• Please make a copy of your assignment before submitting it. If we can¡ ̄t find your submission, we will ask you to resubmit the copy.
• Submit a SOFTCOPY of your assignment to Canvas by the deadline. The softcopy should be one PDF file (no word or jpegs permitted, nor multiple files). If your submission is a scan of a handwritten solution, make sure that it is of high enough resolution to be easily read. At least 300dpi and possible denser.
Problem 1 [30 pts] Greedy File Allocation
Give an O(nlogn) algorithm for assigning location of n files on a magnetic tapes that minimizes the expected time necessary to read a file. As input you are given the size of the files and the probability that each file will be accessed.
Background: Files used to be stored on magnetic tapes rather than disks. Reading a file from tape is not like reading a file from a random access disk. On a tape, the files are laid out sequentially, one after another. To access a file on a tape we first must scan all the other files preceding the file, which takes a significant amount of time. The ordering of the files on the tape therefore defines how long it takes to read any particular file.
More explicitly, an ordering of the files will just be the ordering of how to store the files on the tape. For example, the ordering (3,1,2,4) means that file 1 will be placed as the 2nd file, file 2 will be placed as the 3rd file, file 3 will be placed as the 1st file and file 4 will be placed as the 4th file.
Suppose L[i] is the size of file i. The time Ti it takes to read file i is the sum of the sizes of the files located before file i plus the size of file i. In the example above, if L = [2, 6, 4, 8], then
T3 =L[3]=4,
T1 =L[3]+L[1]=4+2=6
T2 =L[3]+L[1]+L[2]=4+2+6=12
T4 =L[3]+L[1]+L[2]+L[4]=4+2+6+8=20
Suppose that you know that i will be accessed with probability pi and the ordering in which the files are on the tape. The average time to read a file will be
To continue the example above, if the probabilities were p = [1, 1 , 1, 1] then 6 12 4 2
the average read time would be
1·6+ 1 ·12+1·4+1·20=13. 6 12 4 2
On the other hand, if we had the same probabilities but the files were in the order (1, 3, 4, 2) then the reading times would be
T1′ = L[1] = 2
T3′ =L[1]+L[3]=2+4=6
T4′ =L[1]+L[3]+L[4]=2+4+8=14
T2′ =L[1]+L[3]+L[4]+L[2]=2+4+8+6=20
and the average read time would be
1·2+ 1 ·20+1·6+1·14=10.5, 6 12 4 2
which is less than the time for the previous ordering. 2
The problem:
Input is an array of L[1..n] of the n files sizes and a list p1,p2,··· ,pn of the probability of accessing each file.
Give an O(n log n) algorithm that outputs an optimal ordering of the files on the tape, i.e., an ordering that minimizes the expected time to read a file.
To do this answer the following questions:
(a) First assume that all the files are equally likely, i.e., for all files pi = n1. Prove that in this case, inserting the files in order of increasing size gives an optimal solution.
For example, if pi = 0.2 for all i and L[1..5] = [2, 5, 7, 1, 4] they should be inserted in the increasing size order 1, 2, 4, 5, 7.
Note that this property would lead to an O(n log n) algorithm for solving the problem in the equal probability case.
(b) Show that in the general case, sorting the files by size and inserting them on the tape in order of increasing size does not have to give an optimal solution.
(c) Describe a rule that does solve the problem and show that it leads to the optimal ordering. Also, explain how this gives you an O(n log n) greedy algorithm for solving the problem.
(To get full credits, your ordering must lead to an O(n log n) solution.)
Problem 2 [15 pts]
The table below lists a vocabulary of 8 characters (a to h) and their frequencies
in a document.
i12345678 ai abcdefgh
f(ai) 12 3 4 9 20 14 8 13
Apply Huffman coding to construct an optimal codebook. Your solution
should contain:
(a) A full Huffman tree.
(b) The final codebook that contains the binary codewords representing a to h (sorted in the alphabetical order on a to h).
Rules, Recommendations and Hints:
• In part (a) you should explicitly label each edge in the tree with a ’0’ or ’1’. You should also label each leaf with its corresponding character ai and f(ai) value.
• Part (b) should be a table with 2 columns. The first column should be the letters a to h. The second column should be the codeword associated with each character.
Problem 3 [25 pts] Greedy
You are given two arrays ROW[1..n] and COLUMN[1..m] of natural num- bers. ROW[i] counts the number of 1’s in the i-th row of a n × m binary matrix. Similarly, COLUMN[j] counts the number of 1’s in the j-th row of the matrix.
For example, ROW [2, 2] and COLUMN [2, 1, 1, 0] correspond to the 2×4 matrix 1 1 0 0
1 0 1 0 . Summing all the elements across row 1 we get 1+1+0+0 = 2.
Similarly across row 2, we get 1+0+1+0 = 2. Thus, ROW is [2,2]. If we sum the elements across the columns, we get 1+1 = 2, 1+0 = 1, 0+1 = 1, and 0 + 0 = 0. Therefore, COLUMN is [2, 1, 1, 0].
i 1234 1100 2
1010 2 COLUMN 2 1 1 0 ROW
(a) Show that the matrix decoding is not unique.
(b) Design a greedy algorithm for reconstructing the matrix.
Problem 4 [30 pts] Greedy/Dynamic Programming
Suppose that there are n gas stations along a circular route. The target is to start from any potential gas station i move to i + 1, i + 2, … and return to the starting location i without running out of gas. Additionally, you are given two integer arrays G and C of length n. G[i] represents the amount of gas you refill in station i, while C[i] indicates the amount of gas you need to consume to visit the next station i + 1. Design an algorithm such that it finds a potential starting location i. Note that in each gas station you refill the maximum amount available and the tank does not have a limit. Also, it is possible that you might not be able to complete the trip, thus your algorithm should return -1.
For example, given the two arrays below, we could start the trip from i = 3. Meaning we start with 20 gallons in our tank. To move to station i = 4, we consume 5 gallons, leaving 15 gallons in the tank. In location i = 4 we refill 1 gallon increasing the tank to 16 gallons and consume 6 to visit station i = 1. Skipping a few steps, we finally arrive to our starting point i = 3 with 9 gallons left in our tank.
Note that it is not possible to start from any other gas station. For example, if we start from the gas station at position i = 1 we refill 2 gallons and consume 1 gallon to visit i = 2. The tank contains 1 gallon and we refill with 3 more gallons, but to get to the next station we need 5 gallons, which is not enough since we only got 4 gallons in the tank.
i 1234 G(gasrefill) 2 3 20 1 C (gas consumption) 1 5 5 6
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com