CS代考 COMP 3711 – Design and Analysis of Algorithms 2022 Spring Semester – Writte

COMP 3711 – Design and Analysis of Algorithms 2022 Spring Semester – Written Assignment 2 Distributed: March 4, 2022
Due: March 18, 2022
Your solutions should contain (i) your name, (ii) your student ID #, and (iii) your email address
Some Notes:

Copyright By PowCoder代写 加微信 powcoder

􏵵 Please write clearly and briefly. In particular, your solutions should be written or printed on clean white paper with no watermarks, i.e., student society paper is not allowed.
􏵵 Please follow the guidelines on doing your own work and avoiding plagia- rism as described in the class policies. You must acknowledge indi- viduals who assisted you, or sources where you found solutions. Failure to do so will be considered plagiarism.
􏵵 The term Documented Pseudocode means that you must include clear comments inside your pseudocode.
􏵵 Please make a copy of your assignment before submitting it. If we can’t find your submission, we will ask you to resubmit the copy.
􏵵 Submit a SOFTCOPY of your assignment to Canvas by the deadline. The softcopy should be one PDF file (no word or jpegs permitted, nor multiple files). If your submission is a scan of a handwritten solution, make sure that it is of high enough resolution to be easily read. At least 300dpi and possible denser.

Problem 1 [20 pts] Greedy algorithms
You are given a set I of n time intervals. A time instant ti hits a time interval Ij = [sj,ej] if sj ≤ ti ≤ ej. Your task is to design a greedy algorithm that finds a smallest set of time instants that hits all the time intervals.
For example, suppose that I = {[2, 4], [1, 4], [4, 5], [3, 5]}. Then the set {3, 5} of time instants hits all these time intervals as each time interval is hit by either instant 3 or 5. the optimal solution is the time instant 4, which hits all intervals.
Give a greedy algorithm that finds the smallest set of time instants. Provide both documented pseudocode and an explanation in words as to what it does. Justify the correctness of your algorithm.
For a group of time intervals, we call the group accessible if the intersection of all these time intervals is non-empty, which means these intervals can be hit by a single time instant in their intersection. A partition of I is optimal if each part of the partition is accessible and we cannot merge any two parts to get an accessible group. The underlying idea of the algorithm is simple, i.e., find the optimal partition.
First, we sort all time intervals in I = {I1 , …, In } according to their starting times. Second, scan from the first time interval. Once we find an integer j such that {I1,…,Ij} is accessible and {I1,…,Ij+1} is not accessible, we maintain {I1,…Ij} as an accessible group and start a new accessible group construction from Ij+1. The pseudocode is as follows.
Algorithm 1: Algorithm for finding the optimal partition
Sort I according to starting times;
R ← ∅ \\ R stores the maximal accessible groups ; G ← ∅;
for i from 1 to n do
if G ∪ {Ii} is accessible then G ← G ∪ {Ii};
R ← R ∪ {G};
G ← {Ii}; end
Since each element Gj ∈ R is accessible, we can construct a set {t1, …, t|R|} of time instants such that tj hits all time intervals in Gj, where |R| is the cardinality of R. Finally, {t1, …, t|R|} is returned as the solution.

The correctness is guaranteed by the fact that R is an optimal partition of I. Assume that {Ii,…,Ij} is a group in R. According to Algorithm 1, the following properties are satisfied:
􏵵 􏶁i≤k≤j Ik ̸= ∅
􏵵 􏶁i≤k≤j+1 Ik = ∅
Let [si,j,ei,j] be the intersection of {Ii,…,Ij}. We then have ei,j < sj+1. Since I is sorted according to the starting time of each time interval, for any group in R before {Ii, ..., Ij }, its intersection does not intersect Ii. Thus, {Ii, ...., Ij } can not be merged with a group before it to produce a new accessible group. For any group in R behind {Ii,...,Ij}, we have ei,j smaller than the starting time of intersection of this group. Thus, {Ii, ...., Ij } can not be merged with a group behind it to produce a new accessible group. Thus, R is an optimal partition of I. Problem 2 [20 pts] Radix Sorting You are given a set of 10 decimal integers in the range of 1 to 65535: A = [32121, 28932, 34789, 54233, 52608, 18987, 47689, 35700, 23781, 61231]. (a) Please conduct Radix sort on A using Base 10. Illustrate your result after each step. (b) NowconvertthesedecimalintegerstohexadecimalandconductRadix sort again, this time using Base 16. Illustrate your result after each step. (a) The result is as follows: (b) After conversion, the array is A = [7d79, 7104, 87e5, d3d9, cd80, 4a2b, ba49, 8b74, 5ce5, ef 2f ]. And the result is as follows: Problem 3 [30 pts] (Selection Procedure) Recall the selection problem that we learned in class. Given an unsorted array A = [x1 , x2 , ..., xn ], we need to return the ith smallest element in A. We have already learned a randomized algorithm for the selection problem with expected running time O(n). The Problem: (a) Consider a new pivot choosing strategy. First, we generate a uniform random permutation of A. Let [y1,...,yn] denote the permutation. The probability that xi is yj for an arbitrary 1 ≤ j ≤ n is exactly Next, we choose yi as the pivot with a probability of Pr[yi is chosen as a pivot] = i . 1+···+n After replacing the original pivot selection strategy with the new one, we have a new randomized algorithm for the selection problem. Assume that we can produce a uniform random permutation of A in O(n) time and a pivot can be chosen in O(1) following the random distribution. Please analyse the expected running time of the new algorithm. Hints: You can follow the analysis in class and compute the proba- bility that a good pivot is chosen. (b) Now suppose that you are given two unsorted arrays A and B, con- taining m and n elements respectively. You are also given a black box O(n) selection algorithm SEL that can be called on A and B separately and, in addition, a small number (say, ten) of extra mem- ory words that can be used to store some extra data. Using SEL as a subroutine, design an O(n + m) algorithm for finding the kth smallest element in the combined set of O(m + n) elements. You may not merge the two arrays in another array and call an algo- rithm (either SEL or something you wrote) on this third array. This is because there is no extra memory available to build this third array. (a) Directly from the running time analysis of the original selection procedure, we call a pivot ”good” if it is between 25%- and 75%-percentile of A. First, we analyse the probability fo a random pivot being good. Due to random permutation, the probability that yi is a good pivot is Pr[yi isagoodpivot]=21. Pr [random pivot is good] = 􏰫 Pr [yi is chosen and yi is good] After i good pivots, total number of items left to process is at most 􏵹 3 􏵺i n. 4 Let i-th stage be the time between the i-th good pivot and the (i + 1)-st good pivot. Let Xi be the number of pivots in the ith stage. We have E[Xi] = 2(waiting time) Let Yi be the running time of the ith stage. Since the number of items being processed in array in the ith stage is at most 􏵹3􏵺i n, we have 4 􏵿 􏵻3􏵼i 􏶀 􏵻3􏵼i E[Yi]≤E Xi 4 n =2 4 n. Thus, the expected total running time is no more than 􏰪i E[Yi], which is O(n). (b) The key idea is how to shrink the search space for the kth smallest element. By SEL, we can find the medians of A and B, respectively. In addition, we can get the ranks of A’s median and B’s median in the combined array in linear time, i.e, O(n+m) time. Without loss of generality, assume that A’s median is larger than B’s median. We can distinguish different cases as follows: Then we have Pr [yi is chosen] – The kth element is smaller than both A’s median and B’s median. Then the kth element locates in the combined array of the left half of A and the left half of B. The sub-problem is finding the kth smallest element in this combined array. – The kth element is between A’s median and B’s median. Then the kth element locates in the combined array of the left half of A and the right half of B. Let k1 be the value of k minus the rank of B’s median. The sub-problem is finding the kth smallest element in 1 this combined array. – The kth element is larger than both A’s median and B’s median. Then the kth element locates in the combined array of the right half of A and the right half of B. Let k2 be the value of k minus the sum of the ranks of A’s median and B’s median. The sub-problem is finding the kth smallest element in this combined array. The algorithm is as following. For the recursive algorithm, half of the elements are discard in each recursion, which means the input size of the sub-problem is half of the input size of the original problem. By considering all the extra operations that include finding the medians of A and B and comparing them to the target element, the running time recurrence is T (n + m) = T ((n + m)/2) + O(n). Algorithm 2: SELECT (A, B, k): Return the kth smallest element in the combined array of A and B if k=1then x ← SEL(A, 1); y ← SEL(B, 1); Return the smaller element between x and y; m1 ← A’s median ; m2 ← B’s median ; r1 ← the rank of m1 in the combined array; r2 ← the rank of m2 in the combined array; if k≤r1 andk≤r2 then A1 ← the left half of A; B1 ← the left half of B; k1 ← k; if k≥r1 andk≥r2 then A1 ← the right half of A; B1 ← the right half of B; k1 ← (k − r1 − r2); if m1 ≥ m2 then A1 ← the left half of A; B1 ← the right half of B; k1 ←(k−r2); A1 ← the right half of A; B1 ← the left half of B; k1 ←(k−r1); Return SELECT (A1, B1, k1); end Problem 4 [30 pts] (Using Selection) Note: Recall the Selection problem of calculating Select(A,p,r,i) that we learned in class. We actually learned a randomized linear, i.e., O(r−p+1), time algorithm for solving Selection. We also stated that a deterministic linear time algorithm for solving this problem exists. For the sake of this problem, you may assume that you have been given this deterministic lin- ear time algorithm and can use it as a subroutine. (calling it exactly as Select(A, p, r, i)). You may not use any other algorithm as a subroutine without explicitly providing the code and proving correctness from scratch. The Manhattan Distance between two 2-dimensional points p = (x, y) and p′ =(x′,y′)is d1(p, p′) = |x − x′| + |y − y′| In what follows you are given real numbers x1, ..., xn and y1, ..., yn. They are NOT necessarily sorted. Define the n two-dimensional points pi = (xi,yi). The Manhattan Ware- house problem is to find a point p = (x,y) that minimizes the average distance to all the pi. That is, to find p such that ∀p′ =(x′,y′)∈R2,􏰫d1(p,pi)≤􏰫d1(p′,pi). (Dividing both sides of the equation by n gives the average.) Design a O(n) worst case time algorithm for solving the Manhattan Ware- house problem. We split this into two parts. (A) Solve the one-dimensional problem. Given (unsorted) x1, ...xn, define the function f (x) = 􏰫 |x − xi |. Call x ̄ a center if it minimizes f(x), i.e., ∀x ∈ R, f(x ̄) ≤ f(x). (a) Prove that there exist z1 , z2 such that z1 ≤ z2 and (i) f(x) is monotonically decreasing for x ∈ [−∞,z1]. (ii) f(z1) = f(x) = f(z2) for all x ∈ [z1,z2]. (iii) f(x) is monotonically increasing for x ∈ [z2 , ∞]. Note that it is possible that z1 = z2. (b) Give an O(n) time algorithm for finding a center of n one- dimensional points x1, ..., xn. First give documented pseudocode for your algorithm. In addition, below your algorithm, explain in words/symbols what your algorithm is doing and justify the correctness. (c) Explain why your algorithm runs in O(n) time. (B) Solve the Manhattan Warehouse Problem. (a) Give an O(n) time algorithm for solving the Manhattan Ware- house Problem given n points p1, p2, ..., pn where pi = (xi, yi). First give documented pseudocode for your algorithm. In addition, below your algorithm, explain in words/symbols what your algorithm is doing and justify the correctness. (b) Explain why your algorithm runs in O(n) time. – For simplicity, you may assume that none of the xi or yi repeat, i.e., there is no pair i,j such that xi =xj or yi =yj. – Part A(a) is only meant as a hint to get you started. Note that if you know z1, z2, then (you must prove this) any x ∈ [z1, z2] will be a solution to the one-dimensional center problem. A more detailed understanding of the behavior of f(x), that lets you find z1,z2, therefore permits solving the remainder of A. You may, if you like, fully analyze the behavior of f(x) in A(a), write this as a mathematical lemma and then quote that lemma in A(c). – We strongly recommend that before trying to solve part A you choose 5 or 6 points xi and then graph the resultant function f(x). Seeing the diagram should help you solve the problem. – The algorithm for part (B) can use the result of part (A) as a sub- routine. – The main work in solving this problem is in understanding how to solve it using selection. Once you understand that, the actual pseu- docode might be quite short. Solutions: Before starting it is worthwhile to look at examples of f(x) drawn for 5 and 6 points. For 5 points the minimum is exactly at f(x3). For 6 points, the minimum is any x satisfying x3 ≤ x ≤ x4. Note that in the first problem, f(x) reaches its minimum exactly at x = x3, the median point. In the second, f(x) reaches its minimum at all x ∈ [x3, x4], i.e., at all points between the two minimum. Caveat: To simplify the illustration, the example assumed that the xi are sorted. In the real input they are not. The main work in the problem was to identify that it can be reduced to finding the median which can be done in O(n) time WITHOUT sorting the input. A (a) For the analysis (NOT for the algorithm), sort the xi and relabel them as y1,y2,...,yn where y1 < y2 < ··· < yn. Then, the yi are just a relabelling of the xi, so f (x) + 􏰫 |x − xi | = 􏰫 |x − yt |. Then, we can expand f(x) as −nx +(f(y )+ny ),y≤y f(x)= −(n−2i)x +(f(yi)+(n−2i)yi),1≤i≤n and yi ≤y≤yi+1 nx + (f (yn ) − nyn ) , y ≤ yn Recall that y = mx + b is the equation of a line. · If m < 0, y is a decreasing function; · If m = 0, y is a constant function; · If m > 0, y is an increasing function.
f(x) is a piecewise linear function with slope −n towards −∞, slope n towards ∞ and slope that is increasing by 2 at each successive yi value.
The slope of f(x) is negative as long as n − 2i > 0 and the slope is positive when n − 2i < 0. Note that that if n is even, for i = n/2, n − 2i = 0 so there’s an interval in which the slope is 0. If n is odd then n − 2i > 0 for i < ⌈n/2⌉ and n − 2i < 0 for i ≥ ⌈n/2⌉. This indicates that we need to split the analysis into two cases. One for n even and one for n odd. If n is odd: the slope of f(x) switches from negative to positive at y⌈n/2⌉ so · f(x) is monotonically decreasing for x ≤ y⌈n/2⌉ and 11 · monotonically increasing for x > y⌈n/2⌉.
The statement is therefore correct for z1 = yn/2 = z2.
Note that this implies that f(x) achieves its minimum at x0 = y⌈n/2⌉ .
If n is even:
· the slope of f(x) is negative for x < yn/2 and · positive for x < yn/2+1. · he slope of f(x) is zero for yn/2 ≤ x ≤ yn/2+1. The statement is therefore correct for z1 = yn/2 and z2 = yn/2+1. Note that this implies that f(x) achieves its minimum for every x satisfying yn/2 ≤ x ≤ yn/2+1. (b) The algorithm is Center(x1, ..., xn) : 1. Create array A[1, ..., n] with A[i] = xi 2. Return Select(A, 1, n, ⌈n/2⌉) % Find median of the xi This is just using the given procedure to find the median item. Correctness follows directly from (a). (c) Follows from the linearity of the Select procedure given at the start of the problem. B (a) Warehouse(x1,...,xn,y1,...,yn): 1. x = Center(x1, ..., xn) % Find median of the xi 2. y = Center(y1, ..., yn) % Find median of the yi 3. Return (x, y) Separately find the medians of the xi and yi. Combined, those give you the x and y coordinates of the solution to the Warehouse problem. The cost of the Warehouse problem can be written as n􏵽n􏵾􏵽n􏵾 􏰫d1(p,pi)= 􏰫|x−xi| + 􏰫|y−yi| . i=1 i=1 i=1 Since there is no connection between the x part and the y part, we can minimize this by separately finding x that minimizes 􏰪ni=1 |x − xi| and y that minimizes 􏰪ni=1 |y − yi|. Thus, the algorithm is correct. (b) Follows from the linearity of the procedure given in part (a). 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com