1
1.
CSCI 570 – Fall 2020 – HW 3
Graded Problems
There is a stream of integers that comes continuously to a small server. The job of the server is to keep track of k largest numbers that it has seen so far. The server has the following restrictions:
(a) It can process only one number from the stream at a time, which means it takes a number from the stream, processes it, finishes with that number and takes the next number from the stream. It cannot take more than one number from the stream at a time due to memory restriction.
(b) It has enough memory to store up to k integers in a simple data structure (e.g. an array), and some extra memory for computation (like comparison, etc.).
(c) The time complexity for processing one number must be better than ¦¨(k). Anything that is ¦¨(k) or worse is not acceptable.
Design an algorithm on the server to perform its job with the requirements listed above. Solution: Use a binary min-heap on the server.
(a) Do not wait until k numbers have arrived at the server to build the heap, otherwise you would incur a time complexity of O(k). Instead, build the heap on-the-fly, i.e. as soon as a number arrives, if the heap is not full, insert the number into the heap and execute Heapify(). The first k numbers are obviously the k largest numbers that the server has seen.
(b) When a new number x arrives and the heap is full, compare x to the minimum number r in the heap located at the root, which can be done in O(1) time. If x ¡Ü r, ignore x. Otherwise, run Extract-min() and insert the new number x into the heap and call Heapify() to maintain the structure of the heap.
(c) Both Extract-min() and Heapify() can be done in O(log k) time. Hence, the overall complexity is O(log k).
When we have two sorted lists of numbers in non-descending order, and we need to merge them into one sorted list, we can simply compare the first two elements of the lists, extract the smaller one and attach it to the end of the new list, and repeat until one of the two original lists become empty, then we attach the remaining numbers to the end of the new list and it¡¯s done. This takes linear time. Now, try to give an algorithm using O(n log k) time to merge k sorted lists (you can also assume that they contain numbers in non-descending order) into one sorted list, where n is the total number of elements in all the input lists. (Hint: Use a min-heap for k-way merging.)
Solt. Construct a min-heap of the k sublists, that is each sublist is a node in the constructed min-heap. When comparing two sublists, compare their first elements (that is their minimum elements). The creation of this min-heap will cost O(k) time.
Then we run the extract min algorithm and extract the minimum element from the root list. Then update the root sublist and heapify the min-heap according to the new minimum element in the root sublist. (If the root sublist becomes empty during this step, take any leaf sublist and make it the root and then heapify). Each extraction takes O(log k) time.
2.
1
Since we extract n elements in total, the running time is O(n log k + k) = O(n log k) (since k < n). 2 Practice Problems 1. Reading Assignment: Kleinberg and Tardos, Chapter 2.5. Page 2