Proj1-spec
Submission Deadline for the Project is 23:59:59 PM, on 25 Sep, 2017¶
Instructions:¶
You need to implement your code in submission.py file, provided along with this notebook.
For all questions, your codes in the file submission.py need to output your results according to corresponding format specified in Proj1-spec.ipynb.
We will not consider any output printed out on the screen. All results should be returned in appropriate data structures return by corresponding functions.
This Project is not designed to be a full-fledged software program, so we will not evaluate it deliberately using any invalid inputs.
You are not allowed to change the names of the functions to be submitted in submission.py.
If you choose to skip a question, leave the corresponding function body in the file submission.py as it is (i.e., keep the pass line), otherwise it will affect your marks for other questions.
You are allowed to add other functions and import modules (which you may have to in this project), but you are not allowed to define global variables, i.e., only functions are allowed in submission.py. Also be careful not to import unnecessary modules, as it may lead to unexpected errors.
You are allowed to submit as many times as you want before the deadline, but ONLY the latest version will be kept and marked.
For your convenience, we will process your codes submitted before the Deadline using a sample dataset, you should not consider these results as your final grades.
For Final Grades we will be using a different dataset, so your final scores may vary.
In [ ]:
# Question 0: An example (0 point)
We illustrate the steps for completing the **project** notebook and prepare **`submission.py`** by solving the `Question 0`. In this example question, you are required to implement a function that takes two integer arguments `a` and `b`, and output their sum.
You will be provided with the definition of the function, as given below:
In [23]:
def add(a, b): # do not change the heading of the function
pass # **replace** this line with your code
Step 1: You need to put your implementation in the function body like below (you should remove the pass line from the function body):
In [24]:
def add(a, b): # do not change the heading of the function
return a + b
Step 2: you need to paste your code to submission.py, which originally contains only function definitions. We have added this function in submission.py as a sample solution to Question 0 for your reference.
Here is the demonstration of our testing environment, which explains how we will evaluate your submission.
In [25]:
import submission
result = submission.add(5,3)
print(result)
8
As you can see, all you need to/have to do is to implement the function; do not print anything out from your own implementation. (c.f., Instruction-#3)
Question 1: List Intersection using Galloping Search (35 point)¶
You need to implement a function named: gallop_to using the galloping search algorithm.
Specifications:¶
We will test your implementation of gallop_to using a seperate function intersection_galloping, which is already provided to you in this notebook.
Input of the intersection_galloping function consists of two objects of the InvertedList class, initiated with sorted list of integers, e.g.,
a = InvertedList([2, 4, 6, 8, 10, 12, 14, 16, 18])
b = InvertedList([1, 2, 4, 8, 16, 32])
intersection_galloping(a, b) calls the function: gallop_to by passing, an InvertedList object to traverse, and an integer value that we need to gallop to.
In addition to galloping, each run of gallop_to will also return an integer count representing the number of times you assess a value (under the current pointer) in the InvertedList object passed to the gallop_to function.
Final output of intersection_galloping will be a list of integers, and a count representing the sum of all the count variables returned by the gallop_to function.
Example:¶
(i) intersection_galloping(a, b)¶
(a) input:¶
a = InvertedList([2, 4, 6, 8, 10, 12, 14, 16, 18])
b = InvertedList([1, 2, 4, 8, 16, 32])
(b) output:¶
([2, 4, 8, 16], 17)
(ii) gallop_to(a, 8)¶
(a) input:¶
a = InvertedList([2, 4, 6, 8, 10, 12, 14, 16, 18])
val = 8
In the above example, for InvertedList object a with current cursor position = 2, corresponding to current value = 6, gallop_to(a, 8) will return a count = 2, which corresponds to the sum of 1st peek at cursor position = 2, value = 6, and a 2nd peek at cursor position = 4, value = 10, after we doubled the interval.
(b) output:¶
2
Note: In gallop_to, you are only allowed to update count, for the elements of InvertedList that you checked as:
smaller than val (i.e., elements< val), and the first element that is greater than or equal to the val (i.e., element >= val).
The InvertedList class¶
To facilitate the implementation of Question 1, we provide this InvertedList class which encapsulates all the important operations needed to be performed on an inverted list.
In [1]:
class InvertedList:
def __init__(self, l):
self.data = l[:] # make a copy
self.cur = 0 # the cursor
def get_list(self):
return self.data
def eol(self):
# we use cur == len(list) to indicate EOL
return False if self.cur < len(self.data) else True
def next(self, val = 1):
# does not allow cur to be out-of-range, but use len(list) to indicate EOL
self.cur = min(self.cur + val, len(self.data))
def elem(self):
if self.eol():
return None
else:
return self.data[self.cur]
def peek(self, pos):
# look at the element under the current cursor, but does not advance the cursor.
if pos < len(self.data):
return self.data[pos]
else:
return None
def reset(self):
self.cur = 0
You need to implement your code below, and copy this code in the file submission.py
In [27]:
def gallop_to(a, val):# do not change the heading of the function
pass # **replace** this line with your code
Intersection Algorithm using gallop_to()¶
After implementing the gallop_to() function, you will be using Intersection Algorithm, i.e., intersection_galloping to test your results.
In this function, we initialize a variable named: count to store the total number of times you peek a value in InvertedList passed to the gallop_to() function.
You will observe, the galloping search algorithm quickly skip over the list to reposition the cursor.
In [1]:
import submission
def intersection_galloping(a, b):
# just in case these lists have been traversed.
a.reset()
b.reset()
count = 0
ret = []
while not a.eol() and not b.eol():
if a.elem() == b.elem():
ret.append(a.elem())
a.next() # Note that here you are only allowed to move the cursor of one InvertedList Object.
else:
if a.elem() < b.elem():
count = count + submission.gallop_to(a,b.elem())
else:
count = count + submission.gallop_to(b,a.elem())
# end_while
return ret, count
After implementation of the gallop_to function, you can test your implementation here. It should result an output ([2, 4, 8, 16], 17)
In [2]:
a = InvertedList([2, 4, 6, 8, 10, 12, 14, 16, 18])
b = InvertedList([1, 2, 4, 8, 16, 32])
result = intersection_galloping(a, b)
print(result)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
—-> 1 a = InvertedList([2, 4, 6, 8, 10, 12, 14, 16, 18])
2 b = InvertedList([1, 2, 4, 8, 16, 32])
3 result = intersection_galloping(a, b)
4 print(result)
NameError: name ‘InvertedList’ is not defined
Question 2: Logarithmic Merge (35 point)¶
In this question, we simulate the implementation of (binary) Logarithmic_Merge for Index Merging, given in L4-slides (36-37).
You are required to implement the Logarithmic Megre function: Logarithmic_merge(index, cut_off, buffer_size).
The function takes 3 parameters as input,i.e., (i) index, (ii) cut_off, and (iii) buffer_size.
First parameter, index, represents the live streaming index, we simulate the index using a list of unordered integers.
Second parameter, cut_off, is the cut_off point for the streaming index, it is illustrated in the example below.
Third parameter, buffer_size, represents the maximum size of your memory buffer. When index is greater than or equal to the buffer_size, we write down the index in a file.
Note: You are not supposed to actually write down your index in the file, you will be simulating the index writing part by adding the index to a list.¶
Example for cut_off:¶
index = [15, 46, 19, 93, 73, 64, 33, 80, 73, 26, 22, -77-, 27, …..]
Assume that you have access to live-streaming data, which is being populated in your list named: index. The parameter, cut_off defines a threshold for the index. For-example, cut_off = 12 will restrict the function:Logarithmic_merge(index, cut_off, buffer_size) to use top 12 values, i.e., it will only consider values upto and including the value 77.
Specifications:¶
Your inputs will be the (i) index, i.e., unsorted list of integers, (ii) cut_off, i.e., the cut-off point, and (iii) buffer_size, i.e., the maximum size of index that can be kept in memory.
Your output should be a list of sub-lists, where each sub-list stores intermediate result of logarithmic merge. First sub-list will store the value for Z0, the second sub-list will store the value of L0, and the subsequent sub-lists will store values for L1, L2, L3 and so on…
Note: Don’t remove empty sub-lists from your final result. It will effect your evaluation.¶
Example:¶
input: (index, cut_off, buffer_size), e.g., ([15, 46, 19, 93, 73, 64, 33, 80, 73, 26, 22, 77, 27], 10, 3)¶
output: List of sub-lists storing all the intermediate results, and the final result, e.g., [[26], [33, 73, 80], [15, 19, 46, 64, 73, 93]].¶
Note:
Z0 should be added at the start of the final list, followed by L0, L1, L3, and so on…
In the example output, Z0 = [26], L0 = [33, 73, 80], L1 = [15, 19, 46, 64, 73, 93]
index within a sub-list should be sorted.
You need to implement your code below, and copy this code in the file submission.py
In [3]:
def Logarithmic_merge(index, cut_off, buffer_size): # do not change the function heading
pass # **replace** this line with your code
After implementation of Logarithmic_merge, you can test your implementation here. You should see the following output:
[[26], [33, 73, 80], [15, 19, 46, 64, 73, 93]]
In [5]:
import random
import submission
random.seed(1999)
n = 13
index = [random.randint(1, 100) for _ in range(n)]
result = submission.Logarithmic_merge(index, 10, 3) #cut_off = 10, initial buffer_size = 3
print(result) # expect to see a list of lists
[[26], [33, 73, 80], [15, 19, 46, 64, 73, 93]]
Question 3: Decoders for Gamma, Delta, and Rice codes (30 point)¶
In this part you will implement decoders for Gamma, Delta, and Rice codes.
Specifications:¶
For Gamma, and Delta decoders, your input will be a string of bits (0s or 1s) without any space. Your output for Gamma, and Delta decoders should be a list of integers.
For Rice decoder, input constitutes of the (i) code, i.e., a string of bits (0s or 1s) without any space, and (ii) an integer representing the value of b. Here b = 2^k. Output of Rice decoder should be a list of integers.
Note, for Rice Decoder, in contrast to the information provided in (L5, slide57), we assume the input of the decoder is comig from q = [x]/b, and r = (x) – (q.b).¶
(i) Examples for Gamma and Delta:¶
(a) input: (String x), e.g., (“1111111100000000011111111000001001”)¶
(b) output: list of numbers, e.g., [256, 265]¶
(ii) Example for Rice Decoder:¶
(a) input: (String x , int b) e.g., (“11110101111011”,4)¶
(b) output: list of numbers, e.g., [18, 19]¶
You need to implement your code below, and copy this code in the file submission.py
In [32]:
def decode_gamma(inputs):# do not change the function heading
pass # **replace** this line with your code
After implementation of decode_gamma, you can test your implementation here. You should see the following output [256, 265]
In [33]:
import submission
inputs = “1111111100000000011111111000001001”
result = submission.decode_gamma(inputs)
print(result)
[256, 265]
You need to implement your code below, and copy this code in the file submission.py
In [34]:
def decode_delta(inputs):# do not change the function heading
pass # **replace** this line with your code
After implementation of decode_delta, you can test your implementation here. You should see the following output: [256, 265]
In [35]:
import submission
inputs = “111000100000000111000100001001”
result = submission.decode_delta(inputs)
print(result)
[256, 265]
You need to implement your code below, and copy this code in the file submission.py
In [36]:
def decode_rice(inputs, b):# do not change the function heading
pass # **replace** this line with your code
After implementation of decode_rice, you can test your implementation here. you should see the following output: [18, 19]
In [37]:
import submission
inputs = “11110101111011”
b = 4
result = submission.decode_rice(inputs, b)
print(result)
[18, 19]