assignment6
Assignment 6 – Data Structure¶
Only submit .py files to moodle!¶
Q1. (difficulity: **)¶
We store data in bits in our computers, data compression is a way to use fewer bits than the original representation, so the size of the data can be reduced.¶
Compression can be either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression, for example, the file format “.zip” is one of the example of lossless compression. Lossy compression reduces bits by removing unnecessary or less important information, for example,”.jpg” is one of the lossy compression for images.¶
In this question, you will be focused in lossless compression. For example, say we have a mapping for encoding as follows:¶
Using the mapping above, we can therefore encode a message, for example:¶
$BADHEADFEED \Rightarrow 001000011111100000011101100100011$ ¶Huffman code is a type of encoding method that is commonly used for lossless data compression in our computers. It has variable length of its codes, that is, the size of each code is not necessary the same. To construct Huffman code, we first need to count the occurrence or frequency of each character. Using the same example from above, we have¶
Then, we can simplify the table by removing those $0$ occurrence¶
Now we construct a tree using the following rule:¶
– All the characters are leaf nodes¶
– Form a sub-tree by taking the least probable sub-trees¶
– Sort the dictionary keys in ascendant order according to ASCII each step¶
One may illustrate the process by the following,¶
The Huffman code for each character is determined by its leaf position of the tree. That is, counting from the root, left corresponds to $0$ and right corresponds to $1$. So, $A$ is the right most leaf, and hence the corresponding code is $111$.¶
Therefore, using the Huffman coding, we have¶
$BADHEADFEED \Rightarrow 000111011101011101001101001$ ¶So, compare to the fixed length code, Huffman code of $BADHEADFEED$ uses only $27$ bits, which is $6$ bits less than the fixed length code.¶
ABOUT THE PROGRAM BODY, USE THE FOLLOWING TREE STRUCTURE, REMEMBER TO MODIFY THE FILENAMES YOURSELF: ¶
In [ ]:
class Node():
def __init__(self, v):
self.value = v
self.left = None
self.right = None
class BinaryTree():
def __init__(self, root):
self.root = Node(root)
def preorder(self, current, traversal):
# root -> left -> right
if current is not None:
traversal.append(current.value)
self.preorder(current.left, traversal)
self.preorder(current.right, traversal)
return traversal
def inorder(self, current, traversal):
# left -> root -> right
if current is not None:
self.inorder(current.left, traversal)
traversal.append(current.value)
self.inorder(current.right, traversal)
return traversal
def postorder(self, current, traversal):
# left -> right -> root
if current is not None:
self.postorder(current.left, traversal)
self.postorder(current.right, traversal)
traversal.append(current.value)
return traversal
def breadth_first(self, current):
if current is None:
return
q = Queue()
q.enqueue(current)
traversal = []
while q.get_size() > 0:
traversal.append(q.front())
node = q.dequeue()
if node.left is not None:
q.enqueue(node.left)
if node.right is not None:
q.enqueue(node.right)
return traversal
def height(self, current):
if current is None:
return -1
else:
left_h = self.height(current.left)
right_h = self.height(current.right)
return 1 + max(left_h, right_h)
def size(self, current):
return len(self.breadth_first(current))
def leaves(self, current, traversal, code):
# root -> left -> right
if current is not None:
if (current.left is None) and (current.right is None):
traversal[current.value] = code
self.leaves(current.left, traversal, code+’0′)
self.leaves(current.right, traversal, code+’1′)
return traversal
You are ONLY ALLOWED to use the tree $\verb|Tree()|$ data structure, and the node $\verb|Node()|$ data structure from lecture notes (above) to stores the corresponding characters, NO new classes can be created, NO external packages are allowed, and NO modification of the existing Tree and Node classes.¶
Then, write a program that outputs a Huffman encoded message given an input of English message (in upper case).¶
time: 3 seconds¶
filename: assignment_6_Q1.py¶
input filename: assignment_6_Q1_input.txt¶
output filename: assignment_6_Q1_output.txt¶
In [ ]:
In [ ]:
In [2]:
class Queue():
def __init__(self):
self.qu = []
def is_empty(self):
return self.qu == []
def enqueue(self, item):
self.qu.append(item)
def dequeue(self): # this is modified
if self.is_empty():
return False
else:
return self.qu.pop(0)
def get_queue(self): # not a must
return self.qu
def get_size(self):
return len(self.qu)
def front(self): # this is a special function for the tree structure only
if not self.is_empty():
return self.qu[0]
class Vertex():
def __init__(self, n):
self.name = n
class Edge():
def __init__(self, v, c):
self.v = v # list of length 2
self.cap = c
class Graph():
def __init__(self, vertices, edges):
self.vertex_set = vertices #list
self.edge_set = edges #list
def add_vertex(self, v):
self.vertex_set.append(v)
def add_edge(self, e):
self.edge_set.append(e)
def neighbours(self, vertex):
neighbours = []
for e in self.edge_set:
for i in range(2):
if vertex == e.v[i]:
neighbours.append(e.v[1-i])
return neighbours
def traverse(self, current):
if current is None:
return
q = Queue()
q.enqueue(current)
traversal = []
seen = []
while q.get_size() > 0:
traversal.append(q.front())
node = q.dequeue()
seen.append(node)
#print(node.name)
# find neighbours
for i in self.neighbours(node):
if i not in traversal and i not in seen:
q.enqueue(i)
return traversal
def is_connected(self, v1, v2):
if v2 in self.traverse(v1):
return True
else:
return False
a = Vertex(‘a’)
b = Vertex(‘b’)
e_ab = Edge([a, b], 1)
c = Vertex(‘c’)
d = Vertex(‘d’)
e_ac = Edge([a, c], 1)
G = Graph([a,b,c,d], [e_ab, e_ac])
# for i in G.neighbours(a):
# print(i.name)
# for i in G.traverse(d):
# print(i.name)
print(G.is_connected(c,d))
False
In [ ]: