程序代写代做 data structure C algorithm graph Huffman & Intro to Graphs—Monday, February 24/Tuesday, February 25

Huffman & Intro to Graphs—Monday, February 24/Tuesday, February 25
Readings
• Lecture Notes Chapter 15: Huffman Coding
• Lecture Notes Chapter 16: Graph Representations & BFS
Problems Problem 1
What are pros and cons of Huffman Coding?
Solution
Pros: Huffman encodings represents only the characters that are contained in the text, without wasting space in the encoding length for characters that are not present. Additionally, characters that occur most take less bits to encode each time.
Cons: It does not help with longer patterns in the text. Storing the symbol table takes up additional space. Therefore it isn’t appropriate for very small files where the table size would be significant. Encoding a text requires a preprocessing step to generate the table, does not work with a data stream.
Problem 2
Construct an optimal Huffman coding for the following alphabet and frequency table:
What is the average encoded character length for the above encoding?
Solution
The following tree would be produced:
/\ A /\
B /\ C /\
ED
Problem 3
Solution Set
CIS 121—Data Structures and Algorithms—Spring 2020
Character: Frequency:
A 0.4
B 0.3
C 0.15
D 0.1
E 0.05
lavg =􏰄d(c)∗freq(c)=.4∗1+.3∗2+.15∗3+.1∗4+.05∗4=2.05 c∈R
You have an alphabet with n > 2 letters and frequencies. You perform Huffman encoding on this alphabet, and notice that the character with the largest frequency is encoded by a 0. In this alphabet, symbol i occurs with probability pi, where p1 ≥ p2 ≥ p3 ≥ … ≥ pn.
Given this alphabet and encoding, prove that there does not exist an assignment of probabilities to p1 through pn such that p1 < 1/3?. 1 Solution We can use a simple proof by contradiction. Assume that there exists an assignment such that p1 < 1. 3 Look at the last step of the Huffman algorithm, that is the step when our two final nodes are merged into one node. Let these two final nodes be called x and y. Because character 1 has an encoding length of 1, it must have been included in this step. WLOG let x be our single character 1. This is the final step of Huff- man, so we know that px+py = 1. Given our assumption that p1 < 1, this tells us that py = 1−px → py > 2. 33
We know that because n > 2, y must be a group containing at least 2 characters. So, examine the time when y was created. Let the nodes which combined to y be called a and b. We know that pa + pb > 2 , which
3
implies that max{pa , pb } > 1 . Here we have reached contradiction. We know Huffman uses the smallest two 3
nodes at all steps, but at the step y was created, node x was still available where px < 1 . Thus, it would Solution Set 3 We can perform a BFS traversal and just keep track of which elements have been seen. For example, we can store vertices we have seen in a set and just track whether or not any previously seen node is encountered again by checking if it is in the set. Since we are simply doing a BFS, this algorithm runs in linear time. Additional Problems Problem 5 Design an algorithm to determine whether or not a connected undirected graph has a cycle in O(n) time. Solution Perform a BFS traversal and terminate early if you explore at least n edges. Recall that a undirected connected acyclic graph is by definition a tree. We know that a connected graph on n nodes is a tree iff it has exactly n − 1 edges. An additional edge would signify that two nodes are connected by two independent paths, implying the existence of a cycle. This algorithm will take O(n) time to check each vertex and O(n) to check each edge (since we are checking at most n edges). Thus, the running time is O(n). have been chosen instead of max{a, b}. This contradiction then proves the original claim. Problem 4 Design an algorithm to determine whether or not an undirected graph has a cycle. Solution 2