Algorithms & Data Structures (Winter 2022) Algorithm Paradigms – Greedy
Announcements
Copyright By PowCoder代写 加微信 powcoder
• Complete Search
• Divide and Conquer.
• Dynamic Programming.
• Introduction. • Examples.
Greedy – Example
Given 1 ≤ C ≤ 5 chambers which can store 0, 1, or 2 specimens,
1 ≤ S ≤ 2C specimens, and M: a list of mass of the S specimens, determine in which chamber we should store each specimen in order to minimize i.e. sum of differences between the mass in each chamber w.r.t A. where Xi is the total mass of specimens in chamber A is the average of all mass over C chambers
Greedy – Example
• Observations.
• If there exists an empty chamber, at least one chamber with 2 specimens must be moved to this empty chamber! Otherwise the empty chambers contribute too much to IMBALANCE!
Greedy – Example
• Observations. If S > C, then S−C specimens must be paired with one other specimen already in some chambers.
Greedy – Example
• Observations. If S < 2C, add dummy 2C−S specimens with mass 0.
• Forexample,C=3,S=4,M={5,1,2,7}→C=3,S=6,M={5,1,2,7,0,0}.
• Then, sort these specimens based on their mass such that M1 ≤M2 ≤...≤M2C−1≤M2C
• Forexample,M={5,1,2,7,0,0}→{0,0,1,2,5,7}.
Greedy - Example
• By adding dummy specimens and then sorting them, a greedy
strategy ‘appears’. We can now:
• Pair the specimens with masses M1&M2C and put them in chamber 1, then
• Pair the specimens with masses M2&M2C−1 and put them in chamber 2, and so on
Greedy – Example – Data Compression
• We’ve seen how greedy algorithms can be used to commit to certain parts of a solution, based entirely on relatively short- sighted considerations. We will study now a problem on which a greedy rule is used, essentially, to shrink the size of the problem instance, so that an equivalent smaller problem can then be solved by recursion.
• Given a string X, efficiently encode X into a smaller string Y (Saves memory and/or bandwidth).
• Since computers ultimately operate on sequences of bits, one needs encoding schemes that take text written in richer alphabets (such as the alphabets underpinning human languages) and converts this text into long strings of bits.
Greedy – Example – Data Compression
• Given a string X, efficiently encode X into a smaller string Y.
• A data file of 100,000 characters contains only the characters
a–f, with the frequencies indicated.
300000 bits
• The simplest way to do this would be to use a fixed number of bits for each symbol in the alphabet, and then just concatenate the bit strings for each symbol to form the text.
• The letters in most human alphabets do not get used equally frequently.
• English: the letters e, t, a, o, i, and n get used much more frequently than
q, j, x, and z (by more than an order of magnitude)
Greedy – Example – Data Compression
• Given a string X, efficiently encode X into a smaller string Y.
• A data file of 100,000 characters contains only the characters a–f, with the frequencies indicated.
300000 bits 224000 bits
Greedy – Example – Data Compression
• Given a string X, efficiently encode X into a smaller string Y.
• A data file of 100,000 characters contains only the characters
a–f, with the frequencies indicated.
300000 bits 224000 bits
• To give frequent characters short codewords and infrequent characters long codewords.
• To consider here only codes in which no codeword is also a prefix of some other codeword.
Greedy – Example – Data Compression
• A good approach: Huffman encoding
• Compute frequency f(c) for each character c.
• Encode high-frequency characters with short code words
• No code word is a prefix for another code
• Use an optimal encoding tree to determine the code word.
• Any prefix-free binary code can be visualized as a binary tree with the encoded characters stored at the leaves.
Greedy – Huffman encoding
• A code is a mapping of each character of an alphabet to a binary code-word
• A prefix code is a binary code such that no code-word is the prefix of another code-word
• An encoding tree represents a prefix code
• Each external node (leaf) stores a character
• The code word of a character is given by the path from the root to the external node storing the character (0 for a left child and 1 for a right child)
Greedy – Huffman encoding
Initial string: X = acda
Encoded string: Y = 00 011 10 00 15
Greedy – Huffman encoding
• An encoding tree represents a prefix code
• The tree is a full binary tree.
• Everynonleafnodehastwochildren
• Each external node (leaf) stores a character.
• Thenumberofleavesis|alphabet|(oneforeachletterofthealphabet) • Thenumberofinternalnodesis|alphabet|-1
• The code word of a character is given by the path from the root to the external node storing the character (0 for a left child and 1 for a right child)
• Wecancomputethenumberofbitsrequired.
Greedy – Huffman encoding
• This is exactly the same cost function we considered for optimizing binary search trees, but the optimization problem is different, because code trees are not required to keep the keys in any particular order.
• The search for an optimal prefix code can be viewed as the search for a binary tree T, together with a labeling of the leaves of T, that minimizes the number of bits used (length of the path).
Greedy – Huffman encoding
• The search for an optimal prefix code can be viewed as the search for a binary tree T, together with a labeling of the leaves of T, that minimizes the number of bits used (length of the path).
Greedy – Huffman encoding
• X = abracadabra
• T1 encodes X into 29 bits, T2 encodes X into 24 bits T1 T2
Greedy – Huffman encoding
• The search for an optimal prefix code can be viewed as the search for a binary tree T, together with a labeling of the leaves of T, that minimizes the number of bits used (length of the path).
• In 1951, as a PhD student at MIT, developed the following greedy algorithm to produce such an optimal code
Huffman’s Algorithm
• There is an optimal prefix code, with corresponding tree T∗, in which the two lowest-frequency letters are assigned to leaves that are siblings in T∗ and have the largest depth of any leaf.
By deleting the lowest-frequency letters yields an instant with a smaller alphabet
Merge the two least frequent letters and recurse
X = abracadabra Frequencies
abcdr 52112
abcdr acdbr
Greedy – Example – Data Compression
• Suppose we want to encode the following helpfully self-descriptive sentence.
discovered by
• let’s ignore the forty-four spaces, nineteen apostrophes, nineteen commas, three hyphens, and only one period.
• An the frequency table.
Greedy – Example – Data Compression
• Huffman’s algorithm picks out the two least frequent letters, breaking ties arbitrarily—in this case, say, Z and D—and merges them together into a single new character DZ with frequency 3. This new character becomes an internal node in the code tree we are constructing, with Z and D as its children; it doesn’t matter which child is which. The algorithm then recursively constructs a Huffman code for the new frequency table
Greedy – Example – Data Compression
• After 19 merges, all 20 letters have been merged together.
Greedy – Example – Data Compression
Greedy – Example – Data Compression
• Altogether, the encoded message is 649 bits long.
• Different Huffman codes encode the same characters differently, possibly with code words of different length, but the overall length of the encoded message is the same for every Huffman code: 649 bits.
Huffman’s Algorithm
<
2n – 1 INSERT
2n – 2 EXTRACTMIN
If using a binary heap, each of these O(n) operations requires O(logn)
Huffman’s Algorithm
• Given a string X,
Huffman’s algorithm construct a prefix code the minimizes the size of the encoding of X
• It runs in time
O(d + n log n), where d is the size of X and n is the number of distinct characters of X
• A heap-based priority queue is used as an auxiliary structure
• Complete Search
• Divide and Conquer.
• Dynamic Programming. • Introduction.
• Examples.
• Introduction. • Examples.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com