Map
I A Map is what Java calls a PhoneDirectory. I Let¡¯s have a look at the Map interface.
I V put(K key, V value) is like addOrChangeEntry. I V get(K key) is like lookupEntry.
I V remove(Object Key) is like removeEntry.
I We have already implemented a PhoneDirectory using an unsorted or sorted array or doubly linked list.
I You are currently implementing Map as a BST. I It still takes O(n) to get or put in the worst case.
I If the items are not inserted in random order. I We need a way to keep the tree balanced.
This week¡¯s application
I We need a nice application for our Map. I Yet another game!
I Daily Jumble
I Need to unscramble words.
I Puzzle has ¡°rtpocmue¡±?
I Unscrambled is ¡°computer¡±.
I HowcanaMaphelpustodothat?
Slow Way
I We have a dictionary file. I Read it in.
I Try every possible ordering of ¡°rtpocmue¡±. I Look up each one in the dictionary.
I What is the running time?
I Lookup might be O(log n) time, good.
I But the number of orderings is 8! = 40,320, bad!.
Using a Map
I Let¡¯s use a Map.
I The value will be “computer”.
I What will be the key?
I How about the letters in alphabetical order? I That is “cemoprtu”.
I To get ready:
I Read each word from the dictionary file,
I Put it into the Map.
I The key will be the letters of the word in alphabetical order. I The value will be the word.
I To solve a scramble ¡°rtpmceuo¡±:
I Alphabetize it to ¡°cemoprtu¡±.
I Look it up in the map: ¡°computer¡±.
I Does anyone see a problem?
I The words ¡°dare¡±, ¡°dear¡±, and ¡°read¡±
I will all be stored under the key ¡°ader¡±.
I So the value will be ¡°read¡± because it is last.
I Solution is to use List
Implementation using our Map implementations
I I will run the Jumble solver for you using the our different implementations of PhoneDirectory or Map.
I PDMap is a way to convert a PhoneDirectory to a Map.
I The lookup might be as fast as O(n) (for SortedPD) but that¡¯s not the
problem.
I The problem is adding all the words in the dictionary. I We never got add faster than O(n)
I If we add n words, that¡¯s O(n2),
I which is pretty slow.
I It¡¯s even slower for a bigger dictionary, as I will show you. I To get to entry i in the list takes i steps.
I An array can do it in one step (theArray[i])
I but then adding at the head will cost O(n)
I The linked list lets us add quickly once we get there, I but it takes a while to get there.
I BST is faster,
I but still much slower than a balanced tree (TreeMap).
I We need to learn how to balance the tree.
Cost of reallocate
I Let¡¯s think about ArrayBasedPD first!
I What is the cost of add (not addOrChangeEntry)?
I addOrChangeEntry is O(n) because?
I add is O(n) when we have to reallocate,
I but we don¡¯t have to reallocate very often.
I Call add n times (starting with an empty array) costs O(n), not O(n2).
Cost of Add
I Let¡¯s start with an array of size 1 and count the number of array assignments.
I add(1)
I 1:n=1a=1
I add(2) (need to reallocate)
I 12:n=2a=3
I add(3) (need to reallocate)
I 123_: n=3a=6
I add(4)
I 1234: n=4a=7
I add(5) (need to reallocate)
I 12345___: n=4a=12
I add(6), add(7), add(8)
I 12345678: n=8a=15
I add(9) (need to reallocate)
I 123456789_______:n=9a=24
I add(10) – add(16)
I 12345678910111213141516: n=16a=31
I add(17)
I 1234567891011121314151617_______________:
n=16 a=48
Cost of Add
I The total number of assignments is never more than 3n (check!). Why?
I Adding an item the right half of the array requires one assignment now
I plus two more in the future to move it and one item in the left half.
I After that, it is always in the left half, so a newer item will pay to move it from now on.
I It¡¯s actually very much like Social Security!
I Question: will this still work if we only increase the array size by 50%
when we reallocate?
I Yes. Each item added to the right third will pay 1 now and 3 in the future
to move it and 2 items in the left two-thirds. I So the total never exceeds 4n.
I Which is still O(n).
I This is called amortized analysis.
Tree Balancing
I What does this have to do with balancing trees?
I Suppose we start with a balanced tree with 100 nodes in the left and
right half.
I Then we add nodes using the add method you implemented in lab before break.
I When the right half of a (sub)tree is more than twice as big as the left half, we rebalance that tree.
I For this to happen, we have to add at least 100 more nodes to the tree. I So when we rebalance, the tree has to become 50% bigger before we
have to rebalance.
I Does that sound familiar?
I That tells us the total cost of rebalancing never exceeds O(n) per subtree to which we add each item.
I But each subtree is at most 2/3 of its parent tree
I because the subtree is it most twice as big as the other subtree.
I So we add each item to trees of size n, (2/3)n, (2/3)2 n, (2/3)3 n, . . .. I If k > log1.5 n then (2/3)k n < 1, so not more than log1.5 n trees.
I So we only add each item to O(log n) trees.
I So the cost of adding n items is O(n log n).
Rebalancing using Linked Lists
I Question: How can we rebalance an entire tree without losing the correct order of the nodes?
I Answer: The nodes will also have next and previous fields. The class will have a root pointer and a head and tail pointer.
I It will be a binary tree and a linked list. (Bwa ha ha!!)
I So when we want to rebalance, we will just create a balanced tree from the list starting at the minimum of the tree and with the same size as the tree.
Linked List into Balanced Tree
I rebalance(head, size) will take a list starting at head with size nodes
I and return (the root) of a balanced tree.
I rebalance calls rebalanceLeft(head, size - size/2) to get the root and left subtree
I and then it calls rebalance(root.next, size/2) to get the right subtree. I rebalanceLeft(head, size) calls rebalanceLeft(head, size/2) to get the
root of the left subtree
I and rebalanceLeft(left.next, size - size/2) to get the root and the right subtree.
I Some assembly required!!