CS61B Lecture #31
• More balanced search structures (DS(IJ), Chapter 9 Coming Up:
• Pseudo-random Numbers (DS(IJ), Chapter 11)
Last modified: Thu Nov 1 19:39:39 2018
Copyright By PowCoder代写 加微信 powcoder
CS61B: Lecture #31 1
Really Efficient Use of Keys: the Trie
• Haven’t said much about cost of comparisons.
• For strings, worst case is length of string.
• Therefore should throw extra factor of key length, L, into costs:
– Θ(M ) comparisons really means Θ(M L) operations.
– So to look for key X, keep looking at same chars of X M times.
• Can we do better? Can we get search cost to be O(L)?
Idea: Make a multi-way decision tree, with one decision per character
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 2
The Trie: Example
• Set of keys
{a, abase, abash, abate, abbas, axolotl, axe, fabric, facet}
• Ticked lines show paths followed for “abash” and “fabric” • Each internal node corresponds to a possible prefix.
• Characters in path to node = that prefix.
e ax o axolotl✷
abase✷ abash✷ Last modified: Thu Nov 1 19:39:39 2018
abbas✷ axe✷
CS61B: Lecture #31 3
Adding Item to a Trie
• Result of adding bat and faceplate. • New edges ticked.
aba abbas✷ axe✷ axolotl✷
abas abate✷ p face
fabric✷ fac ste
eht abase✷ abash✷ faceplate✷ facet✷
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 4
A Side-Trip: Scrunching
• For speed, obvious implementation for internal nodes is array in- dexed by character.
• Gives O(L) performance, L length of search key.
• [Looks as if independent of N, number of keys. Is there a depen-
• Problem: arrays are sparsely populated by non-null values—waste of space.
Idea: Put the arrays on top of each other!
• Use null (0, empty) entries of one array to hold non-null elements of
• Use extra markers to tell which entries belong to which array.
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 5
Scrunching Example
Small example: (unrelated to Tries on preceding slides)
• Three leaf arrays, each indexed 0..9
0123456789 0123456789
bass trout pike
0123456789
ghee milk oil
salt cumin
• Now overlay them, but keep track of original index of each item:
A3: 0 1* 2 3 4 5* 6 7 8 9*
A2: 0 1 2* 3 4 5 6* 7* 8 9
A1: 0* 1 2 3 4 5* 6 7* 8 9
0 -1 1 -1 2 5 5 7 6 7 9
Last modified: Thu Nov 1 19:39:39 2018
CS61B: Lecture #31 6
trout pike
milk oil mace
• The scrunching idea is cute, but
– Not so good if we want to expand our trie.
– A bit complicated.
– Actually more useful for representing large, sparse, fixed tables with many rows and columns.
• Furthermore, number of children in trie tends to drop drastically when one gets a few levels down from the root.
• So in practice, might as well use linked lists to represent set of node’s children. . .
• . . . but use arrays for the first few levels, which are likely to have more children.
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 7
Probabilistic Balancing: Skip Lists
• A skip list can be thought of as a kind of n-ary search tree in which we choose to put the keys at “random” heights.
• More often thought of as an ordered list in which one can skip large segments.
• Typical example:
−∞ 10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞
• To search, start at top layer on left, search until next step would
overshoot, then go down one layer and repeat.
• In list above, we search for 125 and 127. Gray nodes are looked at; darker gray nodes are overshoots.
• Heights of the nodes were chosen randomly so that there are about 1/2 as many nodes that are > k high as there are that are k high.
• Makes searches fast with high probability.
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 8
Probabilistic Balancing: Skip Lists
• A skip list can be thought of as a kind of n-ary search tree in which we choose to put the keys at “random” heights.
• More often thought of as an ordered list in which one can skip large segments.
• Typical example: ⇓
−∞ 10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞
• To search, start at top layer on left, search until next step would
overshoot, then go down one layer and repeat.
• In list above, we search for 125 and 127. Gray nodes are looked at; darker gray nodes are overshoots.
• Heights of the nodes were chosen randomly so that there are about 1/2 as many nodes that are > k high as there are that are k high.
• Makes searches fast with high probability.
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 9
• Typical example:
Probabilistic Balancing: Skip Lists
• A skip list can be thought of as a kind of n-ary search tree in which we choose to put the keys at “random” heights.
• More often thought of as an ordered list in which one can skip large segments.
−∞ 10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞
• To search, start at top layer on left, search until next step would
overshoot, then go down one layer and repeat.
• In list above, we search for 125 and 127. Gray nodes are looked at; darker gray nodes are overshoots.
• Heights of the nodes were chosen randomly so that there are about 1/2 as many nodes that are > k high as there are that are k high.
• Makes searches fast with high probability.
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 10
• Typical example:
Probabilistic Balancing: Skip Lists
• A skip list can be thought of as a kind of n-ary search tree in which we choose to put the keys at “random” heights.
• More often thought of as an ordered list in which one can skip large segments.
−∞ 10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞
• To search, start at top layer on left, search until next step would
overshoot, then go down one layer and repeat.
• In list above, we search for 125 and 127. Gray nodes are looked at; darker gray nodes are overshoots.
• Heights of the nodes were chosen randomly so that there are about 1/2 as many nodes that are > k high as there are that are k high.
• Makes searches fast with high probability.
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 11
• Typical example:
Probabilistic Balancing: Skip Lists
• A skip list can be thought of as a kind of n-ary search tree in which we choose to put the keys at “random” heights.
• More often thought of as an ordered list in which one can skip large segments.
−∞ 10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞
• To search, start at top layer on left, search until next step would
overshoot, then go down one layer and repeat.
• In list above, we search for 125 and 127. Gray nodes are looked at; darker gray nodes are overshoots.
• Heights of the nodes were chosen randomly so that there are about 1/2 as many nodes that are > k high as there are that are k high.
• Makes searches fast with high probability.
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 12
• Typical example:
Probabilistic Balancing: Skip Lists
• A skip list can be thought of as a kind of n-ary search tree in which we choose to put the keys at “random” heights.
• More often thought of as an ordered list in which one can skip large segments.
−∞ 10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞
• To search, start at top layer on left, search until next step would
overshoot, then go down one layer and repeat.
• In list above, we search for 125 and 127. Gray nodes are looked at; darker gray nodes are overshoots.
• Heights of the nodes were chosen randomly so that there are about 1/2 as many nodes that are > k high as there are that are k high.
• Makes searches fast with high probability.
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 13
• Typical example:
Probabilistic Balancing: Skip Lists
• A skip list can be thought of as a kind of n-ary search tree in which we choose to put the keys at “random” heights.
• More often thought of as an ordered list in which one can skip large segments.
−∞ 10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞
• To search, start at top layer on left, search until next step would
overshoot, then go down one layer and repeat.
• In list above, we search for 125 and 127. Gray nodes are looked at; darker gray nodes are overshoots.
• Heights of the nodes were chosen randomly so that there are about 1/2 as many nodes that are > k high as there are that are k high.
• Makes searches fast with high probability.
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 14
• Typical example:
Probabilistic Balancing: Skip Lists
• A skip list can be thought of as a kind of n-ary search tree in which we choose to put the keys at “random” heights.
• More often thought of as an ordered list in which one can skip large segments.
−∞ 10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞
• To search, start at top layer on left, search until next step would
overshoot, then go down one layer and repeat.
• In list above, we search for 125 and 127. Gray nodes are looked at; darker gray nodes are overshoots.
• Heights of the nodes were chosen randomly so that there are about 1/2 as many nodes that are > k high as there are that are k high.
• Makes searches fast with high probability.
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 15
• Typical example:
Probabilistic Balancing: Skip Lists
• A skip list can be thought of as a kind of n-ary search tree in which we choose to put the keys at “random” heights.
• More often thought of as an ordered list in which one can skip large segments.
−∞ 10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞
• To search, start at top layer on left, search until next step would
overshoot, then go down one layer and repeat.
• In list above, we search for 125 and 127. Gray nodes are looked at; darker gray nodes are overshoots.
• Heights of the nodes were chosen randomly so that there are about 1/2 as many nodes that are > k high as there are that are k high.
• Makes searches fast with high probability.
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 16
• Typical example:
Probabilistic Balancing: Skip Lists
• A skip list can be thought of as a kind of n-ary search tree in which we choose to put the keys at “random” heights.
• More often thought of as an ordered list in which one can skip large segments.
−∞ 10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞
• To search, start at top layer on left, search until next step would
overshoot, then go down one layer and repeat.
• In list above, we search for 125 and 127. Gray nodes are looked at; darker gray nodes are overshoots.
• Heights of the nodes were chosen randomly so that there are about 1/2 as many nodes that are > k high as there are that are k high.
• Makes searches fast with high probability.
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 17
• Typical example:
Probabilistic Balancing: Skip Lists
• A skip list can be thought of as a kind of n-ary search tree in which we choose to put the keys at “random” heights.
• More often thought of as an ordered list in which one can skip large segments.
−∞ 10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞
• To search, start at top layer on left, search until next step would
overshoot, then go down one layer and repeat.
• In list above, we search for 125 and 127. Gray nodes are looked at; darker gray nodes are overshoots.
• Heights of the nodes were chosen randomly so that there are about 1/2 as many nodes that are > k high as there are that are k high.
• Makes searches fast with high probability.
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 18
Example: Adding and deleting
• Starting from initial list:
−∞ 10 20 25 30 40 50 55 60 90 95 100 115 120 125 130 140 150 ∞
• In any order, we add 126 and 127 (choosing random heights for
them), and remove 20 and 40:
−∞ 10 25 30 50 55 60 90 95 100 115 120 125 126 127 130 140 150 ∞ • Shaded nodes here have been modified.
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 19
• Balance in search trees allows us to realize Θ(lg N ) performance. • B-trees, red-black trees:
– Give Θ(lg N ) performance for searches, insertions, deletions.
– B-trees good for external storage. Large nodes minimize # of
I/O operations • Tries:
– Give Θ(B) performance for searches, insertions, and deletions, where B is length of key being processed.
– But hard to manage space efficiently.
• Interesting idea: scrunched arrays share space. • Skip lists:
– Give probable Θ(lg N ) performace for searches, insertions, dele- tions
– Easy to implement.
– Presented for interesting ideas: probabilistic balance, random- ized data structures.
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 20
Summary of Collection Abstractions
contains, iterator
Blue: Java has corresponding interface Green: Java has no corresponding interface
Ordered Set
Unordered Set
Priority Queue
contains, iterator get
Sorted Set
Unordered Map
Ordered Map
Last modified: Thu Nov 1 19:39:39 2018
CS61B: Lecture #31 21
Data Structures that Implement Abstractions
• List: arrays, linked lists, circular buffers • Set
– OrderedSet
∗ Priority Queue: heaps
∗ Sorted Set: binary search trees, red-black trees, B-trees, sorted arrays or linked lists
– Unordered Set: hash table
• Unordered Map: hash table
• Ordered Map: red-black trees, B-trees, sorted arrays or linked lists
Last modified: Thu Nov 1 19:39:39 2018 CS61B: Lecture #31 22
Corresponding Classes in Java
Multiset (Collection)
• List: ArrayList, LinkedList, Stack, ArrayBlockingQueue,
ArrayDeque
– OrderedSet
∗ Priority Queue: PriorityQueue
∗ Sorted Set (SortedSet): TreeSet
– Unordered Set: HashSet
• Unordered Map: HashMap
• Ordered Map (SortedMap): TreeMap
Last modified: Thu Nov 1 19:39:39 2018
CS61B: Lecture #31 23
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com