程序代写代做代考 algorithm go C AVL COMP90038 – Algorithms and Complexity Lecture 16

COMP90038 – Algorithms and Complexity Lecture 16
COMP90038 Algorithms and Complexity
Lecture 16: Time/Space Tradeoffs—String Search Revisited (with thanks to Harald Søndergaard & Michael Kirley)
Casey Myers Casey.Myers@unimelb.edu.au David Caro Building (Physics) 274

COMP90038 – Algorithms and Complexity
Lecture 16
Review from Lecture 15: AVL Trees
• For a binary (sub-) tree, let the balance factor be the difference between the height of its left sub-tree and that of its right sub-tree.
• An AVL tree is a BST in which the balance factor is -1, 0, or 1, for every sub-tree.
• The notion of “balance” that is implied by the AVL condition is sufficient to guarantee that the depth of an AVL tree with 𝑛 nodes is Θ log 𝑛 .

COMP90038 – Algorithms and Complexity
Lecture 16
Review from Lecture 15: AVL Trees
• For a binary (sub-) tree, let the balance factor be the difference between the height of its left sub-tree and that of its right sub-tree.
• An AVL tree is a BST in which the balance factor is -1, 0, or 1, for every sub-tree.
• The notion of “balance” that is implied by the AVL condition is sufficient to
guarantee that the depth of an AVL tree with 𝑛 nodes is Θ log 𝑛 .
✘✘

COMP90038 – Algorithms and Complexity
Lecture 16
Review from Lecture 15: AVL Trees Rotations

COMP90038 – Algorithms and Complexity Lecture 16
Review from Lecture 15: 2-3 Trees
• A node that holds two items (a so-called 3-node) has (at most) three children.
• A 2-3 tree stores up to 2 items in each tree node.
• Insertions, splits and promotions are used to grow and balance the tree.

COMP90038 – Algorithms and Complexity
Lecture 16
Spending Space to Save Time
• Often we can find ways of decreasing the time required to solve a problem, by using additional memory in a clever way.
• For example, in Lecture 6 we considered the simple recursive way of finding the 𝑛th Fibonacci number and discovered that the algorithm uses exponential time.
• However, suppose the same algorithm uses a table to tabulate the function FIB as we go:
– As soon as an intermediate result FIB(𝑖) has been found, it is not simply returned to the caller; the value is first placed in slot 𝑖 of a table (an array). Each call to FIB first looks in this table to see if the required value is there, and only if it is not, the usual recursive process kicks in.

COMP90038 – Algorithms and Complexity
Lecture 16
Fibonacci Numbers with Tabulation
• We assume that, from the outset, all entries of the table 𝐹 are 0.

COMP90038 – Algorithms and Complexity
Lecture 16
Fibonacci Numbers with Tabulation
• We assume that, from the outset, all entries of the table 𝐹 are 0.
Complexity is exponential in 𝑛

COMP90038 – Algorithms and Complexity
Lecture 16
Fibonacci Numbers with Tabulation
• We assume that, from the outset, all entries of the table 𝐹 are 0.
Complexity is exponential in 𝑛
• (I show this code just so that you can see the principle; in Lecture 6 we already discovered a different linear-time algorithm, so here we don’t really need tabulation.)

COMP90038 – Algorithms and Complexity
Lecture 16
Fibonacci Numbers with Tabulation
• We assume that, from the outset, all entries of the table 𝐹 are 0.
Initial
𝐹 = 0,0,⋯,0
𝑛=2
result=FIB(1)+FIB(0) =1+1=2
𝐹 = 0,0,2,0,⋯,0
• (I show this code just so that you can see the principle; in Lecture 6 we already discovered a different linear-time algorithm, so here we don’t really need tabulation.)

COMP90038 – Algorithms and Complexity
Lecture 16
Fibonacci Numbers with Tabulation
• We assume that, from the outset, all entries of the table 𝐹 are 0.
Initial
𝐹 = 0,0,⋯,0
𝑛=2
result=FIB(1)+FIB(0) =1+1=2
𝐹 = 0,0,2,0,⋯,0
𝑛=3
result=FIB(2)+FIB(1) =2+1=3
𝐹 = 0,0,2,3,0, ⋯ , 0
• (I show this code just so that you can see the principle; in Lecture 6 we already discovered a different linear-time algorithm, so here we don’t really need tabulation.)

COMP90038 – Algorithms and Complexity
Lecture 16
Fibonacci Numbers with Tabulation
• We assume that, from the outset, all entries of the table 𝐹 are 0.
Initial
𝐹 = 0,0,⋯,0
𝑛=2
result=FIB(1)+FIB(0) =1+1=2
𝐹 = 0,0,2,0,⋯,0
𝑛=2
result=FIB(2)+FIB(1) =2+1=3
𝐹 = 0,0,2,3,0, ⋯ , 0
𝑛=4
result=FIB(3)+FIB(2) =3+2=5
𝐹 = 0,0,2,3,5,0, ⋯ , 0
• (I show this code just so that you can see the principle; in Lecture 6 we already discovered a different linear-time algorithm, so here we don’t really need tabulation.)

COMP90038 – Algorithms and Complexity Lecture 16
Sorting by Counting
• Suppose we need to sort large arrays, but we know that they will hold keys taken from a small, fixed set (so lots of duplicate keys).
• For example, suppose all keys are single digits in array 𝐴 : 𝐴=633810879253531876512153
• Then we can, in a single linear scan count the occurrences of each key in array 𝐴 and store the result in a small table:
𝑘𝑒𝑦
0
1
2
3
4
5
6
7
8
9
𝑂𝑐𝑐
0
0
0
0
0
0
0
0
0
0

COMP90038 – Algorithms and Complexity Lecture 16
Sorting by Counting
• We can now create a sorted array S 1, ⋯ , 𝑛 of the items by simply slotting items into pre-determined slots in S (a third linear scan).
𝐴=633810879253531876512153
𝑘𝑒𝑦
0
1
2
3
4
5
6
7
8
9
𝐶𝑢𝑚𝑢
1
5
7
12
12
16
18
20
23
24
• Place the first record (with key 6) in S 18 and decrement 𝐶𝑢𝑚𝑢 6 (so that the next ‘6’ will go into slot 17), and so on.
𝑖=1:𝑆 𝐶𝑢𝑚𝑢𝐴1 =𝑆𝐶𝑢𝑚𝑢6 =𝑆18 ←6 𝐶𝑢𝑚𝑢 𝐴 1 = 𝐶𝑢𝑚𝑢 6 ←18-1=17
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
6
3
3
8
1
0
8
7
9
2
5
3
5
3
1
8
7
6
5
1
2
1
5
3

COMP90038 – Algorithms and Complexity Lecture 16
Sorting by Counting
• We can now create a sorted array S 1, ⋯ , 𝑛 of the items by simply slotting items into pre-determined slots in S (a third linear scan).
𝐴=633810879253531876512153
𝑘𝑒𝑦
0
1
2
3
4
5
6
7
8
9
𝐶𝑢𝑚𝑢
1
5
7
12
12
16
17
20
23
24
• Place the first record (with key 6) in S 18 and decrement 𝐶𝑢𝑚𝑢 6 (so that the next ‘6’ will go into slot 17), and so on.
𝑖=1:𝑆 𝐶𝑢𝑚𝑢𝐴1 =𝑆𝐶𝑢𝑚𝑢6 =𝑆18 ←6 𝐶𝑢𝑚𝑢 𝐴 1 = 𝐶𝑢𝑚𝑢 6 ←18-1=17
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
6
3
3
8
1
0
8
7
9
2
5
3
5
3
1
8
7
6
5
1
2
1
5
3
6

COMP90038 – Algorithms and Complexity Lecture 16
Sorting by Counting
• We can now create a sorted array S 1, ⋯ , 𝑛 of the items by simply slotting items into pre-determined slots in S (a third linear scan).
𝐴=633810879253531876512153
𝑘𝑒𝑦
0
1
2
3
4
5
6
7
8
9
𝐶𝑢𝑚𝑢
1
5
7
12
12
16
17
20
23
24
• Place the first record (with key 6) in S 18 and decrement 𝐶𝑢𝑚𝑢 6 (so that the next ‘6’ will go into slot 17), and so on.
𝑖=2:𝑆 𝐶𝑢𝑚𝑢𝐴2 =𝑆𝐶𝑢𝑚𝑢3 =𝑆12 ←3 𝐶𝑢𝑚𝑢 𝐴 2 = 𝐶𝑢𝑚𝑢 3 ←12-1=11
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
6
3
3
8
1
0
8
7
9
2
5
3
5
3
1
8
7
6
5
1
2
1
5
3
6

COMP90038 – Algorithms and Complexity Lecture 16
Sorting by Counting
• We can now create a sorted array S 1, ⋯ , 𝑛 of the items by simply slotting items into pre-determined slots in S (a third linear scan).
𝐴=633810879253531876512153
𝑘𝑒𝑦
0
1
2
3
4
5
6
7
8
9
𝐶𝑢𝑚𝑢
1
5
7
11
12
16
17
20
23
24
• Place the first record (with key 6) in S 18 and decrement 𝐶𝑢𝑚𝑢 6 (so that the next ‘6’ will go into slot 17), and so on.
𝑖=2:𝑆 𝐶𝑢𝑚𝑢𝐴2 =𝑆𝐶𝑢𝑚𝑢3 =𝑆12 ←3 𝐶𝑢𝑚𝑢 𝐴 2 = 𝐶𝑢𝑚𝑢 3 ←12-1=11
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
6
3
3
8
1
0
8
7
9
2
5
3
5
3
1
8
7
6
5
1
2
1
5
3
3
6

COMP90038 – Algorithms and Complexity Lecture 16
Sorting by Counting
• We can now create a sorted array S 1, ⋯ , 𝑛 of the items by simply slotting items into pre-determined slots in S (a third linear scan).
𝐴=633810879253531876512153
𝑘𝑒𝑦
0
1
2
3
4
5
6
7
8
9
𝐶𝑢𝑚𝑢
1
5
7
11
12
16
17
20
23
24
• Place the first record (with key 6) in S 18 and decrement 𝐶𝑢𝑚𝑢 6 (so that the next ‘6’ will go into slot 17), and so on.
𝑖=3:𝑆 𝐶𝑢𝑚𝑢𝐴3 =𝑆𝐶𝑢𝑚𝑢3 =𝑆11 ←3 𝐶𝑢𝑚𝑢 𝐴 3 = 𝐶𝑢𝑚𝑢 3 ←11-1=10
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
6
3
3
8
1
0
8
7
9
2
5
3
5
3
1
8
7
6
5
1
2
1
5
3
3
6

COMP90038 – Algorithms and Complexity Lecture 16
Sorting by Counting
• We can now create a sorted array S 1, ⋯ , 𝑛 of the items by simply slotting items into pre-determined slots in S (a third linear scan).
𝐴=633810879253531876512153
𝑘𝑒𝑦
0
1
2
3
4
5
6
7
8
9
𝐶𝑢𝑚𝑢
1
5
7
10
12
16
17
20
23
24
• Place the first record (with key 6) in S 18 and decrement 𝐶𝑢𝑚𝑢 6 (so that the next ‘6’ will go into slot 17), and so on.
𝑖=3:𝑆 𝐶𝑢𝑚𝑢𝐴3 =𝑆𝐶𝑢𝑚𝑢3 =𝑆11 ←3 𝐶𝑢𝑚𝑢 𝐴 3 = 𝐶𝑢𝑚𝑢 3 ←11-1=10
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
6
3
3
8
1
0
8
7
9
2
5
3
5
3
1
8
7
6
5
1
2
1
5
3
3
3
6

COMP90038 – Algorithms and Complexity Lecture 16
Sorting by Counting
• We can now create a sorted array S 1, ⋯ , 𝑛 of the items by simply slotting items into pre-determined slots in S (a third linear scan).
𝐴=633810879253531876512153
𝑘𝑒𝑦
0
1
2
3
4
5
6
7
8
9
𝐶𝑢𝑚𝑢
1
5
7
10
12
16
17
20
23
24
• Place the first record (with key 6) in S 18 and decrement 𝐶𝑢𝑚𝑢 6 (so that the next ‘6’ will go into slot 17), and so on.
𝑖=4:𝑆 𝐶𝑢𝑚𝑢𝐴4 =𝑆𝐶𝑢𝑚𝑢8 =𝑆23 ←8 𝐶𝑢𝑚𝑢 𝐴 4 = 𝐶𝑢𝑚𝑢 8 ← 23-1=22
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
6
3
3
8
1
0
8
7
9
2
5
3
5
3
1
8
7
6
5
1
2
1
5
3
3
3
6

COMP90038 – Algorithms and Complexity Lecture 16
Sorting by Counting
• We can now create a sorted array S 1, ⋯ , 𝑛 of the items by simply slotting items into pre-determined slots in S (a third linear scan).
𝐴=633810879253531876512153
𝑘𝑒𝑦
0
1
2
3
4
5
6
7
8
9
𝐶𝑢𝑚𝑢
1
5
7
10
12
16
17
20
22
24
• Place the first record (with key 6) in S 18 and decrement 𝐶𝑢𝑚𝑢 6 (so that the next ‘6’ will go into slot 17), and so on.
𝑖=4:𝑆 𝐶𝑢𝑚𝑢𝐴4 =𝑆𝐶𝑢𝑚𝑢8 =𝑆23 ←8 𝐶𝑢𝑚𝑢 𝐴 4 = 𝐶𝑢𝑚𝑢 8 ← 23-1=22
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
6
3
3
8
1
0
8
7
9
2
5
3
5
3
1
8
7
6
5
1
2
1
5
3
3
3
6
8

COMP90038 – Algorithms and Complexity Lecture 16
Sorting by Counting
• We can now create a sorted array S 1, ⋯ , 𝑛 of the items by simply slotting items into pre-determined slots in S (a third linear scan).
𝐴=633810879253531876512153
𝑘𝑒𝑦
0
1
2
3
4
5
6
7
8
9
𝐶𝑢𝑚𝑢
1
5
7
10
12
16
17
20
22
24
• Place the first record (with key 6) in S 18 and decrement 𝐶𝑢𝑚𝑢 6 (so that the next ‘6’ will go into slot 17), and so on.
𝑖=5:𝑆 𝐶𝑢𝑚𝑢𝐴5 =𝑆𝐶𝑢𝑚𝑢1 =𝑆5 ←1 𝐶𝑢𝑚𝑢 𝐴 5 =𝐶𝑢𝑚𝑢 1 ←5-1=4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
6
3
3
8
1
0
8
7
9
2
5
3
5
3
1
8
7
6
5
1
2
1
5
3
3
3
6
8

COMP90038 – Algorithms and Complexity Lecture 16
Sorting by Counting
• We can now create a sorted array S 1, ⋯ , 𝑛 of the items by simply slotting items into pre-determined slots in S (a third linear scan).
𝐴=633810879253531876512153
𝑘𝑒𝑦
0
1
2
3
4
5
6
7
8
9
𝐶𝑢𝑚𝑢
1
4
7
10
12
16
17
20
22
24
• Place the first record (with key 6) in S 18 and decrement 𝐶𝑢𝑚𝑢 6 (so that the next ‘6’ will go into slot 17), and so on.
𝑖=5:𝑆 𝐶𝑢𝑚𝑢𝐴5 =𝑆𝐶𝑢𝑚𝑢1 =𝑆5 ←1 𝐶𝑢𝑚𝑢 𝐴 5 =𝐶𝑢𝑚𝑢 1 ←5-1=4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
6
3
3
8
1
0
8
7
9
2
5
3
5
3
1
8
7
6
5
1
2
1
5
3
1
3
3
6
8

COMP90038 – Algorithms and Complexity Lecture 16
Sorting by Counting
• We can now create a sorted array S 1, ⋯ , 𝑛 of the items by simply slotting items into pre-determined slots in S (a third linear scan).
𝐴=633810879253531876512153
𝑘𝑒𝑦
0
1
2
3
4
5
6
7
8
9
𝐶𝑢𝑚𝑢
0
1
5
8
12
12
16
18
20
23
• Place the first record (with key 6) in S 18 and decrement 𝐶𝑢𝑚𝑢 6 (so that the next ‘6’ will go into slot 17), and so on.
𝑖=24:𝑆 𝐶𝑢𝑚𝑢𝐴24 =𝑆𝐶𝑢𝑚𝑢3 =𝑆8 ←3 𝐶𝑢𝑚𝑢 𝐴 24 =𝐶𝑢𝑚𝑢 3 ←8-1=7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
6
3
3
8
1
0
8
7
9
2
5
3
5
3
1
8
7
6
5
1
2
1
5
3
0
1
1
1
1
2
2
3
3
3
3
5
5
5
5
6
6
7
7
8
8
8
9

COMP90038 – Algorithms and Complexity Lecture 16
Sorting by Counting
• We can now create a sorted array S 1, ⋯ , 𝑛 of the items by simply slotting items into pre-determined slots in S (a third linear scan).
𝐴=633810879253531876512153
𝑘𝑒𝑦
0
1
2
3
4
5
6
7
8
9
𝐶𝑢𝑚𝑢
0
1
5
7
12
12
16
18
20
23
• Place the first record (with key 6) in S 18 and decrement 𝐶𝑢𝑚𝑢 6 (so that the next ‘6’ will go into slot 17), and so on.
𝑖=24:𝑆 𝐶𝑢𝑚𝑢𝐴24 =𝑆𝐶𝑢𝑚𝑢3 =𝑆8 ←3 𝐶𝑢𝑚𝑢 𝐴 24 =𝐶𝑢𝑚𝑢 3 ←8-1=7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
6
3
3
8
1
0
8
7
9
2
5
3
5
3
1
8
7
6
5
1
2
1
5
3
0
1
1
1
1
2
2
3
3
3
3
3
5
5
5
5
6
6
7
7
8
8
8
9

COMP90038 – Algorithms and Complexity Lecture 16
Sorting by Counting
• Note that this gives is a linear-time sorting algorithm (for the cost of some extra space).
• However, it only works in situations where we have a small range of keys, known in advance.
• The method never performs a key-to-key comparison.
• The time complexity of key-comparison based sorting has been proven to be Ω 𝑛log𝑛 .

COMP90038 – Algorithms and Complexity
Lecture 16
String Matching Revisited
• We earlier discussed the brute-force approach to string search.
• “Strings” are usually built from a small, pre-determined alphabet.
• Most of the better algorithms rely on some pre-processing of strings before the actual matching process starts.
• The pre-processing involves the construction of a small table (of predictable size).
• Levitin refers to this as “input enhancement”.

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
• Comparing from right to left in the pattern.
• Very good for random text strings
• We can do better than just observing a mismatch here.
• Because the pattern has no occurrence of I, we might as well slide it 4 positions
along.
• This is based only on knowing the pattern.

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
• Here we can slide the pattern 3 positions, because the last occurrence of E in the pattern is its first position.

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
• What happens when we have longer partial matches?
• The shift is determined by the last character in the pattern.

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
• Building (calculating) the shift table is easy.
• Let 𝑎 be the size of the alphabet.

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
• Building (calculating) the shift table is easy.
• Let 𝑎 be the size of the alphabet.
• Let’s consider a simple example with a small alphabet: the nucleotides from DNA: [A T G C] (so 𝑎 = 4).
• Thepatternsis𝑃=TAACG(so𝑚=5)
• The string is (n = 20)
𝑇 = GACCGCGTGAGATAACGTCA

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
• Building (calculating) the shift table is easy.
• Let 𝑎 be the size of the alphabet.
𝑃
[T,A,A,C,G]
𝑚
5
𝑎
4
𝑖
𝑗
𝑆h𝑖𝑓𝑡
[0,0,0,0]
𝑃[𝑗]
𝑚 − (𝑗 + 1)
• Let’s consider a simple example with a small alphabet: the nucleotides from DNA: [A T G C] (so 𝑎 = 4).
• Thepatternsis𝑃=TAACG(so𝑚=5)
• The string is (n = 20)
𝑇 = GACCGCGTGAGATAACGTCA

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
• Building (calculating) the shift table is easy.
• Let 𝑎 be the size of the alphabet.
𝑃
[T,A,A,C,G]
𝑚
5
𝑎
4
𝑖
0
𝑗
𝑆h𝑖𝑓𝑡
[5,0,0,0]
𝑃[𝑗]
𝑚 − (𝑗 + 1)
• Let’s consider a simple example with a small alphabet: the nucleotides from DNA: [A T G C] (so 𝑎 = 4).
• Thepatternsis𝑃=TAACG(so𝑚=5)
• The string is (n = 20)
𝑇 = GACCGCGTGAGATAACGTCA

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
• Building (calculating) the shift table is easy.
• Let 𝑎 be the size of the alphabet.
𝑃
[T,A,A,C,G]
𝑚
5
𝑎
4
𝑖
1
𝑗
𝑆h𝑖𝑓𝑡
[5,5,0,0]
𝑃[𝑗]
𝑚 − (𝑗 + 1)
• Let’s consider a simple example with a small alphabet: the nucleotides from DNA: [A T G C] (so 𝑎 = 4).
• Thepatternsis𝑃=TAACG(so𝑚=5)
• The string is (n = 20)
𝑇 = GACCGCGTGAGATAACGTCA

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
• Building (calculating) the shift table is easy.
• Let 𝑎 be the size of the alphabet.
𝑃
[T,A,A,C,G]
𝑚
5
𝑎
4
𝑖
2
𝑗
𝑆h𝑖𝑓𝑡
[5,5,5,0]
𝑃[𝑗]
𝑚 − (𝑗 + 1)
• Let’s consider a simple example with a small alphabet: the nucleotides from DNA: [A T G C] (so 𝑎 = 4).
• Thepatternsis𝑃=TAACG(so𝑚=5)
• The string is (n = 20)
𝑇 = GACCGCGTGAGATAACGTCA

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
• Building (calculating) the shift table is easy.
• Let 𝑎 be the size of the alphabet.
𝑃
[T,A,A,C,G]
𝑚
5
𝑎
4
𝑖
3
𝑗
𝑆h𝑖𝑓𝑡
[5,5,5,5]
𝑃[𝑗]
𝑚 − (𝑗 + 1)
• Let’s consider a simple example with a small alphabet: the nucleotides from DNA: [A T G C] (so 𝑎 = 4).
• Thepatternsis𝑃=TAACG(so𝑚=5)
• The string is (n = 20)
𝑇 = GACCGCGTGAGATAACGTCA

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
• Building (calculating) the shift table is easy.
• Let 𝑎 be the size of the alphabet.
𝑃
[T,A,A,C,G]
𝑚
5
𝑎
4
𝑖
3
𝑗
0
𝑆h𝑖𝑓𝑡
[5,4,5,5]
𝑃[𝑗]
T
𝑚 − (𝑗 + 1)
4
• Let’s consider a simple example with a small alphabet: the nucleotides from DNA: [A T G C] (so 𝑎 = 4).
• Thepatternsis𝑃=TAACG(so𝑚=5)
• The string is (n = 20)
𝑇 = GACCGCGTGAGATAACGTCA

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
• Building (calculating) the shift table is easy.
• Let 𝑎 be the size of the alphabet.
𝑃
[T,A,A,C,G]
𝑚
5
𝑎
4
𝑖
3
𝑗
1
𝑆h𝑖𝑓𝑡
[3,4,5,5]
𝑃[𝑗]
A
𝑚 − (𝑗 + 1)
3
• Let’s consider a simple example with a small alphabet: the nucleotides from DNA: [A T G C] (so 𝑎 = 4).
• Thepatternsis𝑃=TAACG(so𝑚=5)
• The string is (n = 20)
𝑇 = GACCGCGTGAGATAACGTCA

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
• Building (calculating) the shift table is easy.
• Let 𝑎 be the size of the alphabet.
𝑃
[T,A,A,C,G]
𝑚
5
𝑎
4
𝑖
3
𝑗
2
𝑆h𝑖𝑓𝑡
[2,4,5,5]
𝑃[𝑗]
A
𝑚 − (𝑗 + 1)
2
• Let’s consider a simple example with a small alphabet: the nucleotides from DNA: [A T G C] (so 𝑎 = 4).
• Thepatternsis𝑃=TAACG(so𝑚=5)
• The string is (n = 20)
𝑇 = GACCGCGTGAGATAACGTCA

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
• Building (calculating) the shift table is easy.
• Let 𝑎 be the size of the alphabet.
𝑃
[T,A,A,C,G]
𝑚
5
𝑎
4
𝑖
3
𝑗
3
𝑆h𝑖𝑓𝑡
[2,4,5,1]
𝑃[𝑗]
C
𝑚 − (𝑗 + 1)
1
• Let’s consider a simple example with a small alphabet: the nucleotides from DNA: [A T G C] (so 𝑎 = 4).
• Thepatternsis𝑃=TAACG(so𝑚=5)
• The string is (n = 20)
𝑇 = GACCGCGTGAGATAACGTCA

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
𝑃
[𝑇,𝐴,𝐴,𝐶,𝐺]
𝑚
5
𝑛
20
𝑆h𝑖𝑓𝑡
𝑖
𝑘
𝑃[𝑚 − 1 − 𝑘]
𝑇[𝑖 − 𝑘]
𝑖−𝑚+1
𝑖 + 𝑆h𝑖𝑓𝑡[𝑇[𝑖]]
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
𝑇
G
A
C
C
G
C
G
T
G
A
G
A
T
A
A
C
G
T
C
A
𝑃
T
A
A
C
G

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
𝑃
[𝑇,𝐴,𝐴,𝐶,𝐺]
𝑚
5
𝑛
20
𝑆h𝑖𝑓𝑡
[2,4,5,1]
𝑖
𝑘
𝑃[𝑚 − 1 − 𝑘]
𝑇[𝑖 − 𝑘]
𝑖−𝑚+1
𝑖 + 𝑆h𝑖𝑓𝑡[𝑇[𝑖]]
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
𝑇
G
A
C
C
G
C
G
T
G
A
G
A
T
A
A
C
G
T
C
A
𝑃
T
A
A
C
G

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
𝑃
[𝑇,𝐴,𝐴,𝐶,𝐺]
𝑚
5
𝑛
20
𝑆h𝑖𝑓𝑡
[2,4,5,1]
𝑖
4
𝑘
0
𝑃[𝑚 − 1 − 𝑘]
G
𝑇[𝑖 − 𝑘]
G
𝑖−𝑚+1
𝑖 + 𝑆h𝑖𝑓𝑡[𝑇[𝑖]]
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
𝑇
G
A
C
C
G
C
G
T
G
A
G
A
T
A
A
C
G
T
C
A
𝑃
T
A
A
C
G

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
𝑃
[𝑇,𝐴,𝐴,𝐶,𝐺]
𝑚
5
𝑛
20
𝑆h𝑖𝑓𝑡
[2,4,5,1]
𝑖
4
𝑘
1
𝑃[𝑚 − 1 − 𝑘]
C
𝑇[𝑖 − 𝑘]
C
𝑖−𝑚+1
𝑖 + 𝑆h𝑖𝑓𝑡[𝑇[𝑖]]
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
𝑇
G
A
C
C
G
C
G
T
G
A
G
A
T
A
A
C
G
T
C
A
𝑃
T
A
A
C
G

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
𝑃
[𝑇,𝐴,𝐴,𝐶,𝐺]
𝑚
5
𝑛
20
𝑆h𝑖𝑓𝑡
[2,4,5,1]
𝑖
4
𝑘
2
𝑃[𝑚 − 1 − 𝑘]
A
𝑇[𝑖 − 𝑘]
C
𝑖−𝑚+1
𝑖 + 𝑆h𝑖𝑓𝑡[𝑇[𝑖]]
4+5=9
Alphabet = [A T G C]
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
𝑇
G
A
C
C
G
C
G
T
G
A
G
A
T
A
A
C
G
T
C
A
𝑃
T
A
A
C
G

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
𝑃
[𝑇,𝐴,𝐴,𝐶,𝐺]
𝑚
5
𝑛
20
𝑆h𝑖𝑓𝑡
[2,4,5,1]
𝑖
9
𝑘
0
𝑃[𝑚 − 1 − 𝑘]
G
𝑇[𝑖 − 𝑘]
A
𝑖−𝑚+1
𝑖 + 𝑆h𝑖𝑓𝑡[𝑇[𝑖]]
9+2=11
Alphabet = [A T G C]
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
𝑇
G
A
C
C
G
C
G
T
G
A
G
A
T
A
A
C
G
T
C
A
𝑃
T
A
A
C
G

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
𝑃
[𝑇,𝐴,𝐴,𝐶,𝐺]
𝑚
5
𝑛
20
𝑆h𝑖𝑓𝑡
[2,4,5,1]
𝑖
11
𝑘
0
𝑃[𝑚 − 1 − 𝑘]
G
𝑇[𝑖 − 𝑘]
A
𝑖−𝑚+1
𝑖 + 𝑆h𝑖𝑓𝑡[𝑇[𝑖]]
11+2=13
Alphabet = [A T G C]
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
𝑇
G
A
C
C
G
C
G
T
G
A
G
A
T
A
A
C
G
T
C
A
𝑃
T
A
A
C
G

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
𝑃
[𝑇,𝐴,𝐴,𝐶,𝐺]
𝑚
5
𝑛
20
𝑆h𝑖𝑓𝑡
[2,4,5,1]
𝑖
13
𝑘
0
𝑃[𝑚 − 1 − 𝑘]
G
𝑇[𝑖 − 𝑘]
A
𝑖−𝑚+1
𝑖 + 𝑆h𝑖𝑓𝑡[𝑇[𝑖]]
13+2=15
Alphabet = [A T G C]
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
𝑇
G
A
C
C
G
C
G
T
G
A
G
A
T
A
A
C
G
T
C
A
𝑃
T
A
A
C
G

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
𝑃
[𝑇,𝐴,𝐴,𝐶,𝐺]
𝑚
5
𝑛
20
𝑆h𝑖𝑓𝑡
[2,4,5,1]
𝑖
15
𝑘
0
𝑃[𝑚 − 1 − 𝑘]
G
𝑇[𝑖 − 𝑘]
C
𝑖−𝑚+1
𝑖 + 𝑆h𝑖𝑓𝑡[𝑇[𝑖]]
15+1=16
Alphabet = [A T G C]
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
𝑇
G
A
C
C
G
C
G
T
G
A
G
A
T
A
A
C
G
T
C
A
𝑃
T
A
A
C
G

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
𝑃
[𝑇,𝐴,𝐴,𝐶,𝐺]
𝑚
5
𝑛
20
𝑆h𝑖𝑓𝑡
[2,4,5,1]
𝑖
16
𝑘
0
𝑃[𝑚 − 1 − 𝑘]
G
𝑇[𝑖 − 𝑘]
G
𝑖−𝑚+1
𝑖 + 𝑆h𝑖𝑓𝑡[𝑇[𝑖]]
Alphabet = [A T G C]
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
𝑇
G
A
C
C
G
C
G
T
G
A
G
A
T
A
A
C
G
T
C
A
𝑃
T
A
A
C
G

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
𝑃
[𝑇,𝐴,𝐴,𝐶,𝐺]
𝑚
5
𝑛
20
𝑆h𝑖𝑓𝑡
[2,4,5,1]
𝑖
16
𝑘
1
𝑃[𝑚 − 1 − 𝑘]
C
𝑇[𝑖 − 𝑘]
C
𝑖−𝑚+1
𝑖 + 𝑆h𝑖𝑓𝑡[𝑇[𝑖]]
Alphabet = [A T G C]
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
𝑇
G
A
C
C
G
C
G
T
G
A
G
A
T
A
A
C
G
T
C
A
𝑃
T
A
A
C
G

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
𝑃
[𝑇,𝐴,𝐴,𝐶,𝐺]
𝑚
5
𝑛
20
𝑆h𝑖𝑓𝑡
[2,4,5,1]
𝑖
16
𝑘
2
𝑃[𝑚 − 1 − 𝑘]
A
𝑇[𝑖 − 𝑘]
A
𝑖−𝑚+1
𝑖 + 𝑆h𝑖𝑓𝑡[𝑇[𝑖]]
Alphabet = [A T G C]
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
𝑇
G
A
C
C
G
C
G
T
G
A
G
A
T
A
A
C
G
T
C
A
𝑃
T
A
A
C
G

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
𝑃
[𝑇,𝐴,𝐴,𝐶,𝐺]
𝑚
5
𝑛
20
𝑆h𝑖𝑓𝑡
[2,4,5,1]
𝑖
16
𝑘
3
𝑃[𝑚 − 1 − 𝑘]
A
𝑇[𝑖 − 𝑘]
A
𝑖−𝑚+1
𝑖 + 𝑆h𝑖𝑓𝑡[𝑇[𝑖]]
Alphabet = [A T G C]
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
𝑇
G
A
C
C
G
C
G
T
G
A
G
A
T
A
A
C
G
T
C
A
𝑃
T
A
A
C
G

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
𝑃
[𝑇,𝐴,𝐴,𝐶,𝐺]
𝑚
5
𝑛
20
𝑆h𝑖𝑓𝑡
[2,4,5,1]
𝑖
16
𝑘
4
𝑃[𝑚 − 1 − 𝑘]
T
𝑇[𝑖 − 𝑘]
T
𝑖−𝑚+1
12
𝑖 + 𝑆h𝑖𝑓𝑡[𝑇[𝑖]]
Alphabet = [A T G C]
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
𝑇
G
A
C
C
G
C
G
T
G
A
G
A
T
A
A
C
G
T
C
A
𝑃
T
A
A
C
G

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
• We can also consider posting a sentinel: Append the pattern 𝑃 to the end of the text 𝑇 so that a match is guaranteed.

COMP90038 – Algorithms and Complexity
Lecture 16
Horspool’s String Search Algorithm
• Unfortunately the worst-case behaviour of Horspool’s algorithm is still Ο(𝑚 × 𝑛), like the brute-force methods.
• However, in practice, for example, when used on English texts, it is linear, and fast.

COMP90038 – Algorithms and Complexity
Lecture 16
Other Important String Search Algorithms
• Horspool’s algorithm was inspired by the famous Boyer-Moore algorithm (BM), also covered in Levitin’s book. The BM algorithm is very similar, but has a more sophisticated shifting strategy, which makes it Ο(𝑚 + 𝑛).
• Another famour string search algorithm is the Knuth-Morris-Pratt algorithm (KMP).
• KMP is very good when the alphabet is small, say, we need to search through very
long bit strings.
• Also, we shall soon meet the Rabin-Karp algorithm (RK), albeit briefly.

COMP90038 – Algorithms and Complexity Lecture 16
Coming Up Next
• We look at the hugely important technique of hashing (Levitin 7.3) , a standard way of implementing a “dictionary”.
• Hashing is arguably the best example of how to gain speed by using additional space to great effect.

Related Posts