CS代考 COMP462) = ?

Algorithms & Data Structures (Winter 2022) Comp250 – Review
School of Computer Science

What should you already know?

Copyright By PowCoder代写 加微信 powcoder

• Linear Data Structures.
• Array lists, singly and doubly linked lists, stacks, queues.
• Induction and Recursion
• Tools for analysis of algorithms.
• Recurrences.
• Asymptotic notation (big O, big Omega, big Theta)
• Basics in non-linear data structures. • Trees, heaps, maps, graphs.

What I would need for Comp251?
• Linear Data Structures.
• Array lists, singly and doubly linked lists, stacks, queues.
• Induction and Recursion
• Tools for analysis of algorithms.
• Recurrences.
• Asymptotic notation (big O, big Omega, big Theta)
• Basics in non-linear data structures. • Trees, heaps, maps, graphs.
September 14th , 2020 5 Mc Type VS Data Structure
Figure taken from Comp251 – 2014
• Sometimes the boundary between DT and DS is unclear and arbitrary. In Comp251, we will use many partially implemented DS (this will not happen during the implementation) during the analysis of algorithms, .
• Enough details to discuss the complexity of the algorithm.

Data Type VS Data Structure
Non-Linear

Data Structures Classification
• Basic data structures (built-in libraries). • Linear DS.
• Non-Linear DS.
• Data structures (Own libraries). • Graphs.
• Union-Find Structures. • Segment Tree.

Data Structures Classification
• Basic data structures (built-in libraries).
• Linear DS (ordering the elements sequentially).
• Static Array (Array in C/C++ and in Java).
• Resizeable array (C++ STL and Java ArrayList). • Linked List: (C++ STL and Java LinkedList).
• Stack (C++ STL and Java Stack).
• Queue (C++ STL and Java Queue).

Data Structures Classification
• Basic data structures (built-in libraries). • Non-Linear DS.
• Balanced Binary Search Tree (C++ STL

/ and in Java TreeMap/TreeSet).
• AVL and Red- = Balanced BST

stores (key -> data) VS only stores the key
• Heap(C++ STL:priority_queue and Java PriorityQueue).
• BST complete.
• Heap property VS BST property.
• Hash Table (Java HashMap/HashSet/HashTable).
• Non synchronized vs synchronized.
• Null vs non-nulls
• Predictable iteration (using LinkedHashMap) vs non predictable.

Deciding the Order of the Tasks
• Returns the newest task (stack)
• Returns the oldest task (queue)
• Returns the most urgent task (priority queue) • Returns the easiest task (priority queue)

• Last in, first out (Last In First Out)
• Stacks model piles of objects (such as dinner plates)
• Supports three constant-time operations • Push(x): inserts x into the stack
• Pop(): removes the newest item
• Top(): returns the newest item
• Very easy to implement using an array
• Useful for:
• Processing nested formulas
• Depth-first graph traversal
• Data storage in recursive algorithms • Reverse a sequence
• Matching brackets
• And a lot more

• First in, first out (FIFO)
• Supports three constant-time operations
• Enqueue(x) : inserts x into the queue • Dequeue() : removes the oldest item • Front() : returns the oldest item
• Implementation is similar to that of stack
• Useful for:
• implementing buffers
• simulating waiting lists • shuffling cards
• And a lot more

Priority Queue
• Each element in a PQ has a priority value
• Three operations:
• Insert(x, p) : inserts x into the PQ, whose priority is p
• RemoveTop() : removes the element with the highest priority • Top() : returns the element with the highest priority
• All operations can be done quickly if implemented using a heap (if not use a sorted array)
• priority_queue (C++), PriorityQueue (Java)
• Useful for
• Maintaining schedules / calendars • Simulating events
• Breadth-first search in a graph
• Sweepline geometric algorithms

• Complete binary tree with the heap property: • The value of a node ≥ values of its children
• What is the difference between full vs complete?
• The root node has the maximum value • Constant-time top() operation
• Inserting/removing a node can be done in O(log n) time without breaking the heap property
• May need rearrangement of some nodes

• Start from the root, number the nodes 1, 2, . . . from left to right
• Given a node k easy to compute the indices of its parent and children
• Parent node: floor(k/2) • Children: 2k, 2k + 1

BST Binary Search Tree
• The idea behind is that each node has, at most, two children
• A binary tree with the following property: for each node v,
• value of v ≥ values in v ’s left subtree
• value of v < values in v ’s right subtree BST Binary Search Tree • Supports three operations • Insert(x) : inserts a node with value x • Delete(x) : deletes a node with value x , if there is any • Find(x) : returns the node with value x , if there is any • Many extensions are possible • Count(x) : counts the number of nodes with value less than or equal to x • GetNext(x) : returns the smallest node with value ≥ x • Curious about the extensions? Please register Comp321 next term 😛 BST Binary Search Tree • Simple implementation cannot guarantee efficiency • In worst case, tree height becomes n (which makes BST useless) • Guaranteeing O(log n) running time per operation requires balancing of the tree (hard to implement). • For example AVL and Red-Black trees • We will skip the details of these balanced trees because we will fully cover them during the next lectures. • What does balanced mean?? • The heights of the two child subtrees of any node differ by at most one. Question for you • Basic data structures (built-in libraries). Do you recognize the problem? Hash Tables key is used as an index to locate the associated value. Content-based retrieval, unlike position-based retrieval. Hashing is the process of generating a key value. An ideal algorithm must distribute evenly the hash values => the buckets will tend to fill up evenly = fast search.
A hash bucket containing more than one value is known as a “collision”.
• Open addressing => A simple rule to decide where to put a new item when the desired space is already occupied.
• Chaining => We associate a linked list with each table location. Hash tables are excellent dictionary data structures.
We will cover all the details about hash tables next lecture

Data Structures
• Data structures (Own Libraries). • Graphs.
• We will talk a lot about them in the last part of the course. • Union-Find Disjoint Sets
• We will cover it a lot during the first part of the course, starting in two lectures. • Segment tree.
• We will not cover it, but if you are curious register Comp321 next term 😉

What have we covered so far?
• Linear Data Structures.
• Array lists, singly and doubly linked lists, stacks, queues.
• Induction and Recursion
• Tools for analysis of algorithms.
• Recurrences.
• Asymptotic notation (big O, big Omega, big Theta)
• Basics in non-linear data structures. • Trees, heaps, maps, graphs.

Recursion – Definitions
• Recursive definition:
• A definition that is defined in terms of itself.
• Recursive method:
• A method that calls itself (directly or indirectly).
• Recursive programming:
• Writing methods that call themselves to solve problems recursively.
Did you watch a movie called ‘Inception’?

Recursion – Example
c(x) = total number of credits required to complete course x c(COMP462) = ?
= 3 credits + #credits for prerequisites #COMP462 has 2 prerequisites: COMP251 & MATH323
= 3 credits + c(COMP251) + c(MATH323) The function c calls itself twice
c(COMP251) = ? c(MATH323) = ?
c(COMP251) = 3 credits + c(COMP250) #COMP250 is a prerequisite
• Substitute c(COMP251) into the formula:
• c(COMP462) = 3 credits + 3 credits + c(COMP250) + c(MATH323)
• c(COMP462) = 6 credits + c(COMP250) + c(MATH323)

Recursion – Example
c(COMP462) = 6 credits + c(COMP250) + c(MATH323) c(COMP250) = ? c(MATH323) = ? c(COMP250) = 3 credits # no prerequisite
c(COMP462) = 6 credits + 3 credits + c(MATH323) c(MATH323) = ?
c(MATH323) = 3 credits + c(MATH141)
c(COMP462) = 9 credits + 3 credits + c(MATH141) c(MATH141) = ?
c(MATH141) = 4 credits # no prerequisite
c(COMP462) = 12 credits + 4 credits = 16 credits

Recursion – Algorithm Structure
• Every recursive algorithm involves at least 2 cases:
• base case: A simple occurrence that can be answered
• recursive case: A more complex occurrence of the problem that cannot be directly answered but can instead be described in terms of smaller occurrences of the same problem.
• Some recursive algorithms have more than one base or recursive case, but all have at least one of each.
• A crucial part of recursive programming is identifying these cases.

Recursion – Algorithm Example
• Algorithm binarySearch(array, start, stop, key)
• Asortedarray
• the region start…stop (inclusively) to be searched • thekeytobefound
• Output: returns the index at which the key has been found, or returns -1 if the key is not in array[start…stop].
• Example: Does the following sorted array A contains the number 6?
Call: binarySearch(A, 0, 7, 6)

Recursion – Algorithm Example
Search [0:7]
5 < 6 Þ look into right half of the array Search [4:7] 7 > 6 Þ look into left half of the array
6 is found. Return 4 (index)
Search [4:4]

Recursion – Algorithm Example
int bsearch(int[] A, int i, int j, int x) { if (i<=j) { // the region to search is non-empty int e = ⎣(i+j)/2⎦; if (A[e] > x) {
return bsearch(A,i,e-1,x);
} else if (A[e] < x) { return bsearch(A,e+1,j,x); } else { return -1; } // value not found What have we covered so far? • Linear Data Structures. • Array lists, singly and doubly linked lists, stacks, queues. • Induction and Recursion • Tools for analysis of algorithms. • Recurrences. • Asymptotic notation (big O, big Omega, big Theta) • Basics in non-linear data structures. • Trees, heaps, maps, graphs. Recursion(CS) VS Recurrence(Math) • A recurrence is a function that is defined in terms of • one or more base cases, and • itself, with smaller arguments. • Examples: 𝑇𝑛 =&𝑇𝑛−1 +1 𝑖𝑓𝑛>1
• Many technical issues:
• Floors and ceilings
• Exact vs. asymptotic functions
• Boundary conditions
1 𝑖𝑓 𝑛 = 1
𝑇𝑛 =-𝑇 𝑛 +𝑇 20𝑛 +𝑛 𝑖𝑓𝑛>1 33
1 𝑖𝑓 𝑛 = 1
• We usually express both the recurrence and its solution using asymptotic notation. 32

Algorithm Analysis
• Q: How to estimate the running time of a recursive algorithm? • A:
1. Define a function T(n) representing the time spent by your algorithm to execute an entry of size n
2. Write a recursive formula computing T(n)
3. Solve the recurrence
• n can be anything that characterizes accurately the size of the input
(e.g. size of the array, number of bits)
• We count the number of elementary operations (e.g. addition, shift) to estimate the running time.
• We often aim to compute an upper bound rather than an exact count. 33

Algorithm Analysis (binary search)
int bsearch(int[] A, int i, int j, int x) { if (i<=j) { // the region to search is non-empty int e = ⎣(i+j)/2⎦; if (A[e] > x) { return bsearch(A,i,e-1,x); } else if (A[e] < x) {return } else { return e; } (A,e+1,j,x) } else { return -1; } // value not found 𝑇𝑛=$ 𝑛𝑐 𝑖𝑓𝑛<=1 𝑇2+𝑐′ 𝑖𝑓𝑛>1
• n is the size of the array

Algorithm Analysis
• Q: How to estimate the running time of a recursive algorithm? • A:
1. Define a function T(n) representing the time spent by your algorithm to execute an entry of size n
2. Write a recursive formula computing T(n)
3. Solve the recurrence
• n can be anything that characterizes accurately the size of the input
(e.g. size of the array, number of bits)
• We count the number of elementary operations (e.g. addition, shift) to estimate the running time.
• We often aim to compute an upper bound rather than an exact count. 35

Substitution method
• How to solve a recursive equation?
1. Guess the solution.
2. Use induction to find the constants and show that the solution works.

Mathematical Induction
• Many statement of the form “for all n≥n0 , P(n)” can be proven with a logical argument call mathematical induction.
• The proof has two components:
• Base case: P(n0)
• Induction step: for any n≥n0 , if P(n) then P(n+1)

Mathematical Induction
• Many statement of the form “for all n≥n0 , P(n)” can be proven with a logical argument call mathematical induction.

Mathematical Induction – Example 1
Claim: foranyn≥1, 1+2+3+4+!+n= n⋅(n+1) 2
• Basecase: n=1
• Induction step:
foranyk≥1, if 1+2+3+4+!+k=k⋅(k+1) 2
then 1+2+3+4+!+k+(k+1)= (k+1)⋅(k+2) 2

Mathematical Induction – Example 1
Assume 1+2+3+4+!+k=k⋅(k+1) 2
then 1+2+3+4+!+k+(k+1) = k⋅(k+1)+(k+1)
= k⋅(k+1)+2⋅(k+1)
= (k+2)⋅(k+1)
Induction hypothesis

Mathematical Induction – Example 2
Claim: foranyn≥1, 1+3+5+7+!+(2⋅n−1)=n2
• Basecase: n=1 1=12
• Induction step:
foranyk≥1, if 1+3+5+7+!+(2⋅k−1)=k2 then 1+3+5+7+!+(2⋅(k+1)−1)=(k+1)2

Mathematical Induction – Example 2
Assume 1+3+5+7+!+(2⋅k−1)=k2
then 1+3+5+7+!+(2⋅k−1)+(2⋅(k+1)−1)
Induction hypothesis
=k2 +2⋅(k+1)−1 =k2 +2⋅k+1 =(k+1)2

Substitution method
• How to solve a recursive equation?
1. Guess the solution.
2. Use induction to find the constants and show that the solution works.
1. Now we have the background to do this.

Substitution method – Binary Search
0 𝑖𝑓 𝑛 = 1 𝑇 𝑛 = $ 𝑇 𝑛2 + 1 𝑖 𝑓 𝑛 > 1
Note: set the constant c=0 and c’=1
Guess: 𝑇 𝑛 = log1 𝑛 Basecase:𝑇 1 =log11=0✓ Inductive case:
Assume 𝑇 𝑛⁄2 = log1(𝑛⁄2)
𝑇 𝑛 = 𝑇 𝑛⁄2 + 1 = log1 𝑛⁄2 + 1
Induction hypothesis can be anything < n =log1 𝑛 −log12+1=log1𝑛✓ Back Substitution method – Binary Search Taken from Prof. ’s lecture notes Comp250 What have we covered so far? • Linear Data Structures. • Array lists, singly and doubly linked lists, stacks, queues. • Induction and Recursion • Tools for analysis of algorithms. • Recurrences. • Asymptotic notation (big O, big Omega, big Theta) • Basics in non-linear data structures. • Trees, heaps, maps, graphs. Running time – Binary Search Best case: The value is exactly in the middle of the array. ⇒ Ω(1) Worst case: You recursively search until you reach an array of size 1 (Note: It does not matter if you find the key or not). ⇒ 𝑂(log! 𝑛) A tree representing binary search. The array being searched here is [20, 30, 40, 50, 80, 90, 100] (from Wikipedia) Running time – Binary Search The time it takes for an algorithm to run depends on: • constant factors (often implementation dependent) • the size 𝑛 of the input • the values of the input, including arguments if applicable... A tree representing binary search. The array being searched here is [20, 30, 40, 50, 80, 90, 100] (from Wikipedia) Running ‘time’ – Measure • Goal: Analyze an algorithm written in pseudocode and describe its running time as a function f(n) of the input size n • Without having to write code • Experimental VS Theoretical analysis • In a way that is independent of the computer used • To achieve that, we need to • Make simplifying assumptions about the running time of each basic (primitive) operations • Study how the number of primitive operations depends on the size of the problem solved Running ‘time’ – Primitive Operations • Simple computer operation that can be performed in time that is always the same, independent of the size of the bigger problem solved (we say: constant time) – Assigning a value to a variable: x ¬1 – Calling a method: Expos.addWin() – Note: doesn’t include the time to execute the method – Returning from a method: return x; – Arithmetic operations on primitive types • x + y, r*3.1416, x/y, etc. – Comparisons on primitive types: x==y – Conditionals: if (...) then.. else... – Indexing into an array: A[i] – Following object reference: Expos.losses •Note: Multiplying two Large Integers is not a primitive operation, because the running time depends on the size of the numbers multiplied. Recursion – Algorithm Example int bsearch(int[] A, int i, int j, int x) { if (i<=j) { int e = ⎣(i+j)/2⎦; if (A[e] > x) {
return bsearch(A,i,e-1,x);
} else if (A[e] < x) { return bsearch(A,e+1,j,x); return -1; } Tcond + + + Tcomp + + + Tcomp + + repeated Log(|A|) times in the worst case scenario Running ‘time’ – Primitive Operations • Counting each type of primitive operations is tedious • The running time of each operation is roughly comparable: Tassign » Tcomp » Tarith » ... » Tindex = 1 primitive operation • We are only interested in the number of primitive operations performed • Worst-case running time for binary search becomes: 𝑇𝑛=$ 𝑛𝑐 𝑖𝑓𝑛<=1 𝑇2+𝑐′ 𝑖𝑓𝑛>1
• When n is large, T(n) grows approximately like log(n). Then, 52
we will write T(n) is O(log(n)) => T(n) is big O of log(n)

Towards a formal definition of big O
• Let t(𝑛) be a function that describes the time it takes for some algorithm on input size 𝑛.
• We would like to express how t(𝑛) grows with 𝑛, as 𝑛 becomes large i.e. asymptotic behavior.
• Unlike with limits, we want to say that t(𝑛) grows like certain simpler functions such as 𝑛, log1 𝑛 , 𝑛, 𝑛1, 2: …

Towards a formal definition of big O
• Let 𝑡(𝑛) and 𝑔(𝑛) be two functions, where 𝑛 ≥ 0. We say 𝑡(𝑛) is asymptotically bounded above by 𝑔(𝑛) if there exists 𝑛; such
that, for all 𝑛 ≥ 𝑛;,
𝑡(𝑛) ≤ 𝑔(𝑛)

Towards a formal definition of big O
• Let 𝑡(𝑛) and 𝑔(𝑛) be two functions, where 𝑛 ≥ 0.
We say 𝑡(𝑛) is 𝑂(𝑔(𝑛)) if there exists two positive constants 𝑛; and 𝑐 such that, for all 𝑛 ≥ 𝑛;,
𝑡(𝑛) ≤ 𝑐 @ 𝑔(𝑛)
• Intuition
• “𝑓 𝑛 𝑖𝑠𝑂(𝑔 𝑛 )”ifandonlyifthereexistsapoint𝑛! beyondwhich𝑓 𝑛 is less than some fixed constant times 𝑔(𝑛).

Towards a formal definition of big O
• Never write 𝑂 3𝑛 ,𝑂 5log” 𝑛 ,𝑒𝑡𝑐.
• Instead, write 𝑂 𝑛 ,𝑂 log! 𝑛 ,𝑒𝑡𝑐.
• Why? The point of the big O notation is to avoid dealing with constant factors. It’s technically correct but we don’t do it…
• 𝑛! and 𝑐 are not uniquely defined. For a given 𝑛! and 𝑐 that satisfies 𝑂(), we can increase one or both to again satisfy the definition. There is not “better” choice of constants.
• However, we generally want a “tight” upper bound (asymptotically), so functions in the big O gives us more information (Note: This is not the same assmaller𝑛” or𝑐).Forinstance,𝑓(𝑛)thatis𝑂 𝑛 isalso𝑂 𝑛! and𝑂 2# . But 𝑂 𝑛 is more informative.

Growth of functions
(from stackoverflow)
Tip: It is helpful to memorize the relationship between basic functions.

Growth of functions
If the unit is in seconds, this would make ~1011 years…

Growth of functions

Towards a formal definition of big Ω
• Let𝑡 𝑛 and𝑔 𝑛 betwofunctionswith𝑛≥0.
• We say 𝑡 𝑛 is Ω(𝑔 𝑛 ), if there exists two positives constants
𝑛; and𝑐suchthat,forall𝑛≥𝑛;,
• Note: This is the opposite of the big O notation. The function 𝑔 is now used as a “lower bound”.

Towards a formal definition of big 𝑇h𝑒𝑡𝑎
• Let 𝑡(𝑛) and 𝑔(𝑛) be two functions, where 𝑛 ≥ 0.
We say 𝑡(𝑛) is Θ(𝑔(𝑛)) if there exists three positive constants 𝑛; and
𝑐A, 𝑐1 such that, for all 𝑛 ≥ 𝑛;,
𝑐A @𝑔(𝑛) ≤ 𝑡(𝑛) ≤ 𝑐1 @𝑔(𝑛)
Note: if 𝑡 𝑛 is Θ(𝑔(𝑛)). Then, it is

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com