程序代写代做代考 game graph algorithm go data structure C CSC373

CSC373
Algorithm Design, Analysis & Complexity
Nisarg Shah
373F20 – Nisarg Shah 1

Introduction
• Instructors
➢ Nisarg Shah
o cs.toronto.edu/~nisarg, nisarg@cs, SF 2301C o LEC 0101 and 0102
• TAs: Too many to list • Disclaimer!
Totally useless this semester!
➢ First online version of the course, so expect a bumpy ride at the start, but hopefully, we’ll get through together
➢ Use any of the feedback mediums (email, Piazza, …) to let me know if you have any suggestions for improvement
373F20 – Nisarg Shah 2

Course Information
• Course Page www.cs.toronto.edu/~nisarg/teaching/373f20/
➢ All the information below is in the course information sheet, available on
Piazza
• Discussion Board piazza.com/utoronto.ca/fall2020/csc373
• Grading – MarkUs
➢ Link will be distributed after about a week or two ➢ LaTeX preferred, scans are OK!
• All times in Eastern time zone, all zoom links on the course page
373F20 – Nisarg Shah 3

Lectures
• Time & Place: Tue 4-5pm, Thu 1-3pm, Zoom
• Details
➢ Delivered live
➢ 10 minute break after every 50 minutes of lecture
➢ Students can ask questions using Zoom’s chat feature
➢ One TA will be present to continuously answer questions ➢ I might also answer questions once in a while
373F20 – Nisarg Shah 4

Tutorials
• Time & Place: Tue 5-6pm, Zoom
• Details
➢ Delivered live by TAs
➢ Problem sets will be posted early on the course webpage o Easier problems that are warm-up to assignments/exams
➢ Please try them before coming to the tutorials
➢ TAs will explain the problems, allow you to discuss them in breakout rooms,
and then go over key parts of the solutions
➢ Solutions will be posted later on the course webpage
373F20 – Nisarg Shah 5

Tutorials
• Further details
➢ Each section is divided into three parts (A,B,C)
➢ Students divided by birth month: A = Jan-Apr, B = May-Aug, C = Sep-Dec ➢ Feel free to attend a different tutorial than the one you’re assigned
o EXCEPT when the tutorial slot is being used for a test
➢ If the attendance is low, the number of tutorials per section may be reduced
373F20 – Nisarg Shah 6

Office Hours
• Time & Place: Wed 4-5pm, Fri 10-11am, Zoom ➢ Do you have conflicts with these slots? Poll!
• Details
➢ I will conduct them
➢ Use the “raise hand” feature
➢ I will call upon the raised hands in order
➢ When called upon, unmute and ask the question
➢ Always phrase your question in a way that doesn’t give away your solutions or approach to an assignment problem
o Just like in a physical office
373F20 – Nisarg Shah 7

Tests
• 2 term tests, one end-of-term test (final exam)
• Time & Place: Tue 5-6pm (tutorial slot)
➢ Need to be able to attend live!
➢ I’m considering using part of the Tue 4-5pm lecture slot to give you more time
• Tentative Plan
➢ Open book, closed internet
➢ You may be asked to join a zoom link and keep your video on
➢ If you have a question, you can “raise hand”, and I or a TA can take you to a breakout room to answer your question
➢ Upload scanned answer sheet at the end (we’ll do a mock run of this)
373F20 – Nisarg Shah 8

Assignments
• 4 assignments, best 3 out of 4
• Group work
➢ In groups of up to three students
➢ Best way to learn is for each member to try each problem
• Questions will be more difficult
➢ May need to mull them over for several days; do not expect to start and finish
the assignment on the same day! ➢ May include bonus questions
• Submission on MarkUs, more details later ➢ May need to compress the PDF
373F20 – Nisarg Shah 9

Grading Policy
• 3 homeworks • 2 term tests
• Final exam
* 10% = 30%
* 20% = 40%
* 30% = 30%
• NOTE: To pass, you must earn at least 40% on the final exam
373F20 – Nisarg Shah 10

Approximate Due Dates
• Please note the word approximate!
➢ Assignment 1: ➢ Assignment 2: ➢ Assignment 3: ➢ Assignment 4: ➢ Midterm 1:
➢ Midterm 2:
Apx. Oct 9 Apx. Oct 30 Apx. Nov 13 Apx. Nov 27 Apx. Oct 20 Apx. Nov 17
• Conflicts
➢ The tests are during the tutorial slot, so there should ideally be no conflict ➢ That said, if you think you’ll have a conflict, let me know at the earliest
373F20 – Nisarg Shah 11

Textbook
• Primary reference: lecture slides • Primary textbook (required)
➢ [CLRS] Cormen, Leiserson, Rivest, Stein: Introduction to Algorithms.
• Supplementary textbooks (optional)
➢ [DPV] Dasgupta, Papadimitriou, Vazirani: Algorithms. ➢ [KT] Kleinberg; Tardos: Algorithm Design.
373F20 – Nisarg Shah 12

Other Policies
• Collaboration
➢ Free to discuss with classmates or read online material
➢ Must write solutions in your own words
o Easier if you do not take any pictures/notes from discussions
• Citation
➢ For each question, must cite the peer (write the name) or the online sources (provide links), if you obtained a significant insight directly pertinent to the question
➢ Failing to do this is plagiarism!
373F20 – Nisarg Shah 13

Other Policies
• “No Garbage” Policy
➢ Borrowed from: Prof. Allan Borodin (citation!)
1. Partial marks for viable approaches
2. Zero marks if the answer makes no sense
3. 20% marks if you admit to not knowing how to approach the question (“I do not know how to approach this question”)
• 20% > 0% !!
373F20 – Nisarg Shah 14

Other Policies
• Late Days
➢ 4 total late days across all 4 assignments
➢ Managed by MarkUs
➢ At most 2 late days can be applied to a single assignment
➢ Already covers legitimate reasons such as illness, university activities, etc. o Petitions will only be granted for circumstances which cannot be covered by this
373F20 – Nisarg Shah 15

Zoom Features
• Just to get acquainted, let’s try out the following features: ➢ Polls (already tried)
➢ Chat
➢ Reactions
➢ Raise hand ➢Yes/No
➢ Breakout rooms
373F20 – Nisarg Shah 16

Enough with the boring stuff.
373F20 – Nisarg Shah 17

What will we study? Why will we study it?
373F20 – Nisarg Shah 18

Muhammad ibn Musa al-Khwarizmi c. 780 – c. 850
373F20 – Nisarg Shah 19

What is this course about?
• Algorithms
➢ Ubiquitous in the real world
o From your smartphone to self-driving cars o From graph problems to graphics problems o…
➢ Important to be able to design and analyze algorithms
➢ For some problems, good algorithms are hard to find
o For some of these problems, we can formally establish complexity results
o We’ll often find that one problem is easy, but its minor variants are suddenly hard
373F20 – Nisarg Shah 20

What is this course about?
• Algorithms
➢ Algorithms in specialized environments or using advanced techniques
o Distributed, parallel, streaming, sublinear time, spectral, genetic… ➢ Other concerns with algorithms
o Fairness, ethics, …
➢ …mostly beyond the scope of this course
373F20 – Nisarg Shah 21

What is this course about?
• Topics in this course ➢ Divide and Conquer
➢ Greedy
➢ Dynamic programming
➢ Network flow
➢ Linear programming
➢ NP-completeness (not really an algorithm design paradigm) ➢ Approximation algorithms (if time permits)
➢ Randomized algorithms (if time permits)
373F20 – Nisarg Shah 22

What is this course about?
• How do we know which paradigm is right for a given problem?
➢ A very interesting question!
➢ Subject of much ongoing research…
o Sometimes, you just know it when you see it…
• How do we analyze an algorithm?
➢ Proof of correctness
➢ Proof of running time
o We’ll try to prove the algorithm is efficient in the worst case o In practice, average case matters just as much (or even more)
373F20 – Nisarg Shah 23

What is this course about?
• What does it mean for an algorithm to be efficient in the worst case?
➢ Polynomial time
➢ It should use at most poly(n) steps on any n-bit input o 𝑛, 𝑛2, 𝑛100, 100𝑛6 + 237𝑛2 + 432, …
➢ If the input to an algorithm is a number 𝑥, the number of bits of input is log 𝑥 o This is because it takes log 𝑥 bits to represent the input 𝑥 in binary
o So the running time should be polynomial in log 𝑥, not in 𝑥
➢ How much is too much?
373F20 – Nisarg Shah 24

What is this course about?
373F20 – Nisarg Shah 25

What is this course about?
373F20 – Nisarg Shah 26

What is this course about?
• What if we can’t find an efficient algorithm for a problem? ➢ Try to prove that the problem is hard
➢ Formally establish complexity results ➢ NP-completeness, NP-hardness, …
• We’ll often find that one problem may be easy, but its simple variants may suddenly become hard
➢ Minimum spanning tree (MST) vs bounded degree MST ➢ 2-colorability vs 3-colorability
373F20 – Nisarg Shah 27

I’m not convinced.
Will I really ever need to know how to design abstract algorithms?
373F20 – Nisarg Shah 28

At the very least…
This will help you prepare for your
technical job interview!
Real Microsoft interview question:
• Given an array 𝑎, find indices (𝑖, 𝑗) with thelargest𝑗−𝑖suchthat𝑎 𝑗 >𝑎[𝑖]
• Greedy? Divide & conquer?
373F20 – Nisarg Shah 29

Disclaimer
• The course is theoretical in nature
➢ You’ll be working with abstract notations, proving correctness of algorithms, analyzing the running time of algorithms, designing new algorithms, and proving complexity results.
• Something for everyone…
➢ If you’re somewhat scared going into the course
➢ If you’re already comfortable with the proofs, and want challenging problems
373F20 – Nisarg Shah 30

Related/Follow-up Courses
• Direct follow-up
➢ CSC473: Advanced Algorithms
➢ CSC438: Computability and Logic
➢ CSC463: Computational Complexity and Computability
• Algorithms in other contexts
➢ CSC304: Algorithmic Game Theory and Mechanism Design (self promotion!)
➢ CSC384: Introduction to Artificial Intelligence ➢ CSC436: Numerical Algorithms
➢ CSC418: Computer Graphics
373F20 – Nisarg Shah 31

Divide & Conquer
373F20 – Nisarg Shah 32

History?
• Maybe you saw a subset of these algorithms?
➢ Mergesort – 𝑂 𝑛 log 𝑛
➢ Karatsuba algorithm for fast multiplication – 𝑂 𝑛log2 3 ➢ Largest subsequence sum in 𝑂 𝑛
➢…
rather than 𝑂 𝑛2
• Have you seen some divide & conquer algorithms before? ➢ Maybe in CSC236/CSC240 and/or CSC263/CSC265
➢ Write “yes”/”no” in chat
373F20 – Nisarg Shah 33

Divide & Conquer
• General framework
➢ Break (a large chunk of) a problem into two smaller subproblems of the same
type
➢ Solve each subproblem recursively and independently
➢ At the end, quickly combine solutions from the two subproblems and/or solve any remaining part of the original problem
• Hard to formally define when a given algorithm is divide-and- conquer…
• Let’s see some examples!
373F20 – Nisarg Shah 34

Master Theorem
• Here’s the master theorem, as it appears in CLRS
➢ Useful for analyzing divide-and-conquer running time
➢ If you haven’t already seen it, please spend some time understanding it
373F20 – Nisarg Shah 35

Master Theorem
Intuition: Compare f(n) with nlogba. The larger determines the recurrence solution.
373F20 – Nisarg Shah 36

Counting Inversions
• Problem
➢ Given an array 𝑎 of length 𝑛, count the number of pairs (𝑖, 𝑗) such that 𝑖 < 𝑗 but𝑎𝑖 >𝑎[𝑗]
• Applications ➢ Voting theory
➢ Collaborative filtering
➢ Measuring the “sortedness” of an array
➢ Sensitivity analysis of Google’s ranking function
➢ Rank aggregation for meta-searching on the Web
➢ Nonparametric statistics (e.g., Kendall’s tau distance)
373F20 – Nisarg Shah 37

Counting Inversions
• Problem
➢ Count (𝑖, 𝑗) such that 𝑖 < 𝑗 but 𝑎 𝑖 > 𝑎[𝑗]
• Brute force
➢ Check all Θ 𝑛2 pairs
• Divide & conquer
➢ Divide: break array into two equal halves 𝑥 and 𝑦
➢ Conquer: count inversions in each half recursively
➢ Combine:
o Solve (we’ll see how): count inversions with one entry in 𝑥 and one in 𝑦 o Merge: add all three counts
373F20 – Nisarg Shah 38

Counting Inversions
• From Kevin Wayne’s slides
373F20 – Nisarg Shah 39

Counting Inversions
373F20 – Nisarg Shah 40

Counting Inversions
373F20 – Nisarg Shah 41

Counting Inversions
• How do we formally prove correctness?
➢ Induction on 𝑛 is usually very helpful
➢ Allows you to assume correctness of subproblems
• Running time analysis
➢ Suppose 𝑇(𝑛) is the running time for inputs of size 𝑛
𝑛
➢ Master theorem says this is 𝑇 𝑛 = 𝑂(𝑛 log 𝑛)
➢ Our algorithm satisfies 𝑇 𝑛 = 2 𝑇 Τ + 𝑂(𝑛) 2
373F20 – Nisarg Shah 42

Without Master Theorem
𝑛
Let’s say 𝑇 𝑛 = 2 𝑇 Τ + 2𝑛 2
373F20 – Nisarg Shah 43

Closest Pair in R2
• Problem:
➢ Given 𝑛 points of the form (𝑥𝑖 , 𝑦𝑖 ) in the plane, find the closest pair of points.
• Applications:
➢ Basic primitive in graphics and computer vision
➢ Geographic information systems, molecular modeling, air traffic control ➢ Special case of nearest neighbor
• Brute force: Θ 𝑛2
373F20 – Nisarg Shah 44

Intuition from 1D?
• In 1D, the problem would be easily 𝑂(𝑛 log 𝑛)
➢ Sort and check!
• Sorting attempt in 2D
➢ Find closest points by x coordinate ➢ Find closest points by y coordinate
• Non-degeneracy assumption
➢ No two points have the same x or y coordinate
373F20 – Nisarg Shah 45

Intuition from 1D?
• Sorting attempt in 2D
➢ Find closest points by x or y coordinate ➢ Doesn’t work!
1+𝜖 1
2
1 1+𝜖
373F20 – Nisarg Shah 46

Closest Pair in R2
• Let’s try divide-and-conquer!
➢ Divide: points in equal halves by drawing a vertical line 𝐿
➢ Conquer: solve each half recursively
➢ Combine: find closest pair with one point on each side of 𝐿 ➢ Return the best of 3 solutions
Seems like Ω(𝑛2)
373F20 – Nisarg Shah 47

Closest Pair in R2
• Combine
➢ We can restrict our attention to points within 𝛿 of 𝐿 on each side, where 𝛿 = best of the solutions in two halves
373F20 – Nisarg Shah 48

Closest Pair in R2
• Combine (let 𝛿 = best of solutions in two halves) ➢ Only need to look at points within 𝛿 of 𝐿 on each side,
➢ Sort points on the strip by 𝑦 coordinate
➢ Only need to check each point with next 11 points in sorted list!
Wait, what? Why 11?
373F20 – Nisarg Shah 49

Why 11?
• Claim:
➢ If two points are at least 12 positions apart in the
sorted list, their distance is at least 𝛿 • Proof:
➢ No two points lie in the same 𝛿/2 × 𝛿/2 box
➢ Two points that are more than two rows apart are at distance at least 𝛿
373F20 – Nisarg Shah 50

Recap: Karatsuba’s Algorithm
• Fast way to multiply two 𝑛 digit integers 𝑥 and 𝑦
• Brute force: 𝑂(𝑛2) operations
• Karatsuba’s observation:
➢ Divide each integer into two parts
o𝑥𝑦= 𝑥𝑦 ∗10𝑛+ 𝑥𝑦 +𝑥𝑦 ∗10Τ +(𝑥𝑦) 111221222
𝑛
o𝑥𝑦+𝑥𝑦=𝑥+𝑥 𝑦+𝑦 −𝑥𝑦−𝑥𝑦 1221 1212 1122
➢ Running time
𝑛𝑛
o𝑥=𝑥 ∗10Τ +𝑥,𝑦=𝑦 ∗10Τ +𝑦 122122
𝑛
➢ Four Τ -digit multiplications can be replaced by three 2
𝑛 log 3
o𝑇𝑛 =3𝑇 Τ +𝑂(𝑛)⇒𝑇𝑛 =𝑂𝑛 2
2
373F20 – Nisarg Shah 51

Strassen’s Algorithm
• Generalizes Karatsuba’s insight to design a fast algorithm for multiplying two 𝑛 × 𝑛 matrices
➢ Call 𝑛 the “size” of the problem 𝐶𝐶𝐴𝐴𝐵𝐵
11 12 = 11 12 ∗ 11 12
𝐶𝐶𝐴𝐴𝐵𝐵 21 22 21 22 21 22
➢ Naively, this requires 8 multiplications of size 𝑛/2 o𝐴 ∗𝐵,𝐴 ∗𝐵,𝐴 ∗𝐵,𝐴 ∗𝐵,…
11 11 12 21 11 12 12 22
➢ Strassen’s insight: replace 8 multiplications by 7
oRunningtime:𝑇 𝑛 =7𝑇 Τ +𝑂(𝑛 )⇒𝑇 𝑛 =𝑂 𝑛 2
2
𝑛 2 log7
373F20 – Nisarg Shah 52

Strassen’s Algorithm
𝐶𝐶𝐴𝐴𝐵𝐵 11 12 = 11 12 ∗ 11 12
𝐶𝐶𝐴𝐴𝐵𝐵 21 22 21 22 21 22
373F20 – Nisarg Shah 53

Median & Selection
• Selection:
➢ Given array 𝐴 of 𝑛 comparable elements, find 𝑘th smallest
➢ 𝑘 = 1 is min, 𝑘 = 𝑛 is max, 𝑘 = ➢ 𝑂 𝑛 is easy for min/max
• What about 𝑘-selection?
➢ 𝑂(𝑛𝑘) by modifying bubble sort ➢𝑂 𝑛log𝑛 by sorting
➢ 𝑂 𝑛 + 𝑘 log 𝑛 using min-heap ➢ 𝑂(𝑘 + 𝑛 log 𝑘) using max-heap
𝑛 + 1 Τ2 is median
• Q: What about just 𝑂(𝑛)?
• A: Yes! Selection is easier than sorting.
373F20 – Nisarg Shah 54

QuickSelect • Find a pivot 𝑝
• Divide 𝐴 into two sub-arrays
➢ 𝐴𝑙𝑒𝑠𝑠 = elements ≤ 𝑝, 𝐴𝑚𝑜𝑟𝑒 = elements > 𝑝
➢ If 𝐴𝑙𝑒𝑠𝑠 ≥ 𝑘, return 𝑘th smallest in 𝐴𝑙𝑒𝑠𝑠 , otherwise return (𝑘 − 𝐴𝑙𝑒𝑠𝑠 )th smallest in 𝐴𝑚𝑜𝑟𝑒
• Problem?
➢ If pivot is close to the min or the max, then we basically get
𝑇 𝑛 ≤ 𝑇 𝑛 − 1 + 𝑂(𝑛), which only gives 𝑇 𝑛 = 𝑂 𝑛2
➢ Want to reduce 𝑛 − 1 to a fraction of 𝑛 (like 𝑛/2, 5𝑛/6, etc)
373F20 – Nisarg Shah 55

Finding a Good Pivot
• Divide 𝑛 elements into Τ groups of 5 each 5
𝑛
373F20 – Nisarg Shah 56

Finding a Good Pivot
• Divide 𝑛 elements into Τ groups of 5 each 5
• Find the median of each group
𝑛
373F20 – Nisarg Shah 57

Finding a Good Pivot
• Divide 𝑛 elements into Τ groups of 5 each 5
• Find the median of each group
• Find the median of 𝑛/5 medians
𝑛
373F20 – Nisarg Shah 58

Finding a Good Pivot
• Divide 𝑛 elements into Τ groups of 5 each 5
• Find the median of each group
• Find the median of 𝑛/5 medians
• Use this median of medians as the pivot in quickselect
• Q: Why does this work?
𝑛
373F20 – Nisarg Shah 59

Analysis
• How many elements can be ≤ 𝑝∗? ➢ Out of 𝑛/5 medians, 𝑛/10 are > 𝑝∗
373F20 – Nisarg Shah 60

Analysis
• How many elements can be ≤ 𝑝∗? ➢ Out of 𝑛/5 medians, 𝑛/10 are > 𝑝∗
373F20 – Nisarg Shah 61

Analysis 𝑛𝑛∗
• Τ ofthe Τ mediansare≤𝑝 10 5
➢ For each such median, there are 3 elements ≤ 𝑝∗
7𝑛 ∗
➢ So there can be at most Τ elements that can be > 𝑝 10
373F20 – Nisarg Shah 62

Analysis
•Thus,𝐴 ≤Τ 𝑚𝑜𝑟𝑒 7𝑛 10
➢ (These are rough calculations…)
• How does this factor into overall algorithm analysis?
➢Similarly, 𝐴 ≤ Τ 𝑙𝑒𝑠𝑠 10
7𝑛
373F20 – Nisarg Shah 63

Analysis
• Divide 𝑛 elements into Τ groups of 5 each
• Find the median of each group
𝑛5
𝑂(𝑛)
𝑇(𝑛/5) 𝑂(𝑛) 𝑇(7𝑛/10)
∗𝑛
• Find 𝑝 = median of Τ medians 5
• Create 𝐴𝑙𝑒𝑠𝑠 and 𝐴𝑚𝑜𝑟𝑒 according to 𝑝∗ • Run selection on one of 𝐴𝑙𝑒𝑠𝑠 or 𝐴𝑚𝑜𝑟𝑒
•𝑇𝑛≤𝑇Τ+𝑇 Τ +𝑂(𝑛) 𝑛 5 7𝑛 10
•Note:Τ+ Τ = Τ
𝑛 5 7𝑛 10 9𝑛 10
➢ Only a fraction of 𝑛, so by the Master theorem, 𝑇 𝑛
= 𝑂(𝑛)
373F20 – Nisarg Shah 64

Residual Notes
• Best algorithm for a problem?
➢ Typically hard to determine
➢ We still don’t know best algorithms for multiplying two 𝑛-digit integers or two 𝑛 × 𝑛 matrices
o Integer multiplication
• Breakthrough in March 2019: first 𝑂(𝑛 log 𝑛) time algorithm • It is conjectured that this is asymptotically optimal
o Matrix multiplication
• 1969(Strassen):𝑂(𝑛2.807) • 1990:𝑂(𝑛2.376)
• 2013:𝑂(𝑛2.3729)
• 2014:𝑂(𝑛2.3728639)
373F20 – Nisarg Shah 65

Residual Notes
• Best algorithm for a problem?
➢ Usually, we design an algorithm and then analyze its running time
➢ Sometimes we can do the reverse:
o E.g., if you know you want an 𝑂(𝑛2 log 𝑛) algorithm
o Master theorem suggests that you can get it by 𝑇𝑛=4𝑇𝑛ൗ +𝑂𝑛2
o So maybe you want to break your problem into 4 problems of size 𝑛/2 each, and then do 𝑂(𝑛2) computation to combine
2
373F20 – Nisarg Shah 66

Residual Notes
• Access to input
➢ For much of this analysis, we are assuming random access to elements of input
➢ So we’re ignoring underlying data structures (e.g. doubly linked list, binary tree, etc.)
• Machine operations
➢ We’re only counting the number of comparison or arithmetic operations
➢ So we’re ignoring issues like how real numbers are stored in the closest pair problem
➢ When we get to P vs NP, representation will matter
373F20 – Nisarg Shah 67

Residual Notes
• Size of the problem
➢ Can be any reasonable parameter of the problem
➢ E.g., for matrix multiplication, we used 𝑛 as the size ➢ But an input consists of two matrices with 𝑛2 entries
➢ It doesn’t matter whether we call 𝑛 or 𝑛2 the size of the problem ➢ The actual running time of the algorithm won’t change
373F20 – Nisarg Shah 68