CS代写 Analysis of Algorithms, I

Analysis of Algorithms, I
CSOR W4231

Computer Science Department

Copyright By PowCoder代写 加微信 powcoder

Columbia University
The dynamic programming principle; segmented least squares

1 Segmented least squares
An exponential recursive algorithm
2 A Dynamic Programming (DP) solution A quadratic iterative algorithm Applying the DP principle

1 Segmented least squares
An exponential recursive algorithm
2 A Dynamic Programming (DP) solution A quadratic iterative algorithm Applying the DP principle

Linear least squares fitting
A foundational problem in statistics: find a line of best fit through some data points.

Linear least squares fitting
Input: a set P of n data points (x1, y1), (x2, y2), . . ., (xn, yn); weassumex1 0 be a fixed multiplier. The cost of partition A is
􏰆 err(Lk,Sk)+mA ·C Sk ∈A

Segmented least squares
This problem is an instance of change detection in data mining and statistics.
Input: A set P of n data points pi = (xi, yi) as before. Output: A segmentation A∗ = {S1, S2, . . . , SmA∗ } of P whose
is minimum.
􏰆 err(Lk,Sk)+mA∗C Sk ∈A∗

A brute force approach
We can find the optimal partition (that is, the one incurring the minimum cost) by exhaustive search.
􏰉 Enumerate every possible partition (segmentation) and compute its cost.
􏰉 Output the one that incurs the minimum cost. △ Ω(2n) partitions

A crucial observation regarding the last data point
Consider the last point pn in the data set.
􏰉 pn belongs to a single segment in the optimal partition.
􏰉 That segment starts at an earlier point pi, for some 1 ≤ i ≤ n.
This suggests a recursive solution: if we knew where the last segment starts, then we could remove it and recursively solve the problem on the remaining points {p1, . . . , pi−1}.

A recursive approach
􏰉 LetOPT(j)=minimumcostofapartitionofthepoints p1,…,pj.
􏰉 Then, if the last segment of the optimal partition is {pi, . . . , pn}, the cost of the optimal solution is
OP T (n) = err(L, {pi, . . . , pn}) + C + OP T (i − 1).
􏰉 But we don’t know where the last segment starts! How do
we find the point pi? 􏰉 Set
OPT(n)= min 􏰇err(L,{pi,…,pn})+C+OPT(i−1)􏰈. 1≤i≤n

A recurrence for the optimal solution
Notation: let ei,j = err(L,{pi,…,pj}), for 1 ≤ i ≤ j ≤ n.
OPT(n) = min 􏰇ei,n + C + OPT(i − 1)􏰈. 1≤i≤n
If we apply the above expression recursively to remove the last segment, we obtain the recurrence
OPT(j) = min 􏰇ei,j + C + OPT(i − 1)􏰈 (5) 1≤i≤j
1. We can precompute and store all ei,j using equations (2), (3), (4) in O(n3) time. Can be improved to O(n2).
2. The natural recursive algorithm arising from recurrence (5) is not efficient (think about its recursion tree!).

Exponential-time recursion
Notation: T (n) = time to compute OP T (n), that is, the cost of the optimal partition for n points.
T (n) ≥ T (n − 1) + T (n − 2).
􏰉 Can show that T(n) ≥ Fn, the n-th Fibonacci number
(by strong induction on n).
􏰉 From optional problem 6a in Homework 1, Fn = Ω(2n/2).
􏰉 HenceT(n)=Ω(2n/2).
⇒ The recursive algorithm requires Ω(2n/2) time.

1 Segmented least squares
An exponential recursive algorithm
2 A Dynamic Programming (DP) solution A quadratic iterative algorithm Applying the DP principle

Are we really that far from an efficient solution?
Recall Fibonacci problem from HW1: exponential recursive algorithm, polynomial iterative solution
1. Overlapping subproblems: spectacular redundancy in computations of recursion tree
2. Easy-to-compute recurrence for combining the smaller subproblems: Fn = Fn−1 + Fn−2
3. Iterative, bottom-up computations: we computed and stored the subproblems from smallest (F0,F1) to largest (Fn), iteratively.
4. Small number of subproblems: only solved n − 1 subproblems.

Elements of DP in segmented least squares
Our problem exhibits similar properties.
1. Overlapping subproblems
2. Easy-to-compute recurrence for combining optimal solutions to smaller subproblems into the optimal solution of a larger subproblem (once smaller subproblems have been solved)
3. Iterative, bottom-up computations: compute the subproblems from smallest (0 points) to largest (n points), iteratively.
4. Small number of subproblems: we only need to solve n subproblems.

A dynamic programming approach
OPT(j)= min 􏰇ei,j +C+OPT(i−1)􏰈 1≤i≤j
􏰉 Theoptimalsolutiontothesubproblemonp1,…,pj contains optimal solutions to smaller subproblems.
􏰉 Recurrence 5 provides an ordering of the subproblems from smaller to larger, with the subproblem of size 0 being the smallest and the subproblem of size n the largest.
⇒ There are n + 1 subproblems in total. Solving the j-th subproblem requires Θ(j) = O(n) time.
⇒ The overall running time is O(n2).
􏰉 Boundaryconditions:OPT(0)=0.
􏰉 Segmentpk,…,pj appearsintheoptimalsolutiononlyif the minimum in the expression above is achieved for i = k.

An iterative algorithm for segmented least squares
Let M be an array of n entries such that
M[i] = cost of optimal partition of the first i data points
SegmentedLS(n, P) M[0] = 0
for all pairs i ≤ j do
Computeei,j forsegmentpi,…,pj using(2),(3),(4)
for j = 1 to n do
M[j]= min{ei,j +C+M[i−1]} 1≤i≤j
Return M[n]
Running time: time required to fill in dynamic programming array M is O(n3) + O(n2). Can be brought down to O(n2).

Reconstructing an optimal segmentation
We can reconstruct the optimal partition recursively, using array M and error matrix e.
OPTSegmentation(j )
if (j == 0) then return else
Find 1 ≤ i ≤ j such that M[j] = ei,j + C + M[i − 1] OPTSegmentation(i − 1)
Output segment {pi,…,pj}
􏰉 Initial call: OPTSegmentation(n) 􏰉 Running time?

Obtaining efficient algorithms using DP
1. Optimal substructure: the optimal solution to the problem contains optimal solutions to the subproblems.
2. A recurrence for the overall optimal solution in terms of optimal solutions to appropriate subproblems. The recurrence should provide a natural ordering of the subproblems from smaller to larger and require polynomial work for combining solutions to the subproblems.
3. Iterative, bottom-up computation of subproblems, from smaller to larger.
4. Small number of subproblems (polynomial in n).

Dynamic programming vs Divide & Conquer
􏰉 They both combine solutions to subproblems to generate the overall solution.
􏰉 However, divide and conquer starts with a large problem and divides it into small pieces.
􏰉 While dynamic programming works from the bottom up, solving the smallest subproblems first and building optimal solutions to steadily larger problems.

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com