Week 12 Studio Sheet (Solutions)
Useful advice: The following solutions pertain to the theoretical problems given in the tutorial classes. You are strongly advised to attempt the problems thoroughly before looking at these solutions. Simply reading the solu- tions without thinking about the problems will rob you of the practice required to be able to solve complicated problems on your own. You will perform poorly on the exam if you simply attempt to memorise solutions to the tutorial problems. Thinking about a problem, even if you do not solve it will greatly increase your understanding of the underlying concepts. Solutions are typically not provided for Python implementation questions. In some cases, psuedocode may be provided where it illustrates a particular useful concept.
Assessed Preparation
Problem 1. Consider the following flow network. Edge labels of the form f /c denote the current flow f and
the total capacity c .
s 1/1 0/1 t
(a) Drawthecorrespondingresidualnetwork
(b) Identifyanaugmentingpathintheresidualnetworkandstateitscapacity
(c) Augmenttheflowofthenetworkalongtheaugmentingpath,showingtheresultingflownetwork
(a) Residualcapacityisshowninblue,reversibleflowisshowninred.
(b) Oneaugmentingpathiss →a →b →d →t withcapacity1.
(c) Notethatthereareotherpossibleaugmentingpathsandthereforeothersolutions.
Problem2. ShowthestepstakenbyKahn’salgorithmforcomputingatopologicalorderofthefollowinggraph. bcd
Kahn’s algorithm initially adds vertices a and h into the queue since they have an indegree of zero. We add a to the topological order, which enqueues b and e since a was their only incoming edge. We next addhandenqueuei.Wethenaddb,e,theni.Thisenqueuesf andj.Weaddf whichenqueuesc.We add j which enqueues k . We add c which enqueues d . We add k and then d , which finally enqueues g , which we add. The final topological order is a , h , b , e , i , f , j , c , k , d , g . Note that other orders are possible. For example, we could have started with h instead of a .
Studio Problems
Problem3. Deviseanalgorithmfordeterminingwhetheragivendirectedacyclicgraphhasauniquetopological
ordering. That is, determine whether there is more than one valid topological ordering.
The easiest way to approach this is to modify Kahn’s algorithm. Recall that the queue used by Kahn’s con- tains vertices that have no dependencies left and hence could be added to the topological order. There- fore if the queue ever contains more than one element, the topological order is not unique. Otherwise, the topological order is unique.
Problem 4. Devise an algorithm for counting the number of paths between two given vertices s and t in a directed acyclic graph. Don’t worry about the magnitude of the answer (i.e. assume that all arithmetic operations
take constant time, despite the fact that the answer might be exponential in the input size.)
We will approach this using a dynamic programming algorithm. Suppose we are at some vertex u . Then the number of different ways in which we could get to t involves trying all of our different edges (u , v ) and then counting the number of resulting paths from all of the v . So let us define the following subproblems.
DP[u] = {The number of paths from u to t }. The recurrence is then given by
1 if u = t , DP[v ] otherwise.
DP[u]= v∈adj[u]
The answer is the value of DP[s ]. The subproblems are dependent in a reverse topological order, since in order to compute DP[u] we must know the value of all of u’s descendants.
1: 2: 3: 4: 5: 6: 7: 8: 9:
functionCOUNT_PATHS(G=(V,E),s,t)
Set count[1..n ] = 0
count[t] = 1
Set order = reverse(topological_sort(G)) for each vertex u in order do
for each edge (u,v) adjacent to u do count[u] = count[u] + count[v]
end for end for
return count[s] endfunction
Problem 5. Complete the Ford-Fulkerson method for the network in Problem 1, showing the final flow network with a maximum flow.
Problem 6. Using your solution to Problem 5, list the vertices in the two components of a minimum s − t cut in the network in Problem 1. Identify the edges that cross the cut and verify that their capacity adds up to the value of the maximum flow.
To find the minimum cut after running the Ford-Fulkerson algorithm, we simply identify all vertices reachable from the source in the residual graph. These vertices will be on one side of the cut, and all remaining vertices will be on the other. In this graph, there are no outgoing edges in the residual graph, since the two edges outgoing from s are full. So the source is on one side of the cut, and everything else is on the other.
Component 1: {s }
Component2: {a,b,c,d,t}
The flow across this cut is indeed 7, as expected.
Problem7. LetG beaflownetworkandlet f beavalidflowonG. Provethatthenetoutflowoutofs isequal to the net inflow into t .
Suppose the set of all vertices is V . We know that the flow of every cut is equal to the flow of the network. Therefore the capacity of the cut ({s }, {V − {s }} (which is the flow out of s ) must equal the capacity of the cut ({V − {t }}, {t } (which is the capacity into t ).
Problem 8. Consider a variant of the maximum network flow problem in which we allow for multiple source vertices and multiple sink vertices. We retain all of the capacity and flow conservation constraints of the original maximum flow problem. As in the original problem, all of the sources and sinks are excluded from the flow conservation constraint. Describe a simple method for solving this problem.
We create a new graph by adding a “super-source” and “super-sink” vertex. The super source is connected to each source by an edge with capacity equal to the total capacities of all the outgoing edges from that source. Similarly, each sink is connected by an edge to the super sink, and the capacity of that edge is equal to the total capacities of all incoming edges to that sink. We then solve the flow problem on this new graph as normal.
Problem 9. A Hamiltonian path in a graph G = (V , E ) is a path in G that visits every vertex v ∈ V exactly once. On general graphs, computing Hamiltonian paths is NP-Hard. Describe an algorithm that finds a Hamiltonian path in a directed acyclic graph in O (V + E ) time or reports that one does not exist.
A Hamiltonian path must begin at some vertex and then travel through every other vertex without ever revisiting a previous one. The first thing to note is that the starting vertex must therefore be a vertex with no incoming edges, since such a vertex can never be visited in any other way by the path. This means that a Hamiltonian path will not exist if there are multiple such vertices. This should remind us of something similar, topological orderings!
Let’s argue that a Hamiltonian path must be a topological ordering of a graph. Consider the sequence of vertices in a Hamiltonian path. If a vertex has an edge that travels to a previous vertex (meaning that the two are out of topological order), then this would produce a cycle in the graph, which is impossible since the graph is acyclic by assumption. Therefore a Hamiltonian path, if one exists, is a topological order of the graph.
Suppose that a Hamiltonian path exists. Then since every adjacent pair of vertices must be connected, the topological order of the graph must be unique, since every vertex has a dependency on the one before it. Therefore if the topological order is not unique, a Hamiltonian path will not exist. Finally, suppose that
there is a unique topological order, then this implies that there are no pairs of vertices in the order that are not adjacent, since otherwise they could be swapped without breaking the dependency constraints. Since every pair of vertices in the order must be adjacent, this constitutes a Hamiltonian path.
We can conclude that a Hamiltonian path exists if and only if the graph has a unique topological order, and if so, the Hamiltonian path is the topological order.
Problem 10. Consider a directed acyclic graph representing the hierarchical structure of n employees at a company. Each employee may have one or many employees as their superior. The company has decided to give raises to m of the top employees. Unfortunately, you are not sure exactly how the company decides who is considered the top employees, but you do know for sure that a person will not receive a raise unless all of their superiors do.
Describe an algorithm that given the company DAG and the value of m , determines which employees are guaran- teed to receive a raise, and which are guaranteed to not receive a raise. Your algorithm should run in O (V 2 + V E ) time.
We can rephrase this problem in terms of topological orderings. An employee will receive a raise if they are among the top m employees, which means that they must be in the first m elements of a topological order. However, we can not simply compute a topological order and check the first m elements since the order is not guaranteed to be unique. More concretely, an employee is guaranteed a raise if they are in the first m elements of every possible topological order. Similarly, an employee is guaranteed to not get a raise if they are never in the first m elements in any topological order.
In order to determine this, consider a particular vertex v in a directed acyclic graph. The vertices that must come after it in a topological ordering are all of the vertices that it can reach, since they are all of its dependants. Conversely, all of the vertices that must come before v are the ones that can reach v . All other vertices could go before or after v .
Therefore we want to count for each employee, the number of employees that are reachable in the com- pany DAG, and conversely, the number of employees that can reach them in the company DAG. We can count the number of vertices reachable from a particular employee in O (V + E ) time with a simple depth- first search. Similarly, to count the number of employees that reach them, we can perform a depth-first search in the company DAG with all of the edges reversed. Let the number of reachable employees be r , and the number of employees that reach them be s . An employee is guaranteed to be in the first m elements of any topological order if and only if
Similarly, an employee is guaranteed to not be in the first m elements if and only if
We can therefore run this procedure for every employee and output the results. We perform two depth- first searches per employee, taking O (V +E ) time each, and hence the total time complexity is O (V 2 +V E ) as required.
Problem11. Givenalistofn integersr1,r2,…,rn andm integersc1,c2,…,cm,wewishtodeterminewhetherthere existsann×mmatrixconsistingofzerosandoneswhoserowsumsarer1,r2,…,rn respectivelyandwhosecolumn sums are c1,c2,…,cm respectively. Describe an algorithm for solving this problem by using a flow network.
Constuct a flow network as follows: Create two sets of vertices, A and B. Each vertex in A (a1,a2,…an) correspondstooneofthen rows,andeachvertexinB (b1,b2,…bm)correspondstooneofthem columns. Add a source and a sink. For 1 ≤ i ≤ n, add an edge from the source to ai with capacity ri. For each 1 ≤ j ≤ n, add an edge from vertex bj to the sink with capacity cj . Add edges ei,j between every pair of vertices (ai ,bj ) with capacity 1. Determine the maximum flow in this network. If all the edges from the sourcearefull,thenwecanconstructsuchamatrix,andmoreoever,theelementsAi,j ofthematrixare equal to the flows along ei , j . If any edge from the source is not full, such a matrix does not exist.
To see why this method works, consider the flows into and out of the ai vertices. We notice that a row hasri 1siffthevertexai hasatotaloutflowofri (becauseeachedgeoutofai correspondstooneofthe valuesincolumni,andalltheedgesfromai havecapacity1),butthisisonlypossibleiftheflowfrom thesourcetoai isexactlyri becausetheonlyedgethatflowsintoai isfromthesource.Theseedgeshave capacity ri , so they all need to be full.
Problem 12. Consider the problem of allocating a set of jobs to one of two supercomputers. Each job must be allocated to exactly one of the two computers. The two computers are configured slightly differently, so for each job, you know how much it will cost on each of the computers. There is an additional constraint. Some of the jobs are related, and it would be preferable, but not required, to run them on the same computer. For each pair of related jobs, you know how much more it will cost if they are run on separate computers. Give an efficient algorithm for determining the optimal way to allocate the jobs to the computers, where your goal is to minimise the total cost.
Create a graph with (bidirectional) edges between related jobs, whose capacities are the cost to separate them. Add edges from s to all jobs with capacity cost to run on computer 1, and edges from all jobs to t with capacity cost to run on computer 2. The minimum cut will be the best total cost since you are paying to either put the job on computer 1 or 2, and paying for related jobs that are cut from each other when assigned to different computers.
Supplementary Problems
Problem 13. Consider a variant of the maximum network flow problem in which vertices also have capacities. That is for each vertex except s and t , there is a maximum amount of flow that can enter and leave it. Describe a simple transformation that can be made to such a flow network so that this problem can be solved using an ordinary maximum flow algorithm1.
For each non-source non-sink vertex v, do the following: Create two new vertices vin and vout. Take all the inflowing edges to v and replace v with vin. Take all the outflowing edges from v and replace v with vout . Create an edge from vin to vout with capacity equal to the capacity of v . Delete v from the network.
Now we can solve for the maximum flow in this graph, and the solution will be the maximum flow in the original graph as well.
Problem14. Twopathsinagraphareedgedisjointiftheyhavenoedgesincommon.Givenadirectednetwork, we would like to determine the maximum number of edge-disjoint paths from vertex s to vertex t .
1 Do not try to modify the Ford-Fulkerson algorithm. In general, it is always safer when solving a problem to reduce the problem to another known problem by transforming the input, rather than modifying the algorithm for the related problem.
(a) Describe how to determine the maximum number of edge-disjoint s − t paths.
(b) Whatisthetimecomplexityofthisapproach?
(c) Howcouldwemodifythisapproachtofindvertex-disjointpaths,i.e.pathswithnoverticesincommon
(a) Given a directed graph G, create a flow network G’ from G as follows: Let s be the source, and t be the sink. Give every edge a capacity of 1. Now we can run the maximum flow algorithm, and the resulting flow is the number of edge-disjoint paths.
This works because the number of edge-disjoint paths that pass through a vertex v is at most min(indegree(v ), outdegree(v )). The maximum flow through a vertex v is the same value, because all edges are capacity 1.
(b) The time complexity of Ford-Fulkerson is bounded by O (E f ) where E is the number of edges and f isthemaximumflow.Here,themaximumflowistheoutdegreeofs,sothecomplexityisO(E× outdegree(s )) which is bounded by O (E V ).
(c) Createtheflownetworkasspecifiedinparta),butaddcapacitiesforeachvertex,andthenperform the transform described in the solution to problem 13. This ensures that the flow through each ver- tex is at most 1, wheras before the flow through a vertex was equal to the number of paths through it.
Problem15. Youarearadiostationhostandareinchargeofschedulingthesongsforthecomingweek.Every song can be classified as being from a particular era, and a particular genre. Your boss has given you a strict set of requirements that the songs must conform to. Among the songs chosen, for each of the n eras 1 ≤ i ≤ n , you must play exactly ni songs of that era, and for each of the m genres 1 ≤ j ≤ m, you must play exactly mj songs fromthatgenre.Ofcourse,ni =mj.Deviseanalgorithmthatgivenalistofallofthesongsyoucouldchoose from, their era and their genre, determines a subset of them that satisfies the requirements, or determines that it is not possible.
Makeabipartitegraphwithverticesu1…ui correspondingtoeras,andv1…vj correspondingtogenres. Each song is represented by an edge from its genre to its era (of capacity 1 if you can only play each song once. How would you allow for each song to be played some other number of times at most?). Add a sourcewithoutgoingedgestoeachgenrevertexuk withcapacitynk (where1≤k≤n).Addasinkvertex, with an edge from each vertex vl to the sink with capacity ml (where 1 ≤ l ≤ n). Run Ford-Fulkerson on this graph, and if a flow of capacity ni = mj is achieved, then it is possible to play songs in the required combination. Otherwise it is not. The flow along song edges tells you how many times to play each song.
Problem 16. Recall the concept of a directed acyclic graph (DAG). A path cover of a DAG is a set of paths that include every vertex exactly once (multiple paths can not touch the same vertex). A minimum path cover is a path cover consisting of as few paths as possible. Devise an efficient algorithm for finding a minimum path cover of a DAG. [Hint: Use bipartite matching]
Create a bipartite graph where each side is a copy of V . For each edge in the graph, put that edge (u , v ), add an edge in the bipartite graph (u on the left, v on the right).
Solve for the maximum cardinality matching (using flow). I claim that any unmatched vertex on the left is the beginning of a path, hence the answer is |V | – #successful matches. To see why this is true, note
that the matching is essentially choosing for each vertex, who do I go to next in my path? Each vertex can be in one path only, hence it can be matched to one vertex, and vice versa, at most one vertex can match to it. The unmatched vertices on the left must therefore be the beginnings of paths, and the unmatched vertices on the right are the ends of paths. In fact, if you imagine traversing the matching in the bipartite graph, the paths you traverse will be precisely the path cover.
Fun fact: extend this to non-acyclic graphs and run this algorithm, and you are determining a cycle cover, a set of disjoint cycles that cover the graph, by the same logic.
Problem 17. (Advanced) A useful application of maximum flow to people interested in sports is the baseball elimination problem. Consider a season of baseball in which some games have already been played, and the schedule for all of the remaining games is known. We wish to determine whether a particular team can possibly end up with the most wins. For example, consider the following stats.
Team1 Team2 Team3 Team4
Wins Games Left 30 5
It is trivial to determine that Team 4 has no chance of winning, since even if they win all 9 of their remaining games, they will be at least one game behind Team 1. However, things get more interesting if Team 4 can win enough games to reach the current top score.
Team1 Team2 Team3 Team4
Wins Games Left 30 5
In this case, Team 4 can reach 31 wins, but it doesn’t matter since the other teams have enough games left that one of them must reach 32 wins. We can determine the answer with certainty if we know not just the number of games remaining, but the exact teams that will play in each of them. A complete schedule consists of the number of wins of each team, the number of games remaining, and for each remaining game, which team they will be playing against. An example schedule might look like the following.
Games Remaining
Wins Total vsT1 Team129 5 0 Team2 28 10 2 Team328 8 1 Team425 9 2
vsT2 vsT3 vsT4 2 1 2
Describe an algor