程序代写代做代考 C algorithm data mining game AI graph Lecture slides by Kevin Wayne
 Copyright © 2005 Pearson-Addison Wesley


Lecture slides by Kevin Wayne
 Copyright © 2005 Pearson-Addison Wesley

http://www.cs.princeton.edu/~wayne/kleinberg-tardos
7. NETWORK FLOW II
‣ bipartite matching
‣ disjoint paths
‣ extensions to max flow ‣ survey design
‣ airline scheduling
‣ image segmentation
‣ project selection
‣ baseball elimination
Last updated on 1/14/20 2:20 PM

Minimum cut application (RAND 1950s)
“Free world” goal. Cut supplies (if Cold War turns into real war).
rail network connecting Soviet Union with Eastern European countries
(map declassified by Pentagon in 1999)
Figure 2
2
From Harris and Ross [1955]: Schematic diagram of the railway network of the Western So-

Maximum flow application (Tolstoǐ 1930s)
Soviet Union goal. Maximize flow of supplies to Eastern Europe.
flow capacity
rail network connecting Soviet Union with Eastern European countries
(map declassified by Pentagon in 1999)
Figure 2
3
From Harris and Ross [1955]: Schematic diagram of the railway network of the Western So-

Max-flow and min-cut applications
Max-flow and min-cut problems are widely applicable model.
・Data mining.
・Open-pit mining.
・Bipartite matching.
・Network reliability.
・Baseball elimination.
・Image segmentation.
・Network connectivity.
・Markov random fields.
・Distributed computing.
・Security of statistical data.
・Egalitarian stable matching.
・Network intrusion detection. ・Multi-camera scene reconstruction. ・Sensor placement for homeland security. ・Many, many, more.
liver and hepatic vascularization segmentation
4

SECTION 7.5
7. NETWORK FLOW II
‣ bipartite matching
‣ disjoint paths
‣ extensions to max flow ‣ survey design
‣ airline scheduling
‣ image segmentation
‣ project selection
‣ baseball elimination

Matching
Def. Given an undirected graph G = (V, E), subset of edges M ⊆ E is a matching if each node appears in at most one edge in M.
Max matching. Given a graph G, find a max-cardinality matching.
6

Bipartite matching
Def. A graph G is bipartite if the nodes can be partitioned into two subsets L and R such that every edge connects a node in L with a node in R.
Bipartite matching. Given a bipartite graph G = (L ∪ R, E), find a max- cardinality matching.
1 1′
2 2′
3 3′
4 4′
L5 5’R matching: 1-1′, 2-2′, 3-4′, 4-5′
7

Bipartite matching: max-flow formulation
Formulation.
・Create digraph Gʹ = (L ∪ R∪ {s, t}, Eʹ ).
・Direct all edges from L to R, and assign infinite (or unit) capacity. ・Add unit-capacity edges from s to each node in L.
・Add unit-capacity edges from each node in R to t.
G′
1 ∞ 1′ 2 2′ 3 3′ 4 4′
5 5′
1
1
s
t
LR
8

Max-flow formulation: proof of correctness
Theorem. 1–1 correspondence between matchings of cardinality k in G and integral flows of value k in Gʹ.
P・f. ⇒
・Let M be a matching in G of cardinality k.
・Consider flow f that sends 1 unit on each of the k corresponding paths.
for each edge e: f(e) ∈ { 0, 1 }
f is a flow of value k. ▪
1 1′
2 2′
3 3′
4 4′
1
1 ∞ 1′
1
2 2′
3 3′ t
4 4′
s
G5 5′
5 5’G′
9

Max-flow formulation: proof of correctness
Theorem. 1–1 correspondence between matchings of cardinality k in G and integral flows of value k in Gʹ.
P・f. ⇐
Consider M = set of edges from L to R with f(e) = 1.
for each edge e: f(e) ∈ { 0, 1 }
・Let f be an integral flow in Gʹ of value k.

– ⎢M ⎢ = k : apply flow-value lemma to cut (L ∪ {s}, R ∪ {t}) ▪
each node in L and R participates in at most one edge in M
1
1∞1′ 1 1′ 1
2 2′ 2 2′
s 3 3′ t 3 3′
4 4′ 4 4′
G′5 5′ 5 5’G
10

Max-flow formulation: proof of correctness
Theorem. 1–1 correspondence between matchings of cardinality k in G and integral flows of value k in Gʹ.
Corollary. Can solve bipartite matching problem via max-flow formulation. P・f.
・Integrality theorem ⇒ there exists a max flow f * in G ʹ that is integral. 1–1 correspondence ⇒ f * corresponds to max-cardinality matching. ▪
1
1∞1′ 1 1′ 1
2 2′ 2 2′
s 3 3′ t 3 3′
4 4′ 4 4′
G′5 5′ 5 5’G
11

Network flow II: quiz 1
What is running time of Ford–Fulkerson algorithms to find a max- cardinality matching in a bipartite graph with ⎟ L⎟ = ⎟ R⎟ = n ?

A. O(m + n) B. O(mn) C. O(mn2) D. O(m2n)
12

Perfect matchings in bipartite graphs
Def. Given a graph G = (V, E), a subset of edges M ⊆ E is a perfect matching if each node appears in exactly one edge in M.
Q. When does a bipartite graph have a perfect matching?
Structure of bipartite graphs with perfect matchings.
・Clearly, we must have ⎟ L⎟ = ⎟ R⎟. ・Which other conditions are necessary? ・Which other conditions are sufficient?
13

Perfect matchings in bipartite graphs
Notation. Let S be a subset of nodes, and let N(S) be the set of nodes adjacent to nodes in S.
Observation. If a bipartite graph G = (L ∪ R, E) has a perfect matching, then ⎟N(S)⎟ ≥ ⎟S⎟ for all subsets S ⊆ L.
Pf. Each node in S has to be matched to a different node in N(S). ▪
S = { 2, 4, 5 } N(S) = { 2′, 5′ }
1
2
3
4
5
1′
2′
3′
4′
5′
no perfect matching
14

Hall’s marriage theorem
Theorem. [Frobenius 1917, Hall 1935] Let G = (L ∪ R, E) be a bipartite graph with ⎟L⎟ = ⎟R⎟. Then, graph G has a perfect matching iff ⎟ N(S)⎟ ≥ ⎟S⎟ for all subsets S ⊆ L.
Pf. ⇒ This was the previous observation.
S = { 2, 4, 5 } N(S) = { 2′, 5′ }
1
2
3
4
5
1′
2′
3′
4′
5′
no perfect matching
15

Hall’s marriage theorem
P・f. ⇐ Suppose G does not have a perfect matching.
・Formulate as a max-flow problem and let (A, B) be a min cut in Gʹ. ・By max-flow min-cut theorem, cap(A, B) < ⎟ L⎟. ・DefineLA=L∩A, LB=L∩B, RA=R∩A. ・cap(A,B) = ⎟LB⎟ + ⎟RA⎟ ⇒ ⎟RA⎟ < ⎟LA⎟. ・Min cut can’t use ∞ edges ⇒ N(LA) ⊆ RA. ⎟ A⎟ ⎟A⎟ ⎟A⎟ ・ N(L ) ≤ R < L . ChooseS=LA. ▪ G′ A 2 1 ∞ ∞ 1 3' 4' 5' 1 t 1 1' LA = {2, 4, 5}
 LB = {1, 3}
 RA = {2', 5'}
 N(LA) = {2', 5'} ∞ 2' s 4 5 1 3 16 Bipartite matching Problem. Given a bipartite graph, find a max-cardinality matching. year worst case technique discovered by 1955 O(m n) augmenting path Ford–Fulkerson 1973 O(m n1/2) blocking flow Hopcroft–Karp, Karzanov 2004 O(n2.378) fast matrix multiplication Mucha–Sankowsi 2013 Õ(m10/7) electrical flow Mądry 20xx running time for finding a max-cardinality matching in a bipartite graph with n nodes and m edges 17 Network flow II: quiz 2 Which of the following are properties of the graph G = (V, E)?
 A. G has a perfect matching. B. Hall’s condition is satisfied: ⎟ N(S)⎟ ≥ ⎟ S⎟ for all subsets S ⊆ V. C. Both A and B. D. Neither A nor B. 18 Nonbipartite matching P・roblem. Given an undirected graph, find a max-cardinality matching. ・Structure of nonbipartite graphs is more complicated. But well understood. [Tutte–Berge formula, Edmonds–Gallai] ・Blossom algorithm: O(n4). ・Best known: O(m n1/2). [Edmonds 1965] [Micali–Vazirani 1980, Vazirani 1994] PATHS, TREES, AND FLOWERS JACK EDMONDS 1. Introduction. A graph G for purposes here is a finite set of elements called vertices and a finite set of elements called edges such that each edge meets exactly two vertices, called the end-points of the edge. An edge is said to join its end-points. A matching in G is a subset of its edges such that no two meet the same vertex. We describe an efficient algorithm for finding in a given graph a match- ing of maximum cardinality. This problem was posed and partly solved by C. Berge; see Sections 3.7 and 3.8. Maximum matching is an aspect of a topic, treated in books on graph theory, which has developed during the last 75 years through the work of about a dozen authors. In particular, W. T. Tutte (8) characterized graphs which do not contain a perfect matching, or 1-factor as he calls it—that is a set of edges with exactly one member meeting each vertex. His theorem prompted attempts at finding an efficient construction for perfect matchings. This and our two subsequent papers will be closely related to other work on the topic. Most of the known theorems follow nicely from our treatment, though for the most part they are not treated explicitly. Our treatment is independent and so no background reading is necessary. Section 2 is a philosophical digression on the meaning of "efficient algorithm." COMBINATORICA Akad6miai Kiad6 - Springer-Verlag COMBINATORICA 14 (i) (1994) 71-109 A THEORY OF ALTERNATING PATHS AND BLOSSOMS FOR PROVING CORRECTNESS OF THE O(v/-VE) GENERAL GRAPH MAXIMUM MATCHING ALGORITHM VIJAY V. VAZIRANI1 Received December 30, 1989 Revised June 15, 1993 1. Introduction Finding a maximum matching in a graph is a classical problem in the study of algorithms. In this paper we present new algorithmically relevant combinatorial structure of matchings. This structure yields the first proof of correctness of the general graph matching algorithm of Mieali and Vazirani [14]; this is currently the 19 most efficient known matching algorithm. Berge's theorem [2], which says that matching M in graph G is a maximum This paper is based on investigations begun with G. B. Dantzig while at the RAND Combinatorial Symposium during the summer of 1961. I am Historical significance (Jack Edmonds 1965) indebted to many people, at the Symposium and at the National Bureau of Standards, who have taken an interest in the matching problem. There has been much animated discussion on possible versions of an algorithm. 2. Digression. An explanation is due on the use of the words "efficient algorithm." First, what I present is a conceptual description of an algorithm and not a particular formalized algorithm or "code." For practical purposes computational details are vital. However, my purpose is only to show as attractively as I can that there is an efficient algorithm. According to the dictionary, "efficient" means "adequate in opera- tion or performance." This is roughly the meaning I want—in the sense that it is conceivable for maximum matching to have no efficient algorithm. Perhaps a better word is "good." I am claiming, as a mathematical result, the existence of a good algorithm for finding a maximum cardinality matching in a graph. There is an obvious finite algorithm, but that algorithm increases in difficulty exponentially with the size of the graph. It is by no means obvious whether or not there exists an algorithm whose difficulty increases only algebraically with the size of the graph. The mathematical significance of this paper rests largely on the assumption that the two preceding sentences have mathematical meaning. I am not prepared to set up the machinery necessary to give them formal meaning, nor is the present context appropriate for doing this, but I should like to explain the idea a little further informally. It may be that since one is customarily concerned with existence, convergence, finiteness, and so forth, one is not in- 20 clined to take seriously the question of the existence of a better-than-finite HACKATHON PROBLEM H・ackathon problem. ・Hackathon attended by n Harvard students and n Princeton students. Each Harvard student is friends with exactly k > 0 Princeton students;
each Princeton student is friends with exactly k Harvard students. ・Is it possible to arrange the hackathon so that each Princeton student
pair programs with a different friend from Harvard?
Mathematical reformulation. Does every k-regular bipartite graph have a perfect matching?
Ex. Boolean hypercube.
2-regular bipartite graph
1 1ʹ 2 2ʹ 3 3ʹ 4 4ʹ

SECTION 7.6
7. NETWORK FLOW II
‣ bipartite matching
‣ disjoint paths
‣ extensions to max flow ‣ survey design
‣ airline scheduling
‣ image segmentation
‣ project selection
‣ baseball elimination

Edge-disjoint paths
Def. Two paths are edge-disjoint if they have no edge in common. Edge-disjoint paths problem. Given a digraph G = (V, E) and two nodes
s and t, find the max number of edge-disjoint s↝t paths. Ex. Communication networks.
25
s36t
digraph G
47
25

Edge-disjoint paths
Def. Two paths are edge-disjoint if they have no edge in common. Edge-disjoint paths problem. Given a digraph G = (V, E) and two nodes
s and t, find the max number of edge-disjoint s↝t paths. Ex. Communication networks.
2 5
s 3 6 t
digraph G
2 edge-disjoint paths
4 7
26

Edge-disjoint paths
Max-flow formulation. Assign unit capacity to every edge.
Theorem. 1–1 correspondence between k edge-disjoint s↝t paths in G and integral flows of value k in Gʹ.
Pf. ⇒
Let P1, …, Pk be k edge-disjoint s↝t paths in G. ・
1 􏰌􏰒􏰓􏰌 e 􏰔􏰕􏰖􏰗􏰎􏰘􏰎􏰔􏰕􏰗􏰌􏰍 􏰎􏰐 􏰍􏰑􏰙􏰌 􏰔􏰕􏰗􏰚 Pj 0 􏰑􏰗􏰚􏰌􏰖􏰛􏰎􏰍􏰌
Set f(e) =
・Since paths are edge-disjoint, f is a flow of value k. ▪
1
1111 1
s1 1t
1
1 1
1
1
1
27

Edge-disjoint paths
Max-flow formulation. Assign unit capacity to every edge.
Theorem. 1–1 correspondence between k edge-disjoint s↝t paths in G and integral flows of value k in Gʹ.
P・f. ⇐
・Let f be an integral flow in Gʹ of value k. Consider edge (s, u) with f(s, u) = 1.
– ・-
by flow conservation, there exists an edge (u, v) with f(u, v) = 1
continue until reach t, always choosing a new edge Produces k (not necessarily simple) edge-disjoint paths. ▪
1
1111 1
s1 1t
can eliminate cycles to get simple paths in O(mn) time if desired (flow decomposition)
1
1 1
1
1
1
28

Edge-disjoint paths
Max-flow formulation. Assign unit capacity to every edge.
Theorem. 1–1 correspondence between k edge-disjoint s↝t paths in G
and integral flows of value k in Gʹ.
Corollary. Can solve edge-disjoint paths problem via max-flow formulation. P・f.
・Integrality theorem ⇒ there exists a max flow f * in G ʹ that is integral. 1–1 correspondence ⇒ f * corresponds to max number of edge-disjoint
s↝t paths in G. ▪
1
1111 1
s1 1t
1
1 1
1
1
1
29

Network connectivity
Def. A set of edges F ⊆ E disconnects t from s if every s↝t path uses at least one edge in F.
Network connectivity. Given a digraph G = (V, E) and two nodes s and t, find min number of edges whose removal disconnects t from s.
25
s36t
47
30

Menger’s theorem
Theorem. [Menger 1927] The max number of edge-disjoint s↝t paths equals the min number of edges whose removal disconnects t from s.
Pf. ≤
・Suppose the removal of F ⊆ E disconnects t from s, and ⎟ F⎟ = k. ・Every s↝t path uses at least one edge in F.
・Hence, the number of edge-disjoint paths is ≤ k. ▪
25 25
s36ts36t
47 47
31

Menger’s theorem
Theorem. [Menger 1927] The max number of edge-disjoint s↝t paths equals the min number of edges whose removal disconnects t from s.
Pf. ≥
・Suppose max number of edge-disjoint s↝t paths is k.
・Then value of max flow = k.
・Max-flow min-cut theorem ⇒ there exists a cut (A, B) of capacity k. ・Let F be set of edges going from A to B.
・⎟ F⎟ = k and disconnects t from s. ▪
25 25
s36ts36t
A
47 47
32

Network flow II: quiz 3
How to find the max number of edge-disjoint paths in an undirected graph?

A. Solve the edge-disjoint paths problem in a digraph

(by replacing each undirected edge with two antiparallel edges).
B. Solve a max flow problem in an undirected graph.
C. Both A and B.
D. Neither A nor B.
33

Edge-disjoint paths in undirected graphs
Def. Two paths are edge-disjoint if they have no edge in common. Edge-disjoint paths problem in undirected graphs. Given a graph G = (V, E)
and two nodes s and t, find the max number of edge-disjoint s–t paths.
25
s36t
digraph G
47
34

Edge-disjoint paths in undirected graphs
Def. Two paths are edge-disjoint if they have no edge in common. Edge-disjoint paths problem in undirected graphs. Given a graph G = (V, E)
and two nodes s and t, find the max number of edge-disjoint s–t paths.
25
s36t
digraph G
(2 edge-disjoint paths)
47
35

Edge-disjoint paths in undirected graphs
Def. Two paths are edge-disjoint if they have no edge in common. Edge-disjoint paths problem in undirected graphs. Given a graph G = (V, E)
and two nodes s and t, find the max number of edge-disjoint s–t paths.
25
s36t
digraph G
(3 edge-disjoint paths)
47
36

Edge-disjoint paths in undirected graphs
Max-flow formulation. Replace each edge with two antiparallel edges and assign unit capacity to every edge.
Observation. Two paths P1 and P2 may be edge-disjoint in the digraph but not edge-disjoint in the undirected graph.
if P1 uses edge (u, v)
and P2 uses its antiparallel edge (v, u)
1
1111 1
s1 1t
1
1 1
1
1
1
37

Edge-disjoint paths in undirected graphs
Max-flow formulation. Replace each edge with two antiparallel edges and assign unit capacity to every edge.
Lemma. In any flow network, there exists a maximum flow f in which
for each pair of antiparallel edges e and eʹ : either f (e) = 0 or f (eʹ) = 0 or both. Moreover, integrality theorem still holds.
P・f. [ by induction on number of such pairs ]
・Suppose f (e) > 0 and f (eʹ) > 0 for a pair of antiparallel edges e and eʹ. ・Set f (e) = f (e) – δ and f (eʹ) = f (eʹ) – δ, where δ = min { f (e), f (eʹ) }.
f is still a flow of the same value but has one fewer such pair. ▪
1
1111 1
s1 1t
1
1 1
1
1
1
38

Edge-disjoint paths in undirected graphs
Max-flow formulation. Replace each edge with two antiparallel edges and assign unit capacity to every edge.
Lemma. In any flow network, there exists a maximum flow f in which
for each pair of antiparallel edges e and eʹ : either f (e) = 0 or f (eʹ) = 0 or both. Moreover, integrality theorem still holds.
Theorem. Max number of edge-disjoint s↝t paths = value of max flow. Pf. Similar to proof in digraphs; use lemma.
1
1111 1
s1 1t
1
1 1
1
1
1
39

More Menger theorems
Theorem. Given an undirected graph and two nodes s and t,
the max number of edge-disjoint s–t paths equals the min number of edges whose removal disconnects s and t.
Theorem. Given an undirected graph and two nonadjacent nodes s and t, the max number of internally node-disjoint s–t paths equals the min number of internal nodes whose removal disconnects s and t.
Theorem. Given a directed graph with two nonadjacent nodes s and t,
the max number of internally node-disjoint s↝t paths equals the min number of internal nodes whose removal disconnects t from s.
40

SECTION 7.7
7. NETWORK FLOW II
‣ bipartite matching
‣ disjoint paths
‣ extensions to max flow ‣ survey design
‣ airline scheduling
‣ image segmentation
‣ project selection
‣ baseball elimination

Network flow II: quiz 4
Which extensions to max flow can be easily modeled?

A. Multiple sources and multiple sinks.
B. Undirected graphs.
C. Lower bounds on edge flows.
D. All of the above.
42

Multiple sources and sinks
Def. Given a digraph G = (V, E) with edge capacities c(e) ≥ 0 and multiple source nodes and multiple sink nodes, find max flow that can be sent from the source nodes to the sink nodes.
flow network G
s1 9 6 t1 37
2
8
s2 1 5
10
4
t2
s3
43

Multiple sources and sinks: max-flow formulation
・Add a new source node s and sink node t.
・For each original source node si add edge (s, si) with capacity ∞.
For each original sink node tj, add edge (tj, t) with capacity ∞. Claim. 1–1 correspondence between flows in G and Gʹ.
flow network G′
s1 9 6 t1 37
∞2 8
s ∞ s2 1 5
∞ 10 s3
t
4∞ t2

44

Circulation with supplies and demands
Def. Given a digraph G = (V, E) with edge capacities c(e) ≥ 0 and n・ode demands d(v), a circulation is a function f (e) that satisfies: ・For each e ∈ E: 0 ≤ f (e) ≤ c(e) (capacity)
For each v ∈ V: f(e) f(e) = d(v) (flow conservation) e 􏰎􏰐 􏰗􏰑 v e 􏰑􏰜􏰗 􏰑􏰏 v
(supply node)
flow network G
−8
6 / 6
−6
2 / 4
4 / 10
6 / 7
1/ 7
flow capacity
7/ 9
4 / 4
−7 3 / 3
11
10 0
(demand node) (transshipment node)
45

Circulation with supplies and demands: max-flow formulation
・Add new source s and sink t.
・For each v with d(v) < 0, add edge (s, v) with capacity −d(v). For each v with d(v) > 0, add edge (v, t) with capacity d(v).

D = d(v) = d(v) Claim. G has circulation iff Gʹ has max flow of value D =
s
786
−8
supply
v: d(v)>0 v: d(v)<0 saturates all edges leaving s and entering t flow network G′ −6 77 106 4 9 −7 34 11 10 0 10 11 t demand 46 Circulation with supplies and demands Integrality theorem. If all capacities and demands are integers, and there exists a circulation, then there exists one that is integer-valued. Pf. Follows from max-flow formulation + integrality theorem for max flow. Theorem. Given (V, E, c, d), there does not exist a circulation iff there exists a node partition (A, B) such that Σv ∈ B d(v) > cap(A, B).
Pf sketch. Look at min cut in Gʹ.
demand by nodes in B exceeds supply of nodes in B plus
max capacity of edges going from A to B
47

Circulation with supplies, demands, and lower bounds
Def. Given a digraph G = (V, E) with edge capacities c(e) ≥ 0, lower bounds l(e) ≥ 0, and node demands d(v), a circulation f (e) is a function that satisfies:
・For each e ∈ E : l(e) ≤ f (e) ≤ c(e) (capacity)
・For each v ∈ V : f(e) f(e) = d(v) (flow conservation)
e 􏰎􏰐 􏰗􏰑 v e 􏰑􏰜􏰗 􏰑􏰏 v
Circulation problem with lower bounds. Given (V, E, l, c, d), does there exist a feasible circulation?
48

Circulation with supplies, demands, and lower bounds
M・ax-flow formulation. Model lower bounds as circulation with demands. ・Send l(e) units of flow along edge e.
Update demands of both endpoints.
lower bound upper bound
v [2, 9] w d(v) d(w)
flow network G
capacity
v 7 w
d(v) + 2 d(w) – 2
flow network G′
Theorem. There exists a circulation in G iff there exists a circulation in Gʹ. Moreover, if all demands, capacities, and lower bounds in G are integers, then there exists a circulation in G that is integer-valued.
Pf sketch. f (e) is a circulation in G iff f ʹ(e) = f (e) – l(e) is a circulation in Gʹ.
49

SECTION 7.8
7. NETWORK FLOW II
‣ bipartite matching
‣ disjoint paths
‣ extensions to max flow ‣ survey design
‣ airline scheduling
‣ image segmentation
‣ project selection
‣ baseball elimination

Survey design
・Design survey asking n consumers about n products. ・ 1 2
・Can survey consumer i about product j only if they own it. ・Ask consumer i between ci and ciʹ questions.
Ask between pj and pjʹ consumers about product j. Goal. Design a survey that meets these specs, if possible.
one survey question per product
Bipartite perfect matching. Special case when ci = ciʹ = pj = pjʹ = 1.
51

Survey design
Max-flow formulation. Model as a circulation problem with lower bounds.
・Add edge (i, j) if consumer j owns product i. ・Add edge from s to consumer j.
・Add edge from product i to t.
・Add edge from t to s.
・All demands = 0.
・Integer circulation ⟺ feasible survey design.
all supplies and demands are 0
[c1, c1ʹ]
[0, ∞]
1 [0, 1] 1ʹ 2 2ʹ
[p1, p1ʹ]
s 3 3ʹ t
4 4ʹ
consumers
products
52

SECTION 7.9
7. NETWORK FLOW II
‣ bipartite matching
‣ disjoint paths
‣ extensions to max flow ‣ survey design
‣ airline scheduling
‣ image segmentation
‣ project selection
‣ baseball elimination

Airline scheduling
Airline scheduling.
・Complex computational problem faced by airline carriers.
・Must produce schedules that are efficient in terms of equipment usage,
crew allocation, and customer satisfaction. ・One of largest consumers of high-powered
algorithmic techniques.
“Toy problem.”
even in presence of unpredictable events, such as weather and breakdowns
・Manage flight crews by reusing them over multiple flights.
・Input: set of k flights for a given day.
・Flight i leaves origin oi at time si and arrives at destination di at time fi. ・Minimize number of flight crews.
54

Airline scheduling
C・irculation formulation. [to see if c crews suffice]
・For each flight i, include two nodes u and v . ui = start of flight i i i v = end of flight i
・Add source s with demand −c, and edges (s, ui) with capacity 1. ・Add sink t with demand c, and edges (vi, t) with capacity 1. ・For each i, add edge (ui, vi) with lower bound and capacity 1.
if flight j reachable from i, add edge (vi, uj) with capacity 1. crew can begin day
crew can end day with any flight
i
with any flight
−c
u1 v1 [0, 1]
[0, 1]
c
use c crews
s u3v3t u2 [1, 1] v2
[0, 1]
flight 2 is performed
u4 v4 same crew can do flights 2 and 4
55

Airline scheduling: running time
Theorem. The airline scheduling problem can be solved in O(k3 log k) time. Pf.
・k = number of flights.
・c = number of crews (unknown). ・O(k) nodes, O(k2) edges.
・At most k crews needed.
・ ⇒ solve log2 k circulation problems.
binary search for min value c*
Value of any flow is between 0 and k.
・ ⇒ at most k augmentations per circulation problem.
Overall time = O(k3 log k).
Remark. Can solve in O(k3) time by formulating as minimum-flow problem.
56

Airline scheduling: postmortem
Remark. We solved a toy version of a real problem.
Real-world problem models countless other factors:
・Union regulations: e.g., flight crews can fly only a certain number of hours in a given time window.
・Need optimal schedule over planning horizon, not just one day. ・Deadheading has a cost.
・Flights don’t always leave or arrive on schedule. ・Simultaneously optimize both flight schedule and fare structure.
Message.
・Our solution is a generally useful technique for efficient reuse of limited resources but trivializes real airline scheduling problem.
・Flow techniques useful for solving airline scheduling problems ・(and are widely used in practice).
Running an airline efficiently is a very difficult problem.
57

SECTION 7.10
7. NETWORK FLOW II
‣ bipartite matching
‣ disjoint paths
‣ extensions to max flow ‣ survey design
‣ airline scheduling
‣ image segmentation
‣ project selection
‣ baseball elimination

Image segmentation
Image segmentation.
・Divide image into coherent regions. ・Central problem in image processing.
Ex. Separate human and robot from background scene.
59

Image segmentation
Foreground / background segmentation.
・Label each pixel in picture as belonging to foreground or background.
・V = set of pixels, E = pairs of neighboring pixels. ・ai ≥ 0 is likelihood pixel i in foreground.
・bi ≥ 0 is likelihood pixel i in background.
・pij ≥ 0 is separation penalty for labeling one of i
and j as foreground, and the other as background.
Goals.
・Accuracy: if ai > bi in isolation, prefer to label i in foreground. ・Smoothness: if many neighbors of i are labeled foreground, ・we should be inclined to label i as foreground.
ai + bj pij
iA j B (i,j )E |A{i,j }|=1
Find partition (A, B) that maximizes:
foreground background
60

Image segmentation
Formulate as min-cut problem.
・Maximization. ・No source or sink. ・Undirected graph.
Turn into minimization problem.
ai + bj pij iA j B (i,j )E
・Maximizing
・is equivalent to minimizing
a constant
|A{i,j }|=1 aj + bi + pij
|A{i,j }|=1

a a i i + + b b j j a a i i b b j j + +
iV j V iA j B (i,j )E
p p i i j j
・or alternatively
j B iA (i,j )E
|A{i,j }|=1
61

Image segmentation
Formulate as min-cut problem Gʹ = (Vʹ, Eʹ).
・Include node for each pixel. pij
edge in G
two antiparallel edges in G′
pij pij
Use two antiparallel edges instead of ・undirected edge.
・Add source s to correspond to foreground.
Add sink t to correspond to background.
s
G′
aj
i pij j
bi
t
62

Image segmentation
C・onsider min cut (A, B) in Gʹ. A = foreground.
cap(A,B) = aj + bi + pij j B iA (i,j )E
iA, jB ・Precisely the quantity we want to minimize.
if i and j on different sides, pij counted exactly once
s
G′
aj
i pij j
A bi
t
63

Grabcut image segmentation
Grabcut. [ Rother–Kolmogorov–Blake 2004 ]
“GrabCut” — Interactive Foreground Extraction using Iterated Graph Cuts
Carsten Rother∗ Vladimir Kolmogorov† Andrew Blake‡ Microsoft Research Cambridge, UK
Figure 1: Three examples of GrabCut . The user drags a rectangle loosely around an object. The object is then extracted automatically.
Abstract
The problem of efficient, interactive foreground/background seg- mentation in still images is of great practical importance in im- age editing. Classical image segmentation tools use either texture (colour) information, e.g. Magic Wand, or edge (contrast) infor- mation, e.g. Intelligent Scissors. Recently, an approach based on optimization by graph-cut has been developed which successfully combines both types of information. In this paper we extend the graph-cut approach in three respects. First, we have developed a more powerful, iterative version of the optimisation. Secondly, the power of the iterative algorithm is used to simplify substantially the user interaction needed for a given quality of result. Thirdly, a ro-
free of colour bleeding from the source background. In general, degrees of interactive effort range from editing individual pixels, at the labour-intensive extreme, to merely touching foreground and/or background in a few locations.
1.1 Previous approaches to interactive matting
In the following we describe briefly and compare several state of the art interactive tools for segmentation: Magic Wand, Intelligent Scissors, Graph Cut and Level Sets and for matting: Bayes Matting and Knockout. Fig. 2 shows their results on a matting task, together with degree of user interaction required to achieve those results.
64
Magic Wand starts with a user-specified point or region to com-

SECTION 7.11
7. NETWORK FLOW II
‣ bipartite matching
‣ disjoint paths
‣ extensions to max flow ‣ survey design
‣ airline scheduling
‣ image segmentation
‣ project selection
‣ baseball elimination

Project selection (maximum weight closure problem)
P・rojects with prerequisites.
・Set of possible projects P : project v has associated revenue pv. ・Set of prerequisites E : (v, w) ∈ E means w is a prerequisite for v.
A subset of projects A ⊆ P is feasible if the prerequisite of every project in A also belongs to A.
Project selection problem. Given a set of projects P and prerequisites E, choose a feasible subset of projects to maximize revenue.
MANAGEMENT SCIENCE Vol. 22, No. 11, July, 1976
can be positive or negative
Printed in U.SA.
MAXIMAL CLOSURE OF A GRAPH AND APPLICATIONS TO COMBINATORIAL PROBLEMS*t
JEAN-CLAUDE PICARD Ecole Polytechnique,Montreal
This paper generalizes the selection problem discussed by J. M. Rhys [12], J. D. Murchland [9], M. L. Balinski [1] and P. Hansen [4]. Given a directed graph G, a closure of G is defined as a subset of nodes such that if a node belongs to the closure all its successors also belong to the set. If a real number is associated to each node of G a maximal closure is defined as a closure of maximal value.
66
1. Introduction

Project selection: prerequisite graph
Prerequisite graph. Add edge (v, w) if w is a prerequisite for v.
w
w
vx
{ v, w, x } is feasible
vx
{ v, x } is infeasible
67

Project selection: min-cut formulation
M・in-cut formulation.
・Assign a capacity of ∞ to each prerequisite edge. ・Add edge (s, v) with capacity pv if pv > 0.
・Add edge (v, t) with capacity −pv if pv < 0. For notational convenience, define ps = pt = 0. u∞ pu∞∞ −pw s py y ∞ z −pz t pv ∞ −px w v ∞ ∞ x 68 Project selection: min-cut formulation C・laim. (A, B) is min cut iff A − { s } is an optimal set of projects. ・Infinite capacity edges ensure A − { s } is feasible. Max revenue because: cap(A,B)= p+(p) vv vB: pv>0 vA: pv<0 = pv pv a constant w v: pv>0 vA u
minimizing this is equivalent to maximizing revenue
pu
py y ∞ z −pz t



−pw
s
Apv∞ −px
v
∞ ∞
x
69

Open-pit mining
Open-pit mining. [studied since early 1960s]
・Blocks of earth are extracted from surface to retrieve ore. ・Each block v has net value pv = value of ore – processing cost. ・Can’t remove block v until both blocks w and x are removed.
w
70
x
v

SECTION 7.12
7. NETWORK FLOW II
‣ bipartite matching
‣ disjoint paths
‣ extensions to max flow ‣ survey design
‣ airline scheduling
‣ image segmentation
‣ project selection
‣ baseball elimination

Baseball elimination
72

Baseball elimination problem
Q. Which teams have a chance of finishing the season with the most wins?
Montreal is mathematically eliminated.
・Montreal finishes with ≤ 80 wins. ・Atlanta already has 83 wins.
Remark. This is the only reason sports writers appear to be aware of — conditions are sufficient but not necessary!
i
team
wins
losses
to play
ATL
PHI
NYM
MON
0
Atlanta
83
71
8

1
6
1
1
Philly
80
79
3
1

0
2
2
New York
78
78
6
6
0

0
3
Montreal
77
82
3
1
2
0

73

Baseball elimination problem
Q. Which teams have a chance of finishing the season with the most wins?
Philadelphia is mathematically eliminated.
・Philadelphia finishes with ≤ 83 wins.
・Either New York or Atlanta will finish with ≥ 84 wins.
Observation. Answer depends not only on how many games already won and left to play, but on whom they’re against.
i
team
wins
losses
to play
ATL
PHI
NYM
MON
0
Atlanta
83
71
8

1
6
1
1
Philly
80
79
3
1

0
2
2
New York
78
78
6
6
0

0
3
Montreal
77
82
3
1
2
0

74

Baseball elimination problem
C・urrent standings.
・Set of teams S.
・Distinguished team z ∈ S.
・Team x has won wx games already.
Teams x and y play each other rxy additional times.
Baseball elimination problem. Given the current standings, is there any outcome of the remaining games in which team z finishes with the most (or tied for the most) wins?
SIAM REVIEW Vol. 8, No. 3, July, 1966
POSSIBLE WINNERS IN PARTIALLY COMPLETED TOURNAMENTS* BENJAMIN L. SCHWARTZ
1. Introduction. In this paper, we shall investigate certain questions in tourna- ment scheduling. For definiteness, we shall use the terminology of baseball. We shall be concerned with the categorization of teams into three classes during the closing days of the season. A team may be definitely eliminated from pen- nant possibility; it may be in contention, or it may have clinched the champion- ship. It will be our convention that a team that can possibly tie for the pennant is considered still in contention. In this paper necessary and sufficient conditions are developed to classify any team properly into the appropriate category.
2. First example. We consider first an extremely simple example, but one that will prove instructive. Suppose one team, say the New York Mets, finds itself at one point in the season 37 games behind another, perhaps the Dodgers,
75
ght; see http://www.siam.org/journals/ojsa.php

Baseball elimination problem: max-flow formulation
Can team 4 finish with most wins?
・Assume team 4 wins all remaining games ⇒ w4 + r4 wins. Divvy remaining games so that all teams have ≤ w + r wins.
44
0–1
games left between 1 and 2
team 2 can still win 0–2 0 this many more games
0–3 1

1–2 ∞ 2 w4+r4–w2 t
s g12
1–3
2–3
3
team nodes
(each team other than 4)
game nodes
(each pair of teams other than 4)
76

Baseball elimination problem: max-flow formulation
Theorem. Team 4 not eliminated iff max flow saturates all edges leaving s. P・f.
Integrality theorem ⇒ each remaining game between x and y added to ・number of wins for team x or team y.
Capacity on (x, t) edges ensure no team wins too many games. ▪ 0–1
games left between 1 and 2
0–2 0
0–3 1
team 2 can still win this many more games
s g12

1–2 ∞ 2 w4+r4–w2 t
1–3
2–3
3
team nodes
(each team other than 4)
game nodes
(each pair of teams other than 4)
77

Baseball elimination: explanation for sports writers
Q. Which teams have a chance of finishing the season with the most wins?
AL East (August 30, 1996)
Detroit is mathematically eliminated.
・Detroit finishes with ≤ 76 wins.
・Wins for R = { NYY, BAL, BOS, TOR } = 278.
・Remaining games among { NYY, BAL, BOS, TOR } = 3 + 8 + 7 + 2 + 7 = 27. ・Average team in R wins 305/4 = 76.25 games.
i
team
wins
losses
to play
NYY
BAL
BOS
TOR
DET
0
New York
75
59
28

3
8
7
3
1
Baltimore
71
63
28
3

2
7
4
2
Boston
69
66
27
8
2

0
0
3
Toronto
63
72
27
7
7
0

0
4
Detroit
49
86
27
3
4
0
0

78

Baseball elimination: explanation for sports writers
Certificate of elimination.

!”#
# remaining games
T⊆S, w(T):= ∑wi , g(T):= i∈T
∑gxy , {x,y} ⊆ T
# wins
!$”$$#
Theorem. [Hoffman–Rivlin 1967] Team z is eliminated iff there exists a
subset T* such that

This exceeds the maximum number that team z can win. ▪ €
wz+gz < ・Suppose there exists T* ⊆ S such that wz +gz w(T*)+g(T*) |T*| P・f. ⇐ ・Then, the teams in T* win at least (w(T*) + g(T*)) / | T* | games on average. w(T*)+g(T*) < |T*| . 79 Baseball elimination: explanation for sports writers P・f. ⇒ ・Use max-flow formulation, and consider min cut (A, B). ・Let T* = team nodes on source side A of min cut. Observe that game node x–y ∈ A iff both x ∈ T* and y ∈ T*. - - infinite capacity edges ensure if x–y ∈ A, then both x ∈ A and y ∈ A if x ∈ A and y ∈ A but x–y ∉ A, then adding x–y to A decreases the capacity of the cut by gxy y ∞ s gxy x–y ∞ x wz+rz –wx t 80 Baseball elimination: explanation for sports writers Pf. ⇒ ・Use max-flow formulation, and consider min cut (A, B). ・Let T* = team nodes on source side A of min cut. ・Observe that game node x–y ∈ A iff both x ∈ T* and y ∈ T*. ・Since team z is eliminated, by max-flow min-cut theorem, ・Rearranging terms: € € g(S−{z}) g(S−{z}) > cap(A, B) > cap(A, B)
= =
= =
+ +
∑(w +g −w ) ∑(wz +gz −wx)
capacity of game edges leaving s
capacity of team edges entering t
capacity of game edges leaving s
capacity of team edges leaving s
wz + gz < w(T *) + g(T *) ▪ € |T*| !###"###$ !###"###$ g(S−{z})−g(T*) g(S−{z})−g(T*) g(S−{z})−g(T*) g(S−{z})−g(T*) !##"###$ !##"###$ capacity of game edges leaving s capacity of team edges leaving s ∑ x∈T* z z x x∈T* − w(T*) + |T*|(w +g ) − w(T*) + |T*|(wz +gz) z z 81