Introduction to Algorithms May 7, 2004
Massachusetts Institute of Technology 6.046J/18.410J
Professors Erik Demaine and Shafi Goldwasser Handout 25
Problem Set 7 Solutions
This problem set is due in recitation on Friday, May 7.
Reading: Chapter 22, 24, 25.1-25.2, Chapters 34, 35
There are five problems. Each problem is to be done on a separate sheet (or sheets) of paper.
Mark the top of each sheet with your name, the course number, the problem number, your recitation
section, the date, and the names of any students with whom you collaborated. As on previous
assignments, “give an algorithm” entails providing a description, proof, and runtime analysis.
Problem 7-1. Arbitrage
Arbitrage is the use of discrepancies in currency exchange rates to transform one unit of a currency
into more than one unit of the same currency. Suppose,
�
U.S. dollar bought ������� Euro, � Euro
bought
� � �
��� Japanese Yen, � Japanese Yen bought � � Turkish Lira, and one Turkish Lira bought�����
�
�
� U.S. dollars.
Then, by converting currencies, a trader can start with
�
U.S. dollar and buy ������� � ��������� � ��������
�
�
��� � ����� U.S. dollars, thus turning a � � profit. Suppose that we are given � currencies����������� ����� ����� and an � �!� table ” of exchange rates, such that one unit of currency �$# buys “&%(‘ �*)
+
units of currency �-, .
(a) Give an efficient algorithm to determine whether or not there exists a sequence of
currencies . ��#0/�����#213� ���$� ����#5436 such that:
“&%(‘ ��� ‘ �7+98 “&%(‘ ��� ‘;: +�8�8�8 “&%<';=3> ��� ‘;= +98 “&%(‘?= � ‘ �*+A@ � �
Solution: We can use the Bellman-Ford algorithm on a suitable weighted, directed
graph B C D;E ��FHG , which we form as follows. There is one vertex in E for each
currency, and for each pair of currencies �3# and �-, , there are directed edges DJI #K� I ,3G and
DJI ,L� I #JG . (Thus, M ENM�CO� and M F M CQP � �7R .)
To determine edge weights, we start by observing that
“&%(‘ ��� ‘ �7+S8 “&%(‘ ��� ‘;: +
8�8�8 “&%<';=3> ��� ‘;= +S8 “&%(‘?= � ‘ �*+A@ �
if and only if
�
“&%(‘ �T� ‘ �-+ 8
�
“&%(‘ ��� ‘;: + 8�8�8
�
“&%<';=3> ��� ‘;= + 8
�
“&%(‘?= � ‘ �*+VU
� �
2 Handout 25: Problem Set 7 Solutions
Taking logs of both sides of the inequality above, we express this condition as
��� �
“&%<' �T� ' � +��
��� �
"&%(' ��� ';: +��
��� �
"&%<';=3> ��� ‘;= +�� 8�8�8 �
��� �
“&%<';= � ' � + U � �
Therefore, if we define the weight of edge DJI #*� I ,TG as
� D I #K� I ,3G C ���
�
"&%<' �*)
+
C ��� "&%<' �K)�+ �
then we want to find whether there exists a negative-weight cycle in B with these edge
weights.
We can determine whether there exists a negative-weight cycle in B by adding an
extra vertex I�
with � -weight edges DJI�
� I # G for all I #�
E , running BELLMAN-FORD
from I�
, and using the boolean result of BELLMAN-FORD (which is TRUE if there are
no negative-weight cycles and FALSE if there is a negative-weight cycle) to guide our
answer. That is, we invert the boolean result of BELLMAN-FORD .
This method works because adding the new vertex I�
with � -weight edges from I�
to all other vertices cannot introduce any new cycles, yet it ensures that all negative-
weight cycles are reachable from I�
.
It takes �&D � � G time to create B , which has ��DJ� � G edges. Then it takes � DJ� : G time to
run BELLMAN-FORD . Thus, the total time is � D � : G .
Another way to determine whether a negative-weight cycle exists is to create B and,
without adding I�
and its incident edges, run either of the all-pairs shortest-paths al-
gorithms. If the resulting shortest-path distance matrix has any negative values on the
diagonal, then there is a negative-weight cycle.
(b) Give an efficient algorithm to print out such a sequence if one exists. Analyze the
running time of your algorithm.
Solution: Assuming that we ran BELLMAN-FORD to solve part (a), we only need
to find the vertices of a negative-weight cycle. We can do so as follows. First, relax
all the edges once more. Since there is a negative-weight cycle, the � value of some
vertex � will change. We just need to repeatedly follow the � values until we get
back to � . In other words, we can use the recursive method given by the PRINT-PATH
procedure of Section 22.2 in CLRS, but stop it when it returns to vertex � .
The running time is ��D � : G to run BELLMAN-FORD , plus ��D � G to print the vertices of
the cycle, for a total of ��D � : G time.
Problem 7-2. Bicycle Tour Planning
Handout 25: Problem Set 7 Solutions 3
You are in charge of planning cycling vacations for a travel agent. You have a map of � cities
connected by direct bike routes. A bike route connecting cities I and � has distance � D I � � G .
Additionally, it costs � D I G to stay in city I for a single night.
A customer will provide you with the following data:
- A starting city � .
- A destination city
�
.
- A trip length � .
- A daily maximum biking distance � D�� G , where �
% � � � + .
Your job is to plan a tour that takes exactly � days, such that the customer does not stay in the
same city two consecutive nights and does not bike more than � D�� G on day � . Your customer can
bike through several cities on a given day, i.e. there doesn’t have to be a direct route between the
cities you assign on day � and day � � � . You also want to minimize the total cost of staying at the
cities on your tour.
Give an � DJ� � D � � � G-G time algorithm to produce an � -city tour D�� C I
� I � ����� � � C I�� G with
minimum cost � � DJI # G , such that � D I # > ��� I #JG
� D ‘ G .
Solution: First, compute the all-pairs shortest path distances ��DJ’ �*) G among every pair of cities ‘
and ) . Store the values in a table.
Then construct a directed acyclic graph � with � D
� � � G nodes. The nodes are labeled as I #�� , for’
�� � � � � �$���-��� and �
�� � � � � �$������� . Each node I #�� corresponds to the option of staying at city ‘
on day � .
For any ‘ �*)
�� � � � � �$���-��� where ‘��C ) and �
�� � � ��������� , add the edge DJI #�� � > ��� � I ,��LG in the graph
� if �
D ‘ �K)
G� � D�� G . This means that the customer can bike from city ‘ to city ) without exceeding
the daily limit. Also, for each node I #�� , assign a node cost �T# to the node.
The lowest cost tour is simply the path from I!
to I#”$� where the total node cost is minimum. IfI�”$� is not reachable from I!
, no tour satisfying the requirements exists.
We can transform the minimum node cost path problem into a shortest path problem easily. Since
the graph is directed, for every edge D � � I G we can assign the cost of node I as its edge weight.
This reduces the problem to a shortest path problem, which can be computed using the shortest
path algorithm on DAGs (see Section 24.2 in CLRS).
Correctness: A path on the graph from I!
to I#”$� corresponds to a bicycle tour for the � days.
Specifically, each edge D I #%� � > ��� � I ,��LG along the path specifies a day trip from city ‘ to city ) on day
� . Note that the edge is present in � if and only if the corresponding day trip does not violate daily
mileage constraint. Also, since there is no edge between D I #�� � > ���*� I #�� G for any ‘ � � in the graph � ,
the customer will not stay at the same city on consecutive nights. The transformation of the min-
imum node-cost path problem into the shortest path problem preserves the cost of corresponding
paths (with an offest of the cost of the source node). Therefore the shortest path algorithm on the
transformed graph will compute the tour with the lowest cost.
4 Handout 25: Problem Set 7 Solutions
Running Time: The all-pairs shortest paths distances can be computed in ��D � : G time using the
Floyd-Warshall algorithm. The graph � contains � D � � G nodes and � DJ� � � G edges, so it takes
� DJ� � � G time to construct the graph. Since � is a DAG, the shortest path on � can be computed
in � DJ� � � G time. Therefore the overall running time is � DJ� : � � � � G C ��D � � D � � � G G .
Problem 7-3. P vs. NP
Suppose that
� �T� � �
���� and that � � U � � � . For each of the following statements, determine
whether it is true, false, or an open problem. Prove your answers.
(a) If
� �
�� , then � �
�� .
Solution: If � C ��� , then this is true. Otherwise, this is false: � � can be NP-
complete. Therefore this is an open question.
(b) If
� �
�� , then � �
�� .
Solution: True. Proof presented in lecture.
(c)
� � is either NP-complete or is in P.
Solution: Open. There exist problems that are in NP, but are not known NP-complete.
If � C ��� this is true. If � �C ��� then it is false.
(d) If
� � U � � � , then both � � and � � are NP-complete.
Solution: If � C ��� , then this is true, since then any problem in � is ��� -complete.
Otherwise, it is false: we can take
� �T� � �
�� , and they are not NP-complete. There-
fore, this is an open problem.
(e) If
� � and � � are both NP-complete, then � � U � � � .
Solution: True by definition of NP-completeness.
(f) Suppose there is a linear time algorithm that recognizes
� � . Then there exists a linear
time algorithm to recognize
� � .
Solution: False. The reduction from
� � to � � , although polynomial, may take more
than linear time.
Subtle Point: This does not necessarily imply that no linear time reduction from
� �
to
� � exists. Essentially, this problem is equivalent to asking whether there exist any
Handout 25: Problem Set 7 Solutions 5
problems with superlinear lower bounds in P. That happens to be true: there do exist
problems in P that cannot be solved in linear time.
This is beyond the context of 6.046. Students will get full credit for the correct answer
with incomplete justification.
Problem 7-4. NP-Completeness
(a) Suppose you are given an algorithm � to solve the CLIQUE decision problem. That
is, � D?B � � G will decide whether graph B has a clique of size � . Give an algorithm to
find the vertices of a � -clique in a graph B using only calls to � , if any such � -clique
exists.
Solution: Run � D?B � � G . If � returns “false”, then exit. Otherwise if � returns “true”,
remove an arbitrary vertex I from the graph. Run � D B � I � � � G . If � now returns
“false”, then I must be in all remaining � -cliques. If � returns “true”, there exists
some � -clique without I . Continue this process until � vertices are found that must
necessarily be in the � -clique.
(b) Prove that the following decision problem is NP-complete.
LARGEST-COMMON-SUBGRAPH: Given two graphs B � and B � and an integer � ,
determine whether there is a graph B with � � edges which is a subgraph of both B �
and B � . (Hint: reduce from CLIQUE.)
Solution:
The problem is in NP: to prove that B � C D?E ����F � G and B � C D;E ����FV��G share a subgraph
of size at least � , one can exhibit a graph � C D?E�� ��F � G with at least � edges, and
1-1 mappings � ��� E ���� E � and � � � E ���� E � . Thus the certificate has sizeM � M � M
� � M � M�� � M . If � is the size of the input M � M�C ��D � G , since � is a subgraph of the
given graphs, and M�� # M C ��D � G since each of them is just a list of vertices. Therefore,
the size of the certificate is polynomial. To verify this certificate, check that � has
at least � edges ( � DJ� G time), check that � � and � � are 1-1 ( � DJ� G time), and check
whether for any edge D � � I G
F � , D�� � D � GT� � � DJI G-G
F � and D�� � D � GT� � � D I G-G
FV�
( � D � � G time). Therefore the certificate takes polynomial time to verify.
Reduction: Given a graph B and an integer � , to determine whether B has a clique of
size � , ask whether B and
�= have a common subgraph with � D�� � G�� � edges, where
�= is the complete graph on � nodes.
(c) Suppose you are given an algorithm � to solve the LARGEST-COMMON-SUBGRAPH
decision problem. Give an algorithm to find a subgraph of size � that appears in both
graphs B � and B � , using only calls to � , if any such subgraph exists.
6 Handout 25: Problem Set 7 Solutions
Solution: Use an approach similar to the reduction from Decisional-Clique to Search-
Clique. Run � D?B �T� B ��� � G . If � says a subgraph exists, try removing an edge from
either B � or B � and re-running � . If � ’s output has changed, that edge must appear
in all remaining subgraphs. Otherwise, it can be excluded.
Problem 7-5. Maximum Coverage Approximation
Suppose you are given a set � of size M�� M�CO� , a collection � of � distinct subsets ��� �T��� ��� ����� ��� � �
where � #�� � , and a number � as input. We would like to pick � subsets from the collection
that cover the maximum number of elements in � . Give a greedy, polynomial-time approx-
imiation algorithm for this maximum coverage problem with ratio bound of �
�� � � ��
� , where
C������ # � M � # M � . Analyze the approximation ratio achieved by your algorithm.
Solution: The following greedy algorithm achieves the �
�� � � ��
� ratio bound.
GREEDY-MAXIMUM-COVERAGE( � � � � � ):
1 ��� �
2 ��� �
3 for ‘�� � to �
4 select a � #�
� that maximizes M � #�� � M
5 ��� � � #
6 ������� ��� # �
7 return �
The algorithm works as follows. At each stage, the algorithm picks a subset �A# and store it in
the collection � . The set � contains, at each stage, the set of uncovered elements. The greedy
approach at line 4 picks the subset that covers as many uncovered elements as possible.
The algorithm can easily be implemented in time polynomial in � , � and � . The number of
iterations on lines 3-6 is � . Each iteration can be implemented in � D � � � G time. Therefore there is
an implementation that runs in time � DJ� � � � G .
The algorithm has a �� � � � ��
� ratio bound. Let � be the solution given by the algorithm, and � �
be the optimal solution. If � covers the entire set � , then � is the optimal solution and we are done.
Therefore, we consider the case where � does not cover the entire � . At the first iteration, the
algorithm will pick the largest subset with size
. Therefore the number of elements covered by
� will be at least
. Also, at each stage, the algorithm will pick at subset that covers at least one
element that has not been previously covered. Therefore, the number of elements covered by �
will also be at least � . To sum up, the solution � will cover at least ����� � � ��
� elements.
Now consider the optimal solution � � . The optimcal solution contains � subsets, and each subset
contains at most
elements. Therefore the maximum number of elements covered by � � is �
. It
follows that the ratio bound of the greedy algorithm is:
Handout 25: Problem Set 7 Solutions 7
�
����� � � �
� C �� � � � �
� �