Algorithms and Data, Summer 1 Problem Set 2
Due May 25, 9PM
Write up the answers to part I and an explanation of your strategy for part II in a file called PS2.pdf. Zip this with Dijkstra.java and any other supplemental code for part II in a file called 4800_PS2_XX.zip, where XX are your initials, and submit on Blackboard. Be sure to name your partner in your PDF and code comments; write up your answers to part I individually from that partner; and both partners should submit zip files on Blackboard.
Part I – Designing Greedy Algorithms
1) You’re an online retailer, and you can ship any box that
weighs less than W for a flat fee. Given N items headed to
the same location of varying weights wi, you want to send
these items in as few boxes as possible. Furthermore, as a
matter of policy, items are always shipped in the order they
were ordered online1 (assume no ties), so we’re just looking
for some “dividing lines” in this existing ordering that represent closing up a box and starting a new one. Design a greedy algorithm that finds a packing with the fewest boxes in linear time, and prove it is optimal by arguing that the greedy algorithm with a “greedy stays ahead” argument. The weights are all less than W.
2) You are designing an AI for a League of Legends style game in which enemy forces may try to approach your base through one of a few different “lanes” or passageways. Your AI needs to set up watchtowers along a perimeter, such that every place where an enemy could cross the perimeter is within radius r of some watchtower in the same lane. (The radius is not necessarily sufficient for one watchtower to cover an entire lane by itself; a wide lane may need multiple watchtowers.) Design an efficient algorithm that will place watchtowers in a way that covers all passable parts of the perimeter using the fewest possible watchtowers. The algorithm should run in time linear in the length of the perimeter and the number of lanes; but you don’t need to analyze the running time. (Assume the perimeter is a straight line, segments of which are unnecessary to cover.) Assume you can find passable/impassable boundaries in constant time. Explain how you know your algorithm employs the minimum number of watchtowers.
Watchtower radius r Perimeter to
cover (a “lane”)
Watchtower
1 Without this restriction, the problem of using the fewest boxes is called “bin-packing” and is thought to not have a polynomial-time optimal solution.
2 items, 9.5 lbs
Impassable perimeter, no need to cover
3) Suppose the requests for Interval Scheduling are requests to do jobs daily instead of on one particular day, and the requested time intervals can go past midnight (and therefore maybe interfere with jobs that start early the next day). Design an O(N2) algorithm that can take an unsorted list of N requests and find a schedule that does not vary from day to day, has no jobs that overlap (including jobs from the previous day), and satisfies the maximum number of requests. Briefly explain your analysis of the running time and how you know your algorithm returns the optimal number of requests.
4) In the board game Ticket to Ride, players are given a map of a country and a list of D destinations that they’d like to connect on that map with train routes. Let’s assume players want to always make the entire list of destinations connected, and they want to do it using as few train pieces as possible. The map of the country is essentially an undirected weighted graph where each vertex is a city, and each edge weight is the number of train pieces necessary to directly connect two cities. The player therefore wants to essentially create a minimal spanning tree, only the minimal spanning tree does not need to touch everything in the graph — just the cities in the secret destination list.
Describe a reasonable greedy polynomial-time algorithm that produces a tree connecting the target destinations, and give its running time. Then draw or describe a case in which your algorithm is not optimal. (There is no known polynomial time algorithm for this that is optimal; it’s called the Steiner tree problem.) Your algorithm should employ at least one of Dijkstra’s algorithm, Kruskal’s algorithm, or Prim’s algorithm, and should take reasonable steps to be as close to optimal as possible.
most vertices aren’t destinations
yellow’s destinations
Sample end of the game
weight 3 edge (3 trains)
5) Some network designers are considering creating a “backbone” for network traffic that connects some cities. They have created an undirected graph G where the vertices are all the cities to be connected, edges represent potential network links they could build out, and edge weights represent the bandwidths that would be achievable on each link. The bandwidth that two users in different locations experience will be the minimum of the bandwidths of the links along the best path connecting them. (The best path is the one that maximizes this bottleneck speed.)
It turns out that it is not necessary in this scenario to have a backbone that is more dense than a tree in order to give every user the best bandwidth possible.
Describe an algorithm that takes an undirected weighted graph G and returns a spanning tree T such that every two vertices in the graph are connected in T by the maximum-bandwidth path in G. Explain how you know your algorithm is correct, and analyze its running time.
Part II – Dijkstra’s Algorithm with Online Updates
Dijkstra’s algorithm can be used by Internet routers to find the best paths to other routers. But the Internet is unreliable, and links can go down and come back up. Whenever either of these events happens, recomputing Dijkstra’s algorithm on the entire Internet graph may not be necessary.
Write a program, Dijkstra.java, that takes two filenames as command-line arguments. The first file should be a list of edges that make up a graph, one edge per line, in the format
vertexname1,vertexname2=distance
For example, “V1,V5=10” (no spaces). You may assume positive integer distances and alphanumeric vertex names. The vertex V0 represents our own location on the network. No vertex on the graph will have degree greater than 10.
The second file should be a list of updates to the graph where each update is of the form vertexname1,vertexname2=D
or
vertexname1,vertexname2=U
indicating that a link is down or up. These represent messages about the state of the network, updating the information about which edges are actually available. Messages may be redundant, telling you that a link is down when it already is down, but they won’t maliciously tell you about edges that didn’t exist in the original graph.
After reading the graph file, your code should print the minimum distances from node V0 to every other vertex in the graph, in the format vertex1=distance1,vertex2=distance2…vertexn=distancen
all on the same line; for example, “V1=5,V2=7,V3=10”. Vertices should be printed in lexicographic order (sort their strings). If V0 can’t reach a vertex, write “INF” for the distance.
Then, read each line of the update file, and print to System.out the shortest paths that are available after each update, using that same format. So, your program should ultimately output N+1 lines: one for the original graph’s shortest distances, and one for each update of the state of the network.
Use Dijkstra’s algorithm with a heap-based priority queue for original pass through the graph. You may not use the PriorityQueue Java class. You will need to implement a heap with methods for adding new elements and removing the min, fixing the heap after either operation. Don’t worry about adding multiple paths to the same vertex to the priority queue — you can add them all, on the assumption that you’ll pull the smallest one first, then ignore the worse nodes when they are pulled.
When a link goes down, determine in constant time whether this information is irrelevant because no optimal path will change, and do not recompute Dijkstra if this is the case. For some extra credit, you can try to also optimize the cases in which the new link information does change the optimal paths — think about what part of your old Dijkstra solution you might salvage and avoid recomputing. But you can just recompute Dijkstra’s algorithm from scratch when the optimal paths must change, and receive full credit.
Use either an adjacency list or adjacency matrix for the initial graph representation; choose whichever you think will be more asymptotically efficient. You can build other data structures as you like.
Print to System.err the elapsed time from the time you finish building the graph to the time you finish printing the final distances. (Yes, we’re timing IO here as well. That’s fine.)
We are providing a sample graph and update files for you to play with, graph.txt and updates.txt; you can find them as files on Blackboard near where you found this PDF. Sample results on these files are in results.txt.
In your PDF for this assignment, as well as in the comments of Dijkstra.java, explain under what circumstances you do not need to recompute the shortest path tree. (This text may be duplicated across partners, and from PDF to code.) Also explain any optimizations you have made in the hopes of extra credit.
Hints (these are suggestions, not requirements)
A good ordering for a project like this is to build your Graph representation (in Graph.java), then your PriorityQueue representation, then your Dijkstra implementation. You can check in Graph.java that you can successfully read a graph of the required format. Test these before moving on to Dijkstra.java; if you don’t want to write formal tests, you can write mains in these that run programs that will test them.
Recall that (“V0”, “V1”) and (“V1”, ”V0”) are the same edge. Find a way to check whether two edges are equal and stick with it. You could override Object.equals(), or just have a method that always checks both orderings, or you can always store the vertices in a determined order. At some point, you’ll also probably need a strategy for checking whether an edge is stored in a HashMap. Recall that using a new object you’ve defined as a key will not work because different instances of what is fundamentally the same object are treated as different keys by default. Hashing Strings as the keys will work — identical Strings are treated as identical keys
— as long the same pair of vertices always produces the same string. Or, if you have an Edge class, you can override hashCode() to any function that is different for different edges and the same for the same edge (xor of the two hashCodes, v1.hashCode() ^ v2.hashCode(), works since there are no self-loops).
The parsing doesn’t require anything more complicated than String.split(). line.split(“=“) on one of our input lines will return an array with two elements, one containing the string to the left of the = and one containing the stuff to the right.
System.err.println to print the status of your data structures as you run can be very helpful when tracking bugs. Remove these or comment them out before submission.
If you’re computing parents and children in an array for your heap, don’t forget that the rules we gave in lecture are 1-indexed. If you’re using those formulas, you need to both add 1 at the beginning and subtract 1 at the end. (You can also derive simplified formulas for the zero- indexed case.)