Scientific Computation Project 2 (last updated 25/2)¶
Clarifications¶
Part 2.2) (25/2) There was a sign error in the specification of model B. The suggested value for
𝛼
α
should be negative (e.g.
𝛼=−0.01
α
=
−
0.01
). If you have already done a tangible amount of work using positive
𝛼
α
, you should include this in your submission and briefly comment on how/why your results change when shifting to negative
𝛼
α
. Otherwise, please focus on cases with
𝛼<0
α
<
0
Part 1.1 (19/2): A simple example Alist for a 3-node, 2-link network: Alist = [[2,1],[0],[0]]
Part 1.2 (19/2): If no path exists between start and dest, return an empty list.
Part 1.3 (19/2): You may assume that at least 2 in-network stations have a cycle route between them.
Part 2.2i) (19/2): Aside from i at node x, all variables should be set to zero at t=0
You are not required to use latex or the provided latex template. Any pdf with a simple, clear presentation is perfectly fine.
Due date/time: 2/3/2020, 18:00 GMT
This project consists of two parts – in the first, you will be developing/analyzing algorithms on graphs, and in the second you will be analyzing dynamical processes on graphs.
Getting Started: Template files have been added to the course repo in the project2/ directory. You can also download the files directly from here: | part1_template.py | part2_template.py | project2_template.tex
Place these files in a folder, and create copies of the template files named part1_dev.py and part2_dev.py in this new directory. Ultimately, the final files that you submit should be titled part1.py, part2.py, and project2.pdf. Browse through the Python template files; there are a few function headers, and you will have to add code for each of these functions as described below. First, however, you should add your 8-digit college id to the docstring at the top of each file.
Part 1.¶
You are tasked with analyzing public transportation networks. You are allowed to use the collections module for part 1, please do not use any other external modules.
For each of the questions below, you should develop a correct, efficient algorithm, implement the algorithm in Python, and discuss the running time and efficiency of your implementation in your pdf. Your discussion should include an estimate of the asymptotic running time and a concise description of how you constructed this estimate. You should also include a brief description of each algorithm.
1) (2 pts) You are given a set of airports and connections corresponding to a air transportation network. Specifically, you know if there is a direct connection between any two airports in the network. Complete the function, flightLegs, so that it efficiently computes the minimum number of flights required to travel between two distinct airports on the network. The function should also efficiently determine the number of distinct routes between the two airports which require this minimum number of flights. The function is provided an N- element adjacency list, Alist, as input, and airports are numbered from 0 to N-1. The ith element of Alist is a list containing the airports which can be reached directly from airport i. The network is symmetric, so if i can be reached directly from j, then j can be reached directly from i. The starting (start) and destination (dest) airports are also provided. The function output should be a two-element list containing the minimum number of flights required to travel between start and dest and the number of distinct journeys between start and dest with this minimum number.
2) (5 pts) You will now analyze the journeys on a subway network. You are provided with a list of stations and direct routes between stations.
i) You are also provided with densities for the network where
𝑑
𝑖𝑗
d
i
j
corresponds to the average number of people on a direct journey from station i to j (no stations other than i and j are visited during a direct journey). For convenience, we will assume that
𝑑
𝑖𝑗
=
𝑑
𝑗𝑖
d
i
j
=
d
j
i
. A safety factor,
𝑆
𝑎𝑏
S
a
b
, for a journey between stations a and b (not necessarily a direct journey) is defined to be the maximum density encountered during the journey (lower S corresponds to higher safety). Given a starting and destination station, develop an algorithm to efficiently find the safest journey between the two stations. Implement your method in safeJourney in part1.py. Along with the starting and destination stations, the function is provided an adjacency list, Alist, for the network. The ith element of the list is a list of tuples. Each tuple contains a station which can be directly reached from i, and the density for travel between that station and i. The function should return a list containing the sequence of stations encountered during the safest journey and the safety factor for the journey. E.g., if the function returns [[1,5,3],10] then the full journey is 1 -> 5 -> 3 and the safety factor of the journey is 10. If multiple journeys produce the same minimum S, you can return any one of those journeys. If no path exists between the starting and destination stations, return an empty list.
ii) Now consider the fastest journeys between stations start and dest. You are provided with durations of journeys between stations i and j,
𝑡
𝑖𝑗
t
i
j
(rounded to the nearest minute and
𝑡
𝑖𝑗
=
𝑡
𝑗𝑖
t
i
j
=
t
j
i
), and if there are multiple shortest journeys, you should select the one with the least number of steps. Complete shortJourney so that it efficiently finds this fastest journey and returns the stations included in the journey (as a list) and the duration of the journey. If no path exists between the starting and destination stations, return an empty list. The overall setup of the function is very similar to safeJourney, and further details are provided in the function docstring.
3) (3 pts) You are provided a network of train stations and, for a given station outside of this network (x), you know the minimum fare for reaching each in-network station from x, and you also know the minimum fare for returning to x from any in-network station. You are also given a list of 2-way dedicated cycling routes connecting pairs of in-network stations. Complete the function, cheapCycling, so that it efficiently finds the cheapest cyling journey and returns the pair of distinct in-network stations corresponding to this journey. Here, a journey consists of 1) arrival at an in-network station from x, 2) some amount of cost-free cycling using dedicated routes, and 3) return to x from an in-network station which is distinct from the arrival station. The train stations are numbered from 0 to N-1, and the function is provided two N-element lists as input (Slist and Clist). The ith element of Slist contains two numbers – the minimum arrival and return fares for routes between stations i and x. Clist is an adjacency list. Its ith element contains a list of integers which correspond to in-network stations that can be reached directly by bicycle from station i.
Note for Part 1: You are not allowed to use heapq, however, if you determine that the best approach to a problem requires a binary heap, you should choose this approach, implement it without a heap, and then discuss the benefits of the heap (and its influence on the asymptotic running time) in your analysis.
Part 2.¶
1) (5 pts) For this question, you will investigate random walks on unweighted, undirected graphs. One step of a random walk is defined as follows. The probability of a walker moving from node i to node j is
𝑝
𝑖𝑗
=
𝐴
𝑖𝑗
/
𝑞
𝑖
p
i
j
=
A
i
j
/
q
i
where
𝐴
𝑖𝑗
A
i
j
is the adjacency matrix for the graph, and
𝑞
𝑖
q
i
is the degree of node i.
i) Complete rwgraph so that it efficiently simulates M Nt- step random walks on the NetworkX graph provided as input. All walkers should start at the same node with the this node set via input variable, i0. Provide a 1-2 sentence description of your algorithm for 1 simulation step in your pdf.
For the next two questions, you should use Barabasi-Albert graphs with n=2000 and m=4 (see NetworkX documentation for the definitions of m and n). You should also fix seed so that the same graph is generated each simulation.
ii) Analyze your simulation results as Nt and M are increased. The initial positions for all walkers should be the node with maximum degree. If multiple nodes have the same maximum degree, select the node with the smallest node number. Compare large-Nt results to the node degrees for the graph. Make 1-2 figures supporting your main findings, and place them along with accompanying discussion in your pdf. The code used to generate the figures should be placed in rwgraph_analyze1
iii) Recall that linear diffusion on an unweighted graph was defined using the graph Laplacian matrix,
𝐿=𝑄−𝐴
L
=
Q
−
A
where Q is a diagonal matrix with the graph degrees on the diagonal, and A is the adjacency matrix. The scaled Laplacian,
𝐿
𝑠
L
s
, is defined as
𝐿
𝑠
=𝐼−
𝑄
−1
𝐴
L
s
=
I
−
Q
−
1
A
. Consider the linear spreading produced by both this operator and
𝐿
𝑇
𝑠
L
s
T
. Investigate if any one of these three diffusion operators produces dynamics with significant similarities to your simulated random walk results (all generated using the same B-A graph). Identify 2-3 key points and provide explanations of what you have found in your pdf. These may be accompanied by figures as needed. Place any relevant code in rwgraph_analyze2. You are not required to establish quantitative correspondence between simulated diffusion and random walks on a time-step by time-step basis.
2. (5 pts) Consider the following two models for transport processes on graphs:
Model A:
𝑑
𝑖
𝑗
𝑑𝑡
=−𝛽
𝑖
𝑗
+
∑
𝑘=0
𝑁−1
𝛾
𝐴
𝑗𝑘
𝑖
𝑘
(1−
𝑖
𝑗
)
d
i
j
d
t
=
−
β
i
j
+
∑
k
=
0
N
−
1
γ
A
j
k
i
k
(
1
−
i
j
)
Model B:
𝑑
𝑠
𝑗
𝑑𝑡
=
𝑖
𝑗
𝑑
𝑖
𝑗
𝑑𝑡
=
∑
𝑘=0
𝑁−1
𝛼
𝐿
𝑗𝑘
(
𝑠
𝑘
−
𝑠
𝑗
)
d
s
j
d
t
=
i
j
d
i
j
d
t
=
∑
k
=
0
N
−
1
α
L
j
k
(
s
k
−
s
j
)
Here,
𝐿
𝑗𝑘
L
j
k
is the Laplacian matrix and
𝐴
𝑗𝑘
A
j
k
is the adjacency matrix for an N-node graph.
i) Complete modelA so that it computes simulations of modelA on the NetworkX graph provided as input. Complete the function, RHS, so that it computes and returns the right-hand side of the model equations (see the function docstring). You should assume that RHS will be called a very large number of times within one call to model A and construct your code accordingly. Construct an estimate of the number of operations required to compute
𝑑
𝑖
𝑗
/𝑑𝑡,𝑗=0,…,𝑁−1
d
i
j
/
d
t
,
j
=
0
,
.
.
.
,
N
−
1
in one call to RHS. Add this estimate to your pdf along with a concise explanation of how it was constructed. Add the needed code below RHS to simulate the model for Nt time steps from t=0 to t=tf with the initial condition
𝑖=
𝑖
0
i
=
i
0
at node x with x and
𝑖
0
i
0
provided as input, and finally return
𝑖
𝑗
(𝑡)
i
j
(
t
)
.
ii) Investigate the similarities and differences between the dynamics generated by model A, model B, and linear diffusion on Barabasi-Albert graphs. You can initially consider
𝛼=−0.01
α
=
−
0.01
(previously was
𝛼=0.01
α
=
0.01
),
𝛽=0.5
β
=
0.5
, and
𝛾=0.1
γ
=
0.1
, though you are free to vary these parameters as you like. You should initially (at t=0) set all variables to zero except for i at the node with maximum degree. You should design your own approach to the problem, and identify 3-4 key points, carefully discuss them (and provide explanations for highlighted trends), and produce a few figures illustrating numerical results related to the key points. Among the points you could consider is the initial spread of a perturbation placed at some node x. Add code for generating your figures to the function, transport and add the figures and accompanying discussion to your pdf. It is sufficient to consider B-A graphs with n=100 and m=5. Add code in the name==main portion of the code to call transport and generate the figures you are submitting with your assignment
Notes for Part 2:
• You may use numpy, scipy, matplotlib, and networkX for part2. Do not use any other modules without the instructor’s permission.
• You should design your codes to efficiently work with large complex networks. You may assume that
𝑁≪𝐿≪
𝑁
2
N
≪
L
≪
N
2
where N is the number of nodes and L is the number of links. You may also assume
that all graphs in this part consist of one connected part and that there are no self-loops.
• You do not need to account for the randomness of Barabasi-Albert graphs by, say, running simulations with several graphs and averaging the results.
Further Notes:
1. Marking will consider both the correctness and efficiency of your code as well as the soundness of your analysis. For part 2, consideration of efficiency will focus on computationally intensive tasks. Code for generating figures or processing simulation results are unlikely to be closely examined.
2. All figures created by your code should be well-made and properly labeled.
3. In order to assign partial credit, markers must be able to understand how your code is organized and what you are trying to do.
4. Lab 6 contains an exercise on simulation of 1-D random walks.
5. You should aim to limit your discussion for part 1 to 2 pages. For part 2, you should aim to fit your discussion within 5 pages including up to 6 figures. These constraints are provided for guidance, and marks will not be deducted for exceeding these limits.
6. Add additional functions as needed, but these functions should not use any external modules outside of those explicitly allowed for each part.
7. You may use codes provided in lectures and labs, but you should not assume that these codes are “efficient”.
8. Please be sensible with the amount of time you devote to the open-ended portions of the assignment. If it seems that your approach for, say, question 2.2 will require an extraordinary amount of time, it may be helpful to consult with the instructor.
Submitting the assignment¶
To submit the assignment for assessment, go to the course blackboard page, and find the link for “Project 2” Click on this link and then click on “Write Submission” and add the statement: “This is all my own unaided work unless stated otherwise.”
Click on “Browse My Computer” and upload your final files, part1.py, part2.py, and project2.pdf. Finally, click “Submit”.
M345SC2020
Navigation
Quick search