SOFT3410 Assignment 2
Part 1 – Task description
Due: 11:59pm Wednesday, 25 November 2020
This assignment is worth 20% of your final assessment
The objective is to write a program that will model the static heat distribution of a room with a fireplace using a stencil pattern. Although a room is 3 dimensional, we will be simulating the room with 2 dimensions. The room is 10 feet wide and 10 feet long with a fireplace along one wall as depicted in Figure 1. The fireplace is 4 feet wide and is centered along one wall (it takes up 40% of the wall, with 30% of the walls on either side). The fireplace emits 100◦C of heat (although in reality a fire is much hotter). The walls are considered to be 20◦C. The boundary values (the fireplace and the walls) are considered to be fixed temperatures. The room temperature is initialized to 20◦C.
We can find the temperature distribution by dividing the area into a fine mesh of points hi,j . The tem- perature at an inside point can be taken to be the average of the temperatures of the four neighboring points, as illustrated in Figure 2.
For this calculation, it is convenient to describe the edges by points adjacent to the interior points. The interior points of hi,j are where 0 < i, 0 < j < n [(n - 1) x (n - 1) interior points]. The edge points are when i = 0, i = n, j = 0, or j = n, and have fixed values corresponding to the fixed temperatures of the edges. Hence,thefullrangeofhi,j is0<=i<=n,0<=j<=n,andthereare(n+1)x(n+1)points. We can compute the temperature of each point by iterating the equation:
hi,j = (hi−1,j + hi+1,j + hi,j−1 + hi,j+1)/4, (0 < i < n, 0 < j < n)
1
for a fixed number of iterations or until the difference between iterations of a point is less than some very small, prescribed amount. Suppose the temperature of each point is held in an array h[i][j] and the boundary points h[0][x], h[x][0], h[n][x], and h[x][n] (0 <= x <= n) have been initialized to the edge temperatures. The calculation as sequential code could be:
for (iteration = 0; iteration < T; iteration++) {
for (i = 1; i < n; i++) {
for (j = 1; j < n; j++) {
g[i][j] = 0.25 * (h[i - 1][j] + h[i + 1][j]
+ h[i][j - 1] + h[i][j + 1]);
} }
/* update points */
for (i = 1; i < n; i++) {
for (j = 1; j < n; j++) {
h[i][j] = g[i][j];
}
} }
Using a fixed number of iterations. Notice that a second array g[][] is used to hold the newly computed values of the points from the old values. The array h[][] is updated with the new values held in g[][]. This is known as Jacobi iteration. Multiplying by 0.25 is done for computing the new value of the point rather than dividing by 4 because multiplication is usually more efficient than division.
Also, the above code can be improved to avoid actually copying the array into another array by using a 3-dimensional array of size N x N x 2, say A[i][j][x] and every iteration, alternate between x = 0 to x =1.
Q1 - 4 Points
Re-write the given sequential code above to avoid the unnecessary points updating (h[i][j] = g[i][j]) using a 3-dimensional array introduced above; and complete the C program to model the heat distri- bution of the room illustrated in Figure 1. Have N x N points and T iterations, where N and T are defined constants in the program. Test your C program on your computer with N = 100 and T= 5000 (Note that we also use same N = 100 and T = 5000 in Q2, Q3, and Q4); and in the report, print out the values of every 10 points from the final heat distribution matrix in a row-by-row fashion, i.e., forming the output as a 10*10 matrix.
Q2 - 2 Points
Parallelize the sequential code given above (with unnecessary points updating) and your new code from Q1 using OpenMP. Analyze the max absolute error of heat distribution matrix between your parallelized version of codes and the corresponding sequential version of codes given above.
SOFT3410
Page 2 of 10
Q3 - 2 Point
Profile the execution time for the four programs you got (2D sequential, 2D parallel, 3D sequential, and 3D parallel) using the default scheduling policy on 1, 2, 4, and 8 threads. Plot all your results in a chart.
Q4 - 2 Points
Try default, static, and dynamic scheduling policies on your parallel heat distribution 3D program with 8 threads and different chunk sizes (1, 4, 8). Report these results in a chart and discuss if any of these scheduling policies will cause false sharing and why. Note that it is possible that they don’t present significant performance differences. You have to summarize your findings.
Marking
Before you attempt to write optimised code, you should first complete a working unoptimised version. Once you have the basic algorithm correct you can then begin to implement various optimisations.
10 marks You will need to submit a report that attempts to answer the above questions and submit your code to a USYD Github repository.
For your submission, your code submission must be uploaded onto USYD Github under the repo name soft3410a2p1. We will only consider the last submission before 11:59pm, 25th November, 2020.
You must also submit your report to canvas portal by 11:59pm, 25th November, 2020.
Part 2 - Task description
In this assignment we will implement the basic PageRank algorithm in the C programming language using a variety of parallel programming techniques to ensure peak performance is achieved.
The PageRank algorithm was developed in 1996 by and when they were graduate students at Stanford University. Google and other search engines compare words in search phrases to words in web pages and use ranking algorithms to determine the most relevant results.
PageRank assigns a score to a set of web pages that indicates their importance. The underlying idea behind PageRank is to model a user who is clicking on web pages and following links from one page to another. In this framework, important pages are those which have incoming links from many other pages, or have incoming links from other pages with a high PageRank score, or both.
SOFT3410
Page 3 of 10
The PageRank algorithm
PageRank is an iterative algorithm that is repeated until a stopping criteria is met. The last iteration gives us the result of the search, which is a score per web page. A high score indicates a very relevant web page whereas a low score indicates a not so relevant web page for a search. Sorting the web pages by their scores in descending order gives us the order for the result list of a search query.
For describing the PageRank algorithm we introduce the following symbols:
• S is the set of all web pages that we are computing the PageRank scores for
• N = |S| is the total number of web pages
• P = [P1t, P2t, ..., PNt ] is the vector of PageRank scores
• d is a dampening factor for the probability that the user continues clicking on web pages • ε is the convergence threshold
• IN(p) is the set of all pages in S which link to page p
• OUT(p) is the set of all pages in S which page p links to
For the PageRank vector, we use the notation P(t) to represent the PageRank score for page p at p
iteration t. We initialise the scores for all pages to an initial value of N1 so that the sum equals 1. P(0) = 1
uN
During each iteration of the algorithm, the value of P is updated as follows:
(1)
1−d P(t)
P(t+1) = +d v (2)
SOFT3410
u N |OUT(v)| v∈IN(u)
The algorithm continues to iterate until the convergence threshold is reached; that is, the algorithm terminates when PageRank scores stop varying between iterations. The PageRank scores have converged when the following condition is met:
||P(t+1) − P(t)|| ≤ ε (3)
The vector norm is defined to be the standard Euclidean vector norm; that is, for some vector x = {x1,x2,...,xn}:
||x|| = x2i (4) i
Page 4 of 10
Example
Our example has four web pages: S = {A, B, C, D}. In this example, d = 0.85 and ε = 0.005. The referencing structure of the web pages A, B, C, and D is given in the graph below.
IN(A) =
IN(B) =
IN(C) =
IN(D) =
{B, D} {D} {B, D} ∅
OUT(A) = OUT(B) = OUT(C) = OUT(D) =
∅
{A, C}
∅
{A, B, C}
Each node represents a web page.
Edges in the graph indicate that the source of the edge is linking to the destination of the edge.
Initialise P(0) = N1 . Then perform the first iteration for each page.
(1) 1−0.85 P(0) P(0) 0.15 0.25 0.25
PA= +0.85 B + D = +0.85 + ≈0.214
SOFT3410
4 |{A,C}| |{A,B,C}| 4 2 3
(1) 1−0.85 P(0) 0.15 0.25 PB = +0.85 D = +0.85
≈ 0.108 PC= +0.85B+D =+0.85+≈0.214
4 |{A,B,C}| 4 3
(1) 1−0.85 P(0) P(0) 0.15 0.25 0.25
4 |{A,C}| |{A,B,C}| 4 2 3
P(1) = 1−0.85+0.85(0)= 0.15+0≈ 0.038
D44 The initialisation and first iteration result in the following values for P:
tABCD
0 0.250 0.250 0.250 0.250 1 0.214 0.108 0.214 0.038
Next, we check whether or not the algorithm has converged.
||P(1) − P(0)|| = ||{−0.036, −0.142, −0.036, −0.215}||
√
≈ 0.260
=
0.0677
Page 5 of 10
Since 0.260 ̸≤ 0.005 (ε), we perform another iteration.
(2) 1−0.85 P(1) P(1) PA= +0.85 B + D
4
0.15 0.108 0.038 = +0.85 +
≈0.094
≈ 0.048
SOFT3410
(2)
423 0.15 0.038
= +0.85
P(1) P(1) 0.15 0.108 0.038
PB = (2)
+0.85 D
4 |{A,B,C}| 4 3
|{A, C}| |{A, B, C}| 1−0.85 P(1)
1−0.85
PC= +0.85 B + D
≈0.094 1−0.85+0.85(0)= 0.15+0≈ 0.038
= +0.85 + 423
4 |{A, C}| |{A, B, C}| D44
tABCD
0 0.250 0.250 0.250 0.250 1 0.214 0.108 0.214 0.038 2 0.094 0.048 0.094 0.038
Again, we check whether or not the algorithm has converged:
||P(2) − P(1)|| ≈ 0.180 ̸≤ ε
Since convergence has not been reached, we perform another iteration.
(3) 1−0.85 P(2) P(2) PA= +0.85 B + D
4
P(2) =
Which leaves us with the following three iterations of P:
≈0.069
≈ 0.048
Again, we check whether or not the algorithm has converged:
||P(3) − P(2)|| ≈ 0.040 ̸≤ ε
0.15
= +0.85 +
0.048 423
0.038 0.15 0.038
(3)
1−0.85 P(2) +0.85 D
|{A, C}| |{A, B, C}|
= +0.85
4 |{A,B,C}| 4 3
PB = (3)
4 |{A, C}| |{A, B, C}| D44
1−0.85
PC= +0.85 B + D
P(2) P(2) 0.15 0.048 0.038
≈0.069 1−0.85+0.85(0)= 0.15+0≈ 0.038
= +0.85 + 423
P(3) =
Page 6 of 10
Since convergence has not been reached, we perform another iteration.
(4) 1 − 0.85 P(3) P(3) 0.15 0.048 0.038
PA= +0.85 B + D = +0.85 + ≈0.069
=
P(3) P(3) 0.15 0.048 0.038 1−0.85+0.85(0)= 0.15+0≈ 0.038
+0.85
SOFT3410
(4)
+0.85 D
4
|{A,C}| |{A,B,C}| 1 − 0.85 P(3)
4
4 |{A,B,C}| 4 3
≈ 0.048 PC= +0.85 B + D = +0.85 + ≈0.069
PB = (4)
Again, we check whether or not the algorithm has converged:
||P(4) − P(3)|| = 0 ≤ ε
We have now reached convergance, so the algorithm terminates and the values of P(4) are the final PageRank scores for each of the pages. If this were a query, the resulting ranks would be A, C, B, D where we arbitrarily rank page A before page C since they have the same score.
0.15
0.038
2 3
1 − 0.85
4 |{A,C}| |{A,B,C}| 4 2 3
P(4) = D44
Page 7 of 10
Implementation details
In the header file pagerank.h we have provided the function read_input that will process the input file for you. The read_input function will output error if the input is malformed in any way and will free any memory that has been previously allocated. We have also provided various structs in the header file that may be useful, including a struct for holding pages and a struct that is a linked list of pages. The function read_input returns the input data in these structs, with the following form:
/* singly linked list to store all pages */
struct list {
node* head; /* pointer to the head of the list */ node* tail; /* pointer to the tail of the list */ int length; /* length of the entire list */
};
/* struct to hold a linked list of pages */
struct node {
page* page; /* pointer to page data structure */ node* next; /* pointer to next page in list */
};
/* struct to hold a page */
struct page {
char name[21]; /* page name */
int index; /* index of the page */
int noutlinks; /* number of outlinks from this page */
list* inlinks; /* linked list of pages with inlinks to this page */
};
Warning: Donotaddotherfilesormodifytheincludedheaderfileormainfunction. We will replace these files with our own copy when marking your code.
SOFT3410
Page 8 of 10
Program input and output
…
Marking
Before you attempt to write optimised code, you should first complete a working unoptimised version. Once you have the basic algorithm correct you can then begin to implement various optimisations.
7 marks are assigned based on your code submission and the correctness of your program.
This component will use our own test cases that cover every aspect of the specification. To pass test cases your solution must produce the correct output within a reasonable time limit. With the given scaffold, do not modify sections that have been outlined.
8 marks are assigned based on the report of your solution and comparison to your single threaded implementation. Observe, document and explain the performance behaviours through your own benchmark suite.
For your submission, your code submission must be uploaded onto USYD Github under the repo name soft3410a2p2. We will only consider the last submission before 11:59pm, 25th November, 2020.
You must also submit your report to canvas portal by 11:59pm, 25th November, 2020.
Note: Part 2 contains 5 bonus points.
Academic declaration
By submitting this assignment you declare the following:
I declare that I have read and understood the University of Plagiarism: Coursework Policy and Procedure, and except where specifically acknowledged, the work contained in this assignment/project is my own work, and has not been copied from other sources or been previously submitted for award or assessment.
I understand that understand that failure to comply with the Student Plagiarism: Coursework Policy and Procedure can lead to severe penalties as outlined under Chapter 8 of the University of -Law 1999 (as amended). These penalties may be imposed in cases where any significant portion of my submitted work has been copied without proper acknowledgement from other sources, including published works, the internet, existing programs, the work of other students, or work previously submitted for other awards or assessments.
I realise that I may be asked to identify those portions of the work contributed by me and required to demonstrate my knowledge of the relevant material by answering oral questions or by undertaking supplementary work, either written or in the laboratory, in order to arrive at the final assessment mark.
I acknowledge that the School of Information Technologies, in assessing this assignment, may reproduce it entirely, may provide a copy to another member of faculty, and/or communicate a copy of this assignment to a plagiarism checking service or in-house computer program, and that a copy of the assignment may be maintained by the service or the School of IT for the purpose of future plagiarism checking.
SOFT3410
Page 10 of 10