MSBD5009 Parallel Programming
Assignment #3: CUDA Programming
Due on 12 May 2020 at 11:59pm
Instructions
• This assignment counts for 20 points.
• This is an individual assignment. You can discuss with others and search online resources, but your submission should be your own code.
• Fill your name, student ID and email in the first line of comments.
• Submit your assignment through Canvas before the deadline.
• Your submission will be compiled and tested on Azure NC6_Promo machines
• No late submissions will be accepted!
Assignment Description
The Smith-Waterman algorithm identifies the similar regions in two genome sequences. In this assignment, you will implement a CUDA version of the Smith-Waterman algorithm to measure the similarity between two strings.
The pseudocode of the Smith-Waterman algorithm is shown in Algorithm 1. Given string A = a1,a2,…,an andstringB=b1,b2,…,bm,wefirstconstructascoringmatrixH ofsize(n+1)∗ (m + 1) and set the first row and column to zero. Then, we iteratively compute the scoring matrix using the following dynamic programming formula:
Hi−1,j−1 +s(ai,bj) Hi−1,j −w
Hi,j−1 −w 0
where s is the substitution matrix s(ai,bj) =
(1≤i≤n,1≤j≤m)
Hij =max
mismatch score, and w is the gap penalty. The value of u, v and w are constant and are given as part of the input to the algorithm. Finally, we return the highest score in the scoring matrix as the similarity score between A and B.
Input and Output
The input file will be in the following format:
1. The first line contains two integers a_len and b_len (1 ≤ a_len ≤ 20000,1 ≤ b_len ≤
20000), separated by a space. They represent the length of string a and b, respectively.
2. The second line is a string a of length a_len, consisting of four letters A, T, G, C.
3. The third line is a string b of length b_len, consisting of four letters A, T, G, C.
1
u, ai=bj
v, a ̸=b , u is the match score, v is the
ij
Assignment 3 MSBD5009, 2020 Spring
1 2 3 4 5 6 7 8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Algorithm 1: The Smith-Waterman Algorithm
Input : string A = a1,a2,…,an and B = b1,b2,…,bm, match score u, mismatch score v,
gap penalty w
Output: the similarity score between A and B int score[|A| + 1][|B| + 1];
fori←0to|A|do
score[i][0] ← 0; end
fori←0to|B|do score[0][i] ← 0;
end
int max_score ← 0; fori←1to|A|do
for j ← 1 to |B| do score[i][j] ← max(0,
score[i − 1][j] − w,
score[i][j − 1] − w,
score[i − 1][j − 1] + sub_mat(ai, bj ));
max_score ← max(max_score, score[i][j]); end
end
return max_score;
Procedure sub_mat(char x, char y)
if x == y then return u;
else
return v;
end
The output of the program consists of three lines:
1. The similarity score between a and b, i.e., the highest score in the scoring matrix. 2. The elapsed time in seconds of the execution.
3. The driver time in seconds of the execution.
We assume that the match score is 3, the mismatch score is -3, and the gap penalty is 2. Here is an example input/output for your reference:
Input:
Output:
Submission and Grading
The code skeleton cuda_smith_waterman_skeleton.cu is provided. Your task is to complete the following function in the code:
98 GGTTGACTA TGTTACGG
Max socre: 13
Elapsed time: 0.00001 s Driver time: 0.0000095 s
2
Assignment 3 MSBD5009, 2020 Spring
int smith_waterman(int blocks_per_grid, int threads_per_block, char *a, char *b, int
a_len, int b_len);
The description of the parameters is as follows: Parameter
int blocks_per_grid
int threads_per_block
char *a
char *b
int a_len
int b_len
Notes:
1. You will be given three files cuda_smith_waterman_skeleton.cu, main.cu and cuda_smith_ waterman.h. You only need to complete and submit the cuda_smith_waterman_skeleton.cu to the Canvas.
2. The sequential Smith-Waterman algorithm code is also provided for your reference. You can use it to learn the logic flow and verify the correctness of your CUDA version. You can also compare the sequential running time with your parallel program performance.
3. Youcanaddhelperfunctionsandvariablesasyouwishinthecuda_smith_waterman_skeleton .cu file, but keep the other two files: main.cu and cuda_smith_waterman.h, unchanged.
4. We will use different input files and specify different CUDA kernel launch parameters ( 13 ≤ block_per_grid ≤ 39 and 32 ≤ threads_per_block ≤ 1024 in ./cuda_smith_waterman
5. The correctness, elapsed time and speedup of your program will be considered in grading.
6. We will perform code similarity checks. In case a submission has code similarity issues, we may request clarification and deduct partial marks or full marks on a case-by-case basis.
Description
Number of blocks per grid Number of threads per block The string a
The string b
The length of string a The length of string b
3