SOFT3410 Tutorial 5
Threads
The goal of this lab is to act as an introduction to threads and encounter some parallel problems
Question 1: Welcome to threads
Up until now, most of your programs have involved only a single thread. Within this lab we will
explore initialising threads and writing separate thread functions. Given the following segment, you
should be able to observe the output from the program below.
Update the code to output thread id more than two times.
struct thread_data {
int tid;
};
void* work1(void* arg) {
struct thread_data* data = (struct thread_data*) arg;
printf(“Hello from thread %d\n”, data->tid);
return NULL;
}
int main() {
struct thread_data data = { { 1 }, { 2 } };
pthread_t threads[2];
pthread_create(threads, NULL, work1, &data);
pthread_create(threads+1, NULL, work2, &data);
printf(“main thread has created threads\n”);
pthread_join(threads[0], NULL);
pthread_join(threads[1], NULL);
printf(“program finished\n”);
return 0;
}
1
SOFT3410 Threads
Discuss with your class the following.
• How is data passed to the thread?
• How is the thread started
• How do we rejoin the thread with the main thread?
Question 2: Addressing False Sharing
Given the following segment of code, examine the performance, do you notice any issues when we
increase the number of threads. Does the increase of threads correspond with an increase of perfor-
mance or not?
struct thread_data {
int* result;
int result_index;
int* numbers;
int numbers_len;
};
void* worker(void* data) {
struct thread_data* d = (struct thread_data*) d;
for(int i = 0; i < d->numbers_len; i++) {
d->result[d->result_index] += d->numbers[i];
}
return NULL;
}
• What issues can arise when you share data among threads?
• If you were use a global variable instead of separate thread data objects, what issues could
arise?
• When data is shared between threads, do you notice a speed up with multiple threads?
Question 3: Addressing False Sharing with Padding
Given the information that a cache line is 64 bytes, how could we resolve the false sharing issue using
padding? How could we transform the data?
uint8_t padding[K];
Assign a value to K, Explain and discuss if this technique is effective or not.
Concurrency Page 2 of 6
SOFT3410 Threads
Question 4: Faster Blur
The following function performs a blur on an image. Break apart the function so a thread could op-
erate on a segment of the image instead of the whole image. Construct a series of large images to
benchmark your implementation and graph your implementation against a single threaded implemen-
tation.
void image_blur(int* inpixels, int* outpixels, size_t width,
size_t height) {
if(pixels != NULL) {
for(size_t y = 1; y < height; y += 3) {
for(size_t 1 = 0; x < width; x += 3) {
int red_sum = 0;
int green_sum = 0;
int blue_sum = 0;
int new_pixel = 0;
for(int i = -1; i < 2; i++) {
for(int j = -1; j < 2; j++) {
red_sum += (inpixels[((y+j) * width + (x+i)]
& 0xFF0000) >> 16;
green_sum += (inpixels[((y+j) * width + (x+i)]
& 0x00FF00 >> 8;
blue_sum += inpixels[((y+j) * width + (x+i)]
& 0x0000FF;
}
}
for(int i = -1; i < 2; i++) {
for(int j = -1; j < 2; j++) {
new_pixel = ((red_sum / 9) << 16) | ((green_sum / 9) << 8)
| ((blue_sum / 9))
outpixels[((y+j) * width + (x+i)] = new_pixel;
}
}
}
}
}
}
Question 5: Matrix-Multiplication and Threads
Using your homework implementation, try break up parts of your matrix-multiplication algorithm to
utilise threads. Consider different ways you can introduce threads into your multiplication function.
Similar to the previous question, use a variety of test data and graph your implementation.
Concurrency Page 3 of 6
SOFT3410 Threads
Question 6: Homework - Marching Squares
Marching squares is an algorithm that can be used generate contour lines from a bitmap image. Given
a specific threshold value, create a 1-bit image based on a pixel from this value. Afterwards en-
code each 3x3 segment into a 4-bit integer in a clockwise direction (top-left, top-right, bottom-right,
bottom-left, use the table below to construct your lookup table) for your fill operation.
This algorithm allows you to find the perimeter of distinct objects in an image. This kind of algorithm
is used for pixel selection (such as magic-wand in an image editor) or animating two-dimensional
amoebas.
You will need to implement the following functions. You will also need to implement a parallel
version that will use these functions.
Each 3x3 segment can be encoded with the following.
Concurrency Page 4 of 6
SOFT3410 Threads
1
2 /**
3 * @param input, input gray-scale image, 0x00 to 0xFF
4 * @param width, width of the image
5 * @param height, height of the image
6 * @param threshold, pixel value to generate a 1bit image
7 * @return uint8_t*, 1bit image of 0x00 or 0xFF with width*height pixels
8 */
9 uint8_t* msquare_1bit(uint8_t* input, size_t in_width, size_t in_height,
10 uint8_t threshold);
11
12 /**
13 * @param data, pixels of the image
14 * @param width, width of the image
15 * @param height, height of the image
16 * @param segments, different segments from 0x00 to 0xFF
17 * @param segments_length, number of segments inputted
18 */
19 void msquare_detect(uint8_t* data, size_t width, size_t height,
20 uint8_t* segments, size_t segments_length);
21 /**
22 * @param original,
23 * @param width, width of the image
24 * @param height, height of the image
25 * @param segments, the encoded segments 0000 -> 1111
26 * @param seg_length, number of segments
27 * @param pos_colour, replace 0xff with positive colour
28 * @param neg_colour, replace 0x00 with negative colour
29 */
30 void msquare_fill(uint8_t* new_image, size_t width, size_t height,
31 uint8_t* segments, size_t seg_length, uint8_t pos_colour, uint8_t neg_colour);
Concurrency Page 5 of 6
SOFT3410 Threads
The algorithm broken into the following steps.
1. Construct a 1-bit image using the threshold value.
2. Construct an array of segments from the 1-bit image.
3. Use the segments to paint the image using the above look up table.
There is an edge case regarding saddle segments. Consider calculating a mid-point during the fill
operation to conclude if the inside of the segment is connected or not.
• Construct a single-threaded solution and a multi-threaded solution
• Graph the differences between the two solutions, observe if the algorithm performs poorly as
compared to the single-threaded solution
• In the event you have observed the algorithm performs worse than single threaded, identify why
and address this issue. Once you have made improvements, attempt to minimise the overhead
caused from threading.
You will need to document your observations in your README.md text file. You will need to adhere
to the following rules.
• No VLAs
• No global or static variables
• No invalid memory access
For your submission, create a github repository with the following name soft3410hw5 and invite
your tutor to your repository.
You will need to supply the following as part of your submission.
• Place all .c source files in your src folder
• Test cases must be in a folder called test
• Supply a makefile or a script to compile your program
• You must supply a README.md file that specifies how to build your program and a small
description of your project and how it solves the problem
• You must ensure that it compiles and executes correctly on linux
You must make a submission to the master branch by 11:59pm, 11th October, 2020. Please make
sure your last commit and push occurs before this time.
Concurrency Page 6 of 6