CS计算机代考程序代写 concurrency cache algorithm SOFT3410 Tutorial 5

SOFT3410 Tutorial 5
Threads

The goal of this lab is to act as an introduction to threads and encounter some parallel problems

Question 1: Welcome to threads

Up until now, most of your programs have involved only a single thread. Within this lab we will
explore initialising threads and writing separate thread functions. Given the following segment, you
should be able to observe the output from the program below.

Update the code to output thread id more than two times.

struct thread_data {
int tid;

};

void* work1(void* arg) {
struct thread_data* data = (struct thread_data*) arg;
printf(“Hello from thread %d\n”, data->tid);
return NULL;

}

int main() {

struct thread_data data = { { 1 }, { 2 } };
pthread_t threads[2];

pthread_create(threads, NULL, work1, &data);
pthread_create(threads+1, NULL, work2, &data);
printf(“main thread has created threads\n”);

pthread_join(threads[0], NULL);
pthread_join(threads[1], NULL);

printf(“program finished\n”);

return 0;
}

1

SOFT3410 Threads

Discuss with your class the following.

• How is data passed to the thread?

• How is the thread started

• How do we rejoin the thread with the main thread?

Question 2: Addressing False Sharing

Given the following segment of code, examine the performance, do you notice any issues when we
increase the number of threads. Does the increase of threads correspond with an increase of perfor-
mance or not?

struct thread_data {
int* result;
int result_index;
int* numbers;
int numbers_len;

};

void* worker(void* data) {

struct thread_data* d = (struct thread_data*) d;

for(int i = 0; i < d->numbers_len; i++) {
d->result[d->result_index] += d->numbers[i];

}

return NULL;
}

• What issues can arise when you share data among threads?

• If you were use a global variable instead of separate thread data objects, what issues could
arise?

• When data is shared between threads, do you notice a speed up with multiple threads?

Question 3: Addressing False Sharing with Padding

Given the information that a cache line is 64 bytes, how could we resolve the false sharing issue using
padding? How could we transform the data?

uint8_t padding[K];

Assign a value to K, Explain and discuss if this technique is effective or not.

Concurrency Page 2 of 6

SOFT3410 Threads

Question 4: Faster Blur

The following function performs a blur on an image. Break apart the function so a thread could op-
erate on a segment of the image instead of the whole image. Construct a series of large images to
benchmark your implementation and graph your implementation against a single threaded implemen-
tation.

void image_blur(int* inpixels, int* outpixels, size_t width,
size_t height) {

if(pixels != NULL) {
for(size_t y = 1; y < height; y += 3) { for(size_t 1 = 0; x < width; x += 3) { int red_sum = 0; int green_sum = 0; int blue_sum = 0; int new_pixel = 0; for(int i = -1; i < 2; i++) { for(int j = -1; j < 2; j++) { red_sum += (inpixels[((y+j) * width + (x+i)] & 0xFF0000) >> 16;
green_sum += (inpixels[((y+j) * width + (x+i)]

& 0x00FF00 >> 8;
blue_sum += inpixels[((y+j) * width + (x+i)]

& 0x0000FF;
}

}
for(int i = -1; i < 2; i++) { for(int j = -1; j < 2; j++) { new_pixel = ((red_sum / 9) << 16) | ((green_sum / 9) << 8) | ((blue_sum / 9)) outpixels[((y+j) * width + (x+i)] = new_pixel; } } } } } } Question 5: Matrix-Multiplication and Threads Using your homework implementation, try break up parts of your matrix-multiplication algorithm to utilise threads. Consider different ways you can introduce threads into your multiplication function. Similar to the previous question, use a variety of test data and graph your implementation. Concurrency Page 3 of 6 SOFT3410 Threads Question 6: Homework - Marching Squares Marching squares is an algorithm that can be used generate contour lines from a bitmap image. Given a specific threshold value, create a 1-bit image based on a pixel from this value. Afterwards en- code each 3x3 segment into a 4-bit integer in a clockwise direction (top-left, top-right, bottom-right, bottom-left, use the table below to construct your lookup table) for your fill operation. This algorithm allows you to find the perimeter of distinct objects in an image. This kind of algorithm is used for pixel selection (such as magic-wand in an image editor) or animating two-dimensional amoebas. You will need to implement the following functions. You will also need to implement a parallel version that will use these functions. Each 3x3 segment can be encoded with the following. Concurrency Page 4 of 6 SOFT3410 Threads 1 2 /** 3 * @param input, input gray-scale image, 0x00 to 0xFF 4 * @param width, width of the image 5 * @param height, height of the image 6 * @param threshold, pixel value to generate a 1bit image 7 * @return uint8_t*, 1bit image of 0x00 or 0xFF with width*height pixels 8 */ 9 uint8_t* msquare_1bit(uint8_t* input, size_t in_width, size_t in_height, 10 uint8_t threshold); 11 12 /** 13 * @param data, pixels of the image 14 * @param width, width of the image 15 * @param height, height of the image 16 * @param segments, different segments from 0x00 to 0xFF 17 * @param segments_length, number of segments inputted 18 */ 19 void msquare_detect(uint8_t* data, size_t width, size_t height, 20 uint8_t* segments, size_t segments_length); 21 /** 22 * @param original, 23 * @param width, width of the image 24 * @param height, height of the image 25 * @param segments, the encoded segments 0000 -> 1111
26 * @param seg_length, number of segments
27 * @param pos_colour, replace 0xff with positive colour
28 * @param neg_colour, replace 0x00 with negative colour
29 */
30 void msquare_fill(uint8_t* new_image, size_t width, size_t height,
31 uint8_t* segments, size_t seg_length, uint8_t pos_colour, uint8_t neg_colour);

Concurrency Page 5 of 6

SOFT3410 Threads

The algorithm broken into the following steps.

1. Construct a 1-bit image using the threshold value.

2. Construct an array of segments from the 1-bit image.

3. Use the segments to paint the image using the above look up table.

There is an edge case regarding saddle segments. Consider calculating a mid-point during the fill
operation to conclude if the inside of the segment is connected or not.

• Construct a single-threaded solution and a multi-threaded solution

• Graph the differences between the two solutions, observe if the algorithm performs poorly as
compared to the single-threaded solution

• In the event you have observed the algorithm performs worse than single threaded, identify why
and address this issue. Once you have made improvements, attempt to minimise the overhead
caused from threading.

You will need to document your observations in your README.md text file. You will need to adhere
to the following rules.

• No VLAs

• No global or static variables

• No invalid memory access

For your submission, create a github repository with the following name soft3410hw5 and invite
your tutor to your repository.

You will need to supply the following as part of your submission.

• Place all .c source files in your src folder

• Test cases must be in a folder called test

• Supply a makefile or a script to compile your program

• You must supply a README.md file that specifies how to build your program and a small
description of your project and how it solves the problem

• You must ensure that it compiles and executes correctly on linux

You must make a submission to the master branch by 11:59pm, 11th October, 2020. Please make
sure your last commit and push occurs before this time.

Concurrency Page 6 of 6