Advanced C Programming
Spring 2020 ECE 264 :: Purdue University
Home
Schedule
Syllabus
Resources
Standards
Scores

Due 5/2
Parallel Adaptive Threshold
Goals
Overview
Adaptive Thresholding
Multi-threaded programming
Warm-up exercises
Opt out.
Start
Requirements
Compile
Pre-tester ●
Submit
How much work is this?
Bonus #1: Median filter (2 bonus points)
Bonus #2: Constant time median filter (6 bonus points)
Q&A
Updates
Goals
The goals of this assignment are as follows:
1. Learn to write parallel programs using multiple threads and the pthread library.
2. Increase your depth in image programming.
Overview
You will create binarize(…), a function that works with your code from EC01 to convert an image to monochrome (black & white) using the “adaptive threshold” method.
This assignment is optional. If you choose to do it, it will count the numerator—but not the denominator—in your homework average. Submitting it may help your grade, but will not hurt your grade (unless you plagiarize).
Adaptive Thresholding
Thresholding is a technique used to convert an image to monochrome (black & white). It is important for problems such as scanned document processing, OCR, image segmentation, and medical imaging. Pixels brighter than some threshold are set to white. The rest are set to black. The key question becomes this: What should the threshold be?
If the image is color, you will need to calculate what each pixel would be as grayscale. In the image below (taken with a smartphone), the image at left is actually color. Notice that there is a little tint from the lighting in the room. The image right is true grayscale. Your code should be able to handle either one.

color

gray
// Credit: Donald Knuth; image is a photo of The Art of Computer Programming: Fundamental Algorithms, volume 1, 2nd edition, p. 272; photo by Alex Quinn
In the simplest case, you could use 50% gray ( ) so that would be converted to . That method—often called fixed threshold because it uses the same threshold value for all pixels—generally works well for scanned documents, as long as the brightness and contrast were set properly when they were scanned. However, in cases of photographs taken in uneven lighting or certain medical applications, a fixed threshold provides poor results, as you see below.

gray

monochrome, fixed threshold
Even if we were to use 30% gray or 70% gray, we would still have similar problems. Because the lighting is different in different regions of the image, there is no one good threshold to use for every pixel. The solution is adaptive threshold. The basic idea is that for each pixel in an image, you may use a different threshold value to decide whether that pixel should be black or white.

gray

monochrome, adaptive threshold
There are many good ways to decide the threshold, none of them perfect. For this assignment, we will keep it simple:
For a neighborhood radius r, a pixel at (x0, y0) shall be white if and only if its intensity is greater than the average intensity of all pixels in the (2r + 1) × (2r + 1) neighborhood surrounding that pixel. That neighborhood shall consist of all pixels at (xi, yi), such that |xi – x0| ≤ r and |yi – y0| ≤ r, including the current pixel being thresholded. Pixels that are not white shall be black. In calculating the threshold, do not include pixels that are beyond the boundaries of the image. This means that for such edge pixels, the neighborhood will be smaller.
Example: Suppose we binarize the following 6×6 image with radius=1.
250
120
0
20
240
100
220
110
10
30
230
130
40
45
50
55
60
65
70
75
80
85
90
95
155
145
135
125
115
105
165
175
185
195
205
215
The pixel at (x=2, y=3) currently has intensity value of 80. The neighborhood (radius=1) consists of pixels having intensity values 45, 50, 55, 75, 80, 85, 145, 135, and 125. The average of those is 88⅓. Since 80 ≤ 88⅓, the pixel at (x=2, y=3) shall be black.
Always calculate the threshold based on only the values in the input image.
(Optional) More information about adaptive threshold:
• Adaptive Threshold (Prof. Robert Fisher, University of Edinburgh)
Multi-threaded programming
Calculating adaptive threshold can be computationally intensive, especially on high resolution images. Therefore, you will implement the adaptive threshold algorithm in parallel by using the pthread library. How can parallelism be implemented for this algorithm? Each pixel in the output image must have its threshold calculated and its intensity value set. Threads can be created such that each thread operates on a particular region of pixels in the image. How you choose to allocate pixels to a particular thread is up to you!
Your binarize(…) will create a copy, which makes your life a lot easier than it would be if they were modifying the image in place.
A thread is defined as an independent stream of instructions that can be scheduled to run by the operating system. Multi-threading allows multiple threads to exist within the context of a single process. These threads share a process’ resources but are able to execute simultaneously and/or independently. You are responsible for creation of num_threads threads and their management. See the man pages for pthread_create(…) and pthread_join(…) for details on their parameters and how to pass data to your worker function. You can refer to example#1 and example#2 covered in class. Also for more examples of thread management you can refer here.
Warm-up exercises
This assignment includes a warm-up exercise to help you get ready. This accounts for 10% of your score for EC02. Scoring will be relatively light, but the usual base requirements apply.
Zero an array. Create a function bool zero array(int✶ array, int array length, int num threads, const char✶✶ a error) that sets every element of the given array to zero using num_threads threads.
Opt out.
In a hurry, and don’t need the practice? This warm-up is here to help you learn what you need to succeed on the rest of this assignment—not to add additional work. Therefore, we give you an option. Those who feel that they do not need the practice may “opt out”. by modifying warmup.c so that it does nothing but print the following message exactly and then exit:
I already know this and do not need to practice.
If you do that, then your score for EC02 will be based solely on the rest of this assignment. If you leave the warmup.c undone, if you do not turn in a warmup.c, or if the message it prints does not match perfectly, then you will receive 0 for the warmup portion of this assignment (10%).
Start
To get the starter files, enter 264get ec02 in bash. You will get the header files (bmp.h and mtat.h), a skeleton warmup.c, and several image files. The image files with “_bw_” in the filename are the output of the binarize(…) function, using the given radius.
You are strongly encouraged to work with the smallest image (6×6), using only 1, 2, or 3 threads. Keeping things simple and small will make your development and debugging more manageable.
Requirements
1. Your submission must contain each of the following files, as specified:
2. file
3. contents
4. mtat.c
5. functions
6. binarize(const BMPImage✶ image, int radius, int num threads, const char✶✶ a error)
→ return type: BMPImage✶
Convert the given image to monochrome using the adaptive threshold algorithm described above with num_threads threads.
• image is a valid 24-bit BMP (v3) image.
• Return a new monochrome BMPImage on the heap.
◦ Each pixel must be either black (red==blue==green==0) or white (red==blue==green==255).
• ⚠ Do not use division in calculating the new color intensity. (See Q&A #2.)
• You may assume 1 ≤ num_threads ≤ 2,064,376.
◦ If the image contains fewer than num_threads pixels, then you may create image -> header.width_px × image -> header.height_px threads, instead.
• Handle run-time errors using the method described in EC01.
• Caller is responsible for freeing the returned image using free_bmp(…).
7. mtat.h
8. declarations
9. declaration of binarize(…)
10. bmp.h
11. declarations
12. same as previous assignment
13. bmp.c
14. functions
15. same as previous assignment
16. test_mtat.c
17. functions
18. main(int argc, char✶ argv[])
→ return type: int
Test your binarize(…).
• 100% code line coverage is required.
• Tests will run in a directory containing all of the EC02 starter files.
• Use your read_bmp(…), write_bmp(…),free_bmp(…) (from EC01), as needed.
• Use miniunit.h.
19. warmup.c
20. functions
21. zero array(int✶ array, int array length, int num threads, const char✶✶ a error)
→ return type: bool
Set every element of array to 0 using num_threads threads.
• If successful, return true and do not modify error.
• In case of the following run-time errors, return false and set *a_error to a descriptive error message.
◦ array_length <= 0
◦ array == NULL
◦ thread could not be created
• If array_length >= 1 and array != NULL, you may assume the array is valid and contains array_length int values
• If num_threads > array_length, change num_threads to array_length of the array.
• Message should be on the data segment.
22. You may copy your create_bmp(…) and/or set_pixel(…) from your test_bmp.c.
23. Do not turn in any image files.
24. You may assume that images contain no more than 16,777,216 total pixels (e.g., 4096×4096).
25. binarize(…) should not modify the image that is passed to it.
26. You may modify mtat.h but not bmp.h. Do not modify the function signature of binarize(…).
27. Only the following externally defined functions and constants are allowed in your .c files. (You may put the corresponding #include <…> statements in the .c file or in your mtat.h, at your option.) This does not apply to your bmp.c and bmp.h.
28. header
29. functions/symbols
30. allowed in…
31. assert.h
32. assert(…)
33. mtat.c, test_mtat.c, warmup.c
34. pthread.h
35. pthread_t, pthread_create, pthread_join
36. mtat.c, test_mtat.c, warmup.c
37. stdbool.h
38. true, false
39. mtat.c, test_mtat.c, warmup.c
40. stdlib.h
41. malloc(…), free(…), qsort(…), NULL, EXIT_SUCCESS, EXIT_FAILURE
42. mtat.c, test_mtat.c, warmup.c
43. stdio.h
44. FILE
45. mtat.c, test_mtat.c, warmup.c
46. stdio.h
47. fopen, fclose
48. test_mtat.c
49. string.h
50. strcat(…), strlen(…), strcpy(…), strcmp(…), strerror(…), memcpy(…)
51. mtat.c, test_mtat.c, warmup.c
52. errno.h
53. errno
54. mtat.c, warmup.c
55. If you need a simple math function like min(…), max(…), floor(…), or ceil(…), just write your own helper function.
56. Submissions must meet the code quality standards and the policies on homework and academic integrity.
Compile
This assignment relies on your code from EC01 but they must be in separate files. Do not copy your EC01 code into mtat.c or mtat.h. Your code should work with any working EC01 code. Your mtat.c must #include both mtat.h and bmp.h.
To compile thread that uses the pthread library, you must include the -pthread flag.
gcc -pthread -o test_mtat mtat.c bmp.c test_mtat.c
Pre-tester ●
The pre-tester for EC02 has not yet been released. As soon as it is ready, this note will be changed.
Submit
In general, to submit any assignment for this course, you will use the following command:
264submit ASSIGNMENT FILES…
For EC02, you will type 264submit ec02 warmup.c mtat.c mtat.h bmp.c bmp.h test_mtat.c from inside your ec02 directory.
You can submit as often as you want, even if you are not finished with the assignment. That saves a backup copy which we can retrieve for you if you ever have a problem.
How much work is this?
This assignment does not require a lot of code. However, writing code that uses threads is different from most of what you have done in ECE 264. In particular, it can be challenging to debug. Give yourself time to understand multi-threaded programming.
Bonus #1: Median filter (2 bonus points)
Add a multi-threaded median filter using a very similar technique to the adaptive threshold. Do not use bubble sort or any other n2 algorithm to calculate the median. You may however use qsort(…).
Your function signature will look like this:
BMPImage✶ median(const BMPImage✶ image, int radius, int num threads, const char✶✶ a error)
To indicate that you have completed bonus #1, include the following at the top of your mtat.h file.
Bonus #1 will be accepted for full credit up to 05/07/2020. Add #define BONUS_PARALLEL_MEDIAN_FILTER to your mtat.h to indicate that you have completed Bonus #1. Submit together with the rest of your EC02. Your test_mtat.c should test your median(…), as well as the rest of your EC02. The same testing requirements apply. As long as your tests verify the correctness in some reasonable fashion—and your code passes them—you will get full credit for this bonus. There is generally no partial credit on bonuses. Scoring may be done manually or by an automated testing script, depending on the number who attempt it.
Bonus #1 is recommended for anyone who has finished the main assignment and wishes to pick up some extra points at the end of the semester. The points are intended to be relatively generous.
Bonus #2: Constant time median filter (6 bonus points)
Implement the constant time median filter algorithm described in this article by Simon Perreault & Patrick Hébert.
An in-person WebEx code review with Prof. Quinn is required for Bonus #2. Email him by 05/07/2020 (with subject “schedule code review of EC02 bonus #2 – ECE 264”) to arrange a time. The walk-through could occur as late as Sun 5/10/2020. Submit together with the rest of your EC02 any time before your scheduled code review. Add #define BONUS_CONSTANT_TIME_MEDIAN_FILTER to your mtat.h.
Bonus #2 is intended for who are hungry for a challenge. In a past semester, 4-5 students attempted it; 2 succeeded.
Q&A
1. Why can’t we use division to calculate pixel intensities?
This is to ensure that results will be predictable and consistent. If people used division, there might be subtle differences between implementations based on how you structure the calculation.
2. How can I calculate the color intensity without using division?
First, let’s look at how you would do this conceptually. You would convert each pixel to grayscale by computing the average of its red, blue, and green channels. That would yield a grayscale intensity between 0 and 255 for each pixel in the image. Then, for a given pixel, you would compare its grayscale intensity to the average of the grayscale intensities of the pixels in its neighborhood. Of course, computing these averages would use division. So how can we do it without division? Note that the sum of r+g+b for each cell is just 3 times the grayscale intensity. So, instead of dividing by 3, we will simply work with the r+g+b sum for every pixel. Likewise, instead of calculating the average within a neighborhood, we will multiply the value for the single pixel of interest by the number of pixels in its neighborhood and compare that with the sum of the r+g+b sums for every pixel in the neighborhood.
Summary: Compare (r+g+b)*num_pixels_in_neighborhood with the sum of r+g+b over all cells in the neighborhood.
3. How can I compare two BMP files?
For your unit tests, you can create a helper function, probably with a for loop that compares each pixel, one by one.
For ad hoc checking, you can use command line tools. Suppose we have two files, expected.bmp and actual.bmp. The most basic thing to do would be to use xxd to store the hex dump of each and then diff them. xxd expected.bmp > expected.bmp.txt
4. xxd actual.bmp > actual.bmp.txt
5. diff expected.bmp.txt actual.bmp.txt
6.
We can do much better using a bash feature called process substitution, combined with a vim feature called vimdiff.
vimdiff <(xxd expected.bmp) <(xxd actual.bmp)
To skip the header, and compare just the pixel data, add -s +54 to each xxd command: vimdiff <(xxd -s +54 expected.bmp) <(xxd -s +54 actual.bmp)
To look at only the header, add -l 54 to each xxd command: vimdiff <(xxd -l 54 expected.bmp) <(xxd -l 54 actual.bmp)
To see the difference at the command line, instead of in vim, substitute diff in place of vimdiff in the above commands.
7. Why is the maximum number of threads 2,064,376?
It is a system constant. To see the value on any Linux system, type this command in bash: cat /proc/sys/kernel/threads-max
Updates
None.
© Copyright 2020 Alexander J. Quinn This content is protected and may not be shared, uploaded, or distributed.