ECS 150: Project #2 – User-level thread library
ECS 150: Project #2 – User-level thread library
Prof. Joël Porquet-Lupine
UC Davis, Winter Quarter 2021
Changelog
General information
Objectives of the project
Program description Introduction
Constraints
Assessment
Suggested work phases Phase 0: Skeleton code
Phase 1: queue API
Phase 2: uthread API
Phase 3: uthread_join()
Phase 4: preemption
Submission Content
Gradescope
Academic integrity
Changelog
The specifications for this project are subject to change at anytime for additional clarification. Make sure to always refer to the latest version.
v2: s/(void)queue/(void)q
v1: First publication
General information
Due before 11:59 PM, Friday, February 12th, 2020.
You will be working with a partner for this project.
The reference work environment is the CSIF.
Objectives of the project
The objectives of this programming project are:
Implementing one of the most used containers in system programming (a queue/list) as specified by a given API.
Learning how to test your code, by writing your own testers and maximizing the test coverage.
Understanding how multiple threads can run within the same process: from the creation of new threads to their termination, and including how to perform context switching during their concurrent execution.
Writing high-quality C code by following established industry standards.
Program description
Introduction
The goal of this project is to understand the idea of threads, by implementing a basic user-level thread library for Linux. Your library will provide a complete interface for applications to create and run independent threads concurrently.
Similar to existing lightweight user-level thread libraries, your library must be able to:
Create new threads
Schedule the execution of threads in a round-robin fashion
Provide a synchronization mechanism for threads to join other threads
Be preemptive, that is to provide an interrupt-based scheduler
Constraints
Your code must be written in C, be compiled with GCC and only use the standard functions provided by the GNU C Library (aka libc). All the functions provided by the libc can be used, but your program cannot be linked to any other external libraries.
Your source code should adopt a sane and consistent coding style and be properly commented when necessary. One good option is to follow the relevant parts of the Linux kernel coding style.
Assessment
Your grade for this assignment will be broken down in two scores:
Auto-grading: ~60% of grade
Running an auto-grading script that tests your code and checks the output against various inputs
Manual review: ~40% of grade
The manual review is itself broken down into different rubrics:
Submission : ~5%
Report file: ~30%
Makefile: ~10%
Queue implementation: ~10%
Queue testing: ~10%
Uthread library implementation: ~15%
Preemption: ~5%
Uthread library testing: ~5%
Code style: ~10%
Suggested work phases
Phase 0: Skeleton code
The skeleton code that you are expected to complete is available in /home/cs150jp/public/p2/. This code already defines most of the prototypes for the functions you must implement, as explained in the following sections.
$ cd /home/cs150jp/public/p2
$ tree
.
├── apps
│ ├── Makefile
│ ├── queue_tester_example.c
│ ├── uthread_hello.c
│ └── uthread_yield.c
└── libuthread
├── context.c
├── Makefile*
├── preempt.c*
├── private.h
├── queue.c*
├── queue.h
├── uthread.c*
└── uthread.h
The code is organized in two parts. In subdirectory apps, you will find a few test programs. The tester queue_tester_example.c focuses solely on the queue implementation, and can be run after Phase 1 is completed. The other two testers make use of the thread library and require Phase 2 to be compiled and run.
Subdirectory libuthread contains the files composing the thread library that you must complete. The files to complete are marked with a star.
You should have no reason to touch any of the headers which are not marked with a star (even if you think you do…).
Copy the skeleton code to your account.
Phase 1: queue API
In this first phase, you must implement a simple FIFO queue. The interface to this queue is defined in libuthread/queue.h and your code should be added into libuthread/queue.c.
You will find all the API documentation within the header file.
The constraint for this exercise is that all operations (apart from the iterate and delete operation) must be O(1). This implies that you must choose the underlying data structure for your queue implementation carefully.
Makefile
Complete the file libuthread/Makefile in order to generate a static library archive named libuthread/libuthread.a.
This library archive must be the default target of your Makefile, because your Makefile is called from the Makefile in the apps directory without any argument.
Note that at first, only the file libuthread/queue.c should be included in your library. You will add the other C files as you start implementing them in order to expand the API provided by your library.
Useful resources for this phase:
http://tldp.org/HOWTO/Program-Library-HOWTO/static-libraries.html
Testing
Add a new test program in the apps directory, called queue_tester.c, which tests your queue implementation. It is important that your tester should be as comprehensive as possible in order to ensure that your queue implementation is resistant. It will ensure that you don’t encounter bugs when using your queue later on.
A good approach for testing your queue implementation is unit testing. The basic idea behind unit testing is to invent all the possible usage scenarios that will trigger the different parts of the implementation, and all the edge cases, in order to guarantee that the implementation always matches the specifications.
Here are two examples to get you started. The first unit test simply checks that creating a queue succeeds, while the second checks a simple enqueue/dequeue scenario:
void test_create(void)
{
queue_t q;
q = queue_create();
assert(q != NULL);
}
void test_queue_simple(void)
{
queue_t q;
int data = 3, *ptr;
q = queue_create();
queue_enqueue(q, &data);
queue_dequeue(q, (void**)&ptr);
assert(ptr == &data);
}
You can find a comprehensive example of a queue tester in /home/cs150jp/public/p2/progs/queue_tester_example.c, which you can complete with more unit tests.
Hints
Most of the functions of this API should look very familiar if you have ever coded a FIFO queue (e.g. create, destroy, enqueue, dequeue, etc.). However, one function of the API stands out from typical interfaces: queue_iterate(). This function provides a generic way to call a custom function (i.e. a function provided by the caller) on each item currently enqueued in the queue.
For example, the following snippet of code shows you how a certain operation can be applied to every item of a queue, or how you can find a certain item in the queue and return it:
/* Callback function that increments integer items by a certain value (or delete
* item if item is value 42) */
static int inc_item(queue_t q, void *data, void *arg)
{
int *a = (int*)data;
int inc = (int)(long)arg;
if (*a == 42)
queue_delete(q, data);
else
*a += inc;
return 0;
}
/* Callback function that finds a certain item according to its value */
static int find_item(queue_t q, void *data, void *arg)
{
int *a = (int*)data;
int match = (int)(long)arg;
(void)q; //unused
if (*a == match)
return 1;
return 0;
}
void test_iterator(void)
{
queue_t q;
int data[] = {1, 2, 3, 4, 5, 42, 6, 7, 8, 9};
size_t i;
int *ptr;
/* Initialize the queue and enqueue items */
q = queue_create();
for (i = 0; i < sizeof(data) / sizeof(data[0]); i++)
queue_enqueue(q, &data[i]);
/* Add value '1' to every item of the queue, delete item '42' */
queue_iterate(q, inc_item, (void*)1, NULL);
assert(data[0] == 2);
assert(queue_length(q) == 9);
/* Find and get the item which is equal to value '5' */
ptr = NULL; // result pointer *must* be reset first
queue_iterate(q, find_item, (void*)5, (void**)&ptr);
assert(ptr != NULL);
assert(*ptr == 5);
assert(ptr == &data[3]);
}
Hopefully, you will find that this function can be useful when implementing the uthread API. One interesting usage may be, for example, to debug your queue(s) of threads by printing them! But other effective usages are possible too…
Phase 2: uthread API
In this second phase, you must implement most of the thread management (some is provided to you for free). The public interface to this thread API is defined in libuthread/uthread.h and your code should be added into libuthread/uthread.c.
You will find all the API documentation within the header file.
Thread definition
Threads are independent execution flows that run concurrently in the address space of a single process (and thus, share the same global variables, heap memory, open files, process identifier, etc.). Each thread has its own execution context, which mainly consists of:
an identifier, know as TID (Thread IDentifier)
a state (running, ready, blocked, etc.)
a backup of the CPU registers (for saving the thread upon descheduling and restoring it later)
a stack
The goal of a thread library is to provide applications that want to use multithreading an interface (i.e. a set of library functions) that the application may use to create and start new threads, terminate threads, or manipulate threads in different ways.
For example, the most well-known and wide-spread standard that defines the interface for threads on Unix-style operating systems is called POSIX thread (or pthread). The pthread API defines a set of functions, a subset of which we want to implement for this project. Of course, there are various ways in which the pthread API can be realized, and existing libraries have implemented pthread both in the OS kernel and in user mode. For this project, we aim to implement a few pthread functions at user level on Linux.
Public API
The API of the uthread library defines the set of functions that applications and the threads they create can call in order to interact with the library.
The first function an application has to call in order to kick off the uthread library uthread_start(). This function must initialize the uthread library by registering the so-far single execution flow of the application as the main thread that the library can later schedule for execution like any other thread.
After the library is initialized, the main thread can call uthread_create() to create a new user thread, running the function specified as argument, and register it to the library so that it can be scheduled for execution later. In this function, you will need to use the context API as discussed below.
This first user thread can in turn call uthread_create() in order to create more user threads, and so on.
For this step, we expect the scheduler to be non-preemptive. Threads must call the function uthread_yield() in order to ask the library’s scheduler to pick and run the next available thread. In non-preemptive mode, a non-compliant thread that never yields can keep the processing resource for itself.
For this first step, you don’t have to write uthread_join() the way it is defined by the documentation. At this point, we assume that uthread_join() is only called by the main function, mostly in order to wait for the threading system to terminate executing threads. You can therefore implement uthread_join() as follows:
Execute an infinite loop in which
If there are no more threads which are ready to run in the system, break the loop and return
Or simply yield to the next available thread
Once all user threads have completed, the main thread can call uthread_stop() in order to free all the library’s resources and resume a regular monothreaded execution.
Private data structure
In order to deal with the creation and scheduling of threads, you will need a data structure that can store information about a single thread.
This data structure will likely need to hold, at least, information mentioned above such as the TID, the context of the thread (its set of registers), information about its stack (e.g., a pointer to the thread’s stack area), and information about the state of the thread (whether it is running, ready to run, or has exited).
It will also need to contain more information when you implement the real version of uthread_join().
This data structure is often called a thread control block (TCB).
Internal context API
Some code located in libuthread/context.c, and which interface is defined in libuthread/private.h, is accessible for you to use. The four functions provided by this library allow you to:
Allocate a stack when creating a new thread (and conversely, destroy a stack when a thread is finally deleted)
Initialize the stack and the execution context of the new thread so that it will run the specified function
Switch between two execution contexts
Useful resources if you would like to further understand how the context API works internally:
https://www.gnu.org/software/libc/manual/html_mono/libc.html#System-V-contexts
Testing
Two programs can help test this phase:
uthread_hello: creates a single thread that displays “Hello world!”
uthread_yield: creates three threads in cascade and test the yield feature of the scheduler
Phase 3: uthread_join()
When a thread exits, its resources (such as its TCB) are not automatically freed. It must first be “joined” by another thread, which can then (or not) collect the return value of the dead thread. The concept is very similar to waitpid() which you have seen in Project 1.
In this phase, you need to remove the infinite loop that you used in the previous phase for uthread_join() and implement the proper behavior for this function.
When a thread T1 joins another thread T2 (usually, a parent joins its child), there are two possible scenarios:
If T2 is still an active thread, T1 must be blocked (i.e. it cannot be scheduled to run anymore) until T2 dies. When T2 dies, T1 is unblocked and collects T2.
If T2 is already dead, T1 can collect T2 right away.
Once T2 is collected, its resources can be finally entirely freed.
When T1 is unblocked, it should be scheduled after all the currently runnable threads.
Testing
For testing this phase, you should probably modify the two previous programs and add some proper joining from parents to children and see if the resulting synchronization corresponds to what is expected.
Phase 4: preemption
Up to this point, uncooperative threads could keep the processing resource for themselves if they never call uthread_yield().
In order to avoid such dangerous behaviour, you will add preemption to your library. The interface of the preemption API is defined in libuthread/private.h and your code should be added to libuthread/preempt.c.
This preemption API is not meant to be exposed to user threads, it should stay completely transparent for them. Whenever the user code of a thread is running, preemption should be enabled.
preempt_{start,stop}()
The function preempt_start() should be called when the uthread library is initializing in order to set up preemption, if required by the user. The setup is a two-step procedure:
Install a signal handler that receives alarm signals (of type SIGVTALRM)
Configure a timer which will fire an alarm (through a SIGVTALRM signal) a hundred times per second (i.e. 100 Hz)
Your signal handler, which acts as the timer interrupt handler, will force the currently running thread to yield, so that another thread can be scheduled instead.
If preemption was enabled, the function preempt_stop() should be called in uthread_stop(), once the multithreading phase of the application ends. It should restore the previous signal action, and restore the previous timer configuration.
Useful resources for this phase:
https://www.gnu.org/software/libc/manual/html_mono/libc.html#Signal-Actions
https://www.gnu.org/software/libc/manual/html_mono/libc.html#Setting-an-Alarm
It is mandatory to use sigaction() over signal().
preempt_{enable,disable}()
The two other functions that you must implement are meant to enable or disable preemption. For that, you will need to be able to block or unblock signals of type SIGVTALRM.
Useful resources for this phase:
https://www.gnu.org/software/libc/manual/html_mono/libc.html#Blocking-Signals
About disabling preemption…
Preemption is a great way to enable reliable and fair scheduling of threads, but it comes with some pitfalls.
For example, if the library is accessing sensitive data structures in order to add a new thread to the system and gets preempted in the middle, scheduling another thread of execution that might also manipulate the same data structures can cause the internal state of the library to become inconsistent.
Therefore, when manipulating shared data structures, preemption should be temporarily disabled so that such manipulations are guaranteed to be performed atomically, as critical sections.
However, avoid disabling preemption each time a thread calls the library. Try to disable preemption only when necessary. For example, the creation of a new thread can be separated between sensitive steps that need to be done atomically and non-sensitive steps that can safely be interrupted and resumed later without affecting the consistency of the shared data structures.
A good way to figure out whether preemption should be temporarily disabled while performing a sequence of operations is to imagine what would happen if this sequence was interrupted in the middle and another thread scheduled.
Testing
Add a new test program in the apps directory, called test_preempt.c, which tests the preemption. Explain in your report why this program demonstrates that your preemptive scheduler works.
The test program doesn’t have to be overly complicated…
Submission
Content
Your submission should contain:
The source code of your library in libuthread/.
The source code of your tester(s) in apps/.
AUTHORS.csv: student ID and email of each partner, one entry per line formatted in CSV (fields are separated with commas). For example:
$ cat AUTHORS.csv
00010001,jdupont@ucdavis.edu
00010002,mdurand@ucdavis.edu
$
REPORT.md: a description of your submission. Your report must respect the following rules:
It must be formatted in markdown language as described in this Markdown-Cheatsheet.
It should contain no more than 200 lines and the maximum width for each line should be 80 characters (check your editor’s settings to configure it automatically –please spare yourself and do not do the formatting manually).
It should explain your high-level design choices, details about the relevant parts of your implementation, how you tested your project, the sources that you may have used to complete this project, and any other information that can help understanding your code.
Keep in mind that the goal of this report is not to paraphrase the assignment, but to explain how you implemented it.
libuthread/Makefile: a Makefile that compiles your source code without any errors or warnings (on the CSIF computers), and builds a static library named libuthread.a.
The compiler should be run with the options -Wall -Wextra -Werror.
There should also be a clean rule that removes generated files and puts the directory back in its original state.
The Makefile should use all the advanced mechanisms presented in class (variables, pattern rules, automatic dependency tracking, etc.)
Your submission should be empty of any clutter files (such as executable files, core dumps, backup files, .DS_Store files, and so on).
Gradescope
Gradescope will be opened for submission on Wednesday, February 10th at 0:00. At that time, you will be able to submit your project as a Git repository.
There should be only one final submission per group, submitted by one of the two partners.
The other partner should be added to the submission as “group member”.
Academic integrity
Novelty
You are expected to write this project from scratch.
Therefore, you cannot use any existing source code available on the Internet, or even reuse your own code if you took this class before.
Authorship
You are also expected to write this project yourself.
Asking anyone someone else to write your code (e.g., a friend, or a “tutor” on a website such as Chegg.com) is not acceptable and will result in severe sanctions.
Sources
You must specify in your report any sources that you have viewed to help you complete this assignment. All of the submissions will be compared with MOSS to determine if students have excessively collaborated, or have used the work of past students.
Violation
Any failure to respect the class rules, both as explained above and in the syllabus, or the UC Davis Code of Conduct will automatically result in the matter being transferred to Student Judicial Affairs.
Copyright © 2017-2021 Joël Porquet-Lupine
Computer_Science_Logo