Signals and process collaboration
By doing project 11, you will learn about Unix signals and how to to
customize their handling for a program. You will also learn a little bit
about how processes might collaborate, but in a simplified setting that
does not use functions such as pipe(), dup(), fork(), and execve(). (Please
do NOT use any of these four commands in this project, as incorrect use
of fork() can significantly impact a system like lectura). You will also
learn about more generic reading and writing, and also checking system
call returns as they can, and will, fail in this assignment. You will also get
even more comfortable studying man pages. Finally, you will be able to
show off your parsing skills that you have developed throughout this
semester.
You are to write two programs called master and slave that will be
used together through a pipe created for you by bash. Specifically, a
possible use is:
master | slave
another is:
master < test_file_01 | slave 1> test_file_01.stdout 2> test_
file_01.stderr
For development, you can test with ex_slave with your master and
ex_master with your slave, and when we test we might do that,
including testing with errant versions of one or the other. Even though
you are writing both programs, you want to imagine that someone
could change the input or output of one of them, and you must be
ready for errant behavior from an errant partner. (This is practice in
preventing security holes).
What to hand in
A directory proj11 that has at least three .c files and at least one .h file
and a Makefile that supports “make all” (both master and slave), “make
master”, “make slave”, and “make clean”. Make clean must be able to be
called repeatedly without error even if things are already clean. Of
course, repeated calls to make for the other targets should do the right
thing also.
Error messages
Both master and slave will write error messages to standard error and
we need to know who is speaking. Thus, all error messages from the
master must begin with “Master: ” and all messages from the slave must
begin with “Slave: “. The rest of the content is largely up to you with
some suggests noted below. Slightly different from previous
assignments, the two kinds of messages are considering separately.
Thus, if the reference yields an error messages from the slave, your
program should yield one or more for the slave. One from the master
won’t count, and would be considered spurious if the master is not also
supposed to write an error message. The logic is the same as we have
been using so far, except that there are now TWO logical standard error
streams.
Initialization (master and slave agreeing on slave
PID)
The slave program will begin by sleeping for two seconds, followed by
writing its PID (process id) to file called slave_pid, overwriting anything
else in the file, followed by closing that file. This file should not be
deleted by either the current instance of master or slave. (When testing,
we will check for that file before trying the next test case). The slave will
then read instructions from stdin (which is the stdout of the master),
and perform actions in response to each line as described below. If EOF
is reached, then the slave cleans up as needed, and then exits with 0 if
no non-fatal errors have been recorded, and non-zero otherwise. If the
slave encounters a bad instruction, or runs into any other kind of
serious trouble (fatal errors) it will print a message on stderr and exit
with a non-zero return code.
The master begins by finding out the current time in form that is
compatible with file timestamps. It will then read the PID from
“slave_pid”. However, a simple getline() will not suffice. First, the master
does not know if the slave managed to write that file before this
attempt to read it, and in fact, we have arranged it so that it probably
has not (two second sleep in the previous paragraph). This is because
want to be sure that the we are not reading a previous version of the
file. While this is somewhat of a hack, we will assume that the
information in the file is current if the modify date of the file is at least
one second later than the current time collected when the master
started. We also want to ensure that everything works even if the slave
is delayed for a few seconds.
One way to meet the above criteria is to have the master try to read the
line within a loop that starts with sleep(1), then checks if the file exists
and is young enough, and can be opened for reading. If so, then we exit
the loop. We should loop several times if needed, but the maximum
wait time should not exceed 10 seconds. If the master still cannot read
the file, it prints a suitable message to stderr, and exits with non-zero
status.
Like the slave, the master will then proceed to read lines from stdin.
These lines will be processed, leading to lines being written to standard
output which will be a pipe to the slave. However, it first prepares for
the fact that it is writing to a pipe, and the slave might exit, leaving no
one to speak to. When this happens, the master gets the signal SIGPIPE,
which will terminate the master. But we do not want to die quite so
dramatically. So, we need to change what happens. One way (which is
required for consistency for the grading scripts) is to ignore the signal,
and pay attention to the return value of the write() call.
Read, parse, and execute
Assuming that the master was able to read the slave PID, it will read and
process lines from standard input until EOF. These lines can be of any
length. However, if the first non-blank character is ‘@’, then the line is a
command and needs to be handled as detailed shortly, but often the
action is to simply write it to standard output, provided that it is
syntactically correct. If the first non-blank character is not a ‘@’, then
the line, including any leading blanks, is written to stdout (a simple
echoing of the input). The slave will process the input stream similarly,
handling commands, and echoing non-command input.
If the first non-blank character is ‘@’, then we read the command up to
the next blank or end of line, followed by reading any arguments after
the blank. All commands are exactly one lower case letter. The syntax of
the commands follow one of these patterns.
@ALPHA
@ALPHA TEXT
@ALPHA NUM
@ALPHA NUM TEXT
Here, ALPHA is a single character, NUM is an integer between 1 and 31,
and TEXT is arbitrary text that might have a size limit depending on the
command. For all commands trailing space is OK, but should be ignored
(trimming makes sense). Parsing of commands proceeds similarly for
the master and the slave, but consequences can be different, including
handling of syntax errors. For syntax errors, both will write a message to
stderr, and set up for non-zero exit status, but the master will continue
processing lines, wheres the slave will cleanup and exit (syntax errors
are fatal for the slave). Errors handled this way include ALPHA not being
a single character, or ALPHA not being a command. The set of allowable
commands is slightly different for the master and the slave.
Some of the details are easier to understand if you keep in mind that
the slave should only get instructions that are syntactically correct and
relevant to what it does, as it only gets input from the master (except
possibly in testing). Therefore, if the input it receives is not valid,
something is really wrong.
Commands:
@c TEXT : The command ‘c’ is a comment. Everything
following, including nothing, is ignored. However, if there is some
non-blank text, it must be separated from the ‘@c’ by a blank. If
the syntax is OK, the master does nothing. However, for the slave,
@c is a fatal syntax error.
@k NUM : The command ‘k’ (“kill”—which is Unix speak for
“signal”) tells the master to send the slave the signal NUM. The ‘k’
must be followed by one ore more spaces, and then exactly one
integer between 1 and 31. Extra spaces after the signal number
are OK, but ignored. For the master, non-conforming input
should not lead to any action, except an error message on stderr,
and eventual non-zero exit, but it continues processing
commands. If the input is conforming, then we do:
sync() (see below) to force what has been written into the
buffer to be sent
sleep(1) to give the slave a chance to processes the
backlog
kill() to send the signal to the slave.
To be safe, we check the return. If kill() returns an
error, the relevant text must be reported on stderr
with the prefix “Master: ” — you might find perror()
especially useful here. Error return from kill() also
leads to a delayed non-zero error exit.
The slave does not recognize @k as a command,
and seeing it is a fatal error.
@s NUM TEXT : The command ‘s’ (“set”) is sent to the slave as
entered, provided that it is syntactically correct. In addition to the
usual criteria, the length of TEXT after trimming must be between
1 and 63 characters, so that it fits into a buffer of size 64.
Slave. If syntactically correct, the slave changes its action on
signal NUM to write TXT to standard output, followed by a return.
Setting this behaviour is not always possible (e.g., for SIGKILL). If
the attempt to do leads to an error return, then a message should
be written to stderr with the prefix “Slave: “. You might find
perror() helpful here, even though what it says might not be very
informative. This error is not fatal, and the slave continues to
process commands, but when the slave does exit, it will do so
with non-zero status. (Note that this is not a fatal error because it
can result from an innocent user error, as opposed to program
bug or malicious user).
@i NUM : The command ‘i’ (“ignore”) is sent to the slave as
entered, provided that it is syntactically correct.
Slave. If syntactically correct, the slave changes its action on
signal NUM to ignore the signal. Again, this is not always possible
(e.g., for SIGKILL). If the attempt to do leads to an error return
from sigaction(), then a message should be written to stderr with
the prefix “Slave: “. This error is not fatal, and the slave continues
to process commands, but when the slave does exit, it will do so
with non-zero status.
@t NUM : The command ‘t’ (“terminate”) is sent to the slave as
entered, provided that it is syntactically correct.
Slave. If syntactically correct, the slave changes its action on
signal NUM to terminate after cleanup, with error status
dependent on whether it has encountered errors. Changing this
action not always possible, and if the attempt to do leads to an
error return from sigaction(), then a message should be written to
stderr with the prefix “Slave: “. This error is not fatal, and the slave
continues to process commands, but when the slave does exit, it
will do so with non-zero status.
@r NUM : The command ‘r’ (“reset”) is sent to the slave as
entered (much like input without the leading ‘@’), provided that it
is syntactically correct.
Slave. If syntactically correct, the slave changes its action on
signal NUM to the default. Changing this action not always
possible, and if the attempt to do leads to an error return from
sigaction(), then a message should be written to stderr with the
prefix “Slave: “. This error is not fatal, and the slave continues to
process commands, but when the slave does exit, it will do so
with non-zero status.
All other lines with ‘@’ as the first non blank character are errors,
including any that are more than a single character, are syntax
errors. As you would expect, for these errors, the master writes a
message to standard error, notes that it needs to exist with non-
zero status, and continues processing commands. Whereas the
slave considers them fatal errors.
Additional requirements and suggestions (read
carefully).
When the master or the slave exits, it needs to free all memory.
Recommended reading: Class notes, man pages for write(),
getpid(), sleep(), time(), kill(), sigaction (here you find how to
implement many of the commands), and signal (section 7) which
is one place to find the signals themselves. You should start to
become more familiar with a few of the more common ones, such
as SIGKILL, SIGTERM, SIGINT, SIGBUS, SIGABRT, SIGSEGV. You
have already seen some of these in action, at least implicitly,
when the OS tells your program it is being bad.
The master and slave share some parsing logic, and this
should not be duplicated. You want to develop functions that
handle parsing for both. Note that the slave should check for
input errors, even though they would not occur if the master is
correct. This is because the master might have a bug, or it, or its
output stream, might have been tampered with. Your parsing
function may need to know if they are being called by the slave
or the master, but this can be an argument to them. Thus you
need at least four code files: One .h file, and three .c files. A few
more might be warranted for a clean collection of files.
Writing to the pipe. For this assignment, when you write to the
pipe, you need to use the function write(). In general, you can use
other functions (e.g., fprintf()) to write to pipes, but there are
subtle differences that are hard to handle in our grading script. If
you use something else, you might miss some of our test cases.
Synchronization. The master and slave are running in parallel,
and just because the master has sent some data over the pipe
does not mean that the slave has had a chance to deal with it.
This means output (including any debugging output you are
writing) might be delayed or in unexpected order. To simplify
debugging, and to make what your program writes more suitable
for a grading script, we need some order. You should use sync()
(which takes void and returns void) to make sure data is not in a
buffer waiting to be sent once it has been checked for errors, etc.
Doing this as every line is processed (by either party) should
suffice, except for the ‘k’ command where we take an extra step
to further allow the slave to deal with its backlog (see above). If
you do chose to use C routines such as fprintf to write the slave’s
output, then you need to use fflush(), and the sync() is still
recommended (depends on the implementation).
Recommendation: You do not need to implement functionality
in the order that it occurs, and you might want to leave certain
details until some of the more basic functionality is in place.
Regardless, try to move your code from one level of functionality
to another in small, tested, steps. Consider first ignoring having
the master and slave talk to each other, and skip the code that
gets the slave PID to the master. Instead, test the slave and
master (and thus the parser they share) by reading files from
stdin. Then, notice that you only need the slave pid for the “k”
command. So you can check a fair bit of the basics without that
command or reading the slave PID. You can also send the slave
the contents of a file (e.g., “testfile”) and then have it wait for you
using:
cat testfile /dev/stdin | slave
followed by figuring out the slave pid (try “ps -u”) in a different
terminal session, and then sending it signals with the system
version of kill. Finally, remember that you will have access to
ex_master and ex_slave, which might provide additional
opportunities for developing small pieces at a time.
Your code generally should be relatively generic with respect to
the signal number, but when you do need to refer to one, use the
symbols defined in signal oriented header files, e.g., SIGTERM.
Warning: Not reading this will cost you time!
We did not cover this in class, but when certain signals are
handled, IO is suspended, and the default is not to restart it. To
change this, you want to set a certain bit in the “sa_flags” field of
the sigaction structure when you set up your handler. To do this:
my_action.sa_flags = 0; // Or, perhaps it is already 0
my_action.sa_flags |= SA_RESTART; // Written in a way that
does not interfere with other flags.
Also, do not forget to set sa_mask with sigemptyset() as covered
in class.
You need a few signal handlers, but not one for each signal
number. The signal handler knows the signal that triggered it.
A sensible data structure for one of the handlers is an array of
char arrays, and you are allowed to use such a thing. Rejoice!