Directions
CIS 548 – Project 0 – Spring 2019
CIS 548 Project 0
penn-shredder: Turtles in Time “Tonight I dine on turtle soup!”
shredder, Teenage Mutant Ninja Turtles the Animated Series
CIS548 Staff
DUE: Friday, Jan. 25th @ 10pm
This is an individual assignment. You may not work with others. Please regularly check Piazza throughout this course for coding style, and project specification clarifications. You are encouraged to version control your code, but do not work in public GitHub repositories. Please avoid publishing this project at anytime, even post-submission, to observe course integrity policies.
Overview
In this assignment you will implement a basic shell named penn-shredder that will restrict the run time of executed processes. Your shell will read input from the users and execute it as a new process, but if the process exceeds a timeout, it will be killed. You will complete this assignment using only a specified set of system calls and without the use of standard C library functions (e.g., printf(3), fgets(3), system(3), etc.). You may freely use any functions in string.h. Please be aware of the accuracy of your output contents, especially for spacing as this year’s autograder will take into consideration whitespace for correctness.
1 Specification
At its core, a shell is a simple loop. Upon each iteration, the shell: prompts the user for a program to run; executes the program as a separate process; waits for the process to finish; and then, re-prompts the user completing the loop. The shell exits the loop when the user provides EOF (end of file) (i.e., Ctrl-D). Your shell, penn-shredder, will do all of that, but with a slight twist.
penn-shredder takes a command line option for specifying a program timeout. If any process runs for longer than the specified timeout, penn-shredder will kill the program and report to the user his snide catch- phrase, “Bwahaha … tonight I dine on turtle soup”. The exact whitespace usage in the catch-phrase is necessary for the autograder to correctly grade your assignment.
The following is an example of input that should invoke the catch-phrase:
Compiled 2019-01-16 14:08
CIS 548 – Project 0 – Spring 2019
Here, penn-shredder was executed with a timeout argument of 10 seconds, and any program that runs longer than 10 seconds will be killed (or shredded). The cat program executed without arguments runs indefinitely, and thus was killed after exceeding the timeout. Conversely, pwd returns quickly and was not killed, upsetting shredder and his henchman.
1.1 read, fork, exec, wait, and repeat!
As described previously, a shell is just a loop performing the same procedure over and over again. Essen-
tially, that procedure can be completed with these four system calls:
• read(2) : read input from stdin into a buffer
• fork(2) : create a new process that is an exact copy of the current running program. • execve(2) : replace the current running program with another
• wait(2) : wait for a child process to finish before proceeding
Your program, in pseudo-code, should look roughly like this:
bash# ./penn-shredder 10
penn-shredder# /bin/cat
Bwahaha … tonight I dine on turtle soup
penn-shredder# /bin/pwd /home/yourusername
penn-shredder#
while(1){ read(cmd, …); pid = fork(); if(!pid){
execve(cmd,…);
}else{ wait();
} }
This may seem simple, but there are many, many things that can go wrong. You should spend some time carefully reading the entire man page for all four of these systems calls. 1 To do so, in a terminal type:
bash# man 2 read
where 2 specifies the manual section. If you do not specify the manual section, you may get information for a different read command.
1Hint: You may want to pay attention to wait(2). Be sure you understand how the waitpid (or wait) is called and checked in the example. Every year many students are surprised by its usage after grading.
CIS 548 – Project 0 – Spring 2019
1.2 Timing is Everything
To time a running program your shell will employ the alarm(2) system call. The alarm(2) system call simply tells the operating system to deliver a SIGALRM signal after a specified time. The SIGALRM signal must be handled by your shell; otherwise, your shell will exit.
To handle a signal a signal handing function must be registered with the operating system via the signal(2) system call. When the signal is delivered, the operating system will preempt your shells current operations (e.g., waiting for the program to finish) and this program on your own to best understand signal handling. Here are some questions you could try and answer: What happens if you remove the call to signal(2)? What happens if you provide different arguments to alarm(2)? What happens if you use the sleep(3) function instead of the busy wait?2 You should also spend time carefully reading the entire man page for these systems calls and references in APUE regarding signals and signal handling.
1.3 Hit the kill-switch!
The kill(2) system call delivers a signal to a process. Despite its morbid name, it will only kill (or terminate) a program if the right signal is delivered. One such signal will always do just that, SIGKILL. The SIGKILL signal has the special property that it cannot be handled or ignored, so no matter the program your shell executed it must heed the signal.3
1.4 Prompting and I/O
In programming your shell, you will only use system calls and nothing from the C standard library (ex- cept string.h). This includes input and output functions like printf(3) and fgets(3). Instead you will use the read(2) and write(2) system calls. Consult the manual pages for these functions’ specification.
Your shell must prompt the user for input as follows:
penn-shredder#
Please note that the prompt has a whitespace after the octothorpe, so if a user begins typing, they would see:
penn-shredder# somestringhere
Following the prompt, your program will read input from the user. You may truncate user input to 1024 bytes, but your shell must gracefully handle input longer than that. By graceful, we mean providing some kind of reasonable behavior such as implicitly truncating to 1024 bytes and properly flushing extra bytes. You may assume that the 1024th byte is the null-termination byte.
1.5 Argument Restrictions
Your shell is not required to parse additional program arguments. That is, a user can provide any number of arguments, but your shell may ignore them, simply executing the base program. However, your shell must still execute the program even if arguments are provided. For example:
21pt Extra Credit: What does the man page have to say about mixing calls to sleep(3) and alarm(2)? Answer this question in your README: it should include a sample program.
31pt Extra Credit: What other signal(s) have this property? Which signals will terminate the program if unhandled? Which man page has this information? Answer these questions in your README
CIS 548 – Project 0 – Spring 2019
Due to this restriction, you may find it useful to write separate testing programs that exit after a specified time (perhaps based on the SIGALRM example above). Feel free to submit these test programs with your shell.4
1.6 Executing a Program
Your shell may only use the execve(2) system call to execute a program. This system call instructs the operating system to replace the current running program – that would be the child of your shell – with the specified program. Please refer to the manual for more details.
The execve(2) system call is the base of a larger collection of functions; however, you may not use those other functions. Explicitly, you may not use execl(3), execv(3)), or any other function listed in the exec(3) manual page. As a result, the user of your shell must specify the entire path to a program to execute it. To learn where a program lives (i.e., its path), use the which program in your non-penn-shredder shell.
Forking, like all other system calls, is computationally expensive. You should not call a system call if it’s simple to determine from internal state that the system call would do nothing or fail, and you should monitor the status of any forked child process.
1.7 Ctrl-C behavior
Ctrl-C (SIGINT) is a very helpful signal often used from the shell to stop the current running program. Child processes started from penn-shredder should respond to Ctrl-C by following their normal behavior on SIGINT, but penn-shredder itself should not exit (even when Ctrl-C is invoked without a child process running).
1.8 Arguments to penn-shredder
Although within penn-shredder there is no requirement to handle arguments, the penn-shredder program itself must take an optional argument: the execution timeout. If no argument is provided, then penn-shredder imposes no time restriction on executing processes. For example, an optional argument of 10 results in a timeout of 10 seconds:
bash# ./penn-shredder 10
45pt Extra Credit: For extra credit, write a simple parser that will handle arbitrary arguments and pass them to execve appropriately. Because it’d make it far too easy, for this extra credit opportunity you cannot use any function from the C standard library. This includes everything from string.h, so don’t even think about using strdup(3) or strtok(3) or their variants.
penn-shredder# /bin/sleep 100
sleep: missing operand
Try ‘sleep –help’ for more information.
bash# ./penn-shredder
penn-shredder# /bin/cat
ˆCpenn-shredder# ˆC/bin/pwd
penn-shredder#
CIS 548 – Project 0 – Spring 2019
Omitting the optional argument results in no timeouts.
As mentioned before, with the exception of string.h you cannot use anything from the C standard
library. However, for command line argument int parsing you may make a single call to atoi(3) or strtol(3) to convert command line input into an integer. You should check for errors in this conversion, e.g., when a user provides a character instead of a number.5
2 Error Handling
All system call functions you use will report errors via the return value. As a general rule, if the return value is less than 0, an error occurred and errno is set appropriately. You must check your error conditions and handle or report errors. To expedite the error checking process, we allow you to use perror(3) library function. Although you are allowed to use perror, it does not imply you should report all errors at an extreme verbosity. Instead, try and strike a balance between sparse and verbose reporting.
3 Code Organization
Sane code organization is critical for all software: Your code should not be all in one giant penn-shredder.c file. Rather, you should create reasonable modules and expose the relevant interfaces and constants in .h files. Make sure that your Makefile builds all relevant modules and links them into the final penn-shredder executable.
You code should adhere to DRY (Don’t Repeat Yourself). If you are writing code that is used in more than one place you should write a function or a macro. See the Piazza Post for general coding conventions and tips.
4 Memory Errors
You are required to check your code for memory errors. This is a nontrivial task, but an extremely important one. Code with memory leaks and memory violations will be deducted. Fortunately, there is a very nice tool valgrind that is available to help you. It should be installed on the SpecLab cluster. valgrind is a tool and not a solution. You must still find and fix any bugs that valgrind locates, but there is no guarantee it will find all memory errors in your code, especially those that rely on user input!
5 Acceptable Library Functions
In this assignment you may use only the following system calls:
• execve(2)
• fork(2)
• wait(2)
• read(2)
53pt Extra Credit: For extra credit, implement your own version of atoi(3) that can handle arbitrary numbers within the 32 bit width of an integer.
• write(2) • signal(2) • alarm(2) • kill(2)
• exit(2)
CIS 548 – Project 0 – Spring 2019
And you may use these non-system calls:
• malloc(3) or calloc(3)
• free(3)
• perror(3) for reporting errors
• atoi(3) or strtol(3) but just once! • string(3) for string utility functions
Using any other library function than those specified above will affect your grade on this assignment. If you use the system(3) library function, you will receive a ZERO on this assignment.
6 Developing Your Code
In general, there are two environments for you to develop projects in this course: (1)virtual machine powered by Vagrant or (2) SpecLab servers for remote developing. We recommend the first environment.
For the first choice, we provide a Vagrantfile for your convenience to create a course-standardized virtual machine. You can find the set-up instruction and video on Piazza. We highly recommend you to use it as your developing environment because all grading will be done on it. Please do not develop on macOS as there are significant differences between macOS and Unix variants such as Ubuntu Linux. If you decide to develop on a different machine anyway, you must compile and test your penn-shredder on the course- standardized virtual machine to ensure your penn-shredder runs as you expect during grading.
The course-standardized virtual machine is powered by Vagrant. Vagrant has a synced folder feature, which can automatically sync your files to and from the guest machine. It provides the flexibility so that you can develop your code on the host machine with your familiar IDEs or tools instead of editing in the VM through a terminal-based editor or over SSH. Read the following link to learn more, https: //www.vagrantup.com/intro/getting-started/synced_folders.html.
However, there is a case-sensitive issue in the synced folder you need to pay attention to. It is known that Windows and macOS are case insensitive while Linux is case sensitive. The course-standardized virtual machine is basically a Ubuntu system. In this virtual machine, the file system is case-sensitive except that the synced folder (/vagrant) has the same case-sensitive with the host machine. To be more specific, if your host machine is any Linux distribution, then your virtual machine has no case sensitive issue. However, if your host machine is Windows or macOS, your synced folder /vagrant is case-insensitive and the other directories are case-sensitive. To handle case-sensitive issue and avoid annoying compiling bugs when you collaborate with others, we highly recommend to name all your .h and .c files in camelCase. A good
CIS 548 – Project 0 – Spring 2019
way to check this is to copy your project to folders other than the shared folder in the vagrant virtual machine, and run the code there.
For the second choice, SpecLab will be available to you as CIS-548 students, however, we discourage its usage unless your personal machine has super, super weak performance specifications such as a single core machine with 2GB of RAM or less. SpecLab is a collection of older desktops that run the same Linux variant as eniac, and most importantly, you can crash them as much as you want without causing too much chaos. You access SpecLab remotely over ssh. There are roughly 10 SpecLab machines up at any time. When you are ready to develop, choose a random number between 01 and 50, or so, and then issue this command:
bash# ssh specDD.seas.upenn.edu
where DD is replaced with the number you chose. If that machine is not responding, then add one to that number and try again. Not all SpecLab machines are currently up, but most are. You must be on SEASnet to directly ssh into SpecLab. The RESnet (your dorms) is not on SEASnet, but the machines in the Moore computer lab are. If you are not on SEASnet, you may still remotely access SpecLab by first ssh-ing into eniac and then ssh-ing into SpecLab.
Do not develop your code on eniac. CIS 548 students are notorious for crashing eniac in creative ways, normally via a “fork-bomb.” This always seems to happen about two hours before a homework deadline, and the result is chaos, not just for our class, but for everyone else using eniac.
Students who are caught running penn-shredder directly on eniac will be penalized. 7 What to turn in
You must provide each of the following for your submission, no matter what. Failure to do so may reflect in your grade.
1. README file. In the README you will provide
• Your name and eniac username • A list of submitted source files
• Extra credit answers
• Compilation Instructions
• Overview of work accomplished
• Description of code and code layout
• General comments and anything that can help us grade your code
2. Makefile.Weshouldbeabletocompileyourcodebysimplytypingmake.Yourexecutableshould be named penn-shredder. If you do not know what a Makefile is or how to write one ask one of the TAs for help or consult one the many on-line tutorials.
3. Your code. You may think this is a joke, but at least once a year, someone forgets to include it.
8 Submission
CIS 548 – Project 0 – Spring 2019
Submission should be done through Canvas. Please compress your files via the following:
tar -cvzf yourpennkey-project0.tar.gz inputfile1 inputfile2…
As explained above, please remember to include your README, your Makefile, any header files you have written, along with your main penn-shredder C program. Canvas will accept multiple submissions but we will grade only the most recent submission submitted before the submission deadline.
9
Grading Guidelines
• 10% Documentation
• 10% Proper use of system calls
• 10% Peer review about general code design • 70% Functionality
Please note that general deductions may occur for a variety of programming errors including memory violations, lack of error checking, poor code organization, etc. Also, do not take the documentation lightly, it provides us (the graders) a road-map of your code, and without it, it is quite difficult to grade the imple- mentation.
You may use any comment style you wish, but we highly suggest using DOxygen style comments as they promote fully documenting function inputs, outputs, and error cases. A useful guide to this comment style can be found here: https://www.stack.nl/ dimitri/doxygen/manual/docblocks.html
This year, we are adding a peer review component as a part of your evaluation. After the submission, you will evaluate another student’s code based on a given rubric using a profiling tool such as valgrind and sanitizers. Keep a lookout on Piazza for more details about this after your submission.
Your programs will be graded on the course-standardized virtual machines, and must execute as specified there. Although you may develop and test on your local machine, you should always test that your program functions properly there.
An autograder will be run against your submission so whitespace matters. Specifically, be sure to match the ”Bwahaha … tonight I dine on turtle soup” prompt and include the single whitespace after the octothorpe.
10 Attribution
This is a large and complex assignment, using arcane and compactly documented APIs. We do not expect you to be able to complete it without relying on some outside references. That said, we do expect you to struggle a lot, which is how you will learn about systems programming, by doing things yourself.
The primary rule to keep you safe from plagiarism/cheating in this project is to attribute in your doc- umentation any outside sources you use. This includes both resources used to help you understand the concepts, and resources containing sample code you incorporated in your shell. The course text and APUE need only be cited in the latter case. You should also use external code sparingly. Using most of an example
CIS 548 – Project 0 – Spring 2019
from the pipe(2) man page is ok; using the ParseInputAndCreateJobAndHandleSignals() function from Foo’s Handy Write Your Own Shell Tutorial is not (both verbatim, and as a closely followed template on structuring your shell).