Assignment 1
CSE 130: Principles of Computer System Design, Fall 2020
Due: Friday, October 30 at 9:00AM
Goals
The goal for Assignment 1 is to implement a simple single-threaded RPC server that will provide file services. The server will respond to a standard RPC protocol, which this assignment will define, providing the results of several file system functions (read, write, list, unlink, status) that the server will implement. You’ll run the server in a directory, and requests for files will be served from under that directory. Because you’ll be reading and writing files, the state of your server will persist across runs of the server. As usual, you must have a design document and writeup along with your README.md in your git repository. Your code must build rpcserver using make.
Programming assignment: RPC server Design document
Before writing code for this assignment, as with every other assignment, you must write up a design document. Your design document must be called DESIGN.pdf, and must be in PDF (you can easily convert other document formats, including plain text, to PDF).
Your design should describe the design of your code in enough detail that a knowledgeable programmer could duplicate your work. This includes descriptions of the data structures you use, non-trivial algorithms and formulas, and a description of each function with its purpose, inputs, outputs, and assumptions it makes about inputs or outputs.
Write your design document before you start writing code. It’ll make writing code a lot easier. Also, if you want help with your code, the first thing we’re going to ask for is your design document. We’re happy to help you with the design, but we can’t debug code without a design any more than you can.
You must commit your design document before you commit the code specified by the design document. You’re welcome to do the design in pieces (e.g., overall design and detailed design of marshaling functions but not of the full server), as long as you don’t write code for the parts that aren’t well-specified. We expect you to commit multiple versions of the design document; your commit should specify why you changed the design document if you do this (e.g., “original approach had flaw X”, “detailed design for module Y”, etc.). If you commit code before it’s designed, or you commit your design a few minutes before the working code that the design describes, you will lose points. We want you to get in the habit of designing components before you build them.
Program functionality
You may only use system calls (read(), write(), send(), recv()) for input or output of user data. You may use string functions to manipulate user data, and you may use fprintf() or similar calls for error messages. Note that string functions like sprintf() and sscanf() don’t perform user input or output.
Your code may be either C or C++, but all source files must have a .cpp suffix and be compiled by clang++ with no errors orwarningsusingthefollowingflags:-std=gnu++11 -Wall -Wextra -Wpedantic -Wshadow
RPC protocol
One part of your code is to implement the necessary functions for marshaling arguments to and from your RPC server. You must write all of this code yourself; you can’t use library functions for any of it.
Your RPC server may need to support the following argument types. Integer arguments are formatted “on the wire” as big-endian, also known as network order.
Argument type
Example value
Example on wire
uint8 t
0x12
0x12
uint16 t
0x1234
0x12 0x34
uint32 t
0x12345678
0x12 0x34 0x56 0x78
uint64 t
0x123456789abcdef0
0x12 0x34 0x56 0x78 0x9a 0xbc 0xde 0xf0
uint8 t *
hello
0x00 0x05 h e l l o
Assignment 1, CSE 130, Fall 2020 – 1 – © 2020 Ethan L. Miller
Signed versions of integer arguments are sent in the same way as unsigned arguments—big endian—with the only difference beting the sign bit set to 1.
Note that pointers can’t be sent directly, since the value to which the pointer points isn’t available on the destination. Instead, a “string” or any other array of bytes is sent as a uint16 t followed by the data itself. Strings are not terminated by a NUL byte on the wire! In other words, sending the string “hello” requires 8 bytes: 2 bytes for the length, and 5 bytes for the characters in the string excluding the NUL terminator.
You’ll need to write functions to convert values to and from wire format. These functions aren’t difficult to write, but you should test them separately (unit testing). For integers, consider that the first byte to send can be extracted by shifting the value left the correct number of bits and ANDing with 0xff. The same goes for the other bytes. For strings, you’ll need to find the length of the string and copy the string data without the NUL at the end. None of these functions should be more than about 10–15 lines of code.
RPC function calls are sent as a uint16 t specifying the function to call, followed by a uint32 t identifier (more on this in a bit), followed by the arguments to the function. The RPC function call specifies the arguments to send.
The server responds with the uint32 t identifier that the client sent, followed by a one-byte error code, followed by whatever values the RPC call should return given the error code. If there’s a return value from the function, it goes first, followed by any values or buffers (arrays) that the RPC needs to return. The 32 bit identifier should uniquely identify the call that the client sent, so that it can match responses with the requests that it made. Typically, this is done by using a sequence counter that’s incremented for each call. No need to worry about wrapping around a 32 bit value for any assignment this quarter.
The error code must be 0x00 if the call succeeded; otherwise, we’ll use the error codes that Linux already uses. For example, if your code tries to read data from a file that doesn’t exist, it would return ENOENT because that’s what open() would return for a non-existent file that you tried to read.
RPC functions
Your RPC server must support the following functions:
Math functions These functions are included so you can test your marshaling code with simple functions (the first value is the function code that specifies the function).
0x0101 int64 t add(int64 t a, int64 t b): Performs a+b, and returns the result. If the result would overflow, the error code returned should be EINVAL (22).
0x0102 int64 t sub(int64 t a, int64 t b): Performs a-b, and returns the result. If the result would overflow, the error code returned should be EINVAL (22).
0x0103 int64 t mul(int64 t a, int64 t b): Performs a × b, and returns the result. If the result would over- flow, the error code returned should be EINVAL (22).
File functions These functions operate on files stored at the server.
0x0201 int64 t read(char * filename, uint64 t offset, uint16 t bufsize, uint8 t * buffer): Read up to bufsize bytes into buffer starting at offset in filename, and return the number of bytes read. Note that bufsize is a signed integer; this is so that the function can return a negative number on the client. We don’t use ssize t because it might be a different number of bytes on the client and the server. The server will return
the bytes to the client on the wire as a number of bytes followed by the bytes themselves, as described above.
0x0202 int64 t write(char * filename, uint64 t offset, uint16 t bufsize, uint8 t * buffer): Write bufsize bytes from buffer starting at offset into filename, and return the number of bytes written. Obviously, the buffer will have to be sent on the wire as a length followed by data. If the file doesn’t already exist,
this must return an error. It’s OK to write data, even past the end of the file, however.
0x0210 int64 t create(char * filename): Create a new file (with zero bytes) called filename. Returns an error if the file already exists, or if the file couldn’t be created.
0x0220 int64 t filesize(char * filename): Return the size of the file whose name is passed. Your server can use the stat(2) system call to obtain the size of a file.
All integers sent and received by the RPC server are formatted “on the wire” in big-endian format, which we covered in class on Monday, October 12th.
Assignment 1, CSE 130, Fall 2020 – 2 – © 2020 Ethan L. Miller
Note that all of the file system calls take a file name as an argument. There’s no separate open or close call. Your server should open the file (for read or write as appropriate), use lseek() to go to the appropriate offset, and then perform the operation. Once it’s done, the file should be closed. Yes, it’s OK to write at an offset past the end of the file. If any of the calls (open, lseek, read/write) returns an error, stop there and return the error code (errno) as the error for the RPC call.
A sample request message (for a write call) looks like this:
write(“foo”, 0, 4000, buffer)
would send the following, assuming the identifier was 0x01020304:
02 02 | 01 02 03 04 | 00 03 | 66 6f 6f | 00 00 00 00 00 00 00 00 | 0F A0 | data …
Note that the buffer size isn’t sent twice. Even though the argument to write() can take up to a 64-bit integer, the data can only be 216 bytes long. This is OK, because you don’t want to send more than 64 KiB in a single message anyway. If a read() request asks for more than 216 bytes, return an error. A write() request can never include more than 216 bytes of data (why not?).
The response to a read request for 4 bytes looks like this (again, assuming identifier 0x01020304):
01 02 03 04 | 00 | 00 04 | 77 6F 72 6B
Here, again, the return value includes only the length of the byte array and doesn’t include a separate value for the number of bytes read because it’s not necessary. It does include an error code (0x00), however.
RPC server
Your server binary must be called rpcserver, and must be written in C or C++. It should be single-threaded, and will listen on a user-specified address and port, passed as command line arguments. The server will listen to this port, and respond to functions sent using the RPC protocol specified above.
Code for handling network setup and listening is posted on Canvas; we don’t expect you to have to figure this out on your own.
Your server should run in whatever directory the command is started. All file names are relative to this directory.
You may allocate no more than 16 KiB of buffer space for your program. Yes, you may have more than 16 KiB of data in a single read or write message, but you can use the same techniques for buffering that you used in Assignment 0.
Testing your code
You should test your code on your own system. You can run the server on localhost using a port number above 1024 (e.g., 9876). Come up with requests you can make of your server, and try them using the Python program we’ve posted on Canvas.
When you’re ready to submit your code, you should consider cloning a new copy of your repository (from gitlab.soe.ucsc.edu) to a clean directory to see if it builds properly, and runs as you expect. That’s an easy way to tell if your repository has all of the
right files in it. You can then delete the newly-cloned copy of the directory on your local machine once you’re done with it.
We’re providing a service on gitlab.soe.ucsc.edu that will allow you to run a subset of the tests that we’ll run on your code for each assignment. You’ll be able to run this test from the gitlab.soe.ucsc.edu server at most thrice per day (days start at midnight), and (of course) you can only run it on commits that have been pushed to gitlab.soe.ucsc.edu. Running these tests is completely optional—there’s a video that shows you how to do it linked from Canvas.
The gitlab.soe.ucsc.edu test will cover at least half of the functionality points for this assignment, but there will be additional test cases not covered by this service, so you should still do your own testing.
README and Writeup
As for previous assignments, your repository must include (README.md) and (WRITEUP.pdf). The README.md file should be short, and contain any instructions necessary for running your code. You should also list limitations or issues in README.md, telling a user if there are any known issues with your code.
Your WRITEUP.pdf is where you’ll describe the testing you did on your program and answer any short questions the assignment might ask. The testing can be unit testing (testing of individual functions or smaller pieces of the program) or whole-system testing, which involves running your code in particular scenarios.
For Assignment 1, please answer the following question:
Assignment 1, CSE 130, Fall 2020 – 3 – © 2020 Ethan L. Miller
• About what fraction of your design and code are there to handle errors properly? How much of your time was spent ensuring that the server behaves “reasonably” in the face of errors?
• What happens in your implementation if, during an RPC call, the connection is closed before the message is finished? How could this happen in the first place?
Submitting your assignment
All of your files for Assignment 1 must be in the asgn1 directory in your git repository. Your repository must meet the Assignment Acceptance Criteria described on Canvas when you submit a commit for grading. It’s OK to commit and push a repository that doesn’t meet minimum requirements for grading. However, we’ll only grade a commit that meets these minimum requirements. You must submit the commit ID you want us to grade via Google Form, linked to the assignment page on Canvas. This must be done before the assignment deadline.
Hints
• Start early on the design. This is a more difficult program than bobcat!
• Attend section the week of October 19th for details on the code you need to set up a network connection. While
you’ll need this code for your server (obviously), you can “include” it in your design with a simple line that
says “set up the RPC server connection at address X and port Y”.
• You’ll need to use (at least) the system calls socket, bind, listen, accept, connect, send, recv,
stat, open, read, write, close. The last four calls should be familiar from Assignment 0, and send and recv are very similar to write and read, respectively. You should read the man pages or other docu- mentation for these functions. Don’t worry about the complexity of opening a socket; we’ll discuss it in section (see above). You may not use any calls for operating on files or network sockets other than those above.
• Write a set of functions to manage a bounded buffer, as we’ve covered in class. You should be able to add material to this buffer (by recving it) and be able to remove data from it as your functions need it. These two actions should be decoupled, so you can add more data when your functions have consumed the buffer. You’ll also want to track how much data is in the buffer so that your code can pull all of the existing data and ask for more. Your RPC functions can then deal with the bounded buffer so they don’t have to worry about network stuff. Abstraction and modularity FTW!
• Test your server using the RPC client we’re providing on Canvas. Make sure you test error conditions as well as “normal” operation.
• Aggressively check for and report errors via a response. If your server runs into a problem well into reading data, you may not be able to complete sending the data. However, since you’ve already sent the response header (including the error code!), you can’t shorten the response. Instead, just close the connection.
• If you need help, use online documentation such as man pages and documentation on Makefiles. If you still need help, ask the course staff. You should be familiar with the rules on academic integrity before you start the assignment.
• Read and follow the Assignment Acceptance Criteria on Canvas. Grading
As with all of the assignments in this class, we will be grading you on all of the material you turn in, with the approximate distribution of points as follows: design document (35%); coding practices (15%); functionality (40%); writeup (10%).
If you submit a commit ID without a green checkmark next to it or modify .gitlab-ci.yml in any way, your maximum grade is 5%. Make sure you submit a commit ID with a green checkmark.
Assignment 1, CSE 130, Fall 2020 – 4 – © 2020 Ethan L. Miller