11/28/21, 9:57 AM Assignment 4 – File Systems
Assignment 4 – File Systems
Due: Dec 5 Dec 8, at 10 p.m. (Do not leave this assignment until the last few days, or you will NOT be able to complete it! Work together with your partner, do not split up the tasks since lots of common code may be reusable if you design your code together!)
Introduction
Copyright By PowCoder代写 加微信 powcoder
In this assignment, you will explore the implementation of a particular file system, ext2, and will write tools to modify ext2-format virtual disks. To do this work, you will need to be comfortable working with binary data and will need to learn about the ext2 filesystem. You will additionally need to demonstrate good synchronization practices, as a culmination of the practice you got with synchronization throughout this course.
You can work in pairs for this assignment. MarkUs will only create the appropriate A4 directory in your repository when you log into MarkUs and either invite a partner, or declare that you will work alone. As usual, please log into MarkUs well before the deadline to make sure you can access your repository. (Do not create an A4 the directory in your git repo, otherwise MarkUs won’t know about it and we won’t be able to see your work.)
For implementing any additional functionality which is not specified in the handout, or if you are unsure whether to handle some inode parameters in specific cases (after having read the documentation!), please ask on the discussion board.
Requirements
Your task is to implement an API (in C) of file system commands to operate on an ext2-formatted virtual disk concurrently. That means that your API implementation will need to ensure that all operations are synchronized correctly with respect to other concurrent instances of files system operations that may be performed on the same file system. The operations you will have to implement are described below, including their arguments.
Important: Remember to read the rest of the assignment, especially the sequential functionality, synchronization, starter code structure, simplifying assumptions, coding style and modularity, visualizing operation outcomes, other tips, etc., before you start your implementation (also revisit these sections as you develop your code).
The starter code can be found on cslinux in this path: /student/cslec/369/assignments/A4-starter- code/starter_code.tgz
ext2_mkdir: creating a directory (to be implemented in the ext2_fsal_mkdir function in ext2fsal_mkdir.c) This operation takes one argument: an absolute path on your ext2-formatted disk image. The operation should work like mkdir, creating the final directory on the specified path on the disk. If any component on the path to the location where the final directory is to be created does not exist (or is not a directory) then this operation should return an appropriate error: ENOENT. If the specified directory already exists, then this operation should return EEXIST. Note:
Please read the specifications to make sure you’re implementing everything correctly (e.g., directory entries should be aligned to 4B, entry names are not null-terminated, etc.).
When you allocate a new inode or data block, you *must use the next one available* from the corresponding bitmap (excluding reserved inodes, of course). Failure to do so will result in deductions, so please be careful about this requirement.
https://mcs.utm.utoronto.ca/~369/restricted/assignments/a4/a4.html
11/28/21, 9:57 AM Assignment 4 – File Systems
When adding a new directory entry to the parent, you must always append after the last entry in the last used block of the parent. That is, you should not be looking to “fill existing gaps” left by removed directory entries in the past.
Be careful to consider trailing slashes in paths. These will show up during testing so it’s your responsibility to make your code as robust as possible by capturing corner cases.
Here are just a few examples of how to handle some cases:
1. Argument: “/foo/bar/blah”, where both “foo” and “bar” exist but blah does not. This should succeed
in creating a new directory called “blah” with the given path in the image. Same thing if blah had a
trailing “/”.
2. Argument: “/foo/bar/blah”, where the path is valid up to “blah” and “blah” is an already existing
directory. This should return EEXIST. Same thing if blah had a trailing “/”.
3. Argument: “/foo/bar/blah/”, where both “foo” and “bar” exist but “blah” is an existing file. This should
return ENOENT.
4. Argument: “/foo/bar/blah”, where “foo” does not exist, or bar is a file. This should return ENOENT.
ext2_cp: copying a file (to be implemented in the ext2_fsal_cp function in ext2fsal_cp.c) This operation takes two arguments:
The first is the path to a file on your native operating system
The second is an absolute path on your ext2-formatted disk image.
The operation should work like cp, copying the file on your native file system onto the specified location on the disk. If the source file does not exist or the target is an invalid path, then this operation should return the appropriate error (ENOENT). If the target is a file with the same name that already exists, you should overwrite it (as cp would). If the target is a soft link with the same name that already exists, you should just return EEXIST.
Please read the specifications of ext2 carefully, some things you will not need to worry about (like permissions, gid, uid, etc.), while setting other information in the inodes may be important (e.g., i_dtime). When you allocate a new inode or data block, you *must use the next one available* from the corresponding bitmap (excluding reserved inodes, of course). Failure to do so will result in deductions, so please be careful about this requirement.
When adding a new directory entry to the parent, you must always append after the last entry in the last used block of the parent. That is, you should not be looking to “fill existing gaps” left by removed directory entries in the past.
Be careful to consider trailing slashes in paths. These will show up during testing so it’s your responsibility to make your code as robust as possible by capturing corner cases.
Here are just a few examples of how to handle some cases. We’ll use a source file called “doh” as the file being copied, although an absolute path works too, since it’ll be on your local file system anyway (make sure the file actually exists, otherwise you cannot copy it, and that it is not a directory).
1. Arguments: “doh”, “/foo/bar/blah”, where the target path is valid and blah does not already exist. This will copy the source file “doh” into a new file called “blah” in the image in the indicated path.
2. Arguments: “doh”, “/foo/bar/blah”, where either foo or bar does not exist, or either foo or bar is a file. This returns ENOENT.
3. Arguments: “doh”, “/foo/bar/blah”, where the target path is valid but file “blah” exists. This will overwrite the file in the image.
4. Arguments: “doh” “/foo/bar/blah”, where the target path is valid but “blah” is a directory. The file “doh” will be copied as “doh” under directory “blah”.
https://mcs.utm.utoronto.ca/~369/restricted/assignments/a4/a4.html
11/28/21, 9:57 AM Assignment 4 – File Systems
5. Arguments: “doh” “/foo/bar/blah/”, where the target path is valid and a file “doh” does not already exist under “blah”. The file “doh” will be copied under directory “blah”.
6. Arguments: “doh” “/foo/bar/blah/”, where the path is valid and a file “doh” already exists under “blah”. The file “doh” will be overwritten in the image.
ext2_ln_hl / ext2_ln_sl: creating a hard link and symbolic link respectively (to be implemented in the ext2_fsal_ln_sl and ext2_fsal_ln_hl functions in ext2fsal_ln_sl.c and ext2fsal_ln_hl.c respectively)
These operations take two arguments:
The source target – an absolute path on your ext2-formatted disk image. The link name – also an absolute path on your ext2-formatted disk image.
These operations should work like ln and ln -s respectively. – The ext2_ln_hl operation creates a hard link to the given source path. For example, if the operation gets the following arguments (in order): “/level1/level2/bar”, “/level1/level2/foo”
then it will create within the file system image a hard link called “foo” (aka the hard link) to the file called “bar” (aka the source file or target file), both of them being located in the /level1/level2 path. This operation should handle any exceptional circumstances, for example:
ENOENT: if the source contains an invalid path (for example, if level2 does not exist, or if level1 is a file, etc.),
ENOENT: if the source name does not exist (for example, if bar does not exist),
EISDIR: if the source refers to a directory (for example, if bar is a directory), etc.
EEXIST: if the hard link name is a file which already exists (for example, if foo already exists)
To simplify your work, we will slightly deviate from the behaviour of regular ln. If the hard link name “foo” exists and happens to be a directory, regular ln would actually create a hardlink with the name given by the source (“bar”), inside the directory “foo”. For your ext2_ln_hl implementation, you must simply return EISDIR if the target is an existing directory.
– The ext2_ln_sl operation must create a symlink instead. Note:
For symbolic links, you will see that the specs mention that if the path is short enough, it can be stored in the inode in the space that would otherwise be occupied by block pointers – these are called fast symlinks. Do *not* implement fast symlinks, just store the path in a data block regardless of length, to keep things uniform in your implementation, and to facilitate testing.
If in doubt about correct operation of links, use the ext2 specs and ask on the discussion board. Similarly to “ln -s”, for ext2_ln_sl the source path for a symbolic link doesn’t have to be a valid path. No error is expected, the invalid source path will just be stored in the target symlink. This is different from hard links where an invalid source path would result in an error.
Similarly to hard links, for ext2_ln_sl we’ll simplify your task for the case when the target is an existing directory, by just returning EISDIR, rather than creating a symlink inside the target directory.
Remember that for a symlink if the source is a directory, then the symlink can be created, unlike for hard links where this is not allowed.
Same observations from ext2_cp and ext2_mkdir apply with respect to using the next available inode and/or data block, and regarding appending new directory entries instead of trying to fill existing gaps.
ext2_rm: removing a file (to be implemented in the ext2_fsal_rm function in ext2fsal_rm.c)
This operation takes a single argument: an absolute path to a regular file or link (not a directory) on that disk. The operation should work like rm, removing the specified file from the disk, and should return the appropriate
https://mcs.utm.utoronto.ca/~369/restricted/assignments/a4/a4.html
11/28/21, 9:57 AM Assignment 4 – File Systems
error in exceptional circumstances. For example, if the path to the file is invalid or if it’s valid but the file does not exist, then ENOENT is the right error code. If the path does not end in a trailing “/”, but the last name in the path is actually a directory, then EISDIR must be returned. Once again, please read the specifications of ext2 carefully, to figure out what needs to actually happen when a file or link is removed (e.g., no need to zero out data blocks, must set i_dtime in the inode, removing a directory entry must adjust its predecessor to indicate the next valid directory entry, etc.).
Sequential functionality details and error checks
All of these tools should be minimalist in terms of the functionality. Don’t implement what isn’t specified: only provide the required functionality. For example, don’t implement wildcards. Also, can’t delete directories? Too bad! 🙂
However, it is your responsibility to make your code as robust as possible by capturing corner cases for the functionality that you need to implement. Here are a few examples of things to consider:
1. Be careful to consider trailing slashes in paths (e.g., ext2_cp, ext2_mkdir, etc.). Depending on whether the path has a trailing slash, the operation might behave differently so be careful to handle these.
2. If you need a new inode or a new data block but cannot obtain one due to lack of space on the disk image, then returning ENOSPC is expected.
3. A tricky corner case arises if a directory has more than one block and you are removing a file or directory which happens to be found in the first directory entry in a data block which does not have the “.” or “..” entries (any data block of the parent directory, other than its first data block). In this case, if you do not have “.” and “..” to change the rec_len for, there’s no “previous” entry to update the rec_len for. In order to avoid losing track of any potential subsequent entries, you should take the approach described in the specs. You should be able to infer from the specs that in this special case, your must zero the inode and adjust the rec_len to the next entry (or the end of the block, if there are no subsequent entries in that block).
Such things will show up in our tests, so you need to test your code properly! Test-driven development is something you’ve been covering since first-year programming courses, so you need to demonstrate your ability to think of corner cases and handle them.
Keep in mind that we don’t want to overload this handout to make it unreadable, so it is your responsibility to consider error checks where appropriate. As a general rule, whenever in doubt, check the ext2 documentation and confirm your doubts for specific corner cases by asking questions on the discussion board.
Synchronization
The tools you have to implement will be run concurrently on the file system images, exposing the possibility of race conditions. You must use synchronization primitives in the provided starter code in order to ensure that accesses to shared file system structures are correctly synchronized.
For instance, think of what happens when you have two ext2_cp commands running concurrently on the image, both copying in some file into the same directory. What if the file being copied in has the same name in both concurrent ext2_cp commands? As you can imagine, race conditions to the file system data structures, such as the inode table, the inode and data bitmaps, as well as races to the same directory data blocks to add or remove new entries concurrently, are not desirable and must be carefully handled to ensure both correctness and efficient execution of concurrent commands.
https://mcs.utm.utoronto.ca/~369/restricted/assignments/a4/a4.html
11/28/21, 9:57 AM Assignment 4 – File Systems
To help us understand your design decisions, you must first design your synchronization and show it to us in a documentation file called Synch.txt (or Synch.pdf, if you want to draw diagrams without resorting to ASCII-text. 🙂 We would recommend that you use a piece of paper and conceptualize what data structures you need to protect, thinking of the steps in each operation that you have to implement. Discuss this with your partner, if applicable. Once you have a clear and complete design that you are happy with, you are welcome to come to office hours and discuss your design with us to see if you are on the right track.
IMPORTANT: You may not discuss your synchronization design decisions on the discussion board, or with other classmates from another group, since you might come up with solutions that are too similar.
All the synchronization primitives you use to synchronize file system metadata and data, must be shared between all the tools, so they must be placed in the ext2fsal.h file. Any initialization of synchronization primitives must be done in the ext2_fsal_init() function and all cleanup of synchronization primitives must be done in the ext2_fsal_destroy() function, from the ext2fsal.c file. These functions are executed at the start of the file system server that receives file system operations from the user. This is also the spot where you should mmap the disk image (like you did in the tutorial exercises), and any other initial setup tasks that your code might need. More details on the structure of the starter code are provided in the following section.
Starter Code Structure
Overview: Your file systems operations will be run on a file system server that we created for you to use, and which receives files system commands (ext2_cp, ext2_rm, etc.) and carries them out concurrently. The file system server binary is provided to you along with a set of necessary libraries (which you do not need to worry about), and the five ext2_cp/ext2_rm/ext2_mkdir/ext2_ln_sl/ext2_ln_hl executables (which you do not need to touch). Your code will be written in the source files under the src/ directory, and built into a shared library using the provided Makefile.
Once again, you can get the starter code from cslinux from this path: /student/cslec/369/assignments/A4-starter- code/starter_code.tgz Once you untar the archive, in your repository you must keep the same directory structure, but without the starter_code directory itself. The starter code we give you has a subdirectory called out, which you should copy as is, into your repository.
The out directory has the following subdirectories:
1. bin: this is where the server binary ext2kmfs is located, and where the shared library libext2fsal.so, which you have to generate must be copied over.
ext2kmfs: run this server executable in a separate console, providing an existing ext2 image as its sole argument. Once you run it, the server will just wait for commands from the user, which will be applied to the image provided as a command line argument when you started the server. The server terminates when it receives a SIGINT (using Ctrl-C).
libext2fsal.so: the provided library is generated from the empty stubs from the src/ directory, so it won’t actually perform any operations in its current form. You must replace this shared object with your own libext2fsal.so library (the one generated when you run “make” in the src/ directory, after you’ve implemented the respective tools).
2. inc: this contains some header files which you do not need to modify.
3. lib: this contains a static library which is needed for the client operations to work correctly, and which you do
not need to modify.
4. src: this directory is where you will develop the file system tools.
ext2.h: this is the header file we gave you for the exercises as well, and which the definitions of ext2 file system structures.
https://mcs.utm.utoronto.ca/~369/restricted/assignments/a4/a4.html
11/28/21, 9:57 AM Assignment 4 – File Systems
ext2fsal*.c files: this is where you need to place the implementation of your file system tools. The file name and the comments in each file should prompt you about the role of each file. For instance, in ext2fsal.c you must perform any initializations of synchronization primitives and initial operations such as mmap-ing the image, as well as cleanup tasks. In the ext2fsal.h header file you’ll need to declare any synchronization primitives (e.g., locks) that you plan to use in your implementation of file system commands, as these primitives will have to be shared across all tools.
e2fs.c/.h: this is where you can add helper methods for your implementation of the file commands, and declare their prototypes in the header file.
5. util: this contains a set of binaries and a library, which you should not modify. The binaries: ext2umfs_cp
ext2umfs_mkdir ext2umfs_ln_hl ext2umfs_ln_sl ext2umfs_rm
represent programs which execute your ext2fsal_cp/ext2_ln_hl/ext2_ln_sl/ext2_rm/ext2_mkdir operations on the server side, by picking up your implementation of the corresponding API from the libext2fsal.so library (remember that this is generated
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com