Assignment 1: File Explorer
============================
## Introduction
For the rest of the assignments in this course, we will be building a distributed social communication platform. The program will allow you to send and receive messages with other students in the class. We will start with a command line interface and end with a graphical interface. At the heart of any communication tool, is the user, so the first thing we need to create is a tool that lets us manage user accounts.
### Summary of program requirements for this assignment:
1. Navigate a computer file system
2. Search for files by full name or suffix
3. Output content of text file
4. Load file contents
5. Create new file
### Learning Goals
1. Working with files and file systems.
2. Understand Recursion
3. Error handling
4. Tests
## Program Requirements
### Part 1
User input for this program will take the following format:
“`ipython3
[COMMAND] [INPUT] [[-]OPTION] [INPUT]
“`
For the first part of the assignment, you will support two commands:
“`{admonition} Program Feature
L – List the contents of the user specified directory.
Q – Quit the program.
“`
The ‘Q’ command is straightforward, when your program receives the ‘Q’ input it should cleanly stop operation. The ‘L’ command will instruct your program to list the contents of a directory. So for example, if we want to list the content of a particular directory in our file system:
“`ipython3
L c:\users\mark\a1
“`
which will print:
“`ipython3
c:\users\mark\a1\test.py
c:\users\mark\a1\readme.txt
c:\users\mark\a1\assets\
c:\users\mark\a1\doc\
“`
So by running this command, we discover that there are two files and two directories contained within the directory we specified. Notice that the results are sorted by files first and then directories. Your first attempt to print the contents of a directory will not look like this, so you will have to take care to ensure your code properly sorts results into files first, followed by directories.
“`{admonition} Program Feature
The results of the ‘L’ command must print results in a files first, directories last format.
“`
Now let’s add some options to refine our directory listing. There are four options that you will need your program to support:
“`{admonition} Program Feature
-r Output directory content recursively.
-f Output only files, excluding directories in the results.
-s Output only files that match a given file name.
-e Output only files that match a give file extension.
“`
Applying the ‘r’ option to the directory list command, will print the contents of the source directory as well as all subdirectories:
“`ipython3
L c:\users\mark\a1 -r
“`
And the resulting output:
“`ipython3
c:\users\mark\a1\test.py
c:\users\mark\a1\readme.txt
c:\users\mark\a1\assets\
c:\users\mark\a1\assets\image1.jpg
c:\users\mark\a1\assets\image2.jpg
c:\users\mark\a1\assets\logos\
c:\users\mark\a1\assets\logos\logo1.jpg
c:\users\mark\a1\assets\logos\logo2.jpg
c:\users\mark\a1\doc\
c:\users\mark\a1\doc\help.doc
c:\users\mark\a1\doc\readme.txt
“`
Notice how that in addition to the output of files within the ‘assets’ directory, the subdirectory ‘logos’ is also printed along with the files contained within.
Okay, so far so good. Now let’s see what happens when we apply the ‘f’ option:
“`ipython3
L c:\users\mark\a1 -f
“`
And the resulting output:
“`ipython3
c:\users\mark\a1\test.py
c:\users\mark\a1\readme.txt
“`
As you may have expected, by adding the ‘f’ option to the ‘L’ command, the program will only list the files in the specified directory.
The two remaining options, ‘s’ and ‘e’, allow us to further refine our results by specifying search parameters. Unlike the ‘r’ and ‘f’ options, these options will also need to accept a user input parameter. So let’s say we want to find all files in a given directory with the name ‘readme.txt’:
“`ipython3
L c:\users\mark\a1 -r -s readme.txt
“`
And the resulting output:
“`ipython3
c:\users\mark\a1\readme.txt
c:\users\mark\a1\doc\readme.txt
“`
Notice that we combined options here to recursively check all subdirectories as well, so our results return two instances of the file ‘readme.txt’ that exist within the ‘a1’ directory. If we had left the ‘r’ option off, the results would be slightly different:
“`ipython3
L c:\users\mark\a1 -s readme.txt
“`
And the resulting output:
“`ipython3
c:\users\mark\a1\readme.txt
“`
Now let’s try another search using the ‘e’ option. As I described earlier, the ‘e’ option allows us to filter results by file extension, so:
“`ipython3
L c:\users\mark\a1 -r -e jpg
“`
And the resulting output:
“`ipython3
c:\users\mark\a1\assets\image1.jpg
c:\users\mark\a1\assets\image2.jpg
c:\users\mark\a1\assets\logos\logo1.jpg
c:\users\mark\a1\assets\logos\logo2.jpg
“`
“`{note}
You can safely assume that when a program feature is combined with the ‘-r’ option, that the ‘-r’ option will alwyas come first.
“`
The results now only return files who have the extension or _suffix_ ‘jpg’. Okay, so once you have created a program that is able to execute all of the operations described in this section you will be ready for Part 2.
### Part 2
Now that we have a way to look for files in our file system, we need to actually do something with those files! So for this part of the assignment you will add three additional commands to your program:
“`{admonition} Program Feature
C – Create a new file in the specified directory.
D – Delete the file.
R – Read the contents of a file.
“`
Creating a new file is exactly as it sounds, when the user issues the C command, your program will read the input and create a new file:
“`ipython3
C c:\users\mark\a1 -n mark
“`
“`{admonition} Program Feature
-n Specify the name to be used for a new file.
“`
The ‘C’ command must also accept a single option ‘n’ that allows the user to specify the name of the file. Note that we do not require the use of an extension. Instead, since we will be using this program in the future to create users for our distributed social platform, the program will control the types of files it creates. So whenever a file is created, it must append the user specified name with a distributed social user extension, or ‘dsu’. So when the previous command is run, it will create a new file in the specified directory, with the specified name, and the ‘dsu’ extension. The program output should look like the following to confirm to the user that the file was created:
“`ipython3
c:\users\mark\a1\mark.dsu
“`
“`{admonition} Program Feature
New files created with the ‘C’ command must automatically be created with the ‘dsu’ extension or _suffix_.
“`
The ‘D’ command will allow the user to delete a DSU file. If the user specified file is not a DSU file, then the program should print ERROR and wait for corrected input from the user. Once the DSU file has been successfully deleted, the program should print a confirmation:
“`ipython3
D c:\users\mark\a1\mark.dsu
“`
And the resulting output:
“`ipython3
c:\users\mark\a1\mark.dsu DELETED
“`
“`{admonition} Program Feature
A the output after deleting a file should print the directory and file name with the word ‘DELETED’ at the end of the line.
“`
The ‘R’ command will print the contents of a DSU file. As with the ‘D’ command, if the user specified file is not a DSU file, the program should print ERROR and wait for corrected input from the user. If the file is empty, then the program should print EMPTY and wait for corrected input from the user:
“`ipython3
R c:\users\mark\a1\mark.dsu
“`
And the resulting output:
“`ipython3
EMPTY
“`
Otherwise, the program will print the contents of the file:
“`ipython3
R c:\users\mark\a1\mark.dsu
“`
And the resulting output:
“`ipython3
Hello World!
“`
“`{admonition} Program Feature
The program must handle errors gracefully. When an error occurs, the program should inform the user by printing ‘ERROR’ and wait for additional input from the user.
“`
Finally, and this applies to both Parts 1 and 2, your program should handle errors gracefully. If a user enters commands, inputs, or options that are not understood by the program, the program should simply print ERROR and wait for additional input from the user.
### A Few Rules
The majority of the work you will need to do for this assignment will center around the [**pathlib**](https://docs.python.org/3.8/library/pathlib.html) library. Start by familiarizing yourself with the various functions that the library provides. Path is a powerful type that will help you manipulate the file system, so use as much of it as you can to reduce the complexity of your program. There are, however, a few functions in the library that you will not be allowed to use for this assignment. **os.walk**, **os.fwalk**, **glob**, and **rglob** abstract away much of the functionality that we want you to learn through this assignment. Generally, if you find yourself using library functions to locate files rather than writing your own recursive function, then you are probably using the library inappropriately.
### Other Considerations
“`{note}
The following notes are adapted from the Alex Thornton’s notes on writing programs. It’s great advice, so rather than rewrite it, I prefer to “stand on the shoulders of giants” and let you learn from the source.
“`
“`{admonition} Alex Thornton’s Program Writing Tips
:class: tip
**The value of working incrementally**
You may be accustomed to solving relatively small, mostly self-contained problems that you can handle all at once. This program, while not giant by real-world standards, consists of more moving parts than you might be used to writing, so you will likely find that attempting to write this program all at once will lead you astray, even if an all-inclusive, everything-at-once approach worked well for you in ICS 31.
A better approach for this project — and one that will increase in importance this quarter, as the problems we solve get larger and more complex — is to look for stable ground as often as you can find it. Rather than trying to write the entire program, write some small part of it that you understand well, then find a way to verify that the part you wrote works as you expected (e.g., by writing a function and then calling it in the Python interpreter manually). When it does, you’ve reached stable ground, and you’re ready to choose the next small step you should take, again taking care to choose something that you’ll be able to verify after you’re done with it. Ideally, you’ll find yourself reaching stable ground quite often — sometimes, every few minutes, if things are going well — and this will help you to feel confident that you’re making progress.
If you find yourself stuck on one problem and have no idea how to move further, find another positive step you can take. For example, if you’re not sure how to find all of the files in a directory structure, write a function that simply returns a hard-coded list of file paths and use that temporarily, then move on and work on something else, eventually working your way back to the places that were causing you trouble. The goal is always to be making some kind of progress, and it’s not necessary to write the program in a particular order (e.g., the order in which things happen in the user interface).
Also, when you reach what you believe to be stable ground, it’s not a bad idea to make a copy of your Python module before proceeding. That way, if your next step doesn’t go as you planned, you maintain the option of “rolling back” to the previous, stable version and trying again. It also gives you something stable to turn in if you find that the deadline arrives and you’re not finished; it’s always better to turn in a partial program that does something correctly, rather than one that doesn’t run. Don’t feel like you need to keep every stable version, but it’s not a bad idea to at least have the most recent one or two. (One of the practical skills you’ll need to start thinking about, if you haven’t already, is staying organized. If you have a couple of versions of a file and find that you’re often not sure which is which, then you need a better organizational scheme — better file names, better directory names, or whatever helps it to be more obvious to you.)
**So, what steps should I take?**
There are a variety of ways to build this program from beginning to end, so don’t go looking for the “perfect way” to do it. Find a small step you can take, take it, and then find another. Of course, there are missteps you might take along the way, but the best way to learn how to write programs this way is simply to do it; you’ll learn as much from your missteps as you will from the ones that work out well.
As an example, though, think about the fact that the program is built around the core notion of “Find all of the unique files in a directory, its subdirectories, their subdirectories, and so on, and return a list of their paths.” That sounds like a pretty good step to start with, except that it’s actually a bigger step than it sounds like. So you could start even smaller: write a function that finds the files in a directory but ignores subdirectories. Once you can do that, add handling for one level of subdirectories (but assume they have no subdirectories inside of them). Then consider unbounded subdirectory depth and handle that scenario.
Whenever you’re working on something and you feel you’ve bitten off more than you can chew, think about ways to break the problem into smaller ones; eventually, you’ll be left with a problem you can think through and complete.
If you’re not sure what step to take next, feel free to ask us; we’ll help you find a task that will lead you to stable ground.
**Testing**
Testing this program is going to seem cumbersome, because it will require creating directory structures and files in various configurations, then running the program to see how it behaves. You might find that it’s worth your time to automate some scenarios, by writing short Python functions (perhaps in a separate module) that create directories and files in interesting combinations. You won’t likely find that it will require writing a lot of code, but it will pay you back (and then some!) as you test your program.
It’s quite common in real-world software development to write code whose sole use is as a development tool; it’s not part of the program, per se, and will not be given to users of the program, but is strictly meant to make it easier to build the program. You’ll be well-served to explore ways to use Python to lighten your testing burden; if there are five interesting scenarios you’ve identified, you should write five Python functions that can set them up for you automatically, so you can get them back any time you need to test them. If it takes a half-hour to write the test programs, but it takes five minutes to set up the tests manually, as soon as you’ve set up the tests six times, the time spent writing the test program will begin paying you back. Computers automate tasks that are otherwise cumbersome, and testing certainly falls into that category. (We’ll see this theme repeated throughout the course.)
“`
## Submitting Your Assignment
### Naming and organizational requirements
How you organize your program is up to you, but there are a couple of requirements that you’ll need to follow.
Your program must be written entirely in a single Python module, in a file named a1.py. Note that capitalization and spacing are important here (i.e., no letters in the filename are capitalized and there are no spaces); they’re part of the requirement.
Executing your a1 module — by, for example, pressing F5 or selecting Run Module from the Run menu in IDLE — must cause your program to read its input and then print its output. It must not be necessary to call a function manually to make your program run.
Other than that, anything goes; you can organize your solution in any way you’d like. Note that future assignments will take what we call “quality of solution” a lot more seriously, but the name of the game in this warm-up assignment is simply to submit a program that works.
### Validation-checking your output
In assignment 0, we referred to this tool as the sanity checker. Moving forward we will refer a tool that I write to validate your program structure a validity checker.
Assignment 1 Validity Checker v0.1
### How we will grade your submission
This assignment will be graded on a 15-point scale, with the 15 points being allocated completely to whether or not you submitted something that meets all of the above requirements. The following rubric will be used:
Requirements and Function | 13 pts
: Does the program do what it is supposed to do?
: Are there any bugs or errors?
Quality and Design | 2 pts
: Is the code well designed?
: Is the code clearly documented?
As this is your first assignment using the following grading rubric, we will make an effort to draw attention to where your program fails to meet the grading criteria. However, as we move on, you will be expected to learn from our feedback and address the concerns that we raise in your code.