CS61C Spring 2017 Project 1: philspel
TAs: David Cheng (originally by Nicholas Weaver)
Due 02/09/2017 at 23:59:59
Goals
The objective of this project is to serve as an introduction to the C language. This project will cover a lot of different features of the C language, including the C file input/output library, memory allocation, string manipulation, and string coercion to void pointers and vice versa. Although there is conceptually a lot to learn to complete this project, the actual code you need to write is relatively short.
Background
Philspel is a very simple and silly spelling checker. It accepts a single command line argument, the name of a dictionary to use. This dictionary consists of a list of valid words to use in checking the input. Philspel processes standard input and copies it to standard output. For each word (sequence of letters unbroken by any non-letter character) in the input, it looks that word, that word converted entirely to lowercase letters, and that word with all but the first letter converted to lowercase. If any of the three variations are found in the dictionary, the word is copied to standard output. Otherwise, the word is copied to standard output, with the string ” [sic]” (without the quotation marks but with the spaces appended). All other input is copied to standard output unchanged.
Your assignment is to complete the implementation of philspel. To do this, you will need to implement 4 functions in philspel.c: stringHash(void *s), stringEquals(void *s1, void *s2), readDictionary(char *filename), and processInput().
Just as a reminder, this projected is to be completed individually.
Setup
To get starter code for this project, we will be using git.
1. First, you need to create a PRIVATE repo with the format proj1-xxx, where xxx is replaced by your login. The process for this is very similar to what was done in lab. Please make sure this repo is NOT your work repo however. Furthermore, please set this to PRIVATE. Just to make sure, set this to PRIVATE. If you do not set this to PRIVATE, horrible things will happen to you and you will be severely punished. So before continuing, make sure your repo is PRIVATE. Not setting your repo to PRIVATE, even if by mistake, will be seen as an intention to cheat.
2. From your access management settings, give ‘cs61c-staff’ admin access to your repo. You also must do this from the very beginning to avoid any penalties placed upon your grade.
3. Did you set your repo to PRIVATE and give the ‘cs61c-staff’ admin access? If yes, enter into the directory of your class account that you would like to hold your proj1-xxx repository. Once in this directory, run the following: git clone https://mybitbucketusername@bitbucket.org/mybitbucketusername/proj1-xxx.git
4. cd proj1-xxx
5. git remote add proj1-starter https://github.com/61c-teach/sp17-proj1-starter.git
6. git fetch proj1-starter
7. git merge proj1-starter/master -m “merge proj1 skeleton code”
8.
9.
10. Note: in case we announce updates to the project starter code, you’ll need to merge the updates in with your current repo. To do this, simply re-run the last two steps from above $ git fetch proj1-starter
11. $ git merge proj1-starter/master -m “merge proj1 skeleton code”
12.
Once you complete these steps, you will have the proj1 directory inside of your repository, which contains the files you need to begin working.
As you develop your code, you should make commits in your git repo and push them as you see fit. Be sure to commit often! This allows you to save your progress as you go along, as well as gives us an overview of how you approached the project. You should not have just a single giant commit of all your code when you finish the project.
Overview
After you have a copy of the starter code, make sure you have the following files.
• hashtable.c – Code for a generic hashtable.
• hashtable.h – Header file for hashtable.c. For more information on header files and how they are used, check out this link. You should not need to modify either hashtable file, though take a look at them and see how they work.
• philspel.c – Contains various functions for our spellchecker. You will need to implement 4 functions in philspel.c: stringHash(void *s), stringEquals(void *s1, void *s2), readDictionary(char *filename), and processInput().
• philspel.h – Header file defining functions in philspel.c. You may modify philspel.h if you wish to declare additional helper functions which you implement in philspel.c.
• Makefile – Makefile for compiling philspel, as well as for running tests. For more information on Makefiles and how they are used, check out this link.
• sampleDictionary – Sample dictionary. Another useful dictionary can be found in /usr/dict/words on the instructional machines.
• sampleInput – Sample input file.
• sampleOutput – Sample output file.
• testOutput – Test output file.
Spend some time looking over these files and figuring out what is in each file and what they are used for.
Instructions
As noted above, your task is to implement the 4 functions in philspel.c: stringHash(void *s), stringEquals(void *s1, void *s2), readDictionary(char *filename), and processInput(). More information about the purpose of each of these functions can be found inside philspel.c.
You can type the following in your project 1 directory to compile and test your program against a sample set of inputs:
$ make test
You can, however, safely output all sorts of debugging information to stderr, as this will be ignored by our scripts and by the test routine provided in the Makefile.
While completing this project will require getting familiar with many parts of the C language (C file input/output library, do memory allocation, manipulate strings, and coerce strings to void pointers and vice versa just to name a few), your final solution should not be all that long.
Start early! You likely will have to learn and experiment a lot with C to complete this project, so make sure you budget enough time.
For grading purposes, we will be splitting the project into two parts.
1. When you first are writing your solution to the project, you can assume that that both the dictionary and the input won’t contain words longer than 70 characters. Sucessfully doing this will be worth 80% of your grade on the project.
2. For the remainding 20% of your grade, we will test your code against both dictionaries and inputs that contain words of an unbounded length. We suggest trying to get a solution with a bounded character count before attempting this part. We will also test against dictionaries that contain entries that are not valid “words”. Such entries (e.g. “Super31337-61c”) should not negatively affect your programs.
Submissions
There are two steps required to submit proj1. Failure to perform both steps will result in loss of credit:
1. First, you must submit using the standard unix submit program on the instructional servers. This assumes that you followed the earlier instructions and did all of your work inside of your git repository. To submit, follow these instructions after logging into your cs61c-xxx class account:
$ cd proj1-xxx
2. $ submit proj1
Once you type submit proj1, follow the prompts generated by the submission system. It will tell you when your submission has been successful and you can confirm this by looking at the output of glookup -t. Remember, you only need to submit two files: philspel.c and philspel.h.
3. Additionally, you must submit proj1 to your bitbucket repository. To do so, follow these instructions after logging into your cs61c-xxx class account:
$ cd proj1-xxx
4. $ git add -u # should add all modified files in proj1 directory
5. $ git commit -m “Project 1 submission”
6. $ git tag -f “proj1-sub” # The tag MUST be “proj1-sub”. Failure to do so will result in loss of credit.
7. $ git push origin master –tags # Note the “–tags” at the end. This pushes tags to bitbucket
Resubmitting
If you need to re-submit, you can follow the same set of steps that you would if you were submitting for the first time. The only exception to this is in the very last step, git push origin master –tags, where you may get an error like the following:
(21:28:08 Sun Feb 01 2015 cs61c-ta@hive12 Linux x86_64)
~/work $ git push origin master –tags
Counting objects: 22, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (19/19), done.
Writing objects: 100% (21/21), 9.73 KiB | 0 bytes/s, done.
Total 21 (delta 4), reused 0 (delta 0)
To git@bitbucket.com:cs61c-staff/cs61c-ta
bf20433..d1ff9ed proj1 -> proj1
! [rejected] proj1-sub -> proj1-sub (already exists)
error: failed to push some refs to git@bitbucket.com:cs61c-staff/cs61c-ta’
hint: Updates were rejected because the tag already exists in the remote.
If this occurs, simply run the following instead of git push origin master –tags:
$ git push -f origin master –tags
Note that in general, force pushes should be used with caution. They will overwrite your remote repository with information from your local copy. As long as you have not damaged your local copy in any way, this will be fine.
Grading
The grading for this project will be done almost entirely by automated scripts. Your output must exactly match the specified format, which makes correctness the primary goal of this project. Again, you are to do this work individually.