Homework 4 : Make a Website
(Deadline as per Coursera)
This assignment may take more time than previous assignments, so start as
early as you can on it.
Copyright By PowCoder代写 加微信 powcoder
This homework deals with the following topics:
● Reading and writing files
● Scraping and parsing information from a text file
● Very basic HTML
● Unit testing
General Problem Specification
The basic skeleton of a website is an HTML page. This HTML page is a text file with a certain format. Taking advantage of this fact, one can take an HTML template and create multiple pages with different values stored. This is exactly what we are going to do in this HW.
Many websites use these kinds of scripts to mass generate HTML pages from databases. We will use a sample text file as our ‘database’.
Our goal will be to take a resume in a simplistic text file and convert it into an HTML file that can be displayed by a web browser.
What is HTML?
You do not need to know much HTML to do this assignment. You can do the assignment by just understanding that HTML is the language that your browser interprets to display the page. It is a tag-based language where each set of tags provides some basic information for how the browser renders the text that you write between them.
For example,
The Beatles
will be rendered by your browser as a heading with a large font.
indicates the beginning of the heading and
indicates the ending of the heading. An HTML webpage is typically divided into a head section and a body section. We have provided you with a basic website template. We want you to retain the head section and write only the body section via your Python program.
For more details about HTML, your best resource will be to use the w3schools website which can be found here: www.w3schools.com. This website has a ton of information and provides all of the common HTML you’ll need to know for this assignment.
Input Test File
We will read a simple text file that is supposed to represent a student’s resume. The resume has some key points but can be somewhat unstructured. In particular, the order of some of the information will definitely be different for different people.
The following are the things that you definitely DO know about the resume:
● Every resume will have a name, which will be written at the top. The top line in the text file will contain just the name.
● There will be a line in the file, which contains an email address. It will be a single line with just the email address and nothing else.
● Every resume will have a list of projects. Projects are listed line by line below a heading called “Projects”. An example of what you might see in a resume file is something like this:
Worked on big data to turn it into bigger data Applied deep learning to learn how to boil water Styled web pages with blink tags.
Washed cars …
———-
● The list of projects ends with a single line that looks like ‘———-’. That is, it will have at least 10 minus signs. While this is a weird requirement, we are imposing it to actually make the assignment easier for you.
● Every resume will have a list of Courses. Courses are listed like: Courses – CIT590, AB120
Courses :- Pottery, Making money by lying
● You are allowed to assume that every single resume will just have this comma separated list of courses with the word “Courses” being there in front of them. You do want to allow for some kind of punctuation mark(s) after the word “Courses”, and before the list of courses.
● So your program should be able to look at the above example and then extract the courses without including the ‘-’ sign or the ‘:-’ or any such punctuation that is in between the word “Courses” and the actual data we’re interested in.
The first step in this assignment is to create a sample resume that conforms to the format described above. We’re providing an example, but please do not just blindly copy our resume. Write your own sample file in addition to the one provided. It does not have to be totally accurate but the HW is likely to be more fun if you make it close to reality.
Functions for Parsing the File
At the very least, you need to write one function for each piece of information that you want to extract from the text file. Contrary to previous homework assignments, in this assignment we will not provide you with a strict outline for the functions or what arguments to pass to them.
When we grade, we’ll look at how modular your code is and how you decided to break up the functionality into separate functions. We’ll also look at how you named the functions, what arguments they take, and how well you unit test the functions.
Here’s a basic breakdown of the functionality required to read the file into memory, parse each section of the file to extract the relevant information, and write the final HTML-formatted information to a new file.
Reading the File
● Since the resume file is pretty small, write a function that reads the file and stores it in memory as a list of lines.
● Then, you can use list and string manipulations to do all of the other necessary work.
● Note: Do not prompt the user for a filename (or any other information for that matter).
This will break our automated unit testing. You can rely on the provided code which includes hardcoded resume filenames to read.
Detecting the Name
● Detect and return the user’s name by extracting the first line.
● The one extra thing we want you to do, just for practice, is if the first character in the
name string is not an uppercase letter (capital ’A’ through ’Z’), consider the name invalid
and ignore it. In this case, use ‘Invalid Name’ as the user’s name.
● For example:
is a valid name
brandon Krakowsky is not a valid name, so your output html file will display ‘Invalid Name’ as the user’s name.
● Another thing to note is that the name on the first line could have leading or trailing whitespace.
● Note: Do not use the istitle() function in Python. This returns True if ALL words in a text start with an upper case letter, AND the rest of the characters in each word are lower case letters, otherwise False. This function will incorrectly identify a name like Edward jones as being an invalid name, when it’s actually valid.
Detecting the Email
● Detect and return the user’s e-mail address by looking for a line that has the character.
● Also make sure that the last four characters of the email are either ‘.com’ or ’.edu’.
● Make sure the email string contains a lowercase alphabetic character after the
● There should be no digits in the email address.
● The email string could have leading or trailing whitespace. You will need to strip
whitespace to properly handle these cases.
● These rules will accommodate but will not
accommodate or
● For example:
is a valid email is not a valid email is also not a valid email
● We are fully aware that these rules are inadequate. However, we want you to use these rules and only these rules.
● If an e-mail string is not found based on the given rules, consider the e-mail address to be missing. This means your function should return an empty string and your output resume file will not display an e-mail address.
● PLEASE DO NOT GOOGLE FOR A FUNCTION FOR THIS. Googling for solutions to your homework is an act of academic dishonesty and in this particular case, you will get solutions involving crazy regular expressions, which is a topic we haven’t yet discussed in class. (In general, your code should never involve a topic that we have not discussed in class.). Plus, you can easily achieve the required functionality without the use of a regular expression.
Detecting the Courses
● Detect and return the user’s courses as a list by looking for the word “Courses” in the list and then extract the line that contains that word.
● Then make sure you extract the correct courses. In particular, any random punctuation after the word “Courses” and before the first actual course needs to be ignored.
● You are allowed to assume that every course begins with a letter of the English
● Note that the word “Courses”, the random punctuation, or individual courses in the list
could have leading or trailing whitespace.
Detecting the Projects
● Detect and return the user’s projects as a list by looking for the word “Projects” in the list.
● Each subsequent line is a project, until you hit a line that looks like ‘———-’. This is NOT an underscore. It is (at least) ten minus signs put together. You have reached the end of the projects section if and only if you see a line that has at least 10 minus signs, one after the other.
● If you detect a blank line between project descriptions, ignore that line. 4
● Also note that certain projects could have leading or trailing whitespace. Writing the HTML
Once you have gathered all the pieces of information from the text file, we want you to programmatically write HTML. Here are the steps for that:
● Start by saving the file resume-template.html that is provided in the same directory as your code.
● Preview the file in a text editor that does HTML syntax highlighting (e.g. Sublime Text). Notice that there is an empty
(Note: you should not modify or overwrite the resume_template.html file. You should copy the information from this file, modify it as needed, and then write the modified information to a new file, resume.html. Think about how you can save the information from resume_template.html in program memory to help you accomplish this.)
● More o o o
like the below, and we need to start by removing the last two lines of HTML (closing and tags) in order to insert the resume content in the correct location.
random header stuff
lots of style rules we won’t worry about
● We want to put our resume content in between the body tags to make it look like this:
random header stuff
lots of style rules we won’t worry about
HTML-formatted resume content goes here
specifically, your Python code will do the following:
Open and read resume-template.html
Read every line of HTML into program memory
Remove the last 2 lines of HTML (the and