3010 Assignment 2
Summer 2020
Remember to complete the Honesty Declaration in UMLearn
As a reminder, copying material (text or code) is plagiarism and academic dishonesty (see the Science links in the ROASS page for details on what constitutes academic dishonesty in written/research work).
Due July 24, 2020, 11:59pm.
Part 1 Screen scraping, two ways
Write a screen scraper for the invitation system you wrote for assignment 1.
Your screen scraper should accept a command-line argument to indicate who you are and what you would like to reply to the invitation. For instance:
./a2 Rick yes
./a2 Morty no
Write screen scrapers in both C, and python. Use an assert statement to verify the invitation was accepted properly, by requesting the initial invitation link a second time, and checking the results of the CGI script. You must use raw sockets (as seen in class) to fetch data from the web server. You may use the example from class as a starting point.
Part 2 XHR, front-end processing
Create a website that is hosted on your CS unix homepage that is a chat system. The chat page should never refresh. Use XHR requests to fetch new
You must use XMLHttpRequest directly, not using fetch, or any libraries to do the data requests for you.
To request updates, use setInterval, which will repeat code for you, based on a delay. There is an example here.
Use JSON to transfer all of the event information. You can use any structure for storing the data on the server, such as simple files, JSON, etc. Be sure to include a brief description of your design in your readme.md file.
Use server-side processing as you deem necessary but note that you will be marked based on your overall design (excessive and unnecessary JavaScript will result in a loss of marks). The interface does not need to be beautiful. A simple text box that sends the text when a button is pressed, or optionally, if the user presses the ¡°return/enter¡± button.
You are permitted to use the cgi module in python, though it is not required.
Some notes
You POST will have a json body in your XHR request. Javascript can convert to and from json.
From w3schools:
var myObj = {name: “John”, age: 31, city: “New York”};
var myJSON = JSON.stringify(myObj);
var myJSON = ‘{“name”:”John”, “age”:31, “city”:”New York”}’;
var myObj = JSON.parse(myJSON);
So can Python.
You can send a JSON body with XHR¡¯s .send(), which can accept a parameter, which is a string. Testing your CGI script on the server. Assuming test.txt contains
{“name”:”Rob”, “message”:”new post”}
Then, we can do the following, assuming you¡¯re using tcsh or bash for your shell:
cat test.txt | env REQUEST_METHOD=POST CONTENT_LENGTH=37 ./chatserver.cgi
Where env places key/value pairs as environment variables.
Part 3 Web server
All socket programming will be done without any advanced objects (such as SocketServer in Python); I want you to work with actual sockets here so you get a good idea how they work (you must use the system calls detailed in class). Any packages, frameworks, classes that provide any web/http related services or hide communications are not allowed.
The following Python packages can be used to complete this question: os, sys, socket, subprocess.
In order to ensure that you don¡¯t use ports that are being used by someone else (e.g. a fellow student), you have all been assigned a unique port to use for the programs in this assignment. Please see for the Port Assignment module to find your port number.
Yes, there are plenty of open source implementations available (and previous sample solutions) and we know all about them ¡ª using any, even as a reference, would be bad. Part of the fun is being able to implement an http server, from scratch, on your own (if a Physicist can do it, so can you!).
The full set of requirements for the protocol can be found in your course notes outlining http (what CGI scripts expect is what you have to provide).
Implement an http server using stream sockets. It will listen for connections on your assigned port ¡ª so if you¡¯re running on mouse with port 15000 you will access the server as eagle.cs.umanitoba.ca:15000 in your browser. Your server must support HEAD, GET, and POST methods (handling parameters passed via either method), the sending of cookies, and the execution of server- side scripts. You can assume that scripts have the .cgi extension only (everything else is just a page). To execute a script you will need to use the subprocess module¡¯s check_output(), popen() and communicate() calls (depending on whether you have a POST or a GET).
Consider the resources that have been available to you in the CGI scripts that you have written. How have you gotten data? We are recreating this environment. We need to pass GET query data, and POST data in a body.
There will be examples of pages that your server must be able to support – initially just html, with CGI scripts added after A1 is due. The server can access these pages by having it run in the same directory (with sub-directories being handled as expected). Note that if someone accesses a directory instead of a page the server should default to accessing index.html within that directory. Further, if the user tries to access a page that doesn¡¯t exist the server must return an appropriate 404 response (e.g. trying to access mouse.cs.umanitoba.ca:15000/ but there is no index.html present in the server¡¯s working directory).
Run your servers on one of the Mac machines (make sure you use the correct host name when connecting to your server). Note that there is a firewall to the Linux machines, and therefore we must use the Mac lab. Login to rodents.cs.umanitoba.ca ¡ª you will be redirected to another machine. This will ensure that we don¡¯t have everyone using the same system to run their servers. Make sure you note the machine name to ensure that you point your browser to the right one when accessing your server. You can also curl (or equivalent) to test your implementation from the command line. In particular, use the –head option to test your support for the HEAD method.
Some hints
CGI scripts are executable. Python can execute other programs via the subprocess module. There are a few ways to run programs, the more complicated (but very useful) version is shown below: #!/usr/bin/python3
import subprocess
theProcess = subprocess.Popen([“/usr/bin/tee”, “output.txt”], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
theInputToTheProcess = b”This will get echoed, and written to a file”
(processStdOut, processStdErr) = theProcess.communicate(input=theInputToTheProcess)
print(processStdOut)
Bonus +50% of Part 3¡¯s weight
Do this part in C instead. Hand in
Include a readme.md in your submission, describing how to use your programs. Use the handin command handin 3010 a2 my_assignment_2_folder
Please have part1, part2, and part3 folders.