_____________________________________
CIS 455 / 555: Internet and Web Systems
Midterm 1 October 14, 2020
_____________________________________
Instructions (READ CAREFULLY):
• Time: You will have 24 hours to complete this exam and turn it in to Canvas. Your answers
should be submitted in plaintext form. No late exams will be accepted (not even if there
are technical difficulties), so finish and upload early!
• Materials: You can use slides, lecture recordings, notes, and any supplemental readings
from the course website. In addition, you can also access the following websites: o piazza.com
o canvas.upenn.edu
o cis.upenn.edu/~cis455
o tools.ietf.org/html/
NO OTHER WEBSITES ARE ALLOWED!! That includes (but is not limited to) Google and Stack Overflow. Use or suspected use of any other material will be considered cheating and will result in an automatic zero and referral for disciplinary action.
• Similarly, DO NOT communicate with any other student during the entire 24 hour period. Any suspected collaboration will be considered cheating for all students involved.
• Piazza: no public posts during the 24 hour period, including about projects. If you need to ask a clarification question, please use a private note and please ask it early. We will be using Piazza to post public clarifications as well, so please check that post often.
• As a take-home exam, many of these questions are open-ended with many possible answers. More important than your final solutions are that you can demonstrate that you understand the concepts deeply and can apply/synthesize them.
• Fill in what you know to get partial credit! If you find a question ambiguous, write down any assumptions you make.
• Last, but not least, at the top of the Canvas text submission include the following:
Name:
PennKey:
I,
1. [12 points] The web server you built in HW1 is an example of a Computer System, as defined by Butler Lampson. Pick your favorite advice from his paper on System Design and describe how components of your HW1 web server follow that advice. Alternatively, if your system does not follow the piece of advice, you may describe how your Web Server should be designed instead. Pick a total of three pieces of advice.
2. a. [14 points] Write an XML Schema for a B+ tree of order 3. We have supplied the first two lines of your implementation.
b. [4 points] Write a corresponding .xml document for the Schema you defined in (a). Your B+ tree must have at least 3 layers.
3. [8 points] In HttpParsing.java (lines 117 – 124) a missing piece of functionality is the ability to handle multi-line headers. Update the code to be able to handle multi-line headers correctly.
4. [10 points] Consider the following two web server designs:
A: A webserver deployed using a Linux 4.0.11 container and Java 8. The host has Linux 4.0.11 and Java 11, but not Java 8. The webserver has a single listener thread that spawns a new thread for every incoming request.
B: A webserver deployed in a Linux VM with Java 8. The host is running Windows and Java 11. The server has a single listener thread and a pool of worker threads, with the listener placing tasks on a queue, which are pulled out by the workers whenever they are ready for the next task.
Compare and contrast the two designs. When will A outperform B? When will B outperform A? A compute answer will have at least 1 paragraph of text (4-6 sentences).
5. [12 points] Before search engines, the only way to browse the web was through web directories, which were hierarchical repositories of webpages categorized by subject. For example, the oldest website for which we have a copy is http://info.cern.ch/, created by the inventor of HTTP. It contains a web directory with the top-level Subjects: Aeronautics, Astronomy and Astrophysics, Bio Sciences, Computing, Geography, etc. The current largest web directory contains around 3.3M sites in 1M categories/subcategories.
Imagine a world where search engines were never invented and we only interacted with the web using directories. What are the pros and cons of this design? A complete answer will have at least 3 pros and 3 cons.
6. Consider a server with a route:
post(“/session/:key/:value”, (req, res) -> { String key = req.params(“:key”);
String value = req.params(“:value”); req.session().attribute(key, value); return “done”;
}
a. [6 points] Describe how an off-path attacker might be able to modify the session state of a victim. Be as detailed as possible, including but not limited to writing out the HTTP messages it would send.
b. [2 points] Briefly describe how a server might defend against your attack in (a). A complete answer can have as little as a single sentence.