School of Computing and Information Systems
COMP30023: Computer Systems

Practical Week 8

1 HTTP and HTML
In this section we will use our browser to view HTTP headers.
Instructions will be provided for Firefox and Chromium-based browsers. You are also welcome to complete the
lab using a different browser (e.g. Safari, non-chromium Edge), though your demonstrator will not be able to
help you in this regard.

1. Firefox: From the menu in the top-right of the browser, select “Web Developer” near the bottom, and
then “Toggle Tools” near the top of the opened menu.

Chrome: From the menu in the top-right of the browser, hover over “More Tools” near the bottom, and
then select “Developer tools” from the bottom.

2. Select the “Network” tab from the top bar, and visit the URL https://www.google.com.au.

3. This will show one line per web request/response. Select one, at the bottom right it will show you the
request and response headers.

1.1 Using cURL
If you prefer using the command line, you can also use the curl tool.
First, you may have to install the curl tool, $ sudo apt install curl .

You can perform a GET request by issuing the following command:

$ curl -s -vv google.com
* Trying 142.250.66.238:80…
* TCP_NODELAY set
* Connected to google.com (142.250.66.238) port 80 (#0)
> GET / HTTP/1.1
> Host: google.com
> User-Agent: curl/7.68.0
> Accept: */*
* Mark bundle as not supporting multiuse
< HTTP/1.1 301 Moved Permanently < Location: http://www.google.com/ < Content-Type: text/html; charset=UTF-8 < Date: Fri, 12 Mar 2021 02:58:57 GMT < Expires: Sun, 11 Apr 2021 02:58:57 GMT < Cache-Control: public, max-age=2592000 < Server: gws < Content-Length: 219 < X-XSS-Protection: 0 < X-Frame-Options: SAMEORIGIN
301 Moved

301 Moved

The document has moved
here.

https://www.google.com.au

The ”>” symbol at the start of a line represents the request header that is being sent to the destination server,
while the ”<” symbol represents the response header that is being received on your host. Using cURL is very common practice when trying to debug HTTP API calls or solving any protocol messaging issues that rely on HTTP. You will notice that the response body does not actually contain the HTML page for google.com due to a 301 Moved response code. You can ask cURL to follow redirects: $ curl -s -vv -L google.com The -L flag tells cURL to follow redirects. The -vv flag increases the verbosity of cURL to provide more information. The -s flag suppresses the request progress bar. The output of this command is shown below. You can see that it now contains the body, but does not fetch the components of the page that are not part of the HTML. $ curl -s -vv -L google.com * Trying 142.250.66.238:80... * TCP_NODELAY set * Connected to google.com (142.250.66.238) port 80 (#0) > GET / HTTP/1.1
> Host: google.com
> User-Agent: curl/7.68.0
> Accept: */*
* Mark bundle as not supporting multiuse
< HTTP/1.1 301 Moved Permanently < Location: http://www.google.com/ < Content-Type: text/html; charset=UTF-8 < Date: Fri, 12 Mar 2021 03:01:08 GMT < Expires: Sun, 11 Apr 2021 03:01:08 GMT < Cache-Control: public, max-age=2592000 < Server: gws < Content-Length: 219 < X-XSS-Protection: 0 < X-Frame-Options: SAMEORIGIN * Ignoring the response-body * Connection #0 to host google.com left intact * Issue another request to this URL: 'http://www.google.com/' * Trying 142.250.76.100:80... * TCP_NODELAY set * Connected to www.google.com (142.250.76.100) port 80 (#1) > GET / HTTP/1.1
> Host: www.google.com
> User-Agent: curl/7.68.0
> Accept: */*
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK < Date: Fri, 12 Mar 2021 03:01:08 GMT < Expires: -1 < Cache-Control: private, max-age=0 < Content-Type: text/html; charset=ISO-8859-1 < P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info." < Server: gws < X-XSS-Protection: 0 < X-Frame-Options: SAMEORIGIN < Set-Cookie: 1P_JAR=2021-03-12-03; expires=Sun, 11-Apr-2021 03:01:08 GMT; path=/; domain=.google.com; Secure < Set-Cookie: NID=211=oXUedXlqFeEqBXzfzqMBFhIVdnG9-J5jiAJ6iYqB3v6eGM6iCn7cXE7o5fDsOLT8i1uD9drWj f4pfjW-jInJJag-eEidIiLLoayv0yrJ8Eizu6tPQZe6fOdOwg8wAYzOH6ljcc0p9Pf8WF69mRflhVHrlG OxPzaGnpwzKvOUsCQ; expires=Sat, 11-Sep-2021 03:01:08 GMT; path=/; domain=.google.com; HttpOnly < Accept-Ranges: none < Vary: Accept-Encoding < Transfer-Encoding: chunked … Rest of HTML body

You are welcome to explore cURL by running $ man curl .

1.2 Using wget
Similarly, you can use wget to download web resources.
e.g. $ wget https://www.google.com

wget allows the downloading of all web elements to view the site correctly.
e.g. $ wget -p -k https://cis.unimelb.edu.au

1.3 Questions to ask yourself
In a new tab, look up the meaning of each header.
You may find the site https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers useful, or you can
search with your favourite search engine.

Look at the page. How many bytes do you think it would take to implement this page? Check the sum of the
sizes of the requests.
Note how many requests there are for such a simple page. Can you identify what each request is for? Which
ones do you think are cacheable?
How many different domain names are used to build this page?

1.4 Viewing the HTML
In a new window, open https://www.google.com again. Press CTRL+U. This will bring up the page source
in a new tab.
Search for the string CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

301 Moved

Related Posts