PowerPoint Presentation
COMP30023 – Computer
Copyright By PowCoder代写 加微信 powcoder
Application Layer – HTTP and HTML
• History of the internet
• Network Protocol Models (stacks)
• OSI vs TCP/IP
• Acknowledgement:
• These slides are minor modifications of those prepared by
University of Melbourne
• Top-down approach
– We’ll gradually peel away the layers over the coming weeks
• Application Layer
– HTTP (the web protocol), and in relation to it, HTML
• Wireshark – viewing network protocols in real-time
University of Melbourne
• Sir Tim Berners-Lee
– 1984 return to CERN (TCP/IP installed)
– Saw many online databases with different access
mechanisms (FTP, Gopher, …)
– 1989 wrote the proposal “a large hypertext
database with typed links” (No takers)
– by 1990, had designed and built: HTTP, HTML,
httpd, WorldWideWeb (browser)
– 1992 left for MIT, after CERN IT Head described it
as a misallocation of resources
• Hypertext
– coined the term in 1963
– Creation and use of linked content
World Wide Web – A Short History
University of Melbourne
• The vision was that HTTP would be the “glue” between data on different existing
– e.g., FTP (file transfer protocol) – many files available for download
• GOPHER – distributed database developed at U. Minnesota in 1991
– Hierarchical file structure
– More suited for text interfaces – lower network overhead
– February 1993, charging for server
• May 1994 first International WWW Conference (at CERN)
• September 1994 W3C formed (DARPA & European Community)
– Standardisation of web technologies – royalty free
• Browser wars 1994-1998 (Microsoft vs. Netscape)
• 1999 – 2001 .com boom
• 2002+ Ubiquitous web
• Web 2.0 – semantic web, social media
World Wide Web – A Short History
University of Melbourne
• Client – typically a browser based access to pages
• Server – daemon based content delivery of pages
• URL ≈ Protocol + DNS Name + file name
WWW – Components
University of Melbourne
TN 4th 7-19
WWW – Architecture
University of Melbourne
TN 5th 7-18
• HyperText Transfer Protocol
– Defined everything needed for the web
• TCP/IP Model vs OSI Model
– Application layer (except compression/encoding – Presentation)
• Resources are referred to by URLs
HTTP – Overview
University of Melbourne
• Uniform Resource Locator
– Sir Tim called it the “universal resource locator”
– Defined in original HTTP specification
– An address for a resource
– Can be relative “./nextpage.html” or absolute “http://www.google.com”
• Separate specification by W3C in 1998 for URI
– Uniform Resource Identifier
University of Melbourne
• Overview:
– Client initiates TCP connection (creates socket) to server, port 80
– Server accepts TCP connection from client
– HTTP messages (application-layer protocol messages) exchanged
between browser (HTTP client) and Web server (HTTP server)
– TCP connection closed
• Connections:
– HTTP 1.0 – single use connection
– HTTP 1.1 – persistent connections, additional headers
– HTTP/2 – 2015 – Further speed improvements (origins in SPDY)
– HTTP/3 (draft; in use) – Allow more parallelism in data loading (QUIC)
HTTP – Protocol Overview
University of Melbourne
© 4/11/22 12
Non-persistent HTTP
© University of Melbourne
• Non-persistent:
– requires 2 “response times” (one to initiate TCP connection and one
for initial HTTP request) per object + file transmission time
– OS overhead for each TCP connection
– browsers often open parallel TCP connections to fetch referenced
• Persistent:
– server leaves connection open after sending response
– subsequent HTTP messages between same client/server sent over
open connection
– client sends requests as soon as it encounters a referenced object,
reducing overall response time
Persistent vs. Non-persistent
University of Melbourne
• HTTP with (a) multiple connections and sequential requests.
(b) A persistent connection and sequential requests.
(c) A persistent connection and pipelined requests.
HTTP Request Connection
University of Melbourne
TN 6th 7-29
• Steps that occur when a link is selected:
– Browser determines the URL
– Browser asks DNS for the IP address of the server (Resolving URL)
– DNS replies
– The browser makes a TCP connection
– Sends HTTP request for the page
– Server sends the page as HTTP response
– Browser fetches other URLs as needed
– The browser displays the page (progressively, as content arrives)
– The TCP connections are released
HTTP – Summary of key steps
University of Melbourne
• Idempotent – multiple identical requests have same effect
• Safe – Only for information retrieval, should not change state
HTTP Method Safe Idempotent Cacheable
GET Yes Yes Yes
HEAD Yes Yes Yes
POST No No Yes/No
PUT No Yes No
DELETE No Yes No
CONNECT No No No
OPTIONS Yes Yes No
TRACE Yes Yes No
PATCH No No No
HTTP – Request Methods
University of Melbourne
Wireshark Example
University of Melbourne
HTTP Request Example
University of Melbourne
GET /somedir/page.html HTTP/1.1
Host: www.somesite.com.au
User-agent: Mozilla/4.0
Connection: close
Accept-language: fr
(extra new line)
request line
(GET, POST,
header lines
Blank line
(2 LF or 2
indicates end
of message
Code Meaning Examples
1xx Information 100 – server agrees to handle client’s request
2xx Success 200 = request succeeded; 204 = no content present
3xx Redirection 301 = page moved; 304 = cached page still valid
4xx Client error 403 = forbidden page; 404 = page not found
5xx Server error 500 = internal server error; 503 try again later
HTTP Response Codes
University of Melbourne
HTTP – Response
University of Melbourne
HTTP/1.1 200 OK
Connection: close
Date: Thu, 06 Aug 2009 12:00:15 GMT
Server: Apache/2.2.11 (Unix)
Last-modified: Mon, 22 Jun 2009
Content-Length: 6821
Content-Type: text/html
Status line
(protocol status
header lines
Data, e.g.,
HTTP Headers
Header Type Description
User-Agent Request Information about the browser and its platform
Accept Request The type of pages the client can handle
Accept-Charset Request The character sets that are acceptable to the client
Accept-Encoding Request The compression formats the client can handle
Accept-Language Request The natural languages the client can handle
If-Modified-Since Request Time and data to check freshness
If-None-Match Request Previously sent tags to check freshness
Host Request The server’s DNS name
Authorization Request A list of the client’s credentials
Referer Request The previous URL from which the request came
Cookie Request Previously set cookie sent back to the server
Set-Cookie Response Cookie for the client to store
Server Response Information about the server
© University of Melbourne
HTTP Headers
University of Melbourne
Header Type Description
Content-Encoding Response How the content is encoded (e.g., gzip)
Content-Language Response The natural language used in the page
Content-Length Response The page’s length in bytes
Content-Type Response The page’s MIME type
Content-Range Response Identifies a portion of the page’s content
Last-Modified Response Time and date the page was last changed
Expires Response Time and date when the pages stops being valid
Location Response Tells the client where to send its request
Accept-Ranges Response Indicates the server will accept byte range requests
Date Both Date and time the message was sent
Range Both Identifies a portion of a page
Cache-Control Both Directives for how to treat cache
Etag Both Tag for the contents of the page
Upgrade Both The protocol the sender wants to switch to
• Plugins/Extensions – integrated software module which
executes inside the browser,
– direct access to online context
• Helper – separate program which can be instantiated by the
browser, but can only access local cache of file content
– application/pdf
– application/msword
Client side processing
University of Melbourne
TN 5th 7-20
• 5 step process:
– Accept TCP Connection from client (browser)
– Identify the file requested
– Get the specified file from the local storage (disk, RAM, …)
– Send the file to the client
– Release the TCP connection
Server side processing – static page
University of Melbourne
• A multithreaded Web server with a front end and processing
Multi-threaded Web Server
University of Melbourne
TN 6th 7-22
• A processing module performs a series of steps:
– Resolve name of Web page requested.
– Perform access control on the Web page.
– Check the cache.
– Fetch requested page from disk or run program
– Determine the rest of the response
– Return the response to the client.
– Make an entry in the server log.
Multi-threaded Web Server –
University of Melbourne
• Goal: satisfy client request without involving origin server –
reduce response time.
University of Melbourne
TN 6th 7-28
• Used for caching, security and IP address sharing
• The browser sends all HTTP requests to the proxy. The proxy
returns objects in its cache or else the proxy requests object
from origin server, then returns object to client.
• Note: the proxy server acts as both client and server.
4/11/22 29
University of Melbourne
TN 6th 7-44
• The network stores no state about web sessions
• Cookies can place small amount (<4Kb) of information on
the users computer and re-use deterministically (RFC 2109)
• Cookies have 5 fields
– domain, path, content, expiry, security
• How to keep state – maintain state at sender/receiver over
multiple transactions; http messages carry “state”
• Questionable mechanism for tracking users (invisibly
perhaps) and learning about user behaviour
– e.g., competitor snooping, undesirable content etc.
University of Melbourne
Example Cookies
University of Melbourne
Name Value Domain Expires HTTPOnly Secure
Session-id 356-7554479-
.amazon.com.au 2036-01-01.
Session-id-time 2082787201l .amazon.com.au 2036-01-01.
ad-id A3kfU1c7DE3W
qz474A25Zfs
.amazon.adsystem.com 2037-01-01.
Name Value Domain Expires HTTPOnly Secure
ad-id A3kfU1c7DE3W
qz474A25Zfs
.amazon.adsystem.com 2037-01-01.
amazon.com.au
nytimes.com
• HTML - Hypertext Markup Language
– a simple language designed to encode both content and
presentational information
– Plain text encoding, with browser based rendering
– Restricted to ISO-8859 Latin-1 character set (internationalisation not
introduced until XHTML with UTF encodings)
• Web Page Components
– Structural divisions:
• Head
• Body …
– Syntactically Restricted Tag Sets
– Attributes & Values
Static web documents
University of Melbourne
©4/11/22 33
University of Melbourne
• HTML was originally an instance of SGML
– standard generalized markup language
• People wanted an HTML-like language to describe data that is not hypertext
– but SGML is too general / “heavy”
• XML (Extensible Markup Language)
& XSL (Extensible Stylesheet Language)
– Primary feature: separation of content and presentational markup
– Stringent validation requirements
– Essentially an expression of HTML 4.0 as valid XML
– Major differences to HTML 4.0 are the requirements for conformance, case folding,
well-formedness, attribute specification, nesting and embedding, and inclusion of a
document type identifier
Beyond HTML
University of Melbourne
Dynamic Content
University of Melbourne
TN 5th 7-35
Dynamic Content
University of Melbourne