PowerPoint Presentation

COMP30023 – Computer

Application Layer – HTTP and HTML

• History of the internet
• Network Protocol Models (stacks)
• OSI vs TCP/IP

• Acknowledgement:
• These slides are minor modifications of those prepared by

University of Melbourne

• Top-down approach
– We’ll gradually peel away the layers over the coming weeks

• Application Layer
– HTTP (the web protocol), and in relation to it, HTML

• Wireshark – viewing network protocols in real-time

University of Melbourne

• Sir Tim Berners-Lee
– 1984 return to CERN (TCP/IP installed)
– Saw many online databases with different access

mechanisms (FTP, Gopher, …)
– 1989 wrote the proposal “a large hypertext

database with typed links” (No takers)
– by 1990, had designed and built: HTTP, HTML,

httpd, WorldWideWeb (browser)
– 1992 left for MIT, after CERN IT Head described it

as a misallocation of resources
• Hypertext

– coined the term in 1963
– Creation and use of linked content

World Wide Web – A Short History

University of Melbourne

• The vision was that HTTP would be the “glue” between data on different existing
– e.g., FTP (file transfer protocol) – many files available for download

• GOPHER – distributed database developed at U. Minnesota in 1991
– Hierarchical file structure
– More suited for text interfaces – lower network overhead
– February 1993, charging for server

• May 1994 first International WWW Conference (at CERN)
• September 1994 W3C formed (DARPA & European Community)

– Standardisation of web technologies – royalty free
• Browser wars 1994-1998 (Microsoft vs. Netscape)
• 1999 – 2001 .com boom
• 2002+ Ubiquitous web
• Web 2.0 – semantic web, social media

World Wide Web – A Short History

University of Melbourne

• Client – typically a browser based access to pages
• Server – daemon based content delivery of pages
• URL ≈ Protocol + DNS Name + file name

WWW – Components

University of Melbourne

TN 4th 7-19

WWW – Architecture

University of Melbourne

TN 5th 7-18

• HyperText Transfer Protocol
– Defined everything needed for the web

• TCP/IP Model vs OSI Model
– Application layer (except compression/encoding – Presentation)

• Resources are referred to by URLs

HTTP – Overview

University of Melbourne

• Uniform Resource Locator
– Sir Tim called it the “universal resource locator”
– Defined in original HTTP specification
– An address for a resource
– Can be relative “./nextpage.html” or absolute “http://www.google.com”

• Separate specification by W3C in 1998 for URI
– Uniform Resource Identifier

University of Melbourne

• Overview:
– Client initiates TCP connection (creates socket) to server, port 80
– Server accepts TCP connection from client
– HTTP messages (application-layer protocol messages) exchanged

between browser (HTTP client) and Web server (HTTP server)
– TCP connection closed

• Connections:
– HTTP 1.0 – single use connection
– HTTP 1.1 – persistent connections, additional headers
– HTTP/2 – 2015 – Further speed improvements (origins in SPDY)
– HTTP/3 (draft; in use) – Allow more parallelism in data loading (QUIC)

HTTP – Protocol Overview

University of Melbourne

Non-persistent HTTP

• Non-persistent:
– requires 2 “response times” (one to initiate TCP connection and one

for initial HTTP request) per object + file transmission time
– OS overhead for each TCP connection
– browsers often open parallel TCP connections to fetch referenced

• Persistent:

– server leaves connection open after sending response
– subsequent HTTP messages between same client/server sent over

open connection
– client sends requests as soon as it encounters a referenced object,

reducing overall response time

Persistent vs. Non-persistent

University of Melbourne

• HTTP with (a) multiple connections and sequential requests.
(b) A persistent connection and sequential requests.
(c) A persistent connection and pipelined requests.

HTTP Request Connection

University of Melbourne

TN 6th 7-29

• Steps that occur when a link is selected:
– Browser determines the URL
– Browser asks DNS for the IP address of the server (Resolving URL)
– DNS replies
– The browser makes a TCP connection
– Sends HTTP request for the page
– Server sends the page as HTTP response
– Browser fetches other URLs as needed
– The browser displays the page (progressively, as content arrives)
– The TCP connections are released

HTTP – Summary of key steps

University of Melbourne

• Idempotent – multiple identical requests have same effect
• Safe – Only for information retrieval, should not change state

HTTP Method Safe Idempotent Cacheable

GET Yes Yes Yes

HEAD Yes Yes Yes

POST No No Yes/No

PUT No Yes No

DELETE No Yes No

CONNECT No No No

OPTIONS Yes Yes No

TRACE Yes Yes No

PATCH No No No

HTTP – Request Methods

University of Melbourne

Wireshark Example

University of Melbourne

HTTP Request Example

University of Melbourne

GET /somedir/page.html HTTP/1.1
Host: www.somesite.com.au
User-agent: Mozilla/4.0
Connection: close
Accept-language: fr

(extra new line)

request line
(GET, POST,

header lines

Blank line
(2 LF or 2
indicates end
of message

Code Meaning Examples

1xx Information 100 – server agrees to handle client’s request

2xx Success 200 = request succeeded; 204 = no content present

3xx Redirection 301 = page moved; 304 = cached page still valid

4xx Client error 403 = forbidden page; 404 = page not found

5xx Server error 500 = internal server error; 503 try again later

HTTP Response Codes

University of Melbourne

HTTP – Response

University of Melbourne

HTTP/1.1 200 OK
Connection: close
Date: Thu, 06 Aug 2009 12:00:15 GMT
Server: Apache/2.2.11 (Unix)
Last-modified: Mon, 22 Jun 2009
Content-Length: 6821
Content-Type: text/html

…

Status line
(protocol status

header lines

Data, e.g.,

HTTP Headers
Header Type Description

User-Agent Request Information about the browser and its platform

Accept Request The type of pages the client can handle

Accept-Charset Request The character sets that are acceptable to the client

Accept-Encoding Request The compression formats the client can handle

Accept-Language Request The natural languages the client can handle

If-Modified-Since Request Time and data to check freshness

If-None-Match Request Previously sent tags to check freshness

Host Request The server’s DNS name

Authorization Request A list of the client’s credentials

Referer Request The previous URL from which the request came

Cookie Request Previously set cookie sent back to the server

Set-Cookie Response Cookie for the client to store

Server Response Information about the server

HTTP Headers

University of Melbourne

Header Type Description

Content-Encoding Response How the content is encoded (e.g., gzip)

Content-Language Response The natural language used in the page

Content-Length Response The page’s length in bytes

Content-Type Response The page’s MIME type

Content-Range Response Identifies a portion of the page’s content

Last-Modified Response Time and date the page was last changed

Expires Response Time and date when the pages stops being valid

Location Response Tells the client where to send its request

Accept-Ranges Response Indicates the server will accept byte range requests

Date Both Date and time the message was sent

Range Both Identifies a portion of a page

Cache-Control Both Directives for how to treat cache

Etag Both Tag for the contents of the page

Upgrade Both The protocol the sender wants to switch to

• Plugins/Extensions – integrated software module which
executes inside the browser,
– direct access to online context

• Helper – separate program which can be instantiated by the
browser, but can only access local cache of file content
– application/pdf
– application/msword

Client side processing

University of Melbourne

TN 5th 7-20

• 5 step process:
– Accept TCP Connection from client (browser)
– Identify the file requested
– Get the specified file from the local storage (disk, RAM, …)
– Send the file to the client
– Release the TCP connection

Server side processing – static page

University of Melbourne

• A multithreaded Web server with a front end and processing

Multi-threaded Web Server

University of Melbourne

TN 6th 7-22

• A processing module performs a series of steps:
– Resolve name of Web page requested.
– Perform access control on the Web page.
– Check the cache.
– Fetch requested page from disk or run program
– Determine the rest of the response
– Return the response to the client.
– Make an entry in the server log.

Multi-threaded Web Server –

University of Melbourne

• Goal: satisfy client request without involving origin server –
reduce response time.

University of Melbourne

TN 6th 7-28

• Used for caching, security and IP address sharing
• The browser sends all HTTP requests to the proxy. The proxy

returns objects in its cache or else the proxy requests object
from origin server, then returns object to client.

• Note: the proxy server acts as both client and server.

4/11/22 29

University of Melbourne

TN 6th 7-44

• The network stores no state about web sessions
• Cookies can place small amount (<4Kb) of information on the users computer and re-use deterministically (RFC 2109) • Cookies have 5 fields – domain, path, content, expiry, security • How to keep state – maintain state at sender/receiver over multiple transactions; http messages carry “state” • Questionable mechanism for tracking users (invisibly perhaps) and learning about user behaviour – e.g., competitor snooping, undesirable content etc. University of Melbourne Example Cookies University of Melbourne Name Value Domain Expires HTTPOnly Secure Session-id 356-7554479- .amazon.com.au 2036-01-01. Session-id-time 2082787201l .amazon.com.au 2036-01-01. ad-id A3kfU1c7DE3W qz474A25Zfs .amazon.adsystem.com 2037-01-01. Name Value Domain Expires HTTPOnly Secure ad-id A3kfU1c7DE3W qz474A25Zfs .amazon.adsystem.com 2037-01-01. amazon.com.au nytimes.com • HTML - Hypertext Markup Language – a simple language designed to encode both content and presentational information – Plain text encoding, with browser based rendering – Restricted to ISO-8859 Latin-1 character set (internationalisation not introduced until XHTML with UTF encodings) • Web Page Components – Structural divisions: • Head …
• Body …

– Syntactically Restricted Tag Sets
– Attributes & Values

Static web documents

University of Melbourne

• HTML was originally an instance of SGML
– standard generalized markup language

• People wanted an HTML-like language to describe data that is not hypertext
– but SGML is too general / “heavy”

• XML (Extensible Markup Language)
& XSL (Extensible Stylesheet Language)
– Primary feature: separation of content and presentational markup
– Stringent validation requirements

– Essentially an expression of HTML 4.0 as valid XML
– Major differences to HTML 4.0 are the requirements for conformance, case folding,

well-formedness, attribute specification, nesting and embedding, and inclusion of a
document type identifier

Beyond HTML

University of Melbourne

Dynamic Content

University of Melbourne

TN 5th 7-35

Dynamic Content

University of Melbourne

Please enter your name:

Please enter your age:

Reply:

Hello .
Prediction: next year you will be Reply:

Hello Barbara.
Prediction: next year you will be

From TN 5th Edition, Figure 7-30

University of Melbourne

TN 6th 7-24

• Technologies for producing interactive web applications

• JavaScript
• Java Applets – compiled Java code (platform independent)
• ActiveX – compiled code for Windows

– HTML and CSS: present information as pages.
– DOM: change parts of pages while they are viewed.
– XML: let programs exchange data with the server.
– An asynchronous way to send and retrieve XML data.
– JavaScript as a language to bind all this together

Client-side Scripting

University of Melbourne

• Tracking with cookies is well known
• Tracking companies have expanded beyond simple cookies

– Plug-in, browser fingerprinting
• https://coveryourtracks.eff.org/ Project to research tracking

techniques in browsers
• How unique is your browser by (EFF) :

https://coveryourtracks.eff.org/static/browser-uniqueness.p

And finally…

University of Melbourne

https://coveryourtracks.eff.org/
https://panopticlick.eff.org/static/browser-uniqueness.pdf
https://panopticlick.eff.org/static/browser-uniqueness.pdf

• The slides were based on slides prepared by ,
based on material developed previously by: ,
, , and .

• Some of the images included in the notes were supplied as
part of the teaching resources accompanying the text books
listed on the previous slides.
– (And also) Computer Networks, 6th Edition, Tanenbaum A., Wetherall. D.

https://ebookcentral.proquest.com/lib/unimelb/detail.action?docID=6481879

• Textbook Reference: Section 2.1 and 2.2, and related topics
from pp.199-210

Acknowledgement

University of Melbourne

https://ebookcentral.proquest.com/lib/unimelb/detail.action?docID=6481879

Application Layer – HTTP and HTML
World Wide Web – A Short History
World Wide Web – A Short History (2)
WWW – Components
WWW – Architecture
HTTP – Overview
HTTP – Protocol Overview
Non-persistent HTTP
Persistent vs. Non-persistent
HTTP Request Connection
HTTP – Summary of key steps
HTTP – Request Methods
Wireshark Example
HTTP Request Example
HTTP Response Codes
HTTP – Response
HTTP Headers
HTTP Headers (2)
Client side processing
Server side processing – static page
Multi-threaded Web Server
Multi-threaded Web Server – dynamic
Example Cookies
Static web documents
Beyond HTML
Dynamic Content
Dynamic Content (2)
Client-side Scripting
And finally…
Acknowledgement

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Reply:

Related Posts