程序代写 COMP30023 – Computer Systems

COMP30023 – Computer Systems
Application Layer – HTTP and HTML

History of the internet

Copyright By PowCoder代写 加微信 powcoder

Network Protocol Models (stacks) OSI vs TCP/IP
Acknowledgement:
These slides are minor modifications of those prepared by Dr
© 4/11 University of Melbourne /22

Top-down approach
We’ll gradually peel away the layers over the coming weeks
– HTTP(thewebprotocol),andinrelationtoit,HTML
Application Layer
Wireshark – viewing network protocols in real-time
© 4/11 University of Melbourne /22

World Wide Web – A Short History
• Sir Tim Berners-Lee
– 1984returntoCERN(TCP/IPinstalled)
– Sawmanyonlinedatabaseswithdifferentaccess mechanisms (FTP, Gopher, …)
– 1989wrotetheproposal“alargehypertext database with typed links” (No takers)
– by1990,haddesignedandbuilt:HTTP,HTML, httpd, WorldWideWeb (browser)
– 1992leftforMIT,afterCERNITHeaddescribedit as a misallocation of resources
• Hypertext
– TedNelsoncoinedthetermin1963 – Creationanduseoflinkedcontent
© 4/11 University of Melbourne /22

World Wide Web – A Short History
• The vision was that HTTP would be the “glue” between data on different existing protocols
– e.g.,FTP(filetransferprotocol)–manyfilesavailablefordownload
• GOPHER – distributed database developed at U. Minnesota in 1991 – Hierarchicalfilestructure
– Moresuitedfortextinterfaces–lowernetworkoverhead – February1993,chargingforserver
• May 1994 first International WWW Conference (at CERN)
• September 1994 W3C formed (DARPA & European Community)
– Standardisationofwebtechnologies–royaltyfree
• Browser wars 1994-1998 (Microsoft vs. Netscape)
• 1999 – 2001 .com boom
• 2002+ Ubiquitous web
• Web 2.0 – semantic web, social media
© 4/11 University of Melbourne /22

TN 4th 7-19
WWW – Components
• Client – typically a browser based access to pages
• URL ≈ Protocol + DNS Name + file name
Server – daemon based content delivery of pages
© 4/11 University of Melbourne /22

WWW – Architecture
TN 5th 7-18
© 4/11 University of Melbourne /22

HTTP – Overview
– Definedeverythingneededfortheweb
HyperText Transfer Protocol TCP/IP Model vs OSI Model
– Applicationlayer(exceptcompression/encoding-Presentation)
Resources are referred to by URLs
© 4/11 University of Melbourne /22

Uniform Resource Locator
– SirTimcalleditthe“universalresourcelocator”
– DefinedinoriginalHTTPspecification
– Anaddressforaresource
– Canberelative“./nextpage.html”orabsolute“http://www.google.com”
– UniformResourceIdentifier
Separate specification by W3C in 1998 for URI
© 4/11 University of Melbourne /22

– ClientinitiatesTCPconnection(createssocket)toserver,port80
– ServeracceptsTCPconnectionfromclient
– HTTPmessages(application-layerprotocolmessages)exchanged
between browser (HTTP client) and Web server (HTTP server) TCP connection closed
HTTP – Protocol Overview
Connections:
– HTTP1.0–singleuseconnection
– HTTP1.1–persistentconnections,additionalheaders
– HTTP/2–2015–Furtherspeedimprovements(originsinSPDY)
– HTTP/3(draft;inuse)–Allowmoreparallelismindataloading(QUIC)
© 4/11 University of Melbourne /22

Non-persistent HTTP
University of Melbourne

Persistent vs. Non-persistent
– requires2“responsetimes”(onetoinitiateTCPconnectionandone
for initial HTTP request) per object + file transmission time
– OSoverheadforeachTCPconnection
– browsersoftenopenparallelTCPconnectionstofetchreferenced objects
– subsequentHTTPmessagesbetweensameclient/serversentover open connection
– clientsendsrequestsassoonasitencountersareferencedobject, reducing overall response time
Non-persistent:
Persistent:
server leaves connection open after sending response
© 4/11 University of Melbourne /22

HTTP Request Connection
TN 6th 7-29
• HTTP with (a) multiple connections and sequential requests. (b) A persistent connection and sequential requests.
(c) A persistent connection and pipelined requests.
© 4/11 University of Melbourne /22

Steps that occur when a link is selected:
– BrowserdeterminestheURL
– BrowserasksDNSfortheIPaddressoftheserver(ResolvingURL)
– DNSreplies
– ThebrowsermakesaTCPconnection
HTTP – Summary of key steps
– ServersendsthepageasHTTPresponse
Sends HTTP request for the page
– BrowserfetchesotherURLsasneeded
– Thebrowserdisplaysthepage(progressively,ascontentarrives)
The TCP connections are released
© 4/11 University of Melbourne /22

HTTP – Request Methods
HTTP Method
Idempotent
• Idempotent – multiple identical requests have same effect
• Safe – Only for information retrieval, should not change state
© 4/11 University of Melbourne /22

Wireshark Example
4/11 University of Melbourne /22

request line (GET, POST, HEAD)
header lines
Blank line
(2 LF or 2 CR/LF) indicates end of message
GET /somedir/page.html HTTP/1.1
Host: www.somesite.com.au
User-agent: Mozilla/4.0
Connection: close
Accept-language: fr
(extra new line)
HTTP Request Example
© 4/11 University of Melbourne /22

HTTP Response Codes
Information
100 – server agrees to handle client’s request
200 = request succeeded; 204 = no content present
Redirection
301 = page moved; 304 = cached page still valid
Client error
403 = forbidden page; 404 = page not found
Server error
500 = internal server error; 503 try again later
© 4/11 University of Melbourne /22

HTTP – Response
Status line
(protocol status
phrase) Date: Thu, 06 Aug 2009 12:00:15 GMT
HTTP/1.1 200 OK
Connection: close
header lines
Data, e.g., requested HTML file
Server: Apache/2.2.11 (Unix)
Last-modified: Mon, 22 Jun 2009
Content-Length: 6821
Content-Type: text/html

© 4/11 University of Melbourne /22

User-Agent
Accept-Charset
Accept-Encoding
Accept-Language
If-Modified-Since
If-None-Match
Authorization
Set-Cookie
HTTP Headers
Server ©Response
Description
Information about the browser and its platform
The type of pages the client can handle
The character sets that are acceptable to the client
The compression formats the client can handle
The natural languages the client can handle
Time and data to check freshness
Previously sent tags to check freshness
The server’s DNS name
A list of the client’s credentials
The previous URL from which the request came
Previously set cookie sent back to the server
Cookie for the client to store
Information about the server
©4/11/22 University of Melbourne

Description
Content-Encoding Response How the content is encoded (e.g., gzip)
HTTP Headers
Content-Language Response The natural language used in the page
Content-Length
The page’s length in bytes
Content-Type
The page’s MIME type
Content-Range
Identifies a portion of the page’s content
Last-Modified
Time and date the page was last changed
Time and date when the pages stops being valid
Tells the client where to send its request
Accept-Ranges
Indicates the server will accept byte range requests
Date and time the message was sent
Identifies a portion of a page
Cache-Control
Directives for how to treat cache
Tag for the contents of the page
The protocol the sender wants to switch to
© 4/11 University of Melbourne /22

Client side processing
Plugins/Extensions – integrated software module which executes inside the browser,
– directaccesstoonlinecontext
Helper – separate program which can be instantiated by the browser, but can only access local cache of file content
– application/pdf
application/msword
TN 5th 7-20
© 4/11 University of Melbourne /22

5 step process:
– AcceptTCPConnectionfromclient(browser)
– Identifythefilerequested
Server side processing – static page
– Sendthefiletotheclient
– ReleasetheTCPconnection
Get the specified file from the local storage (disk, RAM, …)
© 4/11 University of Melbourne /22

Multi-threaded Web Server
TN 6th 7-22
A multithreaded Web server with a front end and processing modules.
© 4/11 University of Melbourne /22

A processing module performs a series of steps:
– ResolvenameofWebpagerequested.
– PerformaccesscontrolontheWebpage.
– Checkthecache.
Multi-threaded Web Server – dynamic
– Determinetherestoftheresponse
Fetch requested page from disk or run program
– Returntheresponsetotheclient.
– Makeanentryintheserverlog.
© 4/11 University of Melbourne /22

Goal: satisfy client request without involving origin server – reduce response time.
TN 6th 7-28
© 4/11 University of Melbourne /22

Used for caching, security and IP address sharing
The browser sends all HTTP requests to the proxy. The proxy returns objects in its cache or else the proxy requests object from origin server, then returns object to client.
Note: the proxy server acts as both client and server.
TN 6th 7-44
4/11/22 © University of Melbourne 29

The network stores no state about web sessions
Cookies can place small amount (<4Kb) of information on the users computer and re-use deterministically (RFC 2109) Cookies have 5 fields domain, path, content, expiry, security How to keep state – maintain state at sender/receiver over multiple transactions; http messages carry “state” Questionable mechanism for tracking users (invisibly perhaps) and learning about user behaviour – e.g.,competitorsnooping,undesirablecontentetc. © 4/11 University of Melbourne /22 Example Cookies amazon.com.au Session-id 356-7554479- 6471342 .amazon.com.au 2036-01-01. Session-id-time 2082787201l .amazon.com.au 2036-01-01. A3kfU1c7DE3W qz474A25Zfs .amazon.adsystem.com 2037-01-01. nytimes.com A3kfU1c7DE3W qz474A25Zfs .amazon.adsystem.com 2037-01-01. © 4/11 University of Melbourne /22 Web Page Components – Structuraldivisions: • Head… • Body…
– SyntacticallyRestrictedTagSets
– Attributes&Values
Static web documents
– asimplelanguagedesignedtoencodebothcontentand
HTML – Hypertext Markup Language
presentational information
– Plaintextencoding,withbrowserbasedrendering
– RestrictedtoISO-8859Latin-1characterset(internationalisationnot introduced until XHTML with UTF encodings)
© 4/11 University of Melbourne /22

University of Melbourne

Beyond HTML
• HTML was originally an instance of SGML – standardgeneralizedmarkuplanguage
• People wanted an HTML-like language to describe data that is not hypertext – but SGML is too general / “heavy”
• XML (Extensible Markup Language)
& XSL (Extensible Stylesheet Language)
– Primaryfeature:separationofcontentandpresentationalmarkup – Stringentvalidationrequirements
– EssentiallyanexpressionofHTML4.0asvalidXML
– MajordifferencestoHTML4.0aretherequirementsforconformance,casefolding, well-formedness, attribute specification, nesting and embedding, and inclusion of a document type identifier
© 4/11 University of Melbourne /22

Dynamic Content
TN 5th 7-35
© 4/11 University of Melbourne /22

Dynamic Content

Please enter your name:

Please enter your age:

Reply:

Hello .
Prediction: next year you will be
(b)

Reply:

Hello Barbara.
Prediction: next year you will be 33
From TN 5th Edition, Figure 7-30
© 4/11 University of Melbourne /22

TN 6th 7-24
© 4/11 University of Melbourne /22

Technologies for producing interactive web applications include:
Client-side Scripting
JavaScript
Java Applets – compiled Java code (platform independent) ActiveX – compiled code for Windows
– HTMLandCSS:presentinformationaspages.
– DOM:changepartsofpageswhiletheyareviewed. – XML:letprogramsexchangedatawiththeserver.
– AnasynchronouswaytosendandretrieveXMLdata. – JavaScriptasalanguagetobindallthistogether
© 4/11 University of Melbourne /22

Tracking with cookies is well known
Tracking companies have expanded beyond simple cookies
– Plug-in,browserfingerprinting https://coveryourtracks.eff.org/ Project to research tracking
techniques in browsers
How unique is your browser by (EFF) :
https://coveryourtracks.eff.org/static/browser-uniqueness.p df
And finally…
© 4/11 University of Melbourne /22

The slides were based on slides prepared by , based on material developed previously by: , , , and .
Acknowledgement
Some of the images included in the notes were supplied as
part of the teaching resources accompanying the text books
listed on the previous slides.
– (And also) Computer Networks, 6th Edition, Tanenbaum A., Wetherall. D.
https://ebookcentral.proquest.com/lib/unimelb/detail.action?docID=6481879
Textbook Reference: Section 2.1 and 2.2, and related topics from pp.199-210
© 4/11 University of Melbourne /22

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com