2.Intro_Applications
Introduction(Protocol Layering, Security)
&
Application Layer (Principles, Web)
Computer Networks and Applications
Week 2
COMP 3331/COMP 9331
Reading Guide: Chapter 1, Sections 1.5 – 1.7
Chapter 2, Sections 2.1 – 2.2
1
1. Introduction: roadmap
1.1 what is the Internet?
1.2 network edge
§ end systems, access networks, links
1.3 network core
§ packet switching, circuit switching, network structure
1.4 delay, loss, throughput in networks
1.5 protocol layers, service models
1.6 networks under attack: security
1.7 history
2
Self study
Three (networking) design steps
v Break down the problem into tasks
v Organize these tasks
v Decide who does what
3
Tasks in Networking
v What does it take to send packets across?
v Simplistic decomposition:
§ Task 1: send along a single wire
§ Task 2: stitch these together to go across country/globe
v This gives idea of what I mean by decomposition
4
Tasks in Networking (bottom up)
v Bits /Packets on wire
v Deliver packets within local network
v Deliver packets across global network
v Ensure that packets get to the destination
process
v Do something with the data
5
Resulting Modules
v Bits / Packets on wire (Physical)
v Delivery packets within local network (Datalink)
v Deliver packets across global network (Network)
v Ensure that packets get to the dst process.
(Transport)
v Do something with the data (Application)
This is decomposition…
Now, how do we organize these tasks?
6
Dear John,
Your days are numbered.
–Pat
Inspiration…
v CEO A writes letter to CEO B
§ Folds letter and hands it to administrative aide
» Aide:
» Puts letter in envelope with CEO
B’s full name
» Takes to FedEx
v FedEx Office
§ Puts letter in larger envelope
§ Puts name and street address on FedEx envelope
§ Puts package on FedEx delivery truck
v FedEx delivers to other company
7
CEO
Aide
FedEx
CEO
Aide
FedEx
Location
Fedex Envelope (FE)
The Path of the Letter
Letter
Envelope
Semantic Content
Identity
“Peers” on each side understand the same things
No one else needs to (abstraction)
Lowest level has most packaging
8
The Path Through FedEx
Truck
Sorting
Office
Airport
FE
Sorting
Office
Airport
Truck
Sorting
Office
Airport
Crate Crate
FE
New
Crate
Crate
FE
Higher “Stack”
at Ends
Partial “Stack”
During Transit
Deepest Packaging (Envelope+FE+Crate)
at the Lowest Level of Transport
Highest Level of “Transit Stack”
is Routing
9
In the context of the Internet
Applications
…built on…
…built on…
…built on…
…built on…
Reliable (or unreliable) transport
Best-effort global packet delivery
Best-effort local packet delivery
Physical transfer of bits
10
Internet protocol stack
v application: supporting network
applications
§ FTP, SMTP, HTTP, Skype, ..
v transport: process-process data
transfer
§ TCP, UDP
v network: routing of datagrams
from source to destination
§ IP, routing protocols
v link: data transfer between
neighboring network elements
§ Ethernet, 802.111 (WiFi), PPP
v physical: bits “on the wire”
11
Three Observations
v Each layer:
§ Depends on layer below
§ Supports layer above
§ Independent of others
v Multiple versions in layer
§ Interfaces differ somewhat
§ Components pick which lower-
level protocol to use
v But only one IP layer
§ Unifying protocol
v
v
v
Quiz: What are the benefits of layering?
2-14
An Example: No Layering
v No layering: each new application has to be re-
implemented for every network technology !
ssh HTTP
WirelessEther-
net
Fiber
optic
Application
Transmission
Media
Skype
An Example: Benefit of Layering
v Introducing an intermediate layer provides a common
abstraction for various network technologies
Skypessh HTTP
WirelessEthernet Fiber
optic
Application
Transmission
Media
Transport
& Network
15
v Layer N may duplicate lower level functionality
§ E.g., error recovery to retransmit lost data
v Information hiding may hurt performance
§ E.g. packet loss due to corruption vs. congestion
v Headers start to get really big
§ E.g., typically TCP + IP + Ethernet headers add up to
54 bytes
v Layer violations when the gains too great to resist
§ E.g., TCP-over-wireless
v Layer violations when network doesn’t trust ends
§ E.g., Firewalls
Is Layering Harmful?
16
Distributing Layers Across Network
v Layers are simple if only on a single machine
§ Just stack of modules interacting with those
above/below
v But we need to implement layers across machines
§ Hosts
§ Routers
§ Switches
v What gets implemented where?
17
What Gets Implemented on Host?
v Bits arrive on wire, must make it up to
application
v Therefore, all layers must exist at host!
18
What Gets Implemented on Router?
v Bits arrive on wire
§ Physical layer necessary
v Packets must be delivered to next-hop
§ datalink layer necessary
v Routers participate in global delivery
§ Network layer necessary
v Routers don’t support reliable delivery
§ Transport layer (and above) not supported
19
20
Internet Layered Architecture
HTTP
TCP
IP
Ethernet
interface
HTTP
TCP
IP
Ethernet
interface
IP IP
Ethernet
interface
Ethernet
interface
SONET
interface
SONET
interface
host host
router router
HTTP message
TCP segment
IP packet IP packetIP packet
20
Logical Communication
v Layers interacts with peer’s corresponding layer
Transport
Network
Datalink
Physical
Transport
Network
Datalink
Physical
Network
Datalink
Physical
Application Application
Host A Host BRouter
21
Physical Communication
v Communication goes down to physical network
v Then from network peer to peer
v Then up to relevant layer
Transport
Network
Datalink
Physical
Transport
Network
Datalink
Physical
Network
Datalink
Physical
Application Application
Host A Host BRouter
22
source
application
transport
network
link
physical
HtHn M
segment Ht
datagram
destination
application
transport
network
link
physical
HtHnHl M
HtHn M
Ht M
M
network
link
physical
link
physical
HtHnHl M
HtHn M
HtHn M
HtHnHl M
router
switch
Encapsulation
message M
Ht M
Hn
frame
23
1. Introduction: roadmap
1.1 what is the Internet?
1.2 network edge
§ end systems, access networks, links
1.3 network core
§ packet switching, circuit switching, network structure
1.4 delay, loss, throughput in networks
1.5 protocol layers, service models
1.6 networks under attack: security
1.7 history
24
Self study
Introduction: summary
covered a “ton” of material!
v Internet overview
v what’s a protocol?
v network edge, core, access
network
§ packet-switching versus
circuit-switching
§ Internet structure
v performance: loss, delay,
throughput
v layering, service models
v security
v history
you now have:
v context, overview, “feel”
of networking
v more depth, detail to
follow!
25
2. Application Layer: outline
2.1 principles of network
applications
2.2 Web and HTTP
2.3 electronic mail
§ SMTP, POP3, IMAP
2.4 DNS
2.5 P2P applications
2.6 video streaming and
content distribution
networks (CDNs)
2.7 socket programming
with UDP and TCP
26
2. Application layer
our goals:
v conceptual,
implementation aspects
of network application
protocols
§ transport-layer
service models
§ client-server
paradigm
§ peer-to-peer
paradigm
v learn about protocols by
examining popular
application-level
protocols
§ HTTP
§ SMTP / POP3 / IMAP
§ DNS
v creating network
applications
§ socket API
27
Creating a network app
Write programs that:
v run on (different) end systems
v communicate over network
v e.g., web server software communicates
with browser software
No need to write software for network-core
devices
v network-core devices do not run user
applications
v applications on end systems allows for
rapid app development, propagation
application
transport
network
data link
physical
application
transport
network
data link
physical
application
transport
network
data link
physical
28
Interprocess Communication (IPC)
v Processes talk to each other through Inter-
process communication (IPC)
v On a single machine:
§ Shared memory
v Across machines:
§ We need other abstractions (message passing)
29
Shared
Segment
Interprocess Communication (IPC)
• In order to cooperate, need to communicate
• Achieved via IPC: interprocess communication
– ability for a process to communicate with another
• On a single machine:
– Shared memory
• Across machines:
– We need other abstractions (message passing)
Text
Data
Stack
Text
Data
Stack
P1 P2
Sockets
v process sends/receives messages to/from its socket
v socket analogous to door
§ sending process shoves message out door
§ sending process relies on transport infrastructure on other
side of door to deliver message to socket at receiving
process
v Application has a few options, OS handles the details
Internet
controlled
by OS
controlled by
app developer
transport
application
physical
link
network
process
transport
application
physical
link
network
process
socket
30
Addressing processes
v to receive messages,
process must have identifier
v host device has unique 32-
bit IP address
v Q: does IP address of host
on which process runs
suffice for identifying the
process?
v identifier includes both IP
address and port numbers
associated with process on
host.
v example port numbers:
§ HTTP server: 80
§ mail server: 25
v to send HTTP message to
cse.unsw.edu.au web server:
§ IP address: 129.94.242.51
§ port number: 80
§ A: no, many processes
can be running on same
host
31
Client-server architecture
server:
v Exports well-defined
request/response interface
v long-lived process that waits for
requests
v Upon receiving request, carries
it out
clients:
v Short-lived process that makes
requests
v “User-side” of application
v Initiates the communication
client/server
32
Client versus Server
v Server
§ Always-on host
§ Permanent IP address
(rendezvous location)
§ Static port conventions
(http: 80, email: 25,
ssh:22)
§ Data centres for scaling
§ May communicate with
other servers to respond
v Client
§ May be intermittently
connected
§ May have dynamic IP
addresses
§ Do not communicate
directly with each other
33
P2P architecture
v no always-on server
§ No permanent rendezvous
involved
v arbitrary end systems
(peers) directly
communicate
v Symmetric responsibility
(unlike client/server)
v Often used for:
§ File sharing (BitTorrent)
§ Games
§ Video distribution, video chat
§ In general: “distributed
systems”
peer-peer
34
P2P architecture: Pros and Cons
+ peers request service from other peers,
provide service in return to other peers
§ self scalability – new peers bring new
service capacity, as well as new service
demands
+ Speed: parallelism, less contention
+ Reliability: redundancy, fault tolerance
+ Geographic distribution
-Fundamental problems of decentralized
control
§ State uncertainty: no shared memory or
clock
§ Action uncertainty: mutually conflicting
decisions
-Distributed algorithms are complex
peer-peer
35
App-layer protocol defines
v types of messages
exchanged,
§ e.g., request, response
v message syntax:
§ what fields in messages
& how fields are
delineated
v message semantics
§ meaning of information
in fields
v rules for when and how
processes send & respond
to messages
open protocols:
v defined in RFCs
v allows for interoperability
v e.g., HTTP, SMTP
proprietary protocols:
v e.g., Skype
36
What transport service does an app need?
data integrity
v some apps (e.g., file transfer,
web transactions) require
100% reliable data transfer
v other apps (e.g., audio) can
tolerate some loss
timing
v some apps (e.g., Internet
telephony, interactive
games) require low delay
to be “effective”
throughput
v some apps (e.g.,
multimedia) require
minimum amount of
throughput to be
“effective”
v other apps (“elastic apps”)
make use of whatever
throughput they get
security
v encryption, data integrity,
…
37
Transport service requirements: common apps
application
file transfer
e-mail
Web documents
real-time audio/video
stored audio/video
interactive games
Chat/messaging
data loss
no loss
no loss
no loss
loss-tolerant
loss-tolerant
loss-tolerant
no loss
throughput
elastic
elastic
elastic
audio: 50kbps-1Mbps
video:100kbps-5Mbps
same as above
few kbps up
elastic
time sensitive
no
no
no
yes, 100’s msec
yes, few msecs
yes, 100’s msec
yes and no
38
Internet transport protocols services
TCP service:
v reliable transport between
sending and receiving
process
v flow control: sender won’t
overwhelm receiver
v congestion control: throttle
sender when network
overloaded
v does not provide: timing,
minimum throughput
guarantee, security
v connection-oriented: setup
required between client and
server processes
UDP service:
v unreliable data transfer
between sending and
receiving process
v does not provide:
reliability, flow control,
congestion control,
timing, throughput
guarantee, security,
orconnection setup,
Q: why bother? Why
is there a UDP?
NOTE: More on transport in Weeks 4 and 5 39
Internet apps: application, transport protocols
application
e-mail
remote terminal access
Web
file transfer
streaming multimedia
Internet telephony
application
layer protocol
SMTP [RFC 2821]
Telnet [RFC 854]
HTTP [RFC 2616]
FTP [RFC 959]
HTTP (e.g., YouTube),
RTP [RFC 1889]
SIP, RTP, proprietary
(e.g., Skype)
underlying
transport protocol
TCP
TCP
TCP
TCP
TCP or UDP
TCP or UDP
40
2. Application Layer: outline
2.1 principles of network
applications
§ app architectures
§ app requirements
2.2 Web and HTTP
2.3 electronic mail
§ SMTP, POP3, IMAP
2.4 DNS
2.5 P2P applications
2.6 video streaming and
content distribution
networks (CDNs)
2.7 socket programming
with UDP and TCP
41
The Web – Precursor
v 1967, Ted Nelson, Xanadu:
§ A world-wide publishing network that
would allow information to be stored
not as separate files but as connected
literature
§ Owners of documents would be
automatically paid via electronic
means for the virtual copying of their
documents
v Coined the term “Hypertext”Ted Nelson
42
Self study
The Web – History
v World Wide Web (WWW): a
distributed database of “pages” linked
through Hypertext Transport Protocol
(HTTP)
§ First HTTP implementation – 1990
• Tim Berners-Lee at CERN
§ HTTP/0.9 – 1991
• Simple GET command for the Web
§ HTTP/1.0 –1992
• Client/Server information, simple caching
§ HTTP/1.1 – 1996
§ HTTP2.0 – 2015
Tim Berners-Lee
43
http://info.cern.ch/hypertext/WWW/TheProject.html
Self study
Web and HTTP
First, a review…
v web page consists of objects
v object can be HTML file, JPEG image, Java applet,
audio file,…
v web page consists of base HTML-file which
includes several referenced objects
v each object is addressable by a URL, e.g.,
www.someschool.edu/someDept/pic.gif
host name path name
44
Uniform Resource Locator (URL)
protocol://host-name[:port]/directory-path/resource
v protocol: http, ftp, https, smtp etc.
v hostname: DNS name, IP address
v port: defaults to protocol’s standard port; e.g. http: 80 https: 443
v directory path: hierarchical, reflecting file system
v resource: Identifies the desired resource
45
Uniform Resource Locator (URL)
protocol://host-name[:port]/directory-path/resource
v Extend the idea of hierarchical hostnames to include anything in a file
system
§ http://www.cse.unsw.edu.au/~salilk/papers/journals/TMC2012.pdf
v Extend to program executions as well…
§ http://us.f413.mail.yahoo.com/ym/ShowLetter?box=%40B%40Bulk&MsgId=26
04_1744106_29699_1123_1261_0_28917_3552_1289957100&Search=&Nh
ead=f&YY=31454&order=down&sort=date&pos=0&view=a&head=b
§ Server side processing can be incorporated in the name
46
HTTP overview
HTTP: hypertext
transfer protocol
v Web’s application layer
protocol
v client/server model
§ client: browser that
requests, receives,
(using HTTP protocol)
and “displays” Web
objects
§ server: Web server
sends (using HTTP
protocol) objects in
response to requests
PC running
Firefox browser
server
running
Apache Web
server
iphone running
Safari browser
47
HTTP overview (continued)
uses TCP:
v client initiates TCP
connection (creates
socket) to server, port 80
v server accepts TCP
connection from client
v HTTP messages
(application-layer protocol
messages) exchanged
between browser (HTTP
client) and Web server
(HTTP server)
v TCP connection closed
HTTP is “stateless”
v server maintains no
information about
past client requests
protocols that maintain
“state” are complex!
v past history (state) must
be maintained
v if server/client crashes,
their views of “state”
may be inconsistent, must
be reconciled
aside
48
HTTP request message
v two types of HTTP messages: request, response
v HTTP request message:
§ ASCII (human-readable format)
request line
(GET, POST,
HEAD commands)
header
lines
carriage return,
line feed at start
of line indicates
end of header lines
GET /index.html HTTP/1.1\r\n
Host: www-net.cs.umass.edu\r\n
User-Agent: Firefox/3.6.10\r\n
Accept: text/html,application/xhtml+xml\r\n
Accept-Language: en-us,en;q=0.5\r\n
Accept-Encoding: gzip,deflate\r\n
Accept-Charset: ISO-8859-1,utf-8;q=0.7\r\n
Keep-Alive: 115\r\n
Connection: keep-alive\r\n
\r\n
carriage return character
line-feed character
49
HTTP response message
status line
(protocol
status code
status phrase)
header
lines
data, e.g.,
requested
HTML file
HTTP/1.1 200 OK\r\n
Date: Sun, 26 Sep 2010 20:09:20 GMT\r\n
Server: Apache/2.0.52 (CentOS)\r\n
Last-Modified: Tue, 30 Oct 2007 17:00:02
GMT\r\n
ETag: “17dc6-a5c-bf716880″\r\n
Accept-Ranges: bytes\r\n
Content-Length: 2652\r\n
Keep-Alive: timeout=10, max=100\r\n
Connection: Keep-Alive\r\n
Content-Type: text/html; charset=ISO-8859-
1\r\n
\r\n
data data data data data …
50
HTTP response status codes
200 OK
§ request succeeded, requested object later in this msg
301 Moved Permanently
§ requested object moved, new location specified later in this msg
(Location:)
400 Bad Request
§ request msg not understood by server
404 Not Found
§ requested document not found on this server
505 HTTP Version Not Supported
451 Unavailable for Legal Reasons
429 Too Many Requests
418 I’m a Teapot
v status code appears in 1st line in server-to-client response message.
v some sample codes:
51
HTTP is all text
v Makes the protocol simple
§ Easy to delineate messages (\r\n)
§ (relatively) human-readable
§ No issues about encoding or formatting data
§ Variable length data
v Not the most efficient
§ Many protocols use binary fields
• Sending “12345678” as a string is 8 bytes
• As an integer, 12345678 needs only 4 bytes
§ Headers may come in any order
§ Requires string parsing/processing
52
Request Method types (“verbs”)
HTTP/1.0:
v GET
§ Request page
v POST
§ Uploads user response to a
form
v HEAD
§ asks server to leave
requested object out of
response
HTTP/1.1:
v GET, POST, HEAD
v PUT
§ uploads file in entity body
to path specified in URL
field
v DELETE
§ deletes file specified in the
URL field
v TRACE, OPTIONS,
CONNECT, PATCH
§ For persistent connections
53
Uploading form input
POST method:
v web page often includes form input
v input is uploaded to server in entity body
Get (in-URL) method:
v uses GET method
v input is uploaded in URL field of request line:
www.somesite.com/animalsearch?monkeys&banana
54
User-server state: cookies
many Web sites use cookies
four components:
1) cookie header line of
HTTP response
message
2) cookie header line in
next HTTP request
message
3) cookie file kept on
user’s host, managed
by user’s browser
4) back-end database at
Web site
example:
v Susan always access Internet
from PC
v visits specific e-commerce
site for first time
v when initial HTTP requests
arrives at site, site creates:
§ unique ID
§ entry in backend
database for ID
55
Cookies: keeping “state” (cont.)
client server
usual http response msg
usual http response msg
cookie file
one week later:
usual http request msg
cookie: 1678 cookie-
specific
action
access
ebay 8734 usual http request msg Amazon server
creates ID
1678 for user create
entry
usual http response
set-cookie: 1678ebay 8734
amazon 1678
usual http request msg
cookie: 1678 cookie-
specific
action
access
ebay 8734
amazon 1678
backend
database
56
The Dark Side of Cookies
v Cookies permit sites to learn a lot about you
v You may supply name and e-mail to sites (and more)
v 3rd party cookies (from ad networks, etc.) can follow you
across multiple sites
§ Ever visit a website, and the next day ALL your ads are from them ?
• Check your browser’s cookie file (cookies.txt, cookies.plist)
• Do you see a website that you have never visited
v You COULD turn them off
§ But good luck doing anything on the Internet !!
57
Third party cookies
Doubleclick server
Banner 1 url
Create cookie for
doubleclick: 3445
Banner 2 url
Cookie:3445
Website A Website B
For more, check the following link and follow
the references:
http://en.wikipedia.org/wiki/HTTP_cookie
58
HTTP2-59
Performance of HTTP
Ø Page Load Time (PLT) as the metric
• From click until user sees page
• Key measure of web performance
Ø Depends on many factors such as
• page content/structure,
• protocols involved and
• Network bandwidth and RTT
Performance Goals
v User
§ fast downloads
§ high availability
v Content provider
§ happy users (hence, above)
§ cost-effective infrastructure
v Network (secondary)
§ avoid overload
60
Solutions?
v User
§ fast downloads
§ high availability
v Content provider
§ happy users (hence, above)
§ cost-effective infrastructure
v Network (secondary)
§ avoid overload
Improve HTTP to
achieve faster
downloads
61
Solutions?
v User
§ fast downloads
§ high availability
v Content provider
§ happy users (hence, above)
§ cost-effective delivery infrastructure
v Network (secondary)
§ avoid overload
Caching and Replication
62
Improve HTTP to
achieve faster
downloads
Solutions?
v User
§ fast downloads
§ high availability
v Content provider
§ happy users (hence, above)
§ cost-effective delivery infrastructure
v Network (secondary)
§ avoid overload
Caching and Replication
Exploit economies of scale
(Webhosting, CDNs, datacenters)
Improve HTTP to
achieve faster
downloads
63
2-64
How to improve PLT
Ø Reduce content size for transfer
• Smaller images, compression
Ø Change HTTP to make better use of available
bandwidth
• Persistent connections and pipelining
Ø Change HTTP to avoid repeated transfers of
the same content
• Caching and web-proxies
Ø Move content closer to the client
• CDNs
HTTP
HTTP Performance
v Most Web pages have multiple objects
§ e.g., HTML file and a bunch of embedded images
v How do you retrieve those objects (naively)?
§ One item at a time
v New TCP connection per (small) object!
non-persistent HTTP
v at most one object sent over TCP connection
§ connection then closed
v downloading multiple objects required multiple
connections
65
Non-persistent HTTP: response time
RTT (definition): time for a
small packet to travel from
client to server and back
HTTP response time:
v one RTT to initiate TCP
connection
v one RTT for HTTP request
and first few bytes of HTTP
response to return
v file transmission time
v non-persistent HTTP
response time =
2RTT+ file transmission
time
time to
transmit
file
initiate TCP
connection
RTT
request
file
RTT
file
received
time time
66
Internet
2-67
HTTP/1.0
Ø Non-Persistent: One TCP
connection to fetch one web
resource
Ø Fairly poor PLT
Ø 2 Scenarios
• Multiple TCP connections
setups to the same server
• Sequential request/responses
even when resources are
located on different servers
Ø Multiple TCP slow-start
phases (more in lecture on
TCP)
HTTP
Improving HTTP Performance:
Concurrent Requests & Responses
v Use multiple connections in
parallel
v Does not necessarily
maintain order of responses R1
R2 R3
T1
T2 T3
68
v What are potential downsides of parallel HTTP
connections, i.e. can opening too many parallel
connections be harmful and if so in what way?
69
Quiz: Parallel HTTP Connections
Persistent HTTP
v server leaves TCP connection open
after sending response
v subsequent HTTP messages
between same client/server are sent
over the same TCP connection
v Allow TCP to learn more accurate
RTT estimate (APPARENT LATER
IN THE COURSE)
v Allow TCP congestion window to
increase (APPARENT LATER)
v i.e., leverage previously discovered
bandwidth (APPARENT LATER)
Persistent without pipelining:
v client issues new request only
when previous response has been
received
v one RTT for each referenced
object
Persistent with pipelining:
v default in HTTP/1.1
v client sends requests as soon as it
encounters a referenced object
v as little as one RTT for all the
referenced objects
Persistent HTTP
70
HTTP 1.1: response time
71
initiate TCP
connection
RTT
request
file
RTT
file
received
time time
Internet
time to
transmit
file
Website with one
index page and three
embedded objects
Improving HTTP Performance: Caching
vWhy does caching work?
§ Exploits locality of reference
vHow well does caching work?
§ Very well, up to a limit
§ Large overlap in content
§ But many unique requests
72
Web caches (proxy server)
v user sets browser: Web
accesses via cache
v browser sends all HTTP
requests to cache
§ object in cache: cache
returns object
§ else cache requests
object from origin
server, then returns
object to client
goal: satisfy client request without involving origin server
client
proxy
server
client origin
server
origin
server
73
More about Web caching
v cache acts as both
client and server
§ server for original
requesting client
§ client to origin server
v typically cache is
installed by ISP
(university, company,
residential ISP)
why Web caching?
v reduce response time
for client request
v reduce traffic on an
institution’s access link
v Internet dense with
caches: enables “poor”
content providers to
effectively deliver
content
74
Caching example:
origin
servers
public
Internet
institutional
network
1 Gbps LAN
1.54 Mbps
access link
assumptions:
v avg object size: 100K bits
v avg request rate from
browsers to origin
servers:15/sec
v avg data rate to browsers: 1.50
Mbps
v RTT from institutional router
to any origin server: 2 sec
v access link rate: 1.54 Mbps
consequences:
v LAN utilization: 0.15%
v access link utilization = 99%
v total delay = Internet delay +
access delay + LAN delay
= 2 sec + minutes + usecs
problem!
75
assumptions:
v avg object size: 100K bits
v avg request rate from
browsers to origin
servers:15/sec
v avg data rate to browsers: 1.50
Mbps
v RTT from institutional router
to any origin server: 2 sec
v access link rate: 1.54 Mbps
consequences:
v LAN utilization: 0.15%
v access link utilization = 99%
v total delay = Internet delay + access
delay + LAN delay
= 2 sec + minutes + usecs
Caching example: fatter access link
origin
servers
1.54 Mbps
access link
154 Mbps
154 Mbps
msecs
Cost: increased access link speed (not cheap!)
0.99%
public
Internet
institutional
network
1 Gbps LAN
76
institutional
network
1 Gbps LAN
Caching example: install local cache
origin
servers
1.54 Mbps
access link
local web
cache
assumptions:
v avg object size: 100K bits
v avg request rate from
browsers to origin
servers:15/sec
v avg data rate to browsers: 1.50
Mbps
v RTT from institutional router
to any origin server: 2 sec
v access link rate: 1.54 Mbps
consequences:
v LAN utilization:
v access link utilization =
v total delay =
?
?
How to compute link
utilization, delay?
Cost: web cache (cheap!)
public
Internet
?
77
Caching example: install local cache
Calculating access link
utilization, delay with cache:
v suppose cache hit rate is 0.4
§ 40% requests satisfied at cache,
60% requests satisfied at origin
origin
servers
1.54 Mbps
access link
v access link utilization:
§ 60% of requests use access link
v data rate to browsers over access
link = 0.6*1.50 Mbps = .9 Mbps
§ utilization = 0.9/1.54 = .58
v total delay
§ = 0.6 * (delay from origin servers) +0.4
* (delay when satisfied at cache)
§ = 0.6 (2.01) + 0.4 (~msecs)
§ = ~ 1.2 secs
§ less than with 154 Mbps link (and
cheaper too!)
public
Internet
institutional
network
1 Gbps LAN
local web
cache
78
v Distribution of web object requests generally follows a Zipf-like
distribution
v The probability that a document will be referenced k requests after it was
last referenced is roughly proportional to 1/k . That is, web traces exhibit
excellent temporal locality.
79
But what is the likelihood of cache hits?
Paper – “Web Caching and Zipf-like Distributions: Evidence and Implications”
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.34.8742&rep=rep1&type=pdf
Video content exhibits similar properties: 10%
of the top popular videos account for nearly
80% of views, while the remaining 90% of
videos account for total 20% of requests.
Paper – http://yongyeol.com/papers/cha-video-2009.pdf
Conditional GET
v Goal: don’t send object if
cache has up-to-date
cached version
§ no object transmission
delay
§ lower link utilization
v cache: specify date of
cached copy in HTTP
request
If-modified-since:
v server: response contains
no object if cached copy
is up-to-date:
HTTP/1.0 304 Not
Modified
HTTP request msg
If-modified-since:
HTTP response
HTTP/1.0
304 Not Modified
object
not
modified
before
HTTP request msg
If-modified-since:
HTTP response
HTTP/1.0 200 OK
object
modified
after
client server
80
Example Cache Check Request
81
Example Cache Check Response
82
v Replicate popular Web site across many machines
§ Spreads load on servers
§ Places content closer to clients
§ Helps when content isn’t cacheable
v Problem:
§ Want to direct client to particular replica
• Balance load across server replicas
• Pair clients with nearby servers
§ Expensive
v Common solution:
§ DNS returns different addresses based on client’s geo
location, server load, etc.
Improving HTTP Performance: Replication
83
v Caching and replication as a service
v Integrate forward and reverse caching functionality
v Large-scale distributed storage infrastructure (usually)
administered by one entity
§ e.g., Akamai has servers in 20,000+ locations
v Combination of (pull) caching and (push) replication
§ Pull: Direct result of clients’ requests
§ Push: Expectation of high access rate
v Also do some processing
§ Handle dynamic web pages
§ Transcoding
§ Maybe do some security function – watermark IP
84
Improving HTTP Performance: CDN
More on this later
What about HTTPS?
v HTTP is insecure
v HTTP basic authentication: password sent using
base64 encoding (can be readily converted to
plaintext)
v HTTPS: HTTP over a connection encrypted by
Transport Layer Security (TLS)
v Provides:
§ Authentication
§ Bidirectional encryption
v Widely used in place of plain vanilla HTTP
85
What’s on the horizon: HTTP/2
v Google SPDY (speedy) -> HTTP/2: (RFC 7540 May 2015)
v Better content structure
v Improvements
§ Severs can push content and thus reduce overhead of an
additional request cycle
§ Fully multiplexed
• Requests and responses are sliced in smaller chunks called frames,
frames are tagged with an ID that connects data to the
request/response
• overcomes Head-of-line blocking in HTTP 1.1
§ Prioritisation of the order in which objects should be sent (e.g.
CSS files may be given higher priority)
§ Data compression of HTTP headers
• Some headers such as cookies can be very long
• Repetitive information
86
More details: https://http2.github.io/faq/
Demo: https://http2.akamai.com/demo
Summary
v Completed Introduction (Chapter 1)
§ Solve Sample Problem Set
v Application Layer (Chapter 2)
§ Principles of Network Applications
§ HTTP
v Next Week: Application Layer (contd.)
§ E-mail
§ P2P
§ DNS
§ Socket Programming
87
Reading Exercise
Chapter 2: 2.4 – 2.7