CS计算机代考程序代写 ER AI algorithm 6.Transport_Part2

6.Transport_Part2

COMP 3331/9331:
Computer Networks and

Applications
Week 5

Transport Layer (Continued)
Reading Guide: Chapter 3, Sections: 3.5 – 3.7

2

Transport Layer Outline

3.1 transport-layer
services

3.2 multiplexing and
demultiplexing

3.3 connectionless
transport: UDP

3.4 principles of reliable
data transfer

3.5 connection-oriented
transport: TCP
§ segment structure
§ reliable data transfer
§ flow control
§ connection management

3.6 principles of congestion
control

3.7 TCP congestion control

Practical Reliability Questions

v How do the sender and receiver keep track of
outstanding pipelined segments?

v How many segments should be pipelined?
v How do we choose sequence numbers?
v What does connection establishment and teardown

look like?
v How should we choose timeout values?

3

TCP: Overview RFCs: 793,1122,1323, 2018, 2581

v full duplex data:
§ bi-directional data flow

in same connection
§ MSS: maximum segment

size
v connection-oriented:

§ handshaking (exchange
of control msgs) inits
sender, receiver state
before data exchange

v flow controlled:
§ sender will not

overwhelm receiver

v point-to-point:
§ one sender, one receiver

v reliable, in-order byte
stream:
§ no “message boundaries”

v pipelined:
§ TCP congestion and flow

control set window size
v send and receive buffers

socket
door

TCP
send buffer

TCP
receive buffer

socket
door

segment

application
writes data

application
reads data

4

TCP segment structure

source port # dest port #

32 bits

application
data

(variable length)

sequence number
acknowledgement number

receive window

Urg data pointerchecksum
FSRPAUheadlen

not
used

options (variable length)

URG: urgent data
(generally not used)

ACK: ACK #
valid

PSH: push data now
(generally not used)

RST, SYN, FIN:
connection estab
(setup, teardown

commands)

# bytes
rcvr willing
to accept

counting
by bytes
of data
(not segments!)

Internet
checksum

(as in UDP)

5

TCP segment structure

6

TCP Segments

source port # dest port #

32 bits

application
data
(variable length)

Urg data pointer

F S R P A U
head
len

not
used

checksum

receive window

sequence number

acknowledgement number

options (variable length)

20 Bytes

(UDP was 8)

Transport Layer Outline

3.1 transport-layer
services

3.2 multiplexing and
demultiplexing

3.3 connectionless
transport: UDP

3.4 principles of reliable
data transfer

3.5 connection-oriented
transport: TCP
§ segment structure
§ reliable data transfer
§ flow control
§ connection management

3.6 principles of congestion
control

3.7 TCP congestion control

7

Recall: Components of a solution for
reliable transport
v Checksums (for error detection)
v Timers (for loss detection)
v Acknowledgments

§Cumulative
§ Selective

v Sequence numbers (duplicates, windows)
v Sliding Windows (for efficiency)

§Go-Back-N (GBN)
§ Selective Repeat (SR)

8

What does TCP do?

Many of our previous ideas, but some key
differences
v Checksum

9

10

TCP Header

Source port Destination port

Sequence number

Acknowledgment

Receive windowHdrLen Flags0

Checksum Urgent pointer

Options (variable)

Data

Computed
over header
and data
(SAME AS UDP)

What does TCP do?

Many of our previous ideas, but some key
differences
v Checksum
v Sequence numbers are byte offsets

11

TCP “Stream of Bytes” Service ..

B
yte 0

B
yte 1

B
yte 2

B
yte 3

B
yte 0

B
yte 1

B
yte 2

B
yte 3

Application @ Host A

Application @ Host B

B
yte 80

B
yte 80

12

.. Provided Using TCP “Segments”

B
yte 0

B
yte 1

B
yte 2

B
yte 3

B
yte 0

B
yte 1

B
yte 2

B
yte 3

Host A

Host B

B
yte 80

TCP Data

TCP Data

B
yte 80

Segment sent when:
1. Segment full (Max Segment Size),
2. Not full, but instructed by the Application e.g.,

1 byte in Telnet

13

TCP Maximum Segment Size

v IP packet
§ No bigger than Maximum Transmission Unit (MTU)
§ E.g., up to 1500 bytes with Ethernet

v TCP packet
§ IP packet with a TCP header and data inside
§ TCP header ³ 20 bytes long

v TCP segment
§ No more than Maximum Segment Size (MSS) bytes
§ E.g., up to 1460 consecutive bytes from the stream
§ MSS = MTU – 20 (min IP header ) – 20 ( min TCP header )

IP Hdr
IP Data

TCP HdrTCP Data (segment)

14

MTU

Sequence Numbers

Host A

ISN (initial sequence number)

Sequence number
= 1st byte in segment =

ISN + k

k bytes

15

Sequence numbers:
• byte stream “number” of first byte in

segment’s data

Sequence & Ack Numbers

Host B

TCP Data

TCP Data

TCP
HDR

TCP
HDR

ACK sequence number
= next expected byte
= seqno + length(data)

Host A

ISN (initial sequence number)

Sequence number
= 1st byte in segment =

ISN + k

k

16

17

Example

Note: Connection establishment not shown. Alice’s end point selects the initial
sequence number as 0 while Bob’s end point selects the initial sequence number as 10

18

Another Example

Note: Connection establishment not shown. Alice’s end point selects the initial
sequence number as 0 while Bob’s end point selects the initial sequence number as 10

HTTP response split into 3 segments (MSS = 1500 bytes)

Why choose random ISN?

v Avoids ambiguity with back-to-back connections
between same end-points

(a) When ISN=0 (b) When ISN is random
v Potential security issue if the ISN is known

3-19

What does TCP do?

Most of our previous tricks, but a few differences
v Checksum
v Sequence numbers are byte offsets
v Receiver sends cumulative acknowledgements (like GBN)

20

ACKing and Sequence Numbers
v Sender sends packet

§ Data starts with sequence number X
§ Packet contains B bytes [X, X+1, X+2, ….X+B-1]

v Upon receipt of packet, receiver sends an ACK
§ If all data prior to X already received:

• ACK acknowledges X+B (because that is next expected byte)
§ If highest in-order byte received is Y s.t. (Y+1) < X • ACK acknowledges Y+1 • Even if this has been ACKed before 21 22 TCP seq. numbers, ACKs Host BHost A Seq=100, Data=50 ISN=100 Seq=150, Data=50 Seq=200, Data=50 Seq=250, Data=50 ACK=200, Received 150-199 Seq=???, Data=50 Seq 100 to Seq 149 ACK=150, Received 100-149 ACK=250, Received 200-249 ACK=300, Received 250-299 23 TCP seq. numbers, ACKs Host BHost A Seq=100, Data=50 ISN=100 Seq=150, Data=50 Seq=200, Data=50 Seq=250, Data=50 ACK=150, Received 100-149 ACK=???, Received 200-249 ACK=???, Received 250-299 Both ACKs would be 150, as that is the next expected seq number Normal Pattern v Sender: seqno=X, length=B v Receiver: ACK=X+B v Sender: seqno=X+B, length=B v Receiver: ACK=X+2B v Sender: seqno=X+2B, length=B v Seqno of next packet is same as last ACK field 24 Packet Loss v Sender: seqno=X, length=B v Receiver: ACK=X+B v Sender: seqno=X+B, length=B v Sender: seqno=X+2B, length=B v Receiver: ACK = X+B LOST 25 26 TCP Header Source port Destination port Sequence number Acknowledgment Receive windowHdrLen Flags0 Checksum Urgent pointer Options (variable) Data Acknowledgment gives seqno just beyond highest seqno received in order (“What Byte is Next”) Piggybacking v So far, we’ve assumed distinct “sender” and “receiver” roles v In reality, usually both sides of a connection send some data 27 Piggybacking • So far, we’ve assumed distinct “sender” and “receiver” roles • In reality, usually both sides of a connection send some data – request/response is a common pattern Client Server Without Piggybacking … Client Server With Piggybacking … Quiz Seq = ?, 2 KBytes of data ACK = ? ACK = ? Seq = 1024 , 1 KB yte of data Seq = 101, 2 KBytes of data ACK = 101 + 2048 = 2149 ACK = 1024 + 1024 = 2048 Seq = 2149 28 What does TCP do? Most of our previous tricks, but a few differences v Checksum v Sequence numbers are byte offsets v Receiver sends cumulative acknowledgements (like GBN) v Receivers can buffer out-of-sequence packets (like SR) 29 Loss with cumulative ACKs v Sender sends packets with 100Bytes and sequence numbers: § 100, 200, 300, 400, 500, 600, 700, 800, 900, … v Assume the fifth packet (seq. no. 500) is lost, but no others v Stream of ACKs will be: § 200, 300, 400, 500, 500, 500, 500,… 30 What does TCP do? Most of our previous tricks, but a few differences v Checksum v Sequence numbers are byte offsets v Receiver sends cumulative acknowledgements (like GBN) v Receivers do not drop out-of-sequence packets (like SR) v Sender maintains a single retransmission timer (like GBN) and retransmits on timeout (how much?) 31 32 TCP round trip time, timeout TCP round trip time, timeout Q: how to set TCP timeout value? v longer than RTT § but RTT varies v too short: premature timeout, unnecessary retransmissions v too long: slow reaction to segment loss and connection has lower throughput Q: how to estimate RTT? v SampleRTT: measured time from segment transmission until ACK receipt § ignore retransmissions v SampleRTT will vary, want estimated RTT “smoother” § average several recent measurements, not just current SampleRTT 33 RTT: gaia.cs.umass.edu to fantasia.eurecom.fr 100 150 200 250 300 350 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 time (seconnds) RT T (m ill is ec on ds ) SampleRTT Estimated RTT EstimatedRTT = (1- a)*EstimatedRTT + a*SampleRTT v exponential weighted moving average v influence of past sample decreases exponentially fast v typical value: a = 0.125 TCP round trip time, timeout RT T (m ill is ec on ds ) RTT: gaia.cs.umass.edu to fantasia.eurecom.fr sampleRTT EstimatedRTT time (seconds) 34 v timeout interval: EstimatedRTT plus “safety margin” § large variation in EstimatedRTT -> larger safety margin

v estimate SampleRTT deviation from EstimatedRTT:
DevRTT = (1-b)*DevRTT +

b*|SampleRTT-EstimatedRTT|

TCP round trip time, timeout

(typically, b = 0.25)

TimeoutInterval = EstimatedRTT + 4*DevRTT

estimated RTT “safety margin”

35

Practice Problem:
http://wps.pearsoned.com/ecs_kurose_compnetw_6/216/55463/14198700.cw/index.html

36

TCP round trip time, timeout

TimeoutInterval = EstimatedRTT + 4*DevRTT

estimated RTT “safety margin”

(EstimatedRTT+4*DevRTT)

DevRTT

EstimatedRTT
RTT

Figure: Credits Prof David Wetherall UoW

Why exclude retransmissions in RTT
computation?

v How do we differentiate between the real ACK, and ACK of
the retransmitted packet?

ACK

Retransmission

Original Transmission

Sa
m

pl
eR

T
T

Sender Receiver

ACK
Retransmission

Original Transmission

SampleRTT

Sender Receiver

37

TCP sender events:
data rcvd from app:
v create segment with

seq #
v seq # is byte-stream

number of first data
byte in segment

v start timer if not
already running
§ think of timer as for

oldest unacked
segment

§ expiration interval:
TimeOutInterval

timeout:
v retransmit segment

that caused timeout
v restart timer
ack rcvd:
v if ack acknowledges

previously unacked
segments
§ update what is known

to be ACKed
§ start timer if there are

still unacked segments

38

PUTTING IT
TOGETHER

TCP sender (simplified)

wait
for

event

NextSeqNum = InitialSeqNum
SendBase = InitialSeqNum

L

create segment, seq. #: NextSeqNum
pass segment to IP (i.e., “send”)
NextSeqNum = NextSeqNum + length(data)
if (timer currently not running)

start timer

data received from application above

retransmit not-yet-acked segment
with smallest seq. #

start timer

timeout

if (y > SendBase) {
SendBase = y
/* SendBase–1: last cumulatively ACKed byte */
if (there are currently not-yet-acked segments)

start timer
else stop timer

}

ACK received, with ACK field value y

39

PUTTING IT
TOGETHER

TCP: retransmission scenarios

lost ACK scenario

Host BHost A

Seq=92, 8 bytes of data

ACK=100

Seq=92, 8 bytes of data

Xtim
eo

ut

ACK=100

premature timeout

Host BHost A

Seq=92, 8 bytes of data

ACK=100

Seq=92, 8
bytes of data

tim
eo

ut

ACK=120

Seq=100, 20 bytes of data

ACK=120

SendBase=100

SendBase=120

SendBase=120

SendBase=92

40

TCP: retransmission scenarios

X

cumulative ACK

Host BHost A

Seq=92, 8 bytes of data

ACK=100

Seq=120, 15 bytes of data

tim
eo

ut

Seq=100, 20 bytes of data

ACK=120

41

X

cumulative ACK

Host BHost A

Seq=92, 8 bytes of data

tim
eo

ut

Seq=100, 20 bytes of data

ACK=?

ACK = 92

TCP ACK generation [RFC 1122, RFC 2581]

event at receiver

arrival of in-order segment with
expected seq #. All data up to
expected seq # already ACKed

arrival of in-order segment with
expected seq #. One other
segment has ACK pending

arrival of out-of-order segment
higher-than-expect seq. # .
Gap detected

arrival of segment that
partially or completely fills gap

TCP receiver action

delayed ACK. Wait up to 500ms
for next segment. If no next segment,
send ACK

immediately send single cumulative
ACK, ACKing both in-order segments

immediately send duplicate ACK,
indicating seq. # of next expected byte

immediate send ACK, provided that
segment starts at lower end of gap

42

What does TCP do?

Most of our previous tricks, but a few differences
v Checksum
v Sequence numbers are byte offsets
v Receiver sends cumulative acknowledgements (like GBN)
v Receivers may not drop out-of-sequence packets (like SR)
v Sender maintains a single retransmission timer (like GBN) and

retransmits on timeout
v Introduces fast retransmit: optimisation that uses duplicate

ACKs to trigger early retransmission

43

X

fast retransmit after sender
receipt of triple duplicate ACK

Host BHost A

Seq=92, 8 bytes of data

ACK=100

ACK=100

ACK=100
ACK=100

TCP fast retransmit

Seq=100, 20 bytes of data

Seq=100, 20 bytes of data

44

tim
eo

ut

TCP fast retransmit
v time-out period often

relatively long:
§ long delay before resending

lost packet
v “Duplicate ACKs” are a

sign of an isolated loss
§ The lack of ACK progress

means that packet hasn’t
been delivered

§ Stream of ACKs means some
packets are being delivered

§ Could trigger resend on
receiving “k” duplicate ACKs
(TCP uses k = 3)

if sender receives 3
duplicate ACKs for
same data
(“triple duplicate ACKs”),
resend unacked
segment with smallest
seq #
§ likely that unacked

segment is lost, so
don’t wait for timeout

TCP fast retransmit

45

What does TCP do?

Most of our previous ideas, but some key
differences
v Checksum
v Sequence numbers are byte offsets
v Receiver sends cumulative acknowledgements (like GBN)
v Receivers do not drop out-of-sequence packets (like SR)
v Sender maintains a single retransmission timer (like GBN) and

retransmits on timeout
v Introduces fast retransmit: optimization that uses duplicate

ACKs to trigger early retransmission

46

Quiz: TCP Sequence Numbers?

A TCP Sender is just about to send a segment of
size 100 bytes with sequence number 1234 and ack
number 436 in the TCP header. What is the highest
sequence number up to (and including) which this
sender has received all bytes from the receiver?
A. 1233
B. 436
C. 435
D. 1334
E. 536

47

www.zeetings.com/salil

ANSWER: C

Quiz: TCP Sequence Numbers?

A TCP Sender is just about to send a segment of
size 100 bytes with sequence number 1234 and ack
number 436 in the TCP header. Is it possible that
the receiver has received byte number 1335?
A. Yes
B. No

48

www.zeetings.com/salil

ANSWER: A

Quiz: TCP Sequence Numbers?

The following statement is true about the TCP
sliding window protocol for implementing reliable
data transfer
A. It exclusively uses the ideas of Go-Back-N
B. It exclusively uses the ideas of Selective Repeat
C. It uses a combination of ideas of Go-Back-N and

Selective-Repeat
D. It uses none of the ideas of Go-Back-N and

Selective-Repeat

49

www.zeetings.com/salil

ANSWER: C

Transport Layer Outline

3.1 transport-layer
services

3.2 multiplexing and
demultiplexing

3.3 connectionless
transport: UDP

3.4 principles of reliable
data transfer

3.5 connection-oriented
transport: TCP
§ segment structure
§ reliable data transfer
§ flow control
§ connection management

3.6 principles of congestion
control

3.7 TCP congestion control

50

TCP flow control
application

process

TCP socket
receiver buffers

TCP
code

IP
code

application
OS

receiver protocol stack

application may
remove data from

TCP socket buffers ….

… slower than TCP
receiver is delivering
(sender is sending)

from sender

receiver controls sender, so
sender won’t overflow
receiver’s buffer by transmitting
too much, too fast

flow control

51

TCP flow control

buffered data

free buffer spacerwnd

RcvBuffer

TCP segment payloads

to application process
v receiver “advertises” free

buffer space by including
rwnd value in TCP header
of receiver-to-sender
segments
§ RcvBuffer size set via

socket options (typical default
is 4096 bytes)

§ many operating systems
autoadjust RcvBuffer

v sender limits amount of
unacked (“in-flight”) data to
receiver’s rwnd value

v guarantees receive buffer
will not overflow

receiver-side buffering

52
http://media.pearsoncmg.com/aw/aw_kurose_network_4/applets/flow/FlowControl.htm

53

TCP Header

Source port Destination port

Sequence number

Acknowledgment

Receive windowHdrLen Flags0

Checksum Urgent pointer

Options (variable)

Data

TCP flow control

v What if rwnd = 0?
§ Sender would stop sending data
§ Eventually the receive buffer would have space when

the application process reads some bytes
§ But how does the receiver advertise the new rwnd to

the sender?
v Sender keeps sending TCP segments with one

data byte to the receiver
v These segments are dropped but acknowledged

by the receiver with a zero-window size
v Eventually when the buffer empties, non-zero

window is advertised

54

Transport Layer Outline

3.1 transport-layer
services

3.2 multiplexing and
demultiplexing

3.3 connectionless
transport: UDP

3.4 principles of reliable
data transfer

3.5 connection-oriented
transport: TCP
§ segment structure
§ reliable data transfer
§ flow control
§ connection management

3.6 principles of congestion
control

3.7 TCP congestion control

55

Connection Management
before exchanging data, sender/receiver “handshake”:
v agree to establish connection (each knowing the other willing

to establish connection)
v agree on connection parameters

connection state: ESTAB
connection variables:

seq # client-to-server
server-to-client

rcvBuffer size
at server,client

application

network

connection state: ESTAB
connection Variables:

seq # client-to-server
server-to-client

rcvBuffer size
at server,client

application

network

Socket clientSocket =
newSocket(“hostname”,”port
number”);

Socket connectionSocket =
welcomeSocket.accept();

56

TCP 3-way handshake

SYNbit=1, Seq=x

choose init seq num, x
send TCP SYN msg

ESTAB

SYNbit=1, Seq=y
ACKbit=1; ACKnum=x+1

choose init seq num, y
send TCP SYNACK
msg, acking SYN
Allocates buffer,
initialize variables

ACKbit=1, ACKnum=y+1

received SYNACK(x)
indicates server is live;
send ACK for SYNACK;

this segment may contain
client-to-server data

received ACK(y)
indicates client is live

SYNSENT

ESTAB

SYN RCVD

client state
CLOSED

server state
LISTEN

57

A B
SYN Consumes 1 Sequence No

Step 1: A’s Initial SYN Packet

A’s port B’s port

A’s Initial Sequence Number

(Irrelevant since ACK not set)

Advertised window5 Flags0

Checksum Urgent pointer

Options (variable)

Flags: SYN
ACK
FIN
RST
PSH
URG

A tells B it wants to open a connection…

Step 2: B’s SYN-ACK Packet

B’s port A’s port

B’s Initial Sequence Number

ACK = A’s ISN plus 1

Advertised window5 0

Checksum Urgent pointer

Options (variable)

Flags: SYN
ACK
FIN
RST
PSH
URG

B tells A it accepts, and is ready to hear the next byte…

… upon receiving this packet, A can start sending data

Flags

59

Step 3: A’s ACK of the SYN-ACK

A’s port B’s port

B’s ISN plus 1

Advertised window5 Flags0

Checksum Urgent pointer

Options (variable)

Flags: SYN
ACK
FIN
RST
PSH
URG

A tells B it’s likewise okay to start sending

A’s Initial Sequence Number+1

… upon receiving this packet, B can start sending data
60

TCP 3-way handshake: FSM

closed

L

listen

SYN
rcvd

SYN
sent

ESTAB

clientSocket.connect((serverName,
serverPort))

SYN(seq=x)

connectionSocket, addr =
welcomeSocket.accept();

SYN(x)

SYNACK(seq=y,ACKnum=x+1)

SYNACK(seq=y,ACKnum=x+1)
ACK(ACKnum=y+1)ACK(ACKnum=y+1)

L

61

welcomeSocket.listen(1);

What if the SYN Packet Gets Lost?

v Suppose the SYN packet gets lost
§ Packet is lost inside the network, or:
§ Server discards the packet (e.g., it’s too busy)

v Eventually, no SYN-ACK arrives
§ Sender sets a timer and waits for the SYN-ACK
§ … and retransmits the SYN if needed

v How should the TCP sender set the timer?
§ Sender has no idea how far away the receiver is
§ Hard to guess a reasonable length of time to wait
§ SHOULD (RFCs 1122,2988) use default of 3 second,

RFC 6298 use default of 1 second

62

SYN Loss and Web Downloads
v User clicks on a hypertext link

§ Browser creates a socket and does a “connect”
§ The “connect” triggers the OS to transmit a SYN

v If the SYN is lost…
§ 1-3 seconds of delay: can be very long
§ User may become impatient
§ … and click the hyperlink again, or click “reload”

v User triggers an “abort” of the “connect”
§ Browser creates a new socket and another “connect”
§ Essentially, forces a faster send of a new SYN packet!
§ Sometimes very effective, and the page comes quickly

63

TCP: closing a connection
v client, server each close their side of connection

§ send TCP segment with FIN bit = 1
v respond to received FIN with ACK

§ on receiving FIN, ACK can be combined with own FIN
v simultaneous FIN exchanges can be handled

64

FIN_WAIT_2

CLOSE_WAIT

FINbit=1, seq=y

ACKbit=1; ACKnum=y+1

ACKbit=1; ACKnum=x+1
wait for server

close

can still
send data

can no longer
send data

LAST_ACK

CLOSED

TIMED_WAIT

timed wait
for 2*max

segment lifetime

CLOSED

Normal Termination, One at a Time

FIN_WAIT_1 FINbit=1, seq=xcan no longer
send but can
receive data

clientSocket.close()

client state server state
ESTABESTAB

65
TIMED_WAIT: Can retransmit ACK if last ACK is lost

FIN Consumes 1 Sequence No

CLOSE_WAIT

ACKbit=1; ACKnum=y+1

ACKbit=1; ACKnum=x+1
wait for server

close

FIN + ACK
together
can no longer
send data

LAST_ACK

CLOSED

TIMED_WAIT

timed wait
for 2*max

segment lifetime

CLOSED

FIN_WAIT_1 FINbit=1, seq=xcan no longer
send but can
receive data

clientSocket.close()

client state server state
ESTABESTAB

66

FINbit=1, seq=y

Normal Termination, Both Together

ACKbit=1,
ACKnum=y+1

wait for server
close

Send Ack

can no longer
send datacan no longer

send but can
receive data

clientSocket.close()

FINbit=1, seq=x

client state server state

67

FINbit=1, seq=y

Simultaneous Closure

ACKbit=1,
ACKnum=x+1

CLOSING

TIMED_WAIT

CLOSED

FIN_WAIT_1

ESTAB

CLOSING

CLOSED

FIN_WAIT_1

ESTAB

TIMED_WAIT

Abrupt Termination

v A sends a RESET (RST) to B
§ E.g., because application process on A crashed

v That’s it
§ B does not ack the RST
§ Thus, RST is not delivered reliably
§ And: any data in flight is lost
§ But: if B sends anything more, will elicit another RST

SY
N

SY
N

A
C

K

A
C

K
D

at
a

R
ST

A
C

K

time
A

B

D
ata R

ST

68

69

TCP Finite State Machine

CS144, Stanford University

FSM Example: TCP Connection

6 http://upload.wikimedia.org/wikipedia/commons/thumb/a/a2/Tcp_state_diagram_fixed.svg/

70

TCP SYN Attack (SYN flooding)
v Miscreant creates a fake SYN packet

§ Destination is IP address of victim host (usually some server)
§ Source is some spoofed IP address

v Victim host on receiving creates a TCP connection state i.e allocates
buffers, creates variables, etc and sends SYN ACK to the spoofed
address (half-open connection)

v ACK never comes back
v After a timeout connection state is freed
v However for this duration the connection state is unnecessarily

created
v Further miscreant sends large number of fake SYNs

§ Can easily overwhelm the victim
v Solutions:

§ Increase size of connection queue
§ Decrease timeout wait for the 3-way handshake
§ Firewalls: list of known bad source IP addresses
§ TCP SYN Cookies (explained on next slide)

71

TCP SYN Cookie
v On receipt of SYN, server does not create connection

state
v It creates an initial sequence number (init_seq) that is a

hash of source & dest IP address and port number of SYN
packet (secret key used for hash)
§ Replies back with SYN ACK containing init_seq
§ Server does not need to store this sequence number

v If original SYN is genuine, an ACK will come back
§ Same hash function run on the same header fields to get the initial

sequence number (init_seq)
§ Checks if the ACK is equal to (init_seq+1)
§ Only create connection state if above is true

v If fake SYN, no harm done since no state was created

TCP SYN Cookies – DDoS defence

Quiz: TCP Connection Management?

Roughly how much time does it take for both the
TCP Sender and Receiver to establish connection
state since the connect() call?
A. RTT
B. 1.5RTT
C. 2RTT
D. 3RTT

72

www.zeetings.com/salil

ANSWER: B
Note that the final ACK will be
typically piggybacked with
the first data segment, so
often the TCP connection
setup is approximated to be
1RTT

Quiz: TCP Connection Management?

Assume that one end point of the TCP connection
sends a FIN segment. If it never receives an ACK,
what should it do?
A. Assume that the connection is closed and do

nothing

B. Retransmit the FIN

C. Transmit an ACK

D. Start crying

73
www.zeetings.com/salil

ANSWER: B

74

Transport Layer: Outline

3.1 transport-layer
services

3.2 multiplexing and
demultiplexing

3.3 connectionless
transport: UDP

3.4 principles of reliable
data transfer

3.5 connection-oriented
transport: TCP
§ segment structure
§ reliable data transfer
§ flow control
§ connection management

3.6 principles of congestion
control

3.7 TCP congestion control

congestion:
v informally: “too many sources sending too much

data too fast for network to handle”
v different from flow control!
v manifestations:

§ lost packets (buffer overflow at routers)
§ long delays (queueing in router buffers)

v a top-10 problem!

Principles of congestion control

75

Congestion

76

Trash

Congestion

Router

Router’s buffer.

Incoming rate is faster than
outgoing link can support.

Ugh. I so
can’t deal

with this right
now!

Congestion Collapse

77

Congestion Collapse

Link A Link B

Congestion Collapse

78

Congestion Collapse

Link A Link B

One sender starts,
but there’s still
capacity at link A.

S1

Congestion Collapse

79

Congestion Collapse

Link A Link B

S1

S2

Another sender starts
up. Link A is showing
slight delay, but still
doing ok.

Congestion Collapse

80

Congestion Collapse

Link A Link B

S1

S2

Unrelated traffic
passes through and
congests link B.

Congestion Collapse

Link A Link B

S1

S2 S2’s traffic is being dropped at Link B, so it starts retransmitting
on top of what it was sending.

(This is very bad. S2 is now sending lots of traffic over link A
that has no hope of crossing link B.)

Congestion Collapse

81

Congestion Collapse

82

Congestion Collapse

Link A Link B

S1

S2

Increased traffic from S2
causes Link A to become
congested. S1 starts
retransmitting.

Congestion Collapse

83

Congestion Collapse

Link A Link B

S1

S2

Congestion
propagates
backwards…

congestion:
v Increases delays

§ If delays > RTO, sender retransmits
v Increases loss rate

§ Dropped packets also retransmitted
v Increases retransmissions, many unnecessary

§ Wastes capacity of traffic that is never delivered
§ Increase in load results in decrease in useful work done

v Increases congestion, cycle continues …

Without congestion control

84

Cost of Congestion

v Knee – point after which
§ Throughput increases slowly
§ Delay increases fast

v Cliff – point after which
§ Throughput starts to drop to zero

(congestion collapse)
§ Delay approaches infinity

Load

Load

T
hr

o
ug

hp
ut

D
el

ay

knee cliff

congestion
collapse

packet
loss

85

This happened to the Internet (then NSFnet) in
1986

v Rate dropped from a blazing 32 Kbps to 40bps
v This happened on and off for two years
v In 1988, Van Jacobson published “Congestion

Avoidance and Control”
v The fix: senders voluntarily limit sending rate

Congestion Collapse

86

Approaches towards congestion control

two broad approaches towards congestion control:

end-end congestion
control:

v no explicit feedback
from network

v congestion inferred
from end-system
observed loss, delay

v approach taken by
TCP

network-assisted
congestion control:

v routers provide
feedback to end systems
§ single bit indicating

congestion (SNA,
DECbit, TCP/IP ECN,
ATM)

§explicit rate for
sender to send at

87

Transport Layer: Outline

3.1 transport-layer
services

3.2 multiplexing and
demultiplexing

3.3 connectionless
transport: UDP

3.4 principles of reliable
data transfer

3.5 connection-oriented
transport: TCP
§ segment structure
§ reliable data transfer
§ flow control
§ connection management

3.6 principles of congestion
control

3.7 TCP congestion control

88

TCP’s Approach in a Nutshell
v TCP connection maintains a window

§ Controls number of packets in flight

v TCP sending rate:
§ roughly: send cwnd bytes, wait RTT for ACKs, then

send more bytes

v Vary window size to control sending rate

rate ~~
cwnd
RTT

bytes/sec

last byte
ACKed sent, not-

yet ACKed
(“in-
flight”)

last byte
sent

cwnd
sender sequence number space

89

All These Windows…

v Congestion Window: CWND
§ How many bytes can be sent without overflowing routers
§ Computed by the sender using congestion control algorithm

v Flow control window: Advertised / Receive Window (RWND)
§ How many bytes can be sent without overflowing receiver’s buffers
§ Determined by the receiver and reported to the sender

v Sender-side window = minimum{CWND, RWND}
• Assume for this discussion that RWND >> CWND

90

CWND

v This lecture will talk about CWND in units of MSS
§ (Recall MSS: Maximum Segment Size, the amount of payload

data in a TCP packet)
§ This is only for pedagogical purposes

v Keep in mind that real implementations maintain CWND in
bytes

91

Two Basic Questions

v How does the sender detect congestion?

v How does the sender adjust its sending rate?

92

Detection Congestion: Infer Loss

v Duplicate ACKs: isolated loss
§ dup ACKs indicate network capable of delivering

some segments

v Timeout: much more serious
§Not enough dup ACKs
§Must have suffered several losses

v Will adjust rate differently for each case

93

X

fast retransmit after sender
receipt of triple duplicate ACK

Host BHost A

Seq=92, 8 bytes of data

ACK=100

tim
eo

ut ACK=100

ACK=100
ACK=100

RECAP: TCP fast retransmit (dup acks)

Seq=100, 20 bytes of data

Seq=100, 20 bytes of data

94

Rate Adjustment

v Basic structure:
§Upon receipt of ACK (of new data): increase rate
§Upon detection of loss: decrease rate

v How we increase/decrease the rate depends on
the phase of congestion control we’re in:
§Discovering available bottleneck bandwidth vs.
§Adjusting to bandwidth variations

95

TCP Slow Start (Bandwidth discovery)

v when connection begins,
increase rate exponentially
until first loss event:
§ initially cwnd = 1 MSS
§ double cwnd every RTT (full

ACKs)
§ Simpler implementation

achieved by incrementing
cwnd for every ACK
received
§ cwnd += 1 for each ACK

v summary: initial rate is slow
but ramps up exponentially
fast

Host A

one segment

R
TT

Host B

time

two segments

four segments

96

Adjusting to Varying Bandwidth

v Slow start gave an estimate of available bandwidth

v Now, want to track variations in this available
bandwidth, oscillating around its current value
§ Repeated probing (rate increase) and backoff (rate

decrease)
§Known as Congestion Avoidance (CA)

v TCP uses: “Additive Increase Multiplicative
Decrease” (AIMD)

97

AIMD
v approach: sender increases transmission rate (window size), probing

for usable bandwidth, until another congestion event occurs
§ additive increase: increase cwnd by 1 MSS every RTT until loss

detected
• For each successful RTT (all ACKS), cwnd = cwnd +1
• Simple implementation: for each ACK, cwnd = cwnd +
1/cwnd (since there are cwnd/MSS packets in a window)

§ multiplicative decrease: cut cwnd in half after loss
c
w
n
d
:

TC
P

s
en

de
r

co
ng

es
tio

n
w

in
do

w
s

iz
e

AIMD saw tooth
behavior: probing

for bandwidth

additively increase window size …
…. until loss occurs (then cut window in half)

time 98

Leads to the TCP “Sawtooth”

Loss

Exponential
“slow start”

t

Window

99

Loss

Loss
Loss Loss

Slow-Start vs. AIMD

v When does a sender stop Slow-Start and start
Additive Increase?

v Introduce a “slow start threshold” (ssthresh)
§ Initialized to a large value

v Convert to AI when cwnd = ssthresh, sender
switches from slow-start to AIMD-style increase
§On timeout, ssthresh = CWND/2

100

Implementation

v State at sender
§CWND (initialized to a small constant)
§ ssthresh (initialized to a large constant)
§ [Also dupACKcount and timer, as before]

v Events
§ACK (new data)
§ dupACK (duplicate ACK for old data)
§ Timeout

101

Event: ACK (new data)

v If CWND < ssthresh §CWND += 1 • Hence after one RTT (All ACKs with no drops): CWND = 2xCWND 102 Event: ACK (new data) v If CWND < ssthresh §CWND += 1 v Else §CWND = CWND + 1/CWND Slow start phase • Hence after one RTT (All ACKs with no drops): CWND = CWND + 1 “Congestion Avoidance” phase (additive increase) 103 Event: dupACK v dupACKcount ++ v If dupACKcount = 3 /* fast retransmit */ § ssthresh = CWND/2 §CWND = CWND/2 104 Event: TimeOut v On Timeout § ssthresh ß CWND/2 §CWND ß 1 105 Example t Window Slow-start restart: Go back to CWND = 1 MSS, but take advantage of knowing the previous value of CWND Slow start in operation until it reaches half of previous CWND, I.e., SSTHRESH Timeout Fast Retransmission SSThresh Set to Here 106 TCP Flavours v TCP-Tahoe § cwnd =1 on triple dup ACK & timeout v TCP-Reno § cwnd =1 on timeout § cwnd = cwnd/2 on triple dup ACK v TCP-newReno § TCP-Reno + improved fast recovery (SKIPPED) v TCP-SACK (NOT COVERED IN THE COURSE) § incorporates selective acknowledgements 107 Quiz: TCP Congestion Control? In the figure how many congestion avoidance intervals can you identify? A. 0 B. 1 C. 2 D. 3 E. 4 108 www.zeetings.com/salil ANSWER: C Quiz: TCP Congestion Control? In the figure how many slow start intervals can you identify? A. 0 B. 1 C. 2 D. 3 E. 4 109 www.zeetings.com/salil ANSWER: C Quiz: TCP Congestion Control? In the figure after the 16th transmission round, segment loss is detected by _______ ? A. Triple Dup Ack B. Timeout 110 www.zeetings.com/salil ANSWER: A Quiz: TCP Congestion Control? In the figure what is the initial value of sstresh (steady state threshold)? A. 0 B. 28 C. 32 D. 42 E. 64 111 www.zeetings.com/salil ANSWER: C Quiz: TCP Congestion Control? In the figure what is the value of sstresh (steady state threshold) at the 18th round? A. 1 B. 32 C. 42 D. 21 E. 20 112 www.zeetings.com/salil ANSWER: D Transport Layer: Summary v principles behind transport layer services: §multiplexing, demultiplexing § reliable data transfer § flow control § congestion control v instantiation, implementation in the Internet § UDP § TCP next: v leaving the network “edge” (application, transport layers) v into the network “core” 113