CS计算机代考程序代写 Week 5-tcp

Week 5-tcp

Advanced Networks
Transport layer: TCP

School of Computer Science
Dr. Wei Bao | Lecturer

rcv pkt1
send ack1

(detect duplicate)

pkt1

sender receiver

rcv pkt1

rcv pkt0

send ack0

send ack1

send ack0

rcv ack0

send pkt0

send pkt1

rcv ack1

send pkt0
rcv pkt0

pkt0

pkt0

ack1

ack0

ack0

(c) ACK loss

ack1
X

loss

pkt1
timeout

resend pkt1

rcv pkt1
send ack1

(detect duplicate)

pkt1

sender
receiver

rcv pkt1

send ack0
rcv ack0

send pkt1

send pkt0
rcv pkt0

pkt0

ack0

(d) premature timeout/ delayed ACK

pkt1
timeout

resend pkt1

ack1

send ack1
rcv ack1
(do nothing)

ack1
send pkt0
rcv ack1 pkt0

rcv pkt0
send ack0

rdt3.0 in action

3-2

ack0

› rdt3.0 is correct, but performance stinks
› e.g.: 1 Gbps link, 15 ms prop. delay, 8000 bit packet:

§ U sender: utilization – fraction of time sender busy sending

U
sender =

.008
30.008

= 0.00027
L / R

RTT + L / R
=

§ if RTT=30 msec, 1KB pkt every 30 msec: 33kB/sec thruput
over 1 Gbps link

v network protocol limits use of physical resources!

Dtrans =
L
R

8000 bits
109 bits/sec= = 8 microsecs

Performance of rdt3.0

3

first packet bit transmitted, t = 0
sender receiver

RTT

last packet bit transmitted, t = L / R

first packet bit arrives
last packet bit arrives, send ACK

ACK arrives, send next
packet, t = RTT + L / R

U
sender =

.008
30.008

= 0.00027
L / R

RTT + L / R
=

rdt3.0: stop-and-wait operation

4

pipelining: sender allows multiple, “in-flight”, yet-to-
be-acknowledged pkts
– range of sequence numbers must be increased
– buffering at sender and/or receiver

› two generic forms of pipelined protocols: go-Back-N,
selective repeat

Pipelined protocols

5

first packet bit transmitted, t = 0
sender receiver

RTT

last bit transmitted, t = L / R

first packet bit arrives
last packet bit arrives, send ACK

ACK arrives, send next
packet, t = RTT + L / R

last bit of 2nd packet arrives, send ACK
last bit of 3rd packet arrives, send ACK

3-packet pipelining increases
utilization by a factor of 3!

U
sender =

.0024
30.008

= 0.00081
3L / R

RTT + L / R
=

Pipelining: increased utilization

6

Go-back-N:
› sender can have up to N
unacked packets in
pipeline
› receiver only sends
cumulative ack
– does not ack packet if there

is a gap
› sender has timer for
oldest unacked packet
– when timer expires,

retransmit all unacked
packets

Selective Repeat:
› sender can have up to N
unacked packets in pipeline
› receiver sends individual ack
for each packet

› sender maintains timer for
each unacked packet
– when timer expires, retransmit

only that unacked packet

Pipelined protocols: overview

7

› “window” of up to N, consecutive unacked pkts allowed

v ACK(n): ACKs all pkts up to, including seq # n – “cumulative
ACK”
§ may receive duplicate ACKs (see receiver)

v timer for oldest in-flight pkt
v timeout(n): retransmit packet n and all higher seq # pkts in

window

Go-Back-N: sender

8

› “window” of up to N, consecutive unacked pkts allowed

v ACK(n): ACKs all pkts up to, including seq # n – “cumulative
ACK”
§ may receive duplicate ACKs (see receiver)

v timer for oldest in-flight pkt
v timeout(n): retransmit packet n and all higher seq # pkts in

window

Go-Back-N: sender

9

Wait
start_timer
udt_send(sndpkt[base])
udt_send(sndpkt[base+1])

udt_send(sndpkt[nextseqnum-1])

timeout

rdt_send(data)
if (nextseqnum < base+N) { sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum) udt_send(sndpkt[nextseqnum]) if (base == nextseqnum) start_timer nextseqnum++ } else refuse_data(data) base = getacknum(rcvpkt)+1 If (base == nextseqnum) stop_timer else start_timer rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) base=1 nextseqnum=1 rdt_rcv(rcvpkt) && corrupt(rcvpkt) L GBN: sender extended FSM 3-10 ACK-only: always send ACK for correctly-received pkt with highest in-order seq # - may generate duplicate ACKs - need only remember expectedseqnum › out-of-order pkt: - discard (don’t buffer): no receiver buffering! - re-ACK pkt with highest in-order seq # Wait udt_send(sndpkt) default rdt_rcv(rcvpkt) && notcurrupt(rcvpkt) && hasseqnum(rcvpkt,expectedseqnum) extract(rcvpkt,data) deliver_data(data) sndpkt = make_pkt(expectedseqnum,ACK,chksum) udt_send(sndpkt) expectedseqnum++ expectedseqnum=1 L GBN: receiver extended FSM 11 send pkt0 send pkt1 send pkt2 send pkt3 (wait) sender receiver receive pkt0, send ack0 receive pkt1, send ack1 receive pkt3, discard, (re)send ack1rcv ack0, send pkt4 rcv ack1, send pkt5 pkt 2 timeout send pkt2 send pkt3 send pkt4 send pkt5 Xloss receive pkt4, discard, (re)send ack1 receive pkt5, discard, (re)send ack1 rcv pkt2, deliver, send ack2 rcv pkt3, deliver, send ack3 rcv pkt4, deliver, send ack4 rcv pkt5, deliver, send ack5 ignore duplicate ACK 0 1 2 3 4 5 6 7 8 sender window (N=4) 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 GBN in action 3-12 › receiver individually acknowledges all correctly received pkts - buffers pkts, as needed, for eventual in-order delivery to upper layer › sender only resends pkts for which ACK not received - sender timer for each unACKed pkt › sender window › receiver window Selective repeat 13 Selective repeat: sender, receiver windows 3-14 data from above: › if next available seq # in window, send pkt timeout(n): › resend pkt n, restart timer ACK(n) in [sendbase,sendbase+N-1]: › mark pkt n as received › if n is smallest unACKed pkt, advance window base to next unACKed seq # sender pkt n in [rcvbase, rcvbase+N-1] v send ACK(n) v out-of-order: buffer v in-order: deliver (also deliver buffered, in-order pkts), advance window to next not-yet-received pkt pkt n in [rcvbase-N,rcvbase-1] v ACK(n) otherwise: v ignore receiver Selective repeat 15 3-16 send pkt0 send pkt1 send pkt2 send pkt3 (wait) sender receiver receive pkt0, send ack0 receive pkt1, send ack1 receive pkt3, buffer, send ack3rcv ack0, send pkt4 rcv ack1, send pkt5 pkt 2 timeout send pkt2 Xloss receive pkt4, buffer, send ack4 receive pkt5, buffer, send ack5 rcv pkt2; deliver pkt2, pkt3, pkt4, pkt5; send ack2 record ack3 arrived 0 1 2 3 4 5 6 7 8 sender window (N=4) 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 record ack4 arrived record ack5 arrived Q: what happens when ack2 arrives? Selective repeat in action 3-17 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 9 Connection-oriented Transport TCP 18 TCP: Overview RFCs: 793,1122,1323, 2018, 2581 › full duplex data: - bi-directional data flow in same connection -MSS: maximum segment size › connection-oriented: - handshaking (exchange of control msgs) inits sender, receiver state before data exchange › flow controlled: - sender will not overwhelm receiver › point-to-point: - one sender, one receiver › reliable, in-order byte stream › pipelined: - TCP congestion and flow control set window size 19 source port # dest port # 32 bits application data (variable length) sequence number acknowledgement number receive window Urg data pointerchecksum FSRPAUheadlen not used options (variable length) URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now (generally not used) RST, SYN, FIN: connection estab (setup, teardown commands) # bytes rcvr willing to accept counting by bytes of data (not segments!) Internet checksum (as in UDP) TCP segment structure 20 TCP seq. numbers, ACKs sequence numbers: - “number” of first byte in segment’s data acknowledgements: - seq # of next byte expected from other side - cumulative ACK Q: how receiver handles out-of- order segments - A: TCP spec doesn’t say, - up to implementor - Most will store, but still use cumulative ACK 21 source port # dest port # sequence number acknowledgement number checksum rwnd urg pointer incoming segment to sender A sent ACKed sent, not- yet ACKed (“in- flight”) usable but not yet sent not usable window size N sender sequence number space source port # dest port # sequence number acknowledgement number checksum rwnd urg pointer outgoing segment from sender User types ‘C’ host ACKs receipt of echoed ‘C’ host ACKs receipt of ‘C’, echoes back ‘C’ simple telnet scenario Host BHost A Seq=4201, ACK=7901, data = ‘C’, 4201- 4300 Seq=7901, ACK=4301, data = ‘C’ 7901-8000 Seq=4301, ACK=8001 no data TCP seq. numbers, ACKs 22 TCP round trip time, timeout Q: how to set TCP timeout value? › longer than RTT - but RTT varies › too short: premature timeout, unnecessary retransmissions › too long: slow reaction to segment loss Q: how to estimate RTT? › SampleRTT: measured time from segment transmission until ACK receipt - ignore retransmissions › SampleRTT will vary, want estimated RTT “smoother” - weighted average of several recent measurements, not just current SampleRTT 23 Host BHost A SampleRT 24 SampleRTT SampleRTT Host BHost A Ignore retransmissions 25 First attempt Retransmission ACK SampleRTT? SampleRTT? Host BHost A First attempt Retransmission ACK SampleRTT? SampleRTT? RTT: gaia.cs.umass.edu to fantasia.eurecom.fr 100 150 200 250 300 350 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 time (seconnds) RT T (m ill is ec on ds ) SampleRTT Estimated RTT EstimatedRTT = (1- a)*EstimatedRTT + a*SampleRTT v exponential weighted moving average v influence of past sample decreases exponentially fast v typical value: a = 0.125 RT T (m ill is ec on ds ) RTT: gaia.cs.umass.edu to fantasia.eurecom.fr sampleRTT EstimatedRTT time (seconds) TCP round trip time, timeout 26 TCP round trip time, timeout › timeout interval: EstimatedRTT plus “safety margin” - large variation in EstimatedRTT -> larger safety margin

› estimate SampleRTT deviation from EstimatedRTT:

27

DevRTT = (1-b)*DevRTT +
b*|SampleRTT-EstimatedRTT|

(typically, b = 0.25)

TimeoutInterval = EstimatedRTT + 4*DevRTT

estimated RTT “safety margin”

Reliable Data Transfer in TCP

28

TCP reliable data transfer

›TCP creates rdt service
on top of IP’s unreliable
service
– pipelined segments
– cumulative acks
– single retransmission timer
› retransmissions triggered
by:
– timeout events
– duplicate acks

let’s initially consider
simplified TCP sender:
– ignore duplicate acks
– ignore flow control,

congestion control

29

TCP sender events
data rcvd from app:
› create segment with seq
#
› seq # is byte-stream
number of first data byte
in segment
› start timer if not already
running
– think of timer as for oldest

unacked segment
– expiration interval:
TimeOutInterval

timeout:
› retransmit segment that
caused timeout
› restart timer
ack rcvd:
› if ack acknowledges
previously unacked
segments
– update what is known to be

ACKed
– start timer if there are still

unacked segments

30

wait
for
event

NextSeqNum = InitialSeqNum
SendBase = InitialSeqNum

L

create segment, seq. #: NextSeqNum
pass segment to IP (i.e., “send”)
NextSeqNum = NextSeqNum + length(data)
if (timer currently not running)

start timer

data received from application above

retransmit not-yet-acked segment
with smallest seq. #

start timer

timeout

if (y > SendBase) {
SendBase = y
/* SendBase–1: last cumulatively ACKed byte */
if (there are currently not-yet-acked segments)

start timer
else stop timer

}

ACK received, with ACK field value y

TCP sender (simplified)

31

lost ACK scenario

Host BHost A

Seq=92, 8 bytes of data

ACK=100

Seq=92, 8 bytes of data

Xtim
eo

ut

ACK=100

premature timeout

Host BHost A

Seq=92, 8 bytes of data

ACK=100

Seq=92, 8
bytes of data

tim
eo

ut

ACK=120

Seq=100, 20 bytes of data

ACK=120

SendBase=100

SendBase=120

SendBase=120

SendBase=92

TCP: retransmission scenarios

32

X

cumulative ACK

Host BHost A

Seq=92, 8 bytes of data

ACK=100

Seq=120, 15 bytes of data

tim
eo

ut

Seq=100, 20 bytes of data

ACK=120

TCP: retransmission scenarios

33

event at receiver

arrival of in-order segment with
expected seq #. All data up to
expected seq # already ACKed

arrival of in-order segment with
expected seq #. One other
segment has ACK pending

arrival of out-of-order segment
higher-than-expect seq. # .
Gap detected

arrival of segment that
partially or completely fills gap

TCP receiver action

delayed ACK. Wait up to 500ms
for next segment. If no next segment,
send ACK

immediately send single cumulative
ACK, ACKing both in-order segments

immediately send duplicate ACK,
indicating seq. # of next expected byte

immediate send ACK, provided that
segment starts at lower end of gap

TCP ACK generation [RFC 1122, RFC 2581]

34

Too many ACKs

Host BHost A

TCP ACK

35

cumulative ACK

Host BHost A

TCP ACK

36

Host BHost A

Need to
ack this
one?

Host BHost A

Yes

TCP fast retransmit

› time-out period often
relatively long:
– long delay before

resending lost packet

› detect lost segments via
duplicate ACKs.
– sender often sends many

segments back-to-back

– if segment is lost, there will
likely be many duplicate
ACKs.

37

if sender receives 3
duplicate ACKs for same
data
(“triple duplicate ACKs”),
resend unacked segment
with smallest seq #
§ likely that unacked segment

lost, so don’t wait for
timeout

TCP fast retransmit

X

fast retransmit after sender
receipt of triple duplicate ACK

Host BHost A

Seq=92, 8 bytes of data

ACK=100

tim
eo

ut ACK=100

ACK=100

ACK=100

Seq=100, 20 bytes of data

Seq=100, 20 bytes of data

TCP fast retransmit

38

Flow Control in TCP

39

application
process

TCP socket
receiver buffers

TCP
code

IP
code

application
OS

receiver protocol stack

application may
remove data from

TCP socket buffers ….

… slower than TCP
receiver is delivering
(sender is sending)

from sender

receiver controls sender, so
sender won’t overflow
receiver’s buffer by transmitting
too much, too fast

flow control

TCP flow control

40

TCP flow control

› receiver “advertises” free buffer
space by including rwnd value in
TCP header of receiver-to-
sender segments
– RcvBuffer size set via socket

options (typical default is 4096
bytes)

– many operating systems autoadjust
RcvBuffer

› sender limits amount of unacked
(“in-flight”) data to receiver’s
rwnd value

› guarantees receive buffer will not
overflow

41

buffered data

free buffer spacerwnd

RcvBuffer

TCP segment payloads

to application process

receiver-side buffering

Connection Management in TCP

42

Connection Management
before exchanging data, sender/receiver “handshake”:
› agree to establish connection (each knowing the other willing to

establish connection)

› agree on connection parameters

43

connection state: ESTAB
connection variables:

seq # client-to-server
server-to-client

rcvBuffer size
at server,client

application

network

connection state: ESTAB
connection Variables:

seq # client-to-server
server-to-client

rcvBuffer size
at server,client

application

network

Agreeing to establish a connection

Q: will 2-way handshake always
work in network?

› variable delays
› retransmitted messages (e.g.

req_conn(x)) due to message loss
› message reordering
› can’t “see” other side

44

2-way handshake:

Let’s talk

OK
ESTAB

ESTAB

choose x req_conn(x)
ESTAB

ESTAB
acc_conn(x)

2-way handshake failure scenarios:

retransmit
req_conn(x)

ESTAB

req_conn(x)

half open connection!
(no client!)

client
terminates

server
forgets x

connection
x completes

retransmit
req_conn(x)

ESTAB

req_conn(x)

data(x+1)

retransmit
data(x+1)

accept
data(x+1)

choose x
req_conn(x)

ESTAB

ESTAB

acc_conn(x)

client
terminates

ESTAB

choose x
req_conn(x)

ESTAB
acc_conn(x)

data(x+1) accept
data(x+1)

connection
x completes server

forgets x

Agreeing to establish a connection

45

SYNbit=1, Seq=x

choose init seq num, x
send TCP SYN msg

ESTAB

SYNbit=1, Seq=y
ACKbit=1; ACKnum=x+1

choose init seq num, y
send TCP SYNACK
msg, acking SYN

ACKbit=1, ACKnum=y+1

received SYNACK(x)
indicates server is live;
send ACK for SYNACK;

this segment may contain
client-to-server data

received ACK(y)
indicates client is live

SYNSENT

ESTAB

SYN RCVD

client state
LISTEN

server state
LISTEN

TCP 3-way handshake

46

TCP: closing a connection

› client, server each closes their side of connection
– send TCP segment with FIN bit = 1

› respond to received FIN with ACK

47

FIN_WAIT_2

CLOSE_WAIT

FINbit=1, seq=y

ACKbit=1; ACKnum=y+1

ACKbit=1; ACKnum=x+1
wait for server

close

can still
send data

can no longer
send data

LAST_ACK

CLOSED

TIMED_WAIT

timed wait
for 2*max

segment lifetime

CLOSED

FIN_WAIT_1 FINbit=1, seq=xcan no longer
send but can
receive data

clientSocket.close()

client state server state
ESTABESTAB

TCP: closing a connection

48

source port # dest port #

32 bits

application
data
(variable length)

sequence number
acknowledgement number

receive window
Urg data pointerchecksum

FSRPAUheadlen
not
used

options (variable length)

URG: urgent data
(generally not used)

ACK: ACK #
valid

PSH: push data now
(generally not used)

RST, SYN, FIN:
connection estab
(setup, teardown

commands)

# bytes
rcvr willing
to accept

counting
by bytes
of data
(not segments!)

Internet
checksum

(as in UDP)

TCP segment structure

49

Principles of Congestion Control

50

Principles of congestion control

congestion:
› informally: “too many sources sending too much
data too fast for network to handle”

› different from flow control!
›manifestations:
– lost packets (buffer overflow at routers)
– long delays (queueing in router buffers)
› a top-10 problem!

51

Causes/costs of congestion: scenario 1

› two senders, two receivers
› one router, infinite buffers

› output link capacity: R
› no retransmission

› maximum per-connection
throughput: R/2

52

unlimited shared
output link buffers

Host A

original data: lin

Host B

throughput: lout

R/2

R/2

l o
ut

lin R/2
de

la
y

lin
v large delays as arrival rate, lin,

approaches capacity

original data: lin throughput: lout

Causes/costs of congestion: scenario 2
› one router, finite buffers
› sender retransmission of timed-out packet

– application-layer input = application-layer output: lin = lout、
– Goodput

– transport-layer input includes retransmissions : l’in lin

53

finite shared output
link buffers

Host A

lin : original data

Host B

loutl’in: original data, plus
retransmitted data

Causes/costs of congestion: scenario 2

idealization: perfect
knowledge

› sender sends only when
router buffers available

54

finite shared output
link buffers

lin : original data loutl’in: original data, plus
retransmitted data

copy

free buffer space!

R/2

R/2

l o
ut

l’in

Host B

A

Causes/costs of congestion: scenario 2
Idealization: known loss

packets can be lost, dropped
at router due to full buffers

› sender only resends if
packet known to be lost

55

lin : original data loutl’in: original data, plus
retransmitted data

copy

no buffer space!A

Host B

Causes/costs of congestion: scenario 2
Idealization: known loss

packets can be lost, dropped
at router due to full buffers

› sender only resends if
packet known to be lost

56

lin : original data loutl’in: original data, plus
retransmitted data

free buffer space!

R/2

R/2lin

l o
ut

when sending at R/2,
some packets are
retransmissions but
asymptotic goodput
is still R/2

A

Host B

A

lin
loutl’incopy

free buffer space!

timeout

R/2

R/2lin

l o
ut

when sending at R/2,
some packets are
retransmissions
including duplicated
that are delivered!

Host B

Realistic: duplicates
v packets can be lost, dropped

at router due to full buffers
v sender times out prematurely,

sending two copies, both of
which are delivered

Causes/costs of congestion: scenario 2

57

R/2

l o
ut

when sending at R/2,
some packets are
retransmissions
including duplicated
that are delivered!

“costs” of congestion:
v more work (retrans) for given “goodput”
v unneeded retransmissions: link carries multiple copies of pkt

§ decreasing goodput

R/2lin

Realistic: duplicates
v packets can be lost, dropped

at router due to full buffers
v sender times out prematurely,

sending two copies, both of
which are delivered

Causes/costs of congestion: scenario 2

58

Causes/costs of congestion: scenario 3

› four senders
› multihop paths
› timeout/retransmit

59

Q: what happens as lin’ increases ?

finite shared output
link buffers

Host A lout Host B

Host C
Host D

lin : original data
l’in: original data, plus

retransmitted data

A: as red lin’ increases, all arriving
blue pkts at upper queue are
dropped, blue throughput g 0

another “cost” of congestion:
v when packet dropped, any “upstream transmission

capacity used for that packet was wasted!

l o
ut

lin’

Causes/costs of congestion: scenario 3

60

two broad approaches towards congestion control:

Approaches towards congestion control

end-end congestion
control:

› no explicit feedback
from network

› congestion inferred
from end-system
observed loss, delay

› approach taken by TCP

network-assisted
congestion control:

› routers provide feedback
to end systems

– single bit indicating
congestion

– explicit rate for sender
to send at

61

TCP Congestion Control

62

Additive increase multiplicative decrease (AIMD)
v approach: sender increases transmission rate (window

size), probing for usable bandwidth, until loss occurs
§ additive increase: increase cwnd by 1 MSS (maximum

segment size) every RTT until loss detected
§multiplicative decrease: cut cwnd in half after loss

c
w
n
d
:

T
C

P
s

en
de

r
co

ng
es

tio
n

w
in

do
w

s
iz

e

AIMD saw tooth
behavior: probing

for bandwidth

additively increase window size …
…. until loss occurs (then cut window in half)

time

TCP congestion control

63

TCP Congestion Control: details

› sender limits transmission:

› cwnd is dynamic, function of perceived
network congestion

TCP sending rate:
› roughly: send cwnd bytes, wait RTT for

ACKS, then send more bytes

64

last byte
ACKed sent, not-

yet ACKed
(“in-
flight”)

last byte
sent

cwnd

LastByteSent- LastByteAcked < cwnd sender sequence number space rate ~~ cwnd RTT bytes/sec TCP Slow Start ›when connection begins, increase rate exponentially: - initially cwnd = 1 MSS - double cwnd every RTT - done by incrementing cwnd for every ACK received › summary: initial rate is slow but ramps up exponentially fast ›when should the exponential increase switch to linear (additive increase)? 65 Host A one segment R T T Host B time two segments four segments TCP: switching from slow start to CA Q: when should the exponential increase switch to linear? A: cwnd reaches ssthresh Implementation: › At beginning ssthresh, specified in different versions of TCP › (In this example ssthresh=8 segment) › on loss event, ssthresh is set to 1/2 of cwnd just before loss event 66 ssthresh=6cwnd=12 loss! TCP: detecting, reacting to loss › loss indicated by timeout: - cwnd set to 1 MSS; - window then grows exponentially (as in slow start) to ssthresh, then grows linearly › loss indicated by 3 duplicate ACKs: › TCP Tahoe, same as loss indicated by timeout, always sets cwnd to 1 (timeout or 3 duplicate acks) › TCP RENO - cwnd is cut in half window then grows linearly (additive increase) - fast recovery 67 TCP: switching from slow start to CA 68 slow start multiplicative decease additive increase additive increase slow start additive increase timeout ssthresh = cwnd/2 cwnd = 1 MSS dupACKcount = 0 retransmit missing segment L cwnd > ssthresh congestion

avoidance/
additive
increase

cwnd = cwnd + MSS (MSS/cwnd)
dupACKcount = 0
transmit new segment(s), as allowed

new ACK.

dupACKcount++
duplicate ACK

fast
recovery/
multiplicative
decrease

cwnd = cwnd + MSS
transmit new segment(s), as allowed

duplicate ACK

ssthresh= cwnd/2
cwnd = ssthresh + 3

retransmit missing segment

dupACKcount == 3

timeout
ssthresh = cwnd/2
cwnd = 1
dupACKcount = 0
retransmit missing segment

ssthresh= cwnd/2
cwnd = ssthresh + 3
retransmit missing segment

dupACKcount == 3cwnd = ssthresh
dupACKcount = 0

New ACK

slow
start/
exponential
increase

timeout
ssthresh = cwnd/2
cwnd = 1 MSS
dupACKcount = 0
retransmit missing segment

cwnd = cwnd+MSS
dupACKcount = 0
transmit new segment(s), as allowed

new ACKdupACKcount++
duplicate ACK

L
cwnd = 1 MSS
ssthresh = 64 KB
dupACKcount = 0

New
ACK!

New
ACK!

New
ACK!

Summary: TCP Reno Congestion Control

69

TCP Reno throughput

› avg. TCP thruput as function of window size, RTT?
– ignore slow start, assume always data to send

›W: window size (measured in bytes) where loss occurs
– avg. window size (# in-flight bytes) is ¾ W
– avg. thruput is 3/4W per RTT

70

W

W/2

avg TCP thruput = 34
W
RTT bytes/sec

TCP Futures: TCP over “long, fat pipes”

› example: 1500 byte segments, 100ms RTT, want 10
Gbps throughput
› requires W = 83,333 in-flight segments
› throughput in terms of segment loss probability, L [Mathis

1997]:

➜ to achieve 10 Gbps throughput, need a loss rate of L = 2·10-10
– a very small loss rate!

› new versions of TCP for high-speed
› Vegas, Westwood, CUBIC, etc.

71

TCP throughput = 1.22
.MSS

RTT L

TCP Fairness

Fairness: K TCP sessions share same bottleneck link
of bandwidth R, each has average rate of R/K

72

TCP connection 1

bottleneck
router
capacity RTCP connection 2

Why is TCP fair?

two competing sessions:
› additive increase gives slope of 1, as throughout increases
› multiplicative decrease decreases throughput proportionally

73

R

R

equal bandwidth share

Session 1 data rate

S
es

si
on

2
d

at
a

ra
te

congestion avoidance: additive increase
loss: decrease window by factor of 2

congestion avoidance: additive increase
loss: decrease window by factor of 2

45o

Fairness (more)

Fairness and UDP

›multimedia apps often
do not use TCP
– do not want rate throttled

by congestion control

› instead use UDP:
– send audio/video at

constant rate, tolerate
packet loss

Fairness, parallel TCP connections
› application can open multiple
parallel connections between
two hosts
› e.g., link of rate R

– App 1 asks for 1 TCP, gets 0.1R
– App 2 asks for 9 TCPs, gets 0.9R

74

Final word

Window size = min (rwnd, cwnd)

75

receive window congestion window

flow control congestion control

Conclusion

› principles behind transport layer services:
-multiplexing, demultiplexing
– reliable data transfer
– connection setup, teardown
– flow control
– congestion control
› instantiation, implementation in the Internet
– UDP
– TCP

76