程序代做CS代考 Connection-oriented Transport TCP

Connection-oriented Transport TCP

Advanced Networks
Transport layer: TCP
School of Computer Science

Dr. | Lecturer

1

rcv pkt1
send ack1
(detect duplicate)

pkt1
sender
receiver
rcv pkt1
rcv pkt0
send ack0
send ack1
send ack0
rcv ack0
send pkt0
send pkt1
rcv ack1
send pkt0
rcv pkt0

pkt0

pkt0

ack1

ack0

ack0
(c) ACK loss

ack1
X
loss

pkt1

timeout
resend pkt1
rcv pkt1
send ack1
(detect duplicate)

pkt1
sender
receiver
rcv pkt1
send ack0
rcv ack0
send pkt1
send pkt0
rcv pkt0

pkt0

ack0
(d) premature timeout/ delayed ACK

pkt1

timeout
resend pkt1

ack1

send ack1
rcv ack1
(do nothing)

ack1
send pkt0
rcv ack1

pkt0
rcv pkt0
send ack0
rdt3.0 in action
3-2

ack0

rdt3.0 is correct, but performance stinks
e.g.: 1 Gbps link, 15 ms prop. delay, 8000 bit packet:

U sender: utilization – fraction of time sender busy sending

if RTT=30 msec, 1KB pkt every 30 msec: 33kB/sec thruput over 1 Gbps link
network protocol limits use of physical resources!
Dtrans =
L
R

8000 bits
109 bits/sec

=
=
8 microsecs
Performance of rdt3.0
3

first packet bit transmitted, t = 0

sender
receiver

RTT

last packet bit transmitted, t = L / R

first packet bit arrives

last packet bit arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R

rdt3.0: stop-and-wait operation
4

pipelining: sender allows multiple, “in-flight”, yet-to-be-acknowledged pkts
range of sequence numbers must be increased
buffering at sender and/or receiver
two generic forms of pipelined protocols: go-Back-N, selective repeat

Pipelined protocols
5

first packet bit transmitted, t = 0

sender
receiver

RTT

last bit transmitted, t = L / R

first packet bit arrives

last packet bit arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R

last bit of 2nd packet arrives, send ACK

last bit of 3rd packet arrives, send ACK
3-packet pipelining increases
utilization by a factor of 3!

Pipelining: increased utilization
6

Go-back-N:
sender can have up to N unacked packets in pipeline
receiver only sends cumulative ack
does not ack packet if there is a gap
sender has timer for oldest unacked packet
when timer expires, retransmit all unacked packets
Selective Repeat:
sender can have up to N unacked packets in pipeline
receiver sends individual ack for each packet

sender maintains timer for each unacked packet
when timer expires, retransmit only that unacked packet

Pipelined protocols: overview
7

“window” of up to N, consecutive unacked pkts allowed

ACK(n): ACKs all pkts up to, including seq # n – “cumulative ACK”
may receive duplicate ACKs (see receiver)
timer for oldest in-flight pkt
timeout(n): retransmit packet n and all higher seq # pkts in window

Go-Back-N: sender
8

“window” of up to N, consecutive unacked pkts allowed

ACK(n): ACKs all pkts up to, including seq # n – “cumulative ACK”
may receive duplicate ACKs (see receiver)
timer for oldest in-flight pkt
timeout(n): retransmit packet n and all higher seq # pkts in window

Go-Back-N: sender
9

Wait

start_timer
udt_send(sndpkt[base])
udt_send(sndpkt[base+1])

udt_send(sndpkt[nextseqnum-1])

timeout

rdt_send(data)

if (nextseqnum < base+N) { sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum) udt_send(sndpkt[nextseqnum]) if (base == nextseqnum) start_timer nextseqnum++ } else refuse_data(data) base = getacknum(rcvpkt)+1 If (base == nextseqnum) stop_timer else start_timer rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) base=1 nextseqnum=1 rdt_rcv(rcvpkt) && corrupt(rcvpkt) L GBN: sender extended FSM 3-10 If (nextseqnum < base+N) it means we can still send, otherwise we have send N messages (not yet acknowledged). If base == nextseqnum, it means that we have not sent anything yet, so let’s start timer. If timer expires, resend N messages not yet acknowledged. If base == nextseqnum when we receive, it means everything has been acknowledged. 10 ACK-only: always send ACK for correctly-received pkt with highest in-order seq # may generate duplicate ACKs need only remember expectedseqnum out-of-order pkt: discard (don’t buffer): no receiver buffering! re-ACK pkt with highest in-order seq # Wait udt_send(sndpkt) default rdt_rcv(rcvpkt) && notcurrupt(rcvpkt) && hasseqnum(rcvpkt,expectedseqnum) extract(rcvpkt,data) deliver_data(data) sndpkt = make_pkt(expectedseqnum,ACK,chksum) udt_send(sndpkt) expectedseqnum++ expectedseqnum=1 L GBN: receiver extended FSM 11 send pkt0 send pkt1 send pkt2 send pkt3 (wait) sender receiver receive pkt0, send ack0 receive pkt1, send ack1 receive pkt3, discard, (re)send ack1 rcv ack0, send pkt4 rcv ack1, send pkt5 pkt 2 timeout send pkt2 send pkt3 send pkt4 send pkt5 X loss receive pkt4, discard, (re)send ack1 receive pkt5, discard, (re)send ack1 rcv pkt2, deliver, send ack2 rcv pkt3, deliver, send ack3 rcv pkt4, deliver, send ack4 rcv pkt5, deliver, send ack5 ignore duplicate ACK 0 1 2 3 4 5 6 7 8 sender window (N=4) 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 GBN in action 3-12 receiver individually acknowledges all correctly received pkts buffers pkts, as needed, for eventual in-order delivery to upper layer sender only resends pkts for which ACK not received sender timer for each unACKed pkt sender window receiver window Selective repeat 13 Selective repeat: sender, receiver windows 3-14 data from above: if next available seq # in window, send pkt timeout(n): resend pkt n, restart timer ACK(n) in [sendbase,sendbase+N-1]: mark pkt n as received if n is smallest unACKed pkt, advance window base to next unACKed seq # sender pkt n in [rcvbase, rcvbase+N-1] send ACK(n) out-of-order: buffer in-order: deliver (also deliver buffered, in-order pkts), advance window to next not-yet-received pkt pkt n in [rcvbase-N,rcvbase-1] ACK(n) otherwise: ignore receiver Selective repeat 15 The receiver still sends ack upon reception of an old packet already delivered. 15 3-16 send pkt0 send pkt1 send pkt2 send pkt3 (wait) sender receiver receive pkt0, send ack0 receive pkt1, send ack1 receive pkt3, buffer, send ack3 rcv ack0, send pkt4 rcv ack1, send pkt5 pkt 2 timeout send pkt2 X loss receive pkt4, buffer, send ack4 receive pkt5, buffer, send ack5 rcv pkt2; deliver pkt2, pkt3, pkt4, pkt5; send ack2 record ack3 arrived 0 1 2 3 4 5 6 7 8 sender window (N=4) 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 record ack4 arrived record ack5 arrived Q: what happens when ack2 arrives? Selective repeat in action 3-17 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 9 Mark 2 as received, advance window base to 3, send pkt6 17 Connection-oriented Transport TCP 18 20min 18 TCP: Overview RFCs: 793,1122,1323, 2018, 2581 full duplex data: bi-directional data flow in same connection MSS: maximum segment size connection-oriented: handshaking (exchange of control msgs) inits sender, receiver state before data exchange flow controlled: sender will not overwhelm receiver point-to-point: one sender, one receiver reliable, in-order byte stream pipelined: TCP congestion and flow control set window size 19 source port # dest port # 32 bits application data (variable length) sequence number acknowledgement number receive window Urg data pointer checksum F S R P A U head len not used options (variable length) URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now (generally not used) RST, SYN, FIN: connection estab (setup, teardown commands) # bytes rcvr willing to accept counting by bytes of data (not segments!) Internet checksum (as in UDP) TCP segment structure 20 TCP seq. numbers, ACKs sequence numbers: “number” of first byte in segment’s data acknowledgements: seq # of next byte expected from other side cumulative ACK Q: how receiver handles out-of-order segments A: TCP spec doesn’t say, up to implementor Most will store, but still use cumulative ACK 21 source port # dest port # sequence number acknowledgement number checksum rwnd urg pointer incoming segment to sender A sent ACKed sent, not-yet ACKed (“in-flight”) usable but not yet sent not usable window size N sender sequence number space source port # dest port # sequence number acknowledgement number checksum rwnd urg pointer outgoing segment from sender User types ‘C’ host ACKs receipt of echoed ‘C’ host ACKs receipt of ‘C’, echoes back ‘C’ simple telnet scenario Host B Host A Seq=4201, ACK=7901, data = ‘C’, 4201- 4300 Seq=7901, ACK=4301, data = ‘C’ 7901-8000 Seq=4301, ACK=8001 no data TCP seq. numbers, ACKs 22 Assume that the sequence numbers are 42 and 79 and that a telnet communication occurs sending characters typed to server and sent back to client. 22 TCP round trip time, timeout Q: how to set TCP timeout value? longer than RTT but RTT varies too short: premature timeout, unnecessary retransmissions too long: slow reaction to segment loss Q: how to estimate RTT? SampleRTT: measured time from segment transmission until ACK receipt ignore retransmissions SampleRTT will vary, want estimated RTT “smoother” weighted average of several recent measurements, not just current SampleRTT 23 One measurement of SampleRTT is recordedat a time generally: upon ACK reception. Which means that every RTT, a new value is recorded. 23 Host B Host A SampleRT 24 SampleRTT 24 SampleRTT Assume that the sequence numbers are 42 and 79 and that a telnet communication occurs sending characters typed to server and sent back to client. 24 Host B Host A Ignore retransmissions 25 First attempt Retransmission ACK SampleRTT? SampleRTT? Host B Host A 25 First attempt Retransmission ACK SampleRTT? SampleRTT? Assume that the sequence numbers are 42 and 79 and that a telnet communication occurs sending characters typed to server and sent back to client. 25 EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT exponential weighted moving average influence of past sample decreases exponentially fast typical value:  = 0.125 RTT (milliseconds) RTT: gaia.cs.umass.edu to fantasia.eurecom.fr sampleRTT EstimatedRTT time (seconds) TCP round trip time, timeout 26 The estimate relies on previous estimates and the new recorded sampleRTT as in the equation above. 26 TCP round trip time, timeout timeout interval: EstimatedRTT plus “safety margin” large variation in EstimatedRTT -> larger safety margin
estimate SampleRTT deviation from EstimatedRTT:
27
DevRTT = (1-)*DevRTT +
*|SampleRTT-EstimatedRTT|
(typically,  = 0.25)
TimeoutInterval = EstimatedRTT + 4*DevRTT
estimated RTT
“safety margin”

Reliable Data Transfer in TCP

28

20min
28

TCP reliable data transfer
TCP creates rdt service on top of IP’s unreliable service
pipelined segments
cumulative acks
single retransmission timer
retransmissions triggered by:
timeout events
duplicate acks

let’s initially consider simplified TCP sender:
ignore duplicate acks
ignore flow control, congestion control
29

TCP sender events
data rcvd from app:
create segment with seq #
seq # is byte-stream number of first data byte in segment
start timer if not already running
think of timer as for oldest unacked segment
expiration interval: TimeOutInterval
timeout:
retransmit segment that caused timeout
restart timer
ack rcvd:
if ack acknowledges previously unacked segments
update what is known to be ACKed
start timer if there are still unacked segments

30

wait
for
event

NextSeqNum = InitialSeqNum
SendBase = InitialSeqNum

L
create segment, seq. #: NextSeqNum
pass segment to IP (i.e., “send”)
NextSeqNum = NextSeqNum + length(data)
if (timer currently not running)
start timer

data received from application above

retransmit not-yet-acked segment with smallest seq. #
start timer
timeout

if (y > SendBase) {
SendBase = y
/* SendBase–1: last cumulatively ACKed byte */
if (there are currently not-yet-acked segments)
start timer
else stop timer
}
ACK received, with ACK field value y

TCP sender (simplified)
31

lost ACK scenario

Host B
Host A

Seq=92, 8 bytes of data

ACK=100

Seq=92, 8 bytes of data
X
timeout

ACK=100

premature timeout

Host B
Host A

Seq=92, 8 bytes of data

ACK=100

Seq=92, 8
bytes of data
timeout

ACK=120

Seq=100, 20 bytes of data

ACK=120
SendBase=100
SendBase=120
SendBase=120
SendBase=92

TCP: retransmission scenarios
32

32

X
cumulative ACK

Host B
Host A

Seq=92, 8 bytes of data

ACK=100

Seq=120, 15 bytes of data

timeout

Seq=100, 20 bytes of data

ACK=120

TCP: retransmission scenarios
33

33

event at receiver

arrival of in-order segment with
expected seq #. All data up to
expected seq # already ACKed

arrival of in-order segment with
expected seq #. One other
segment has ACK pending

arrival of out-of-order segment
higher-than-expect seq. # .
Gap detected

arrival of segment that
partially or completely fills gap

TCP receiver action

delayed ACK. Wait up to 500ms
for next segment. If no next segment,
send ACK

immediately send single cumulative
ACK, ACKing both in-order segments

immediately send duplicate ACK,
indicating seq. # of next expected byte

immediate send ACK, provided that
segment starts at lower end of gap

TCP ACK generation [RFC 1122, RFC 2581]
34

A duplicate ACK is an ACK that re-acknowledges a segment already ACKed.
34

Too many ACKs

Host B
Host A

TCP ACK
35

cumulative ACK

Host B
Host A

35

TCP ACK
36

Host B
Host A

Need to ack this one?

Host B
Host A

Yes

36

TCP fast retransmit
time-out period often relatively long:
long delay before resending lost packet
detect lost segments via duplicate ACKs.
sender often sends many segments back-to-back
if segment is lost, there will likely be many duplicate ACKs.

37
if sender receives 3 duplicate ACKs for same data
(“triple duplicate ACKs”), resend unacked segment with smallest seq #
likely that unacked segment lost, so don’t wait for timeout

TCP fast retransmit

37

X

fast retransmit after sender
receipt of triple duplicate ACK
Host B
Host A
Seq=92, 8 bytes of data

ACK=100
timeout

ACK=100

ACK=100

ACK=100

Seq=100, 20 bytes of data

Seq=100, 20 bytes of data

TCP fast retransmit
38

Flow Control in TCP

39

20min
39

application
process

TCP socket
receiver buffers

TCP
code

IP
code

application
OS

receiver protocol stack
application may
remove data from
TCP socket buffers ….

… slower than TCP
receiver is delivering
(sender is sending)

from sender

receiver controls sender, so sender won’t overflow receiver’s buffer by transmitting too much, too fast

flow control

TCP flow control
40

TCP flow control
receiver “advertises” free buffer space by including rwnd value in TCP header of receiver-to-sender segments
RcvBuffer size set via socket options (typical default is 4096 bytes)
many operating systems autoadjust RcvBuffer
sender limits amount of unacked (“in-flight”) data to receiver’s rwnd value
guarantees receive buffer will not overflow
41

buffered data

free buffer space
rwnd

RcvBuffer
TCP segment payloads
to application process
receiver-side buffering

Connection Management in TCP

42

20min
42

Connection Management
before exchanging data, sender/receiver “handshake”:
agree to establish connection (each knowing the other willing to establish connection)
agree on connection parameters
43

connection state: ESTAB
connection variables:
seq # client-to-server
server-to-client
rcvBuffer size
at server,client

application

network

connection state: ESTAB
connection Variables:
seq # client-to-server
server-to-client
rcvBuffer size
at server,client

application

network

Allocate buffers and variable upon reception of SYN (server) of SYNACK (client)
43

Agreeing to establish a connection
Q: will 2-way handshake always work in network?
variable delays
retransmitted messages (e.g. req_conn(x)) due to message loss
message reordering
can’t “see” other side
44
2-way handshake:

Let’s talk

OK
ESTAB
ESTAB

choose x

req_conn(x)

ESTAB
ESTAB

acc_conn(x)

2-way handshake failure scenarios:

retransmit
req_conn(x)

ESTAB

req_conn(x)
half open connection!
(no client!)

client terminates
server
forgets x
connection
x completes
retransmit
req_conn(x)

ESTAB

req_conn(x)

data(x+1)
retransmit
data(x+1)

accept
data(x+1)
choose x

req_conn(x)

ESTAB
ESTAB

acc_conn(x)

client terminates

ESTAB

choose x

req_conn(x)
ESTAB

acc_conn(x)

data(x+1)

accept
data(x+1)

connection
x completes
server
forgets x

Agreeing to establish a connection
45

SYNbit=1, Seq=x
choose init seq num, x
send TCP SYN msg

ESTAB

SYNbit=1, Seq=y
ACKbit=1; ACKnum=x+1
choose init seq num, y
send TCP SYNACK
msg, acking SYN

ACKbit=1, ACKnum=y+1
received SYNACK(x)
indicates server is live;
send ACK for SYNACK;
this segment may contain
client-to-server data
received ACK(y)
indicates client is live
SYNSENT

ESTAB

SYN RCVD

client state

LISTEN
server state

LISTEN

TCP 3-way handshake
46

TCP: closing a connection
client, server each closes their side of connection
send TCP segment with FIN bit = 1
respond to received FIN with ACK
47

FIN_WAIT_2

CLOSE_WAIT

FINbit=1, seq=y

ACKbit=1; ACKnum=y+1

ACKbit=1; ACKnum=x+1
wait for server
close
can still
send data
can no longer
send data

LAST_ACK
CLOSED

TIMED_WAIT

timed wait
for 2*max
segment lifetime

CLOSED

FIN_WAIT_1

FINbit=1, seq=x
can no longer
send but can
receive data
clientSocket.close()
client state

server state

ESTAB
ESTAB

TCP: closing a connection
48

source port #
dest port #

32 bits

application
data
(variable length)
sequence number

acknowledgement number

receive window
Urg data pointer
checksum
F

S
R
P
A
U
head
len
not
used

options (variable length)
URG: urgent data
(generally not used)
ACK: ACK #
valid
PSH: push data now
(generally not used)
RST, SYN, FIN:
connection estab
(setup, teardown
commands)

# bytes
rcvr willing
to accept
counting
by bytes
of data
(not segments!)
Internet
checksum
(as in UDP)

TCP segment structure
49

Principles of Congestion Control

50

20min
50

Principles of congestion control
congestion:
informally: “too many sources sending too much data too fast for network to handle”
different from flow control!
manifestations:
lost packets (buffer overflow at routers)
long delays (queueing in router buffers)
a top-10 problem!

51

Congestion control is between network and sender, whereas flow control is between rcvr and sender.
51

Causes/costs of congestion: scenario 1
two senders, two receivers
one router, infinite buffers
output link capacity: R
no retransmission

maximum per-connection throughput: R/2
52

unlimited shared output link buffers

Host A
original data: lin

Host B

throughput: lout

R/2
R/2
lout
lin

R/2
delay
lin
large delays as arrival rate, lin, approaches capacity

original data: lin
throughput: lout

Causes/costs of congestion: scenario 2
one router, finite buffers
sender retransmission of timed-out packet
application-layer input = application-layer output: lin = lout、
Goodput
transport-layer input includes retransmissions : l’in lin

53

finite shared output link buffers

Host A
lin : original data

Host B

lout

l’in: original data, plus retransmitted data

Causes/costs of congestion: scenario 2
idealization: perfect knowledge
sender sends only when router buffers available

54

finite shared output link buffers

lin : original data

lout

l’in: original data, plus retransmitted data

copy
free buffer space!

R/2
R/2
lout
l’in

Host B
A

Causes/costs of congestion: scenario 2
Idealization: known loss packets can be lost, dropped at router due to full buffers
sender only resends if packet known to be lost

55

lin : original data

lout

l’in: original data, plus retransmitted data

copy
no buffer space!

A
Host B

Causes/costs of congestion: scenario 2
Idealization: known loss packets can be lost, dropped at router due to full buffers
sender only resends if packet known to be lost

56

lin : original data

lout

l’in: original data, plus retransmitted data

free buffer space!

R/2

R/2

lin

lout

when sending at R/2, some packets are retransmissions but asymptotic goodput is still R/2

A
Host B

A
lin

lout

l’in

copy
free buffer space!

timeout

R/2

R/2
lin

lout

when sending at R/2, some packets are retransmissions including duplicated that are delivered!

Host B
Realistic: duplicates
packets can be lost, dropped at router due to full buffers
sender times out prematurely, sending two copies, both of which are delivered

Causes/costs of congestion: scenario 2
57

Goodput is the useful output

57

R/2

lout

when sending at R/2, some packets are retransmissions including duplicated that are delivered!

“costs” of congestion:
more work (retrans) for given “goodput”
unneeded retransmissions: link carries multiple copies of pkt
decreasing goodput

R/2
lin

Realistic: duplicates
packets can be lost, dropped at router due to full buffers
sender times out prematurely, sending two copies, both of which are delivered

Causes/costs of congestion: scenario 2
58

Goodput is the useful output

58

Causes/costs of congestion: scenario 3
four senders
multihop paths
timeout/retransmit

59

Q: what happens as lin’ increases ?
finite shared output link buffers

Host A

lout

Host B
Host C
Host D
lin : original data

l’in: original data, plus retransmitted data
A: as red lin’ increases, all arriving blue pkts at upper queue are dropped, blue throughput g 0

another “cost” of congestion:
when packet dropped, any “upstream transmission capacity used for that packet was wasted!

lout
lin’
Causes/costs of congestion: scenario 3
60

two broad approaches towards congestion control:

Approaches towards congestion control
end-end congestion control:
no explicit feedback from network
congestion inferred from end-system observed loss, delay
approach taken by TCP
network-assisted
congestion control:
routers provide feedback to end systems
single bit indicating congestion
explicit rate for sender to send at
61

TCP Congestion Control

62

20min
62

Additive increase multiplicative decrease (AIMD)
approach: sender increases transmission rate (window size), probing for usable bandwidth, until loss occurs
additive increase: increase cwnd by 1 MSS (maximum segment size) every RTT until loss detected
multiplicative decrease: cut cwnd in half after loss

cwnd: TCP sender
congestion window size
AIMD saw tooth
behavior: probing
for bandwidth

additively increase window size …
…. until loss occurs (then cut window in half)

time
TCP congestion control
63

cwnd: congestion window
MSS: maximum segment size
AIMD:
63

TCP Congestion Control: details
sender limits transmission:

cwnd is dynamic, function of perceived network congestion

TCP sending rate:
roughly: send cwnd bytes, wait RTT for ACKS, then send more bytes
64

last byte
ACKed
sent, not-yet ACKed
(“in-flight”)
last byte sent
cwnd

LastByteSent- LastByteAcked
< cwnd sender sequence number space rate ~ ~ cwnd RTT bytes/sec TCP Slow Start when connection begins, increase rate exponentially: initially cwnd = 1 MSS double cwnd every RTT done by incrementing cwnd for every ACK received summary: initial rate is slow but ramps up exponentially fast when should the exponential increase switch to linear (additive increase)? 65 Host A one segment RTT Host B time two segments four segments TCP: switching from slow start to CA Q: when should the exponential increase switch to linear? A: cwnd reaches ssthresh Implementation: At beginning ssthresh, specified in different versions of TCP (In this example ssthresh=8 segment) on loss event, ssthresh is set to 1/2 of cwnd just before loss event 66 ssthresh=6 cwnd=12 loss! Problem 40, p.322. 66 TCP: detecting, reacting to loss loss indicated by timeout: cwnd set to 1 MSS; window then grows exponentially (as in slow start) to ssthresh, then grows linearly loss indicated by 3 duplicate ACKs: TCP Tahoe, same as loss indicated by timeout, always sets cwnd to 1 (timeout or 3 duplicate acks) TCP RENO cwnd is cut in half window then grows linearly (additive increase) fast recovery 67 TCP: switching from slow start to CA 68 slow start multiplicative decease additive increase additive increase slow start additive increase Problem 40, p.322. 68 timeout ssthresh = cwnd/2 cwnd = 1 MSS dupACKcount = 0 retransmit missing segment L cwnd > ssthresh

congestion
avoidance/
additive
increase

cwnd = cwnd + MSS (MSS/cwnd)
dupACKcount = 0
transmit new segment(s), as allowed

new ACK
.

dupACKcount++

duplicate ACK

fast
recovery/
multiplicative
decrease

cwnd = cwnd + MSS
transmit new segment(s), as allowed

duplicate ACK
ssthresh= cwnd/2
cwnd = ssthresh + 3
retransmit missing segment

dupACKcount == 3
timeout
ssthresh = cwnd/2
cwnd = 1
dupACKcount = 0
retransmit missing segment

ssthresh= cwnd/2
cwnd = ssthresh + 3
retransmit missing segment

dupACKcount == 3

cwnd = ssthresh
dupACKcount = 0

CK

slow
start/
exponential
increase
timeout
ssthresh = cwnd/2
cwnd = 1 MSS
dupACKcount = 0
retransmit missing segment

cwnd = cwnd+MSS
dupACKcount = 0
transmit new segment(s), as allowed

new ACK

dupACKcount++

duplicate ACK

L
cwnd = 1 MSS
ssthresh = 64 KB
dupACKcount = 0

CK!

CK!

CK!
Summary: TCP Reno Congestion Control
69

TCP Reno throughput
avg. TCP thruput as function of window size, RTT?
ignore slow start, assume always data to send
W: window size (measured in bytes) where loss occurs
avg. window size (# in-flight bytes) is ¾ W
avg. thruput is 3/4W per RTT
70

W
W/2
avg TCP thruput =
3
4

W
RTT

bytes/sec

TCP Futures: TCP over “long, fat pipes”
example: 1500 byte segments, 100ms RTT, want 10 Gbps throughput
requires W = 83,333 in-flight segments
throughput in terms of segment loss probability, L [Mathis 1997]:

➜ to achieve 10 Gbps throughput, need a loss rate of L = 2·10-10 – a very small loss rate!
new versions of TCP for high-speed
Vegas, Westwood, CUBIC, etc.

71
TCP throughput =
1.22
.
MSS

RTT

L

Average cwdw size should be 83.33 using the previous equation. A lot of inflight segment which makes it possible for one to be lost.
Conclusion the probability of loss has to be very lowto achieve 10Gb => motivating research on the topic (RFC3649).
71

TCP Fairness
Fairness: K TCP sessions share same bottleneck link of bandwidth R, each has average rate of R/K
72

TCP connection 1
bottleneck
router
capacity R

TCP connection 2

Why is TCP fair?
two competing sessions:
additive increase gives slope of 1, as throughout increases
multiplicative decrease decreases throughput proportionally
73

R
R
equal bandwidth share
Session 1 data rate
Session 2 data rate

congestion avoidance: additive increase

loss: decrease window by factor of 2

congestion avoidance: additive increase

loss: decrease window by factor of 2

45o

Problem 50, p.324.
73

Fairness (more)
Fairness and UDP
multimedia apps often do not use TCP
do not want rate throttled by congestion control
instead use UDP:
send audio/video at constant rate, tolerate packet loss

Fairness, parallel TCP connections
application can open multiple parallel connections between two hosts
e.g., link of rate R
App 1 asks for 1 TCP, gets 0.1R
App 2 asks for 9 TCPs, gets 0.9R

74

74

Final word
Window size = min (rwnd, cwnd)

75
receive window

congestion window

flow control
congestion control

75

Conclusion
principles behind transport layer services:
multiplexing, demultiplexing
reliable data transfer
connection setup, teardown
flow control
congestion control
instantiation, implementation in the Internet
UDP
TCP
76

U

sender

=

.
008

30.008

=

0.00027

L / R

RTT + L / R

=

U

sender

=

.008

30.008

RTT + L / R

L / R

=

=

0.00027

U

sender

=

.
008

30.008

=

0.00027

L / R

RTT + L / R

=

U

sender

=

.008

30.008

RTT + L / R

L / R

=

=

0.00027

U

sender

=

.
0024

30.008

=

0.00081

3
L / R

RTT + L / R

=

U

sender

=

.0024

30.008

RTT + L / R

3L / R

=

=

0.00081

RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
100
150
200
250
300
350
1815222936435057647178859299106
time (seconnds)
RTT (milliseconds)
SampleRTTEstimated RTT

/docProps/thumbnail.jpeg