COMP30023 – Computer Systems
TCP Congestion Control
Dr Lachlan Andrew 2021
© University of Melbourne
Acknowledgement
• The slides are based on slide prepared by Chris Culnane based on material developed previously by: Michael Kirley, Zoltan Somogyi, Rao Kotagiri, James Bailey and Chris Leckie.
• Some of the images included in the notes were supplied as part of the teaching resources accompanying the text books listed in lecture 1.
© 2021 University of Melbourne 2
Recap
• Socket Programming – SocketInterface
– SocketsinC
• TCP flow control
© 2021 University of Melbourne 3
TCP Sliding Window – Segment Loss
71
81
91
101
111
121
131
141
1
11
21
31
41
51
61
1
11
© 2021 University of Melbourne 4
TCP Sliding Window – Segment Loss
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
1
11
31
© 2021 University of Melbourne 5
TCP Sliding Window – Segment Loss
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
ACK:21, Window:50
1
11
31
© 2021 University of Melbourne 6
TCP Sliding Window – Segment Loss
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
ACK:21, Window:50
1
11
31
41
© 2021 University of Melbourne 7
TCP Sliding Window – Segment Loss
ACK + 3 DupACKs = Fast Retransmission
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
ACK:21, Window:50
1
11
31
41
51
© 2021 University of Melbourne 8
Original flow + loss control
• Flow+loss control had existed on single point-to-point links for a long time
• TCP originally used the experience from the link layer
• Caused bad design decision.
© 2021 University of Melbourne 9
Two types of flow control
• “Go-back-N” (3.4.3 of Kurose and Ross)
– When a packet is lost, go back to the point where it was
transmitted, and (re)transmit everything from that point onward
– Advantage:Receiverdoesn’tneedtostore/reorderpackets
• “Selective Repeat” (3.4.4 of Kurose and Ross)
– Onlyretransmitthelostpacket
– Packets arrive out of order. Receiver must store out-of-order packets to send them in-order to the application
• Selective repeat: more complex, only helps if loss is common
– Link layer fixes errors. Errors will be rare – right?
© 2021 University of Melbourne 10
Congestion collapse
• In the late 1980s, the internet had “congestion collapse” – It took tens of minutes to send a packet to the next building
• Van Jacobson diagnosed and solved the problem
• Router buffers were overflowing, causing high loss
• Senders were doing go-back-N, so that every packet loss caused N more packets to enter the system
• Solution: Selective repeat (“fast retransmit”) – “Packetconservation”principle
© 2021 University of Melbourne 11
TCP Sliding Window – Segment Loss
ACK + 3 DupACKs = Fast Retransmission
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
ACK:21, Window:50
1
11
31
41
51
© 2021 University of Melbourne 12
TCP Sliding Window – Fast retransmit
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
1
11
31
41
51
© 2021 University of Melbourne 13
TCP Sliding Window – Fast retransmit
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
ACK:21, Window:50
1
11
31
41
51
61
© 2021 University of Melbourne 14
TCP Sliding Window – Fast retransmit
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
ACK:71, Window:0
1
11
21
31
41
51
61
© 2021 University of Melbourne 15
TCP Sliding Window
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
Potential for deadlock
• Sender won’t send any more data (window size is 0)
• Receiver won’t receive anything, so won’t send any ACKs to increase window size
1
11
21
31
41
51
61
© 2021 University of Melbourne 16
TCP Sliding Window – Avoid Deadlock
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
1
11
21
31
Data read by application
41
51
61
© 2021 University of Melbourne 17
TCP Sliding Window – Window Update
Receiver can initiate an update by sending a WindowUpdate
WIndowUpdate, ACK:71, Window:50
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
1
11
21
31
41
51
61
© 2021 University of Melbourne 18
TCP Sliding Window – Window Update
121
131
141
1
11
21
31
41
51
61
71
81
91
101
111
1
11
21
31
41
51
61
© 2021 University of Melbourne 19
TCP Sliding Window – Persist Timer
Sender can monitor and initiate an update by sending a ZeroWindowProbe
On receipt of 0 size window, sender:
• Starts Persist Timer
• Persist Timer will periodically send
ZeroWindowProbe segments
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
ACK:71, Window:0
1
11
21
31
41
51
61
© 2021 University of Melbourne 20
TCP Sliding Window
Sender can initiate an update by sending a ZeroWindowProbe
ZeroWindowProbe
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
1
11
21
31
41
51
61
© 2021 University of Melbourne 21
TCP Sliding Window
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
ZeroWindowProbeACK, ACK:71, Window:50
1
11
21
31
41
51
61
© 2021 University of Melbourne 22
TCP Sliding Window
121
131
141
1
11
21
31
41
51
61
71
81
91
101
111
1
11
21
31
41
51
61
© 2021 University of Melbourne 23
TCP Congestion Control
• When networks are overloaded, congestion occurs, potentially affecting all layers
• Although lower layers (link and network) attempt to ameliorate congestion, in reality TCP affects congestion most significantly because TCP offers methods to transparently reduce the data rate, and hence reduce congestion itself
– Real-time control protocol also helps by changing video compression • Original TCP (before Jacobson)
– Initially, the receiver chooses a window based on its buffer size
– if the sender is constrained to this size, then congestion problems will not occur due to buffer overflow at the receiver itself, but may still occur due to congestion within the network
© 2021 University of Melbourne 24
Congestion Control Window
• Jacobson introduce CWND, the congestion window, in the same software update that introduce selective repeat.
• Additional window that is dynamically adjusted based on network performance to aid efficient transfer
• The congestion window size is maintained by the sender, unlike the sliding window that is controlled by the receiver
– Alsoonlyusedatthesender.
– Nochangestopacketformatstosendadditionalfield
• Was a big consideration. Now a huge one.
© 2021 University of Melbourne 25
Incremental congestion control
• At first, CWND2*maximum segment size – Sendertransmits2segments
• If this segment is acknowledged, CWND++ – Sendertransmits3segments
• As each new segment is acknowledged, the congestion window is increased by one maximum segment size
• Each full window of acknowledgements doubles the congestion window – which grows until
– a timeout
– It reaches a threshold, SSthresh
© 2021 University of Melbourne 26
Incremental congestion control
• Slow-start algorithm – Initialrateisslow
– Rategrowsexponentiallyfromthere
© 2021 University of Melbourne 27
Incremental congestion control
• Eventually too many segments would be placed onto the network causing congestion and timeouts
• Slow-start maintains a threshold: Slow Start Threshold ssthresh
• Slow start keeps increasing the size of the window until a timeout
occurs or the threshold is reached
• If segment loss occurs the ssthresh is set to half the current congestion window size, and the process start again
• Once the threshold is reached the growth is slowed to linear, by adding 1 MSS to the congestion window for each successful ACK, known as additive increase
• Can also react to known lost segments via fast retransmission
• Known as TCP Tahoe (1988) after BSD release name
© 2021 University of Melbourne 28
Incremental congestion control – an example
© 2021 University of Melbourne 29
Congestion control optimisation
• Further optimisations have been made
– Fastrecoveryusingduplicateacknowledgecounts
– Startsfromnewssthreshinsteadoforiginalstartingvalue, effectively avoiding the slow start phase and going straight to additive increase
© 2021 University of Melbourne 30
Further Optimisations
• SACK (Selective Acknowledgements) – provides greater ability to track segments in-flight, by allowing up to 3 ranges of bytes received to be specified
• ECN Explicit Congestion Notification (ECE & CWR bits set during SYN)
– Allows IP Layer to indicate congestion is occurring without dropping the
segment by setting an ECN flag
– Receiver indicates this to sender via ECE (ECN Echo) flag
– Sender acknowledges this by setting the Congestion Window Reduced Flag (CWR), reacts as if a segment has been lost, without actually having lost it
© 2021 University of Melbourne 31
Macroscopic model
• These packet-level rules affect many things we care about
– Fairnessbetweenflows
– Responsetolonground-triptimes(RTTs) – Responsetorandompacketloss
• It is useful to have an algebraic expression relating these quantities
© 2021 University of Melbourne 32
Window size
• W increases once per window
– Approximation:EachpacketthatarrivesincreasesWby1/W
• When a loss occurs, W is halved – With probability p, WW/2
(1−𝑝𝑝) 𝑊𝑊 •−𝑝𝑝
• The average increase in window size is
𝑊𝑊2
• To be in equilibrium (balance), the average increase must be 0 • 𝑊𝑊≈ 2/𝑝𝑝
© 2021 University of Melbourne 33
Rate, fairness
• The window is sent ≤ once per round trip time (RTT), T • Rateis𝑊𝑊≈1 2/𝑝𝑝
1. For a given packet loss rate, longer RTTs get less rate
𝑇𝑇𝑇𝑇
2.
• •
– “RTTunfairness”
If RTT is small, TCP forces the packet loss rate to be high
This formula is very approximate
– WindowonlyrespondstoonepacketlossperRTT
– Packet losses are clustered
However, it gives two important insights
© 2021 University of Melbourne 34
And finally…
• The first “worm” malware released on the internet was the Morris Worm in 1988
• It was estimated to have done $100,000 to $10,000,000 damage, despite being intended to harmlessly demonstrate vulnerabilities, because it spread faster than intended
• Its inventor, Robert Morris, went on to do a PhD in TCP congestion control…
© 2021 University of Melbourne 35