Networking CS 161 Textbook Computer Security
An online version is available at https://textbook.cs161.org. Textbook by , , ,
Additional contributions by , , and Shomil Jain Last update: August 26, 2021
Contact for corrections:
25 Introduction to Networking
To discuss network security, first we need to know how the network is designed. This section provides a (simplified) overview of the various Internet layers and how they interact. A video version of this section is available: see Lecture 11, Summer 2020.
25.1 Local Area Networks
The primary goal of the Internet is to move data from one location to another. A good analogy for the Internet is the postal system, which we’ll refer to throughout this section.
The first building block we need is something that moves data across space, such as bits on a wire, radio waves, carrier pigeons, etc. Using our first building block, we can connect a group of local machines in a local area network (LAN).
Figure 1: Computers connected in a local area network (LAN).
Note that in a LAN, all machines are connected to all other machines. This allows any machine on the LAN to send and receive messages from any other machine on the same LAN. You can think of a LAN as an apartment complex, a local group of nearby apartments that are all connected. However, it would be infeasible to connect every machine in the world to every other machine in the world, so we introduce a router to connect multiple LANs.
CS 161 Notes 1 of 75
Figure 2: Two LANs connected through a router.
A router is a machine that is connected to two or more LANs. If a machine wants to send a message to a machine on a different LAN, it sends the message to the router, which forwards the message to the second LAN. You can think of a router as a post office: to send a message somewhere outside of your local apartment complex, you’d take it to the post office, and they would forward your message to the other apartment complex.
With enough routers and LANs, we can connect the entire world in a wide area network, which forms the basis of the Internet.
25.2 Internet layering
You may have noticed that this design uses layers of abstraction to build the Internet. The lowest layer (layer 1, also called the physical layer) moves bits across space. Then, layer 2 (the link layer) uses layer 1 as a building block to connect local machines in a LAN. Finally, layer 3 (the internetwork layer) connects many layer 2 LANs. Each layer relies on services from a lower layer and provides services to a higher layer. Higher layers contain richer information, while lower layers provide the support necessary to send the richer information at the higher layers.
This design provides a clean abstraction barrier for implementation. For example, a network can choose to use wired or wireless communication at Layer 1, and the Layer 1 implementa- tion does not affect any protocols at the other layers.
In total, there are 7 layers of the Internet, as defined by the OSI 7-layer model. However, this model is a little outdated, so some layers are obsolete, and additional layers for security have been added since then. We will see these higher layers later.
CS 161 Notes 2 of 75
7 6.5 6 5 4 3 2 1
Application Secure Transport obsolete
obsolete Transport (Inter)Network Link
Figure 3: The OSI 7-layer model.
25.3 Protocols and Headers
Each layer has its own set of protocols, a set of agreements on how to communicate. Each protocol specifies how communication is structured (e.g. message format), how machines should behave while communicating (e.g. what actions are needed to send and receive messages), and how errors should be handled (e.g. a message timing out).
To support protocols, messages are sent with a header, which is placed at the beginning of the message and contains some metadata such as the sender and recipient’s identities, the length of the message, identification numbers, etc. You can think of headers as the envelope of a letter: it contains the information needed to deliver the letter, and appears before the actual letter.
Figure 4: Multiple headers on a single packet.
Because multiple protocols across different layers are needed to send a message, we need multiple headers on each packet. Each message begins as regular human-readable text (the highest layer). As the message is being prepared to get sent, it is passed down the protocol stack to lower layers (similar to how C programs are passed to lower layers to translate C code to RISC-V to machine-readable bits). Each layer adds its own header to the top of the message provided from the layer directly above. When the message reaches the lowest layer,
CS 161 Notes 3 of 75
it now has multiple headers, starting with the header for the lowest layer first.
Once the message reaches its destination, the recipient must unpack the message and decode it back into human-readable text. Starting at the lowest layer, the message moves up the protocol stack to higher layers. Each layer removes its header and provides the remaining content to the layer directly above. When the message reaches the highest layer, all headers have been processed, and the recipient sees the regular human-readable text from before.
25.4 Addressing: MAC, IP, Ports
Depending on the layer, a machine can be referred to by several different addresses.
Layer 2 (link layer) uses 48-bit (6-byte) MAC addresses to uniquely identify each machine on the LAN. This is not to be confused with MACs (message authentication codes) from the crypto section. Usually it is clear from context which type of MAC we are referring to, although sometimes cryptographic MACs are called MICs (message integrity codes) when discussing networking. MAC addresses are usually written as 6 pairs of hex numbers, such as ca:fe:f0:0d:be:ef. There is also a special MAC address, the broadcast address of ff:ff:ff:ff:ff:ff, that says “send this message to everyone on the local network.” You can think of MAC addresses as apartment numbers: they are used to uniquely identify people within one apartment complex, but are useless for uniquely identifying one person in the world. (Imagine sending a letter addressed to “Apartment 5.” This might work if you’re delivering letters within your own apartment complex, but how many Apartment 5s exist in the entire world?)
Layer 3 (IP layer) uses 32-bit (4-byte) IP addresses to uniquely identify each machine glob- ally. IP addresses are usually written as 4 integers between 0 and 255, such as 128.32.131.10. Because the Internet has grown so quickly, the most recent version of the layer 3 protocol, IPv6, uses 128-bit IP addresses, which are written as 8 2-byte hex values separated by colons, such as cafe:f00d:d00d:1401:2414:1248:1281:8712. However, for this class, you only need to know about IPv4, which uses 32-bit IP addresses.
Higher layers are designed to allow each machine to have multiple processes communicating across the network. For example, your computer only has one IP address, but it may have multiple browser tabs and applications open that all want to communicate over the network. To distinguish each process, higher layers assign each process on a machine a unique 16-bit port number. You can think of port numbers as room numbers: they are used to uniquely identify one person in a building.
The source and destination addresses are contained in the header of a message. For example, the Layer 2 header contains MAC addresses, the Layer 3 header contains IP addresses, and higher layer headers will contain port numbers.
25.5 Packets vs. Connections
Notice that in the postal system example, the post office has no idea if you and your pen pal are having a conversation through letters. The Internet is the same: at the physical, link,
CS 161 Notes 4 of 75
and internetwork layers, there is no concept of a connection. A router at the link layer only needs to consider each individual packet and send it to its destination (or, in the case of a long-distance message, forward it to another router somewhere closer to the destination). At the lower layers, we call individual messages packets. Packets are usually limited to a fixed length.
In order to actually create a two-way connection, we rely on higher layers, which maintain a connection by breaking up longer messages into individual packets and sending them through the lower layer protocols. Higher-layer connections can also implement cryptographic pro- tocols for additional security, as we’ll see in the TLS section.
Note that so far, the Internet design has not guaranteed any correctness or security. Packets can be corrupted in transit or even fail to send entirely. The IP (Internet Protocol) at layer 3 only guarantees best-effort delivery, and does not handle any errors. Instead, we rely on higher layers for correctness and security.
25.6 Network Adversaries
Network adversaries can be sorted into 3 general categories. They are, from weakest to strongest:
Off-path Adversaries: cannot read or modify any packets sent over the connection. On-path Adversaries: can read, but not modify packets.
In-path Adversaries: can read, modify, and block packets. Also known as a man-in-the- middle.
Note that all adversaries can send packets of their own, including faking or spoofing the packet headers to appear like the message is coming from somebody else. This is often as simple as setting the “source” field on the packet header to somebody else’s address.
CS 161 Notes 5 of 75
26 Wired Local Networks: ARP
26.1 Cheat sheet
• Layer: Link (2)
• Purpose: Translate IP addresses to MAC addresses
• Vulnerability: On-path attackers can see requests and send spoofed malicious responses • Defense: Switches, arpwatch
26.2 Networking background: Ethernet
Recall that on a LAN (local-area network), all machines are connected to all other machines. Ethernet is one particular LAN implementation that uses wires to connect all machines.
Ethernet started as a broadcast-only network. Each node on the network could see messages sent by all other nodes, either by being on a common wire or a network hub, a simple repeater that took every packet it received and rebroadcast it to all the outputs. A receiver is simply supposed to ignore all packets not sent to either the receiver’s MAC or the broadcast address. But this is only enforced in software, and most Ethernet devices can enter promiscuous mode, where it will receive all packets. This is also called sniffing packets.
For versions of Ethernet that are inherently broadcast, such as a hub, an adversary in the local network can see all network traffic and can also introduce any traffic they desire by simply sending packets with a spoofed MAC address. Sanity check: what type of adversary does this make someone on the same LAN network as a victim?1
26.3 Protocol: ARP
ARP, the Address Resolution Protocol, translates Layer 3 IP addresses into Layer 2 MAC addresses.
Say Alice wants to send a message to Bob, and Alice knows that Bob’s IP address is 1.1.1.1. The ARP protocol would follow three steps:
1. Alice would broadcast to everyone else on the LAN: “What is the MAC address of 1.1.1.1?”
2. Bob responds by sending a message only to Alice: “My IP is 1.1.1.1 and my MAC address is ca:fe:f0:0d:be:ef.” Everyone else does nothing.
3. Alice caches the IP address to MAC address mapping for Bob.
If Bob is outside of the LAN, then the router would respond in step 2 with its MAC address.
Any received ARP replies are always cached, even if no broadcast request (step 1) was ever made.
1A: On-path
CS 161 Notes 6 of 75
26.4 Attack: ARP Spoofing
Because there is no way to verify that the reply in step 2 is actually from Bob, it is easy to attack this protocol. If Mallory is able to create a spoofed reply and send it to Alice before Bob can send his legitimate reply, then she can convince Alice that a different MAC address (such as Mallory’s) corresponds to Bob’s IP address. Now, when Alice wants to send a local message to Bob, she will use the malicious cached IP address to MAC address mapping, which might map Bob’s IP address to Mallory’s MAC address. This will cause messages intended for Bob to be sent to Mallory. Sanity check: what type of adversary is Mallory after she executes an ARP spoof attack?2
ARP spoofing is our first example of a race condition, where the attacker’s response must arrive faster than the legitimate response to fool the victim. This is a common pattern for on-path attackers, who cannot block the legitimate response and thus must race to send their response first.
26.5 Defenses: Switches
A simple defense against ARP spoofing is to use a tool like arpwatch, which tracks the IP address to MAC address pairings across the LAN and makes sure nothing suspicious happens.
Modern wired Ethernet networks defend against ARP spoofing by using switches rather than hubs. Switches have a MAC cache, which keeps track of the IP address to MAC address pairings. If the packet’s IP address has a known MAC in the cache, the switch just sends it to the MAC. Otherwise, it broadcasts the packet to everyone. Smarter switches can filter requests so that not every request is broadcast to everyone.
Higher-quality switches include VLANs (Virtual Local Area Networks), which implement isolation by breaking the network into separate virtual networks.
2A: Man-in-the-middle. She can receive messages from Alice, modify them, then send them to Bob.
CS 161 Notes 7 of 75
Wireless Local Networks: WPA2
Cheat sheet
Layer: Link (2)
Purpose: Communicate securely in a wireless local network
Vulnerability: On-path attackers can learn the encryption keys from the handshake and decrypt messages (includes brute-forcing the password if they don’t know it already)
Defense: WPA2-Enterprise
Networking background: WiFi
Another implementation of the link layer is WiFi, which wirelessly connects machines in a LAN. Because it wireless connections over cellular networks, WiFi has some differences from wired Ethernet, but these are out of scope for this class. For the purposes of this class, WiFi behaves mostly like Ethernet, with the same packet format and similar protocols like ARP for address translation.
To join a WiFi network, your computer establishes a connection to the network’s AP (Ac- cess Point). Generally the AP is continuously broadcasting beacon packets saying “I am here” and announcing the name of the network, also called the SSID (Service Set Iden- tifier). When you choose to connect to a WiFi network (or if your computer is configured to automatically join a WiFi network), it will broadcast a request to join the network.
If the network is configured without a password, your computer immediately joins the net- work, and all data is transmitted without encryption. This means that anybody else on the same network can see your traffic and inject packets, like in ARP spoofing.
27.3 Protocol
WPA2-PSK (WiFi Protected Access: Pre-Shared Key) is a protocol that enables secure communications over a WiFi network by encrypting messages with cryptography.
In WPA2-PSK, a network has one password for all users (this is the WiFi password you ask your friends for). The access point derives a PSK (Pre-Shared Key) by applying a password-based key derivation function (PBKDF2-SHA1) on the SSID and the password. Recall from the cryptography unit that password-based key derivation functions are designed to be slower by a large constant factor to make brute-force attacks more difficult. Sanity check: Why might we choose to include the SSID as input to the key derivation function?3
When a computer (client) wants to connect to a network protected with WPA2-PSK, the user must first type in the WiFi password. Then, the client uses the same key derivation
3By including the SSID, two different networks with the same password will still have different PSKs.
CS 161 Notes 8 of 75
function to generate the PSK. Sanity check: Why can’t we be done here and use the PSK to encrypt all further communications?4
To give each user a unique encryption key, after both the client and the access point inde- pendently derive the PSK, they participate in a handshake to generate shared encryption keys.
Figure 5: The WPA2 handshake.
1. The client and the access point exchange random nonces, the ANonce and the SNonce. The nonces ensure that different keys will be generated during each handshake. The nonces are sent without any encryption.
2. The client and access point independently derive the PTK (Pairwise Transport Keys) as a function of the two nonces, the PSK, and the MAC addresses of both the access point and the client.
3. The client and the access point exchange MICs (recall that these are MACs from the crypto unit) to check that no one tampered with the nonces, and that both sides
4Because everyone on the network would use the same PSK, so others on the same network can still decrypt your traffic.
CS 161 Notes 9 of 75
correctly derived the PTK.
4. The access point encrypts the GTK (Group Temporal Key) and sends it to the client.
5. The client sends an ACK (acknowledgement message) to indicate that it successfully received the GTK.
Once the handshake is complete, all further communication between the client and the access point is encrypted with the PTK.
The GTK is used for messages broadcast to the entire network (i.e. sent to the broadcast MAC address, ff:ff:ff:ff:ff:ff). The GTK is the same for everyone on the network, so everyone can encrypt/send and decrypt/receive broadcast messages.
In practice, the handshake is optimized into a 4-way handshake, requiring only 4 messages to be exchanged between the client and the access point.
Figure 6: The optimized, 4-way WPA2 handshake. 1. The access point sends the ANonce, as before.
CS 161 Notes 10 of 75
2. Once the client receives the ANonce, it has all the information needed to derive the PTK, so it derives the PTK first. Then it sends the SNonce and the MIC to the access point.
3. Once the access point receives the SNonce, it can derive the PTK as well. Then it sends the encrypted GTK and the MIC to the client.
4. The client sends an ACK to indicate that it successfully received the GTK, as before.
27.4 Attacks
In the WPA2 handshake, everything except the GTK is sent unencrypted. Recall that the PTK is derived with the two nonces, the PSK, and the MAC addresses of both the access point and the client. This means that an on-path attacker who eavesdrops on the entire handshake can learn the nonces and the MAC addresses. If the attacker is part of the WiFi network (i.e. they know the WiFi password and generated the PSK), then they know everything necessary to derive the PTK. This attacker can decrypt all messages and eavesdrop on communications, and encrypt and inject messages.
Even if the attacker isn’t on the WiFi network (doesn’t know the WiFi password and can- not generate the PSK), they can try to brute-force the WiFi password. For each guessed password, the attacker derives the PSK from that password, uses the PSK (and the other unencrypted information from the handshake) to derive the PTK, and checks if that PTK is consistent with the MICs. If the WiFi password is low-entropy, an attacker with enough compute power can brute-force the password and learn the PTK.
27.5 Defenses: WPA2-Enterprise
The main problem leading to the attacks in the previous section is that every user on the network uses the same secrecy (the WiFi password) to derive private keys. To solve this, each user needs an different, unique source of secrecy. This modified protocol is called WPA2- Enterprise. AirBears2 is an example of WPA2-Enterprise that you might be familiar with.
Instead of using one WiFi password for all users, WPA2-Enterprise gives authorized a unique username and password. In WPA2-Enterprise, before the handshake occurs, the client con- ne