Carnegie Mellon
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
1
14
–
513
18
–
613
Carnegie Mellon
Network Programming: Part II
15-213/18-213/14-513/15-513/18-613: Introduction to Computer Systems
23rd Lecture, November 17, 2020
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 2
Carnegie Mellon
2. Start client Client
1. Start server Server
Review: Echo
Server
+ Client
Structure
Await connection request from client
3. Exchange data
open_clientfd
Connection request
accept
Client / Server Session
terminal read
socket write
socket read
terminal write
4. Disconnect client
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
5. Drop client
close
EOF
socket read
close
open_listenfd
socket read
socket write
4
Carnegie Mellon
Today
The Sockets Interface
Web Servers
The Tiny Web Server
Serving Dynamic Content
CSAPP 11.4
CSAPP 11.5.1-11.5.3 CSAPP 11.6
CSAPP 11.5.4
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
5
Carnegie Mellon
Start client
Client
getaddrinfo
socket
Start server
Server
getaddrinfo
socket
bind listen
Sockets Interface
open_listenfd
open_clientfd
Connection request
Client / Server Session
Await connection request from next client
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
6
connect
rio_writen
rio_readlineb
close
EOF
rio_readlineb
accept
rio_readlineb
rio_writen
close
Carnegie Mellon
Review: Generic Socket Address Generic socket address:
▪ For address arguments to connect, bind, and accept
struct sockaddr {
uint16_t sa_family; /* Protocol family */ char sa_data[14]; /* Address data. */
};
sa_family
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
7
Family Specific
Carnegie Mellon
Review: Socket Address Structures Internet (IPv4) specific socket address:
▪ Must cast (struct sockaddr_in *) to (struct sockaddr *) for functions that take socket address arguments.
struct sockaddr_in {
uint16_t sin_family; /* Protocol family (always AF_INET) */ uint16_t sin_port; /* Port num in network byte order */ struct in_addr sin_addr; /* IP addr in network byte order */ unsigned char sin_zero[8]; /* Pad to sizeof(struct sockaddr) */
};
sa_family
sin_family
Family Specific
sin_port sin_addr
AF_INET
0
0
0
0
0
0
0
0
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
8
Carnegie Mellon
Review: getaddrinfo
getaddrinfo converts string representations of hostnames,
host addresses, ports, service names to socket address structures
SA list
addrinfo structs
ai_canonname
result
Socket address structs
ai_addr
ai_next
NULL
ai_addr
ai_next
NULL
ai_addr
NULL
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
9
Carnegie Mellon
Start client
Client
getaddrinfo
SA list
Start server
Server
getaddrinfo
SA list
socket
bind listen
Sockets Interface
open_listenfd
socket
open_clientfd
Connection request
Client / Server Session
Await connection request from next client
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
10
connect
rio_writen
rio_readlineb
close
EOF
rio_readlineb
accept
rio_readlineb
rio_writen
close
Carnegie Mellon
Sockets Interface: socket
Clients and servers use the socket function to create a socket descriptor:
int socket(int domain, int type, int protocol)
Example:
int clientfd = socket(AF_INET, SOCK_STREAM, 0);
Indicates that we are using Indicates that the socket 32-bit IPV4 addresses will be the end point of a reliable (TCP) connection
Protocol specific! Best practice is to use getaddrinfo to generate the parameters automatically, so that code is protocol independent.
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 11
Carnegie Mellon
Client
getaddrinfo
socket
SA list
Start server
Server
getaddrinfo
SA list
listenfd
Sockets Interface
open_listenfd
socket
clientfd
open_clientfd
bind
listen
Connection request
connect
accept
Client / Server Session
Await connection request from next client
rio_writen
rio_readlineb
rio_writen
rio_readlineb
close
EOF
rio_readlineb
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
12
close
Carnegie Mellon
Sockets Interface: bind
A server uses bind to ask the kernel to associate the
server’s socket address with a socket descriptor:
int bind(int sockfd, SA *addr, socklen_t addrlen);
Our convention: typedef struct sockaddr SA;
Process can read bytes that arrive on the connection whose
endpoint is addr by reading from descriptor sockfd Similarly, writes to sockfd are transferred along
connection whose endpoint is addr
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 13
Carnegie Mellon
Client
getaddrinfo
SA list
socket
clientfd
Server
getaddrinfo
SA list
socket
listenfd
listenfd <-> SA
Sockets Interface
open_listenfd
open_clientfd
bind
Connection request
listen
Client / Server Session
Await connection request from next client
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
15
connect
rio_writen
rio_readlineb
close
EOF
rio_readlineb
accept
rio_readlineb
rio_writen
close
Carnegie Mellon
Sockets Interface: listen
Kernel assumes that descriptor from socket function is an
active socket that will be on the client end
A server calls the listen function to tell the kernel that a
descriptor will be used by a server rather than a client:
int listen(int sockfd, int backlog);
Converts sockfd from an active socket to a listening socket that can accept connection requests from clients.
backlog is a hint about the number of outstanding connection requests that the kernel should queue up before starting to refuse requests (128-ish by default)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 16
Carnegie Mellon
Client
getaddrinfo
SA list
socket
clientfd
Server
getaddrinfo
SA list
socket
listenfd
listenfd <-> SA
Sockets Interface
open_listenfd
open_clientfd
bind
Connection request
listening listenfd
listen
Client / Server Session
Await connection request from next client
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
17
connect
rio_writen
rio_readlineb
close
EOF
rio_readlineb
accept
rio_readlineb
rio_writen
close
Carnegie Mellon
Sockets Interface: accept
Servers wait for connection requests from clients by
calling accept:
int accept(int listenfd, SA *addr, int *addrlen);
Waits for connection request to arrive on the connection bound to listenfd, then fills in client’s socket address in addr and size of the socket address in addrlen.
Returns a connected descriptor connfd that can be used to communicate with the client via Unix I/O routines.
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 18
Carnegie Mellon
Client
getaddrinfo
SA list
clientfd
Server
getaddrinfo
SA list
listenfd
listenfd <-> SA
Sockets Interface
open_listenfd
socket
socket
open_clientfd
bind
Connection request
listening listenfd
listen
connect
accept
Client / Server Session
Await connection request from next client
rio_writen
rio_readlineb
rio_writen
rio_readlineb
close
EOF
rio_readlineb
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
19
close
Carnegie Mellon
Sockets Interface: connect
A client establishes a connection with a server by calling connect:
int connect(int clientfd, SA *addr, socklen_t addrlen);
Attempts to establish a connection with server at socket address addr
▪ If successful, then clientfd is now ready for reading and writing. ▪ Resulting connection is characterized by socket pair
(x:y, addr.sin_addr:addr.sin_port)
▪ x is client address
▪ y is ephemeral port that uniquely identifies client process on client host
Best practice is to use getaddrinfo to supply the arguments addr and addrlen.
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 20
Carnegie Mellon
connect/accept Illustrated listenfd
Client
clientfd
Connection request
Client
clientfd
Server
1. Server blocks in accept, waiting for connection request on listening descriptor listenfd
2. Client makes connection request by calling and blocking in connect
listenfd
Server
Client
clientfd
listenfd
Server
connfd
3. Server returns connfd from accept. Client returns from connect. Connection is now established between clientfd and connfd
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
21
Carnegie Mellon
Connected vs. Listening Descriptors
Listening descriptor
▪ End point for client connection requests
▪ Created once and exists for lifetime of the server
Connected descriptor
▪ End point of the connection between client and server
▪ A new descriptor is created each time the server accepts a connection request from a client
▪ Exists only as long as it takes to service client Why the distinction?
▪ Allows for concurrent servers that can communicate over many client connections simultaneously
▪ E.g., Each time we receive a new request, we fork a child to handle the request
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 22
Carnegie Mellon
Client
getaddrinfo
SA list
socket
clientfd
Server
getaddrinfo
SA list
socket
listenfd
listenfd <-> SA
Sockets Interface
open_listenfd
open_clientfd
bind
Connection request
listening listenfd
connected connfd
Await connection request from next client
listen
connect
accept
Client / Server Session
rio_writen
connected (to SA) clientfd
rio_readlineb
rio_writen
rio_readlineb
close
EOF
rio_readlineb
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
23
close
Carnegie Mellon
Client
getaddrinfo
socket
Server
getaddrinfo
socket
bind
listen
Sockets Interface
open_listenfd
open_clientfd
Connection request
Client / Server Session
Await connection request from next client
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
24
connect
rio_writen
rio_readlineb
close
EOF
rio_readlineb
accept
rio_readlineb
rio_writen
close
Carnegie Mellon
Sockets Helper: open_clientfd Establish a connection with a server
int open_clientfd(char *hostname, char *port) { int clientfd;
struct addrinfo hints, *listp, *p;
/* Get a list of potential server addresses */
memset(&hints, 0, sizeof(struct addrinfo));
hints.ai_socktype = SOCK_STREAM; /* Open a connection */ hints.ai_flags = AI_NUMERICSERV; /* …using numeric port arg. */ hints.ai_flags |= AI_ADDRCONFIG; /* Recommended for connections */ Getaddrinfo(hostname, port, &hints, &listp);
csapp.c
AI_ADDRCONFIG – uses your system’s address type.
You have at least one IPV4 iface? IPV4. At least one IPV6? IPV6.
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 25
Carnegie Mellon
getaddrinfo
addrinfo structs
ai_canonname
Socket address structs
result
ai_addr
ai_next
NULL
ai_addr
ai_next
NULL
ai_addr
NULL
Clients: walk this list, trying each socket address in turn, until the calls to socket and connect succeed.
Servers: walk the list until calls to socket and bind succeed.
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 26
Carnegie Mellon
Sockets Helper: open_clientfd (cont)
}
csapp.c
/* Walk the list for one that we can successfully connect to */
for (p = listp; p; p = p->ai_next) {
/* Create a socket descriptor */
if ((clientfd = socket(p->ai_family, p->ai_socktype,
p->ai_protocol)) < 0) continue; /* Socket failed, try the next */
/* Connect to the server */
if (connect(clientfd, p->ai_addr, p->ai_addrlen) != -1) break; /* Success */
Close(clientfd); /* Connect failed, try another */ }
/* Clean up */
Freeaddrinfo(listp);
if (!p) /* All connects failed */
return -1;
else /* The last connect succeeded */
return clientfd;
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 27
Carnegie Mellon
Client
getaddrinfo
socket
Server
getaddrinfo
socket
bind
listen
Sockets Interface
open_listenfd
open_clientfd
Connection request
Client / Server Session
Await connection request from next client
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
28
connect
rio_writen
rio_readlineb
close
EOF
rio_readlineb
accept
rio_readlineb
rio_writen
close
Carnegie Mellon
Sockets Helper: open_listenfd
Create a listening descriptor that can be used to accept
connection requests from clients.
int open_listenfd(char *port)
{
struct addrinfo hints, *listp, *p; int listenfd, optval=1;
/* Get a list of potential server addresses */
memset(&hints, 0, sizeof(struct addrinfo));
hints.ai_socktype = SOCK_STREAM; /* Accept connect. */ hints.ai_flags = AI_PASSIVE | AI_ADDRCONFIG; /* …on any IP addr */ hints.ai_flags |= AI_NUMERICSERV; /* …using port no. */ Getaddrinfo(NULL, port, &hints, &listp);
csapp.c
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 29
Carnegie Mellon
Sockets Helper: open_listenfd (cont)
/* Walk the list for one that we can bind to */
for (p = listp; p; p = p->ai_next) {
/* Create a socket descriptor */
if ((listenfd = socket(p->ai_family, p->ai_socktype,
p->ai_protocol)) < 0) continue; /* Socket failed, try the next */
/* Eliminates "Address already in use" error from bind */
Setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR, (const void *)&optval , sizeof(int));
/* Bind the descriptor to the address */
if (bind(listenfd, p->ai_addr, p->ai_addrlen) == 0) break; /* Success */
Close(listenfd); /* Bind failed, try the next */ }
csapp.c
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 30
Carnegie Mellon
Sockets Helper: open_listenfd (cont)
/* Clean up */
Freeaddrinfo(listp);
if (!p) /* No address worked */
return -1;
/* Make it a listening socket ready to accept conn. requests */
if (listen(listenfd, LISTENQ) < 0) {
Close(listenfd);
return -1; }
return listenfd;
}
csapp.c
Key point: open_clientfd and open_listenfd are both independent of any particular version of IP.
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 31
Carnegie Mellon
Testing Servers Using telnet
The telnet program is invaluable for testing servers that transmit ASCII strings over Internet connections
▪ Our simple echo server ▪ Web servers
▪ Mail servers
Usage:
▪ linux> telnet
Carnegie Mellon
Testing the Echo Server With telnet
whaleshark> ./echoserveri 15213
Connected to (MAKOSHARK.ICS.CS.CMU.EDU, 50280) server received 11 bytes
server received 8 bytes
makoshark> telnet whaleshark.ics.cs.cmu.edu 15213 Trying 128.2.210.175…
Connected to whaleshark.ics.cs.cmu.edu (128.2.210.175). Escape character is ‘^]’.
Hi there!
Hi there!
Howdy!
Howdy!
^]
telnet> quit Connection closed. makoshark>
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 33
Carnegie Mellon
Today
The Sockets Interface
Web Servers
The Tiny Web Server
Serving Dynamic Content
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 34
Carnegie Mellon
Web Server Basics
Clientsandserverscommunicate using the HyperText Transfer Protocol (HTTP)
▪ Client and server establish TCP connection
▪ Client requests content
▪ Server responds with requested
content
▪ Client and server close connection (eventually)
Current version is HTTP/1.1 ▪ RFC 2616, June, 1999.
HTTP request Web
client (browser)
Web server
HTTP response (content)
Web content Streams Datagrams
HTTP
TCP
IP
http://www.w3.org/Protocols/rfc2616/rfc2616.html
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
35
Carnegie Mellon
Web Content
Web servers return content to clients
▪ content: a sequence of bytes with an associated MIME (Multipurpose Internet Mail Extensions) type
Example MIME types ▪ text/html
▪ text/plain ▪ image/gif ▪ image/png ▪ image/jpeg
HTML document
Unformatted text
Binary image encoded in GIF format Binary image encoded in PNG format Binary image encoded in JPEG format
You can find the complete list of MIME types at:
http://www.iana.org/assignments/media-types/media-types.xhtml
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 36
Carnegie Mellon
Static and Dynamic Content
The content returned in HTTP responses can be either static or dynamic
▪ Static content: content stored in files and retrieved in response to an HTTP request
▪ Examples: HTML files, images, audio clips, Javascript programs
▪ Request identifies which content file
▪ Dynamic content: content produced on-the-fly in response to an HTTP
request
▪ Example: content produced by a program executed by the server on behalf of the client
▪ Request identifies file containing executable code
Web content associated with a file that is managed by the server
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 37
Carnegie Mellon
URLs and how clients and servers use them
Unique name for a file: URL (Universal Resource Locator)
Example URL: http://www.cmu.edu:80/index.html
Clients use prefix (http://www.cmu.edu:80) to infer:
▪ What kind (protocol) of server to contact (HTTP) ▪ Where the server is (www.cmu.edu)
▪ What port it is listening on (80)
Servers use suffix (/index.html) to:
▪ Determine if request is for static or dynamic content.
▪ No hard and fast rules for this
▪ One convention: executables reside in cgi-bin directory
▪ Find file on file system
▪ Initial “/” in suffix denotes home directory for requested content.
▪ Minimal suffix is “/”, which server expands to configured default filename (usually, index.html)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 38
Carnegie Mellon
HTTP Requests
HTTP request is a request line, followed by zero or more
request headers
Request line:
▪
DELETE, or TRACE
▪
▪ A URL is a type of URI (Uniform Resource Identifier)
▪ See http://www.ietf.org/rfc/rfc2396.txt
▪
Request headers:
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 39
Carnegie Mellon
HTTP Responses
HTTP response is a response line followed by zero or more response headers, possibly followed by content, with blank line (“\r\n”) separating headers from content.
Response line:
▪
▪
▪ 200 OK
▪ 301 Moved
▪ 404 Not found
Request was handled without error Provide alternate URL
Server couldn’t find the file
Response headers:
▪ Content-Type: MIME type of content in response body
▪ Content-Length: Length of content in response body
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 40
Carnegie Mellon
Example HTTP Transaction
whaleshark> telnet www.cmu.edu 80 Client: open connection to server
Trying 128.2.42.52… Telnet prints 3 lines to terminal
Connected to WWW-CMU-PROD-VIP.ANDREW.cmu.edu.
Escape character is ‘^]’.
GET / HTTP/1.1
Host: www.cmu.edu
Client: request line
Client: required HTTP/1.1 header Client: blank line terminates headers Server: response line
Server: followed by 5 response headers Server: this is an Apache server
HTTP/1.1 301 Moved Permanently
Date: Wed, 05 Nov 2014 17:05:11 GMT
Server: Apache/1.3.42 (Unix)
Location: http://www.cmu.edu/index.shtml Server: page has moved here
Transfer-Encoding: chunked
Content-Type: text/html; charset=…
15c
…
0
Connection closed by foreign host.
Server: response body will be chunked Server: expect HTML in response body Server: empty line terminates headers Server: first line in response body Server: start of HTML content
Server: end of HTML content
Server: last line in response body
Server: closes connection
HTTP standard requires that each text line end with “\r\n”
Blank line (“\r\n”) terminates request and response headers Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 41
Carnegie Mellon
Example HTTP Transaction, Take 2
whaleshark> telnet www.cmu.edu 80 Client: open connection to server Trying 128.2.42.52… Telnet prints 3 lines to terminal Connected to WWW-CMU-PROD-VIP.ANDREW.cmu.edu.
Escape character is ‘^]’.
GET /index.shtml HTTP/1.1
Host: www.cmu.edu
HTTP/1.1 200 OK
Date: Wed, 05 Nov 2014 17:37:26 GMT Server: Apache/1.3.42 (Unix) Transfer-Encoding: chunked Content-Type: text/html; charset=…
1000
…
0
Connection closed by foreign host.
Client: request line
Client: required HTTP/1.1 header Client: blank line terminates headers Server: response line
Server: followed by 4 response headers
Server: empty line terminates headers
Server: begin response body
Server: first line of HTML content
Server: end response body
Server: close connection
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 42
Carnegie Mellon
Example HTTP(S) Transaction, Take 3
whaleshark> openssl s_client www.cs.cmu.edu:443
CONNECTED(00000005)
…
Certificate chain
…
–
Server certificate
—–BEGIN CERTIFICATE—– MIIGDjCCBPagAwIBAgIRAMiF7LBPDoySilnNoU+mp+gwDQYJKoZIhvcNAQELBQAw djELMAkGA1UEBhMCVVMxCzAJBgNVBAgTAk1JMRIwEAYDVQQHEwlBbm4gQXJib3Ix EjAQBgNVBAoTCUludGVybmV0MjERMA8GA1UECxMISW5Db21tb24xHzAdBgNVBAMT wkWkvDVBBCwKXrShVxQNsj6J
…
—–END CERTIFICATE—– subject=/C=US/postalCode=15213/ST=PA/L=Pittsburgh/street=5000 Forbes Ave/O=Carnegie Mellon University/OU=School of Computer Science/CN=www.cs.cmu.edu issuer=/C=US/ST=MI/L=Ann Arbor/O=Internet2/OU=InCommon/CN=InCommon RSA Server CA
SSL handshake has read 6274 bytes and written 483 bytes
…
>GET / HTTP/1.0
HTTP/1.1 200 OK
Date: Tue, 12 Nov 2019 04:22:15 GMT
Server: Apache/2.4.10 (Ubuntu)
Set-Cookie: SHIBLOCATION=scsweb; path=/; domain=.cs.cmu.edu
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 43 … HTML Content Continues Below …
Carnegie Mellon
Quiz Time!
Check out:
https://canvas.cmu.edu/courses/17808
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 44
Carnegie Mellon
Today
The Sockets Interface
Web Servers
The Tiny Web Server
Serving Dynamic Content
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 45
Carnegie Mellon
Tiny Web Server
Tiny Web server described in text
▪ Tiny is a sequential Web server
▪ Serves static and dynamic content to real browsers ▪ text files, HTML files, GIF, PNG, and JPEG images
▪ 239 lines of commented C code
▪ Not as complete or robust as a real Web server
▪ You can break it with poorly-formed HTTP requests (e.g., terminate lines with “\n” instead of “\r\n”)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 46
Carnegie Mellon
Tiny Operation
Accept connection from client
Read request from client (via connected socket)
Split into
If URI contains “cgi-bin” then serve dynamic content ▪ (Would do wrong thing if had file “abcgi-bingo.html”)
▪ Fork process to execute program
Otherwise serve static content ▪ Copy file to output
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 47
Carnegie Mellon
Tiny Serving Static Content
void serve_static(int fd, char *filename, int filesize) {
int srcfd;
char *srcp, filetype[MAXLINE], buf[MAXBUF];
/* Send response headers to client */
get_filetype(filename, filetype);
sprintf(buf, “HTTP/1.0 200 OK\r\n”);
sprintf(buf, “%sServer: Tiny Web Server\r\n”, buf); sprintf(buf, “%sConnection: close\r\n”, buf);
sprintf(buf, “%sContent-length: %d\r\n”, buf, filesize); sprintf(buf, “%sContent-type: %s\r\n\r\n”, buf, filetype); Rio_writen(fd, buf, strlen(buf));
/* Send response body to client */
srcfd = Open(filename, O_RDONLY, 0);
srcp = Mmap(0, filesize, PROT_READ, MAP_PRIVATE, srcfd, 0); Close(srcfd);
Rio_writen(fd, srcp, filesize);
Munmap(srcp, filesize);
}
tiny.c
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 48
Carnegie Mellon
Today
The Sockets Interface
Web Servers
The Tiny Web Server
Serving Dynamic Content
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 49
Carnegie Mellon
Serving Dynamic Content
Client sends request to server If request URI contains the
string “/cgi-bin”, the Tiny server assumes that the request is for dynamic content
GET /cgi-bin/env.pl HTTP/1.1
Client
Server
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
50
Carnegie Mellon
Serving Dynamic Content (cont)
The server creates a child process and runs the program identified by the URI in that process
Client
Server
fork/exec
env.pl
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
51
Carnegie Mellon
Serving Dynamic Content (cont)
The child runs and generates the dynamic content
The server captures the content of the child and forwards it without modification to the client
Client
Content
Server
Content
env.pl
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
52
Carnegie Mellon
Issues in Serving Dynamic Content
How does the client pass program arguments to the server?
How does the server pass these arguments to the child?
How does the server pass other info relevant to the request to the child?
How does the server capture the content produced by the child?
These issues are addressed by the Common Gateway Interface (CGI) specification.
Client
Request Content
Content
Server Create
env.pl
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
53
Carnegie Mellon
CGI
Because the children are written according to the CGI
spec, they are often called CGI programs.
However, CGI really defines a simple standard for transferring information between the client (browser), the server, and the child process.
CGI is the original standard for generating dynamic content. Has been largely replaced by other, faster techniques:
▪ E.g., fastCGI, Apache modules, Java servlets, Rails controllers
▪ Avoid having to create process on the fly (expensive and slow).
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 54
Carnegie Mellon
The add.com Experience
host port CGI program
arguments
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
55
Output page
Carnegie Mellon
Serving Dynamic Content With GET
Question: How does the client pass arguments to the server?
Answer: The arguments are appended to the URI
Can be encoded directly in a URL typed to a browser or a URL in an HTML link
▪ http://add.com/cgi-bin/adder?15213&18213
▪ adder is the CGI program on the server that will do the addition. ▪ argument list starts with “?”
▪ arguments separated by “&”
▪ spaces represented by “+” or “%20”
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 56
Carnegie Mellon
Serving Dynamic Content With GET URL suffix:
▪ cgi-bin/adder?15213&18213 Result displayed on browser:
Welcome to add.com: THE Internet addition portal.
The answer is: 15213 + 18213 = 33426
Thanks for visiting!
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 57
Carnegie Mellon
Serving Dynamic Content With GET
Question: How does the server pass these arguments to the child?
Answer: In environment variable QUERY_STRING ▪ A single string containing everything after the “?”
▪ For add: QUERY_STRING = “15213&18213”
/* Extract the two arguments */
if ((buf = getenv(“QUERY_STRING”)) != NULL) { p = strchr(buf, ‘&’);
*p = ‘\0’;
strcpy(arg1, buf);
strcpy(arg2, p+1);
n1 = atoi(arg1);
n2 = atoi(arg2);
}
adder.c
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 58
Carnegie Mellon
Serving Dynamic Content with GET
Question: How does the server capture the content produced by the child?
Answer:Thechildgeneratesitsoutputonstdout. Serverusesdup2to redirect stdout to its connected socket.
void serve_dynamic(int fd, char *filename, char *cgiargs) {
char buf[MAXLINE], *emptylist[] = { NULL };
/* Return first part of HTTP response */
sprintf(buf, “HTTP/1.0 200 OK\r\n”); Rio_writen(fd, buf, strlen(buf)); sprintf(buf, “Server: Tiny Web Server\r\n”); Rio_writen(fd, buf, strlen(buf));
if (Fork() == 0) { /* Child */
/* Real server would set all CGI vars here */ setenv(“QUERY_STRING”, cgiargs, 1);
Dup2(fd, STDOUT_FILENO); /* Redirect stdout to client */ Execve(filename, emptylist, environ); /* Run CGI program */
}
Wait(NULL); /* Parent waits for and reaps child */
} tiny.c Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
59
Carnegie Mellon
Serving Dynamic Content with GET
Notice that only the CGI child process knows the content type and length, so it must generate those headers.
/* Make the response body */
sprintf(content, “Welcome to add.com: “);
sprintf(content, “%sTHE Internet addition portal.\r\n
“, content); sprintf(content, “%sThe answer is: %d + %d = %d\r\n
“,
content, n1, n2, n1 + n2);
sprintf(content, “%sThanks for visiting!\r\n”, content);
/* Generate the HTTP response */
printf(“Content-length: %d\r\n”, (int)strlen(content)); printf(“Content-type: text/html\r\n\r\n”); printf(“%s”, content);
fflush(stdout);
exit(0);
adder.c
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 60
Carnegie Mellon
Serving Dynamic Content With GET
bash:makoshark> telnet whaleshark.ics.cs.cmu.edu 15213 Trying 128.2.210.175…
Connected to whaleshark.ics.cs.cmu.edu (128.2.210.175). Escape character is ‘^]’.
GET /cgi-bin/adder?15213&18213 HTTP/1.0
HTTP/1.0 200 OK
Server: Tiny Web Server Connection: close Content-length: 117 Content-type: text/html
Welcome to add.com: THE Internet addition portal.
The answer is: 15213 + 18213 = 33426
Thanks for visiting!
HTTP request sent by client
HTTP response generated by the server
HTTP response generated by the CGI program
Connection closed by foreign host.
bash:makoshark>
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
61
Carnegie Mellon
For More Information
W. Richard Stevens et. al. “Unix Network Programming: The Sockets Networking API”, Volume 1, Third Edition, Prentice Hall, 2003
▪ THE network programming bible.
Michael Kerrisk, “The Linux Programming Interface”, No Starch Press, 2010
▪ THE Linux programming bible.
Complete versions of all code in this lecture is available from the 213 schedule page.
▪ http://www.cs.cmu.edu/~213/schedule.html
▪ csapp.{.c,h}, hostinfo.c, echoclient.c, echoserveri.c, tiny.c, adder.c ▪ You can use any of this code in your assignments.
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 62
Carnegie Mellon
Additional slides
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 63
Carnegie Mellon
Web History 1989:
▪ Tim Berners-Lee (CERN) writes internal proposal to develop a distributed hypertext system
▪ Connects “a web of notes with links”
▪ Intended to help CERN physicists in large projects share and
manage information
1990:
▪ Tim BL writes a graphical browser for Next machines
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 64
Carnegie Mellon
Web History (cont)
1992
▪ NCSA server released
▪ 26 WWW servers worldwide
1993
▪ Marc Andreessen releases first version of NCSA Mosaic browser ▪ Mosaic version released for (Windows, Mac, Unix)
▪ Web (port 80) traffic at 1% of NSFNET backbone traffic
▪ Over 200 WWW servers worldwide
1994
▪ Andreessen and colleagues leave NCSA to form “Mosaic
Communications Corp” (predecessor to Netscape)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 65
Carnegie Mellon
HTTP Versions
Major differences between HTTP/1.1 and HTTP/1.0 ▪ HTTP/1.0 uses a new connection for each transaction
▪ HTTP/1.1 also supports persistent connections
▪ multiple transactions over the same connection
▪ Connection: Keep-Alive ▪ HTTP/1.1 requires HOST header
▪ Host: www.cmu.edu
▪ Makes it possible to host multiple websites at single Internet host ▪ HTTP/1.1 supports chunked encoding
▪ Transfer-Encoding: chunked
▪ HTTP/1.1 adds additional support for caching
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 66
Carnegie Mellon
GET Request to Apache Server From Firefox Browser
URI is just the suffix, not the entire URL
GET HTTP/1.1
Host: www.cs.cmu.edu
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.11) Gecko/20101012 Firefox/3.6.11
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
CRLF (\r\n)
/~bryant/test.html
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 67
Carnegie Mellon
GET Response From Apache Server
HTTP/1.1 200 OK
Date: Fri, 29 Oct 2010 19:48:32 GMT
Server: Apache/2.2.14 (Unix) mod_ssl/2.2.14 OpenSSL/0.9.7m mod_pubcookie/3.3.2b PHP/5.3.1
Accept-Ranges: bytes
Content-Length: 479
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html
Some Tests
…
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 68
Carnegie Mellon
Data Transfer Mechanisms
Standard
▪ Specify total length with content-length
▪ Requires that program buffer entire message
Chunked
▪ Break into blocks
▪ Prefix each block with number of bytes (Hex coded)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 69
Carnegie Mellon
Chunked Encoding Example
HTTP/1.1 200 OK\n
Date: Sun, 31 Oct 2010 20:47:48 GMT\n
Server: Apache/1.3.41 (Unix)\n
Keep-Alive: timeout=15, max=100\n
Connection: Keep-Alive\n
Transfer-Encoding: chunked\n
Content-Type: text/html\n
\r\n
d75\r\n
.