Proxy Recita-on
Recita4on 13: November 23, 2015
Copyright By PowCoder代写 加微信 powcoder
¢ Ge3ng content on the web: Telnet/cURL Demo
§ How the web really works
¢ Networking Basics
§ Due Tuesday, December 8th § Grace days allowed
¢ String Manipula-on in C
The Web in a Textbook
¢ Client request page, server provides, transac-on done.
Web client (browser)
Web server
¢ A sequen-al server can handle this. We just need to serve one page at a -me.
¢ This works great for simple text pages with embedded styles.
§ Interac4ve remote shell – like ssh without security § Must build HTTP request manually
§ This can be useful if you want to test response to malformed headers
~]% telnet www.cmu.edu 80
Trying 128.2.42.52…
Connected to WWW-CMU-PROD-VIP.ANDREW.cmu.edu (128.2.42.52).
Escape character is ‘^]’.
GET http://www.cmu.edu/ HTTP/1.0
HTTP/1.1 301 Moved Permanently
Date: Sat, 11 Apr 2015 06:54:39 GMT
Server: Apache/1.3.42 (Unix) mod_gzip/1.3.26.1a mod_pubcookie/3.3.4a mod_ssl/2.8.31 OpenSSL/0.9.8e-
fips-rhel5
Location: http://www.cmu.edu/index.shtml
Connection: close
Content-Type: text/html; charset=iso-8859-1
Moved Permanently
The document has moved here.
Connection closed by foreign host.
Telnet/cURL Demo
§ “URL transfer library” with a command line program
§ Builds valid HTTP requests for you!
~]% curl http://www.cmu.edu/
Moved Permanently
The document has moved here.
§ Can also be used to generate HTTP proxy requests:
~]% curl –proxy lemonshark.ics.cs.cmu.edu:3092 http://www.cmu.edu/
Moved Permanently
The document has moved here.
How the Web Really Works
¢ In reality, a single HTML page today may depend on 10s or 100s of support files (images, stylesheets, scripts, etc.)
¢ Builds a good argument for concurrent servers
§ Just to load a single modern webpage, the client would have to
wait for 10s of back-to-back request
§ I/O is likely slower than processing, so back
¢ Caching is simpler if done in pieces rather than whole page
§ If only part of the page changes, no need to fetch old parts again
§ Each object (image, stylesheet, script) already has a unique URL that can be used as a key
How the Web Really Works
Excerpt from www.cmu.edu/index.html:
…
Sequen-al Proxy
Sequen-al Proxy
¢ Note the sloped shape of when requests finish
§ Although many requests are made at once, the proxy does not accept a new job un4l it finishes the current one
§ Requests are made in batches. This results from how HTML is structured as files that reference other files.
¢ Compared to the concurrent example (next), this page takes a long -me to load with just sta-c content
Concurrent Proxy
Concurrent Proxy
¢ Now, we see much less purple (wai-ng), and less -me spent overall.
¢ No-ce how mul-ple green (receiving) blocks overlap in -me
§ Our proxy has mul4ple connec4ons open to the browser to handle several tasks at once
How the Web Really Works ¢ A note on AJAX (and XMLHZpRequests)
§ Normally, a browser will make the ini4al page request then request any suppor4ng files
§ And XMLHapRequest is simply a request from the page once it has been loaded & the scripts are running
§ The dis4nc4on does not maaer on the server side – everything is an HTTP Request
¢ Ge3ng content on the web: Telnet/cURL Demo § How the web really works
¢ Networking Basics
§ Due Tuesday, December 8th § Grace days allowed
¢ String Manipula-on in C
¢ What is a socket?
§ To an applica4on, a socket is a file descriptor that lets the applica4on read/ write from/to the network
§ (all Unix I/O devices, including networks, are modeled as files)
¢ Clients and servers communicate with each other by
reading from and wri-ng to socket descriptors
¢ The main difference between regular file I/O and socket I/ O is how the applica-on “opens” the socket descriptors
Overview of the Sockets Interface
open_clientfd
getaddrinfo
rio_writen
getaddrinfo
Connec4on request
open_listenfd
Client / Server Session
Await connec4on request from next client
rio_readlineb
rio_readlineb
rio_writen
rio_readlineb
Host and Service Conversion: getaddrinfo ¢ getaddrinfo is the modern way to convert string representa4ons of
host, ports, and service names to socket address structures.
§ Replaces obsolete gethostbyname – unsafe because it returns a pointer to a sta4c variable
¢ Advantages:
§ Reentrant (can be safely used by threaded programs).
§ Allows us to write portable protocol-independent code(IPv4 and IPv6)
§ Givenhostandservice,getaddrinfo returnsresultthat points to a linked list of addrinfo structs, each poin4ng to socket address struct, which contains arguments for sockets APIs.
¢ getnameinfo is the inverse of getaddrinfo, conver-ng a socket address to the corresponding host and service.
Sockets API
¢ int socket(int domain, int type, int protocol);
§ Create a file descriptor for network communica4on
§ used by both clients and servers
§ int sock_fd = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP); § One socket can be used for two-way communica4on
¢ int bind(int socket, const struct sockaddr *address, socklen_t address_len);
§ Associate a socket with an IP address and port number § used by servers
§ struct sockaddr_in sockaddr – family, address, port
Sockets API
¢ int listen(int socket, int backlog);
§ socket: socket to listen on
§ used by servers
§ backlog: maximum number of wai4ng connec4ons
§ err = listen(sock_fd, MAX_WAITING_CONNECTIONS);
¢ int accept(int socket, struct sockaddr *address, socklen_t *address_len);
§ used by servers
§ socket: socket to listen on
§ address: pointer to sockaddr struct to hold client informa4on amer accept returns
§ return: file descriptor
Sockets API
¢ int connect(int socket, struct sockaddr *address, socklen_t address_len);
§ aaempt to connect to the specified IP address and port described in address
§ used by clients
¢ int close(int fd);
§ used by both clients and servers
§ (also used for file I/O)
§ fd: socket fd to close
Sockets API
¢ ssize_t read(int fd, void *buf, size_t nbyte); § used by both clients and servers
§ (also used for file I/O)
§ fd: (socket) fd to read from
§ buf: buffer to read into § nbytes: buf length
¢ ssize_t write(int fd, void *buf, size_t nbyte); § used by both clients and servers
§ (also used for file I/O)
§ fd: (socket) fd to write to
§ buf: buffer to write § nbytes: buf length
¢ Ge3ng content on the web: Telnet/cURL Demo § How the web really works
¢ Networking Basics
§ Due Tuesday, December 8th § Grace days allowed
¢ String Manipula-on in C
Byte Ordering Reminder
¢ So, how are the bytes within a mul–byte word ordered in memory?
¢ Conven-ons
§ Big Endian: Sun, PPC Mac, Internet
§ Least significant byte has highest address
§ : x86, ARM processors running Android, iOS, and Windows
§ Least significant byte has lowest address
Byte Ordering Reminder
¢ So, how are the bytes within a mul–byte word ordered in memory?
¢ Conven-ons
§ Big Endian: Sun, PPC Mac, Internet § Least significant byte has highest address
¢ Make sure to use correct endianness
Proxy – Func-onality
¢ Should work on vast majority of sites
§ Twitch, CNN, NY Times, etc.
§ Some features of sites which require the POST opera4on (sending data to the website), will not work
– Logging in to websites, sending Facebook message § HTTPS is not expected to work
§ Google, YouTube (and some other popular websites) now try to push users to HTTPs by default; watch out for that
¢ Cache previous requests
§ Use LRU evic4on policy
§ Must allow for concurrent reads while maintaining consistency § Details in write up
Proxy – Func-onality ¢ Why a mul–threaded cache?
n Sequen4al cache would boaleneck parallel proxy n Mul4ple threads can read cached content safely
n Search cache for the right data and return it
n Two threads can read from the same cache block n But what about wri4ng content?
n Overwrite block while another thread reading? n Two threads wri4ng to same cache block?
Proxy – How
¢ Proxies are a bit special – they are a server and a client at the same 4me.
¢ They take a request from one computer (ac4ng as the server), and make it on their behalf (as the client).
¢ Ul4mately, the control flow of your program will look like a server, but will have to act as a client to complete the request
¢ Start small
§ Grab yourself a copy of the echo server (pg. 946) and client (pg.
947) in the book
§ Also review the 4ny.c basic web server code to see how to deal with HTTP headers
§ Note that 4ny.c ignores these; you may not
Proxy – How
¢ What you end up with will resemble:
Client socket address
128.2.194.242:51213
Server socket address
208.216.181.15:80
Proxy server socket address
128.2.194.34:15213
Proxy client socket address
128.2.194.34:52943
Server (port 80)
¢ Step 1: Sequen-al Proxy
§ Works great for simple text pages with embedded styles
¢ Step 2: Concurrent Proxy
§ mul4-threading
¢ Step 3 : Cache Web Objects
§ Cache individual objects, not the whole page
§ Use an LRU evic-on policy
§ Your caching system must allow for concurrent reads while maintaining consistency. Concurrency? Shared Resource?
Proxy – Tes-ng & Grading
¢ New: Autograder
§ ./driver.sh will run the same tests as autolab:
§ Ability to pull basic web pages from a server
§ Handle a (concurrent) request while another request is s4ll
§ Fetch a web page again from your cache amer the server has been stopped
§ This should help answer the ques4on “is this what my proxy is supposed to do?”
§ Please don’t use this grader to defini4vely test your proxy; there are many things not tested here
Proxy – Tes-ng & Grading
¢ Test your proxy liberally
§ The web is full of special cases that want to break your proxy
§ Generate a port for yourself with ./port-for-user.pl [andrewid]
§ Generate more ports for web servers and such with ./free-port.sh § Consider using your andrew web space (~/www) to host test files
§ You have to visit haps://www.andrew.cmu.edu/server/publish.html to publish your folder to the public server
¢ Create a handin file with make handin
§ Will create a tar file for you with the contents of your proxylab- handin folder
¢ Ge3ng content on the web: Telnet/cURL Demo § How the web really works
¢ Networking Basics
§ Due Tuesday, December 8th § Grace days allowed
¢ String Manipula-on in C
String manipula-on in C
¢ sscanf: Read input in specific format
int sscanf(const char *str, const char *format, …);
buf = “213 is awesome”
// Read integer and string separated by white space from buffer ‘buf’ // into passed variables
ret = sscanf(buf, “%d %s %s”, &course, str1, str2);
This results in:
course = 213, str1 = is, str2 = awesome, ret = 3
String manipula-on (cont)
¢ sprine: Write input into buffer in specific format
int sprinI(char *str, const char *format, …);
str = “213 is awesome”
// Build the string in double quotes (“”) using the passed arguments // and write to buffer ‘buf’
sprinI(buf, “String (%s) is of length %d”, str, strlen(str));
This results in:
buf = String (213 is awesome) is of length 14
String manipula-on (cont)
Other useful string manipula-on func-ons: ¢ strcmp, strncmp, strncasecmp
¢ strcpy, strncpy
Aside: Se3ng up Firefox to use a proxy
¢ You may use any browser, but we’ll be grading with Firefox
¢ Preferences > Advanced > Network > Se3ngs… (under Connec-on)
¢ Check “Use this proxy for all protocols” or your proxy will appear to work for HTTPS traffic.
Acknowledgements
¢ Slides derived from recita-on slides of last 2 years by § Shiva
§ Hartaj Singh Dugal §
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com