EECS 485 Lecture 19
Scaling Dynamic Pages
John Kloosterman
Copyright By PowCoder代写 加微信 powcoder
Learning Objectives
• Describe how load balancing, including sharding and replication allows dynamic websites to scale
• Identify the major parts of physical server and data center design
• List the advantages and disadvantages of cloud computing
• Differentiate process isolation, virtualization, and containerization and identify the tradeoffs associated with each
Load Balancing
Load balancing
• Website grows too big for one server to support
Round robin DNS
• Multiple IP address for one domain name
• DNS server responds to a DNS request with a list of
IP addresses
$ host google.com
google.com has address 192.122.185.23 google.com has address 192.122.185.34 google.com has address 192.122.185.53 google.com has address 192.122.185.59 …
Synchronizing front end servers
• Problem: Users might contact different front end servers on every request.
• how do all the FEs/DBs stay in sync?
• FE: only dependent on database state • Database: hard, hard problem
Sharding by content
• Different users or tables in different DBs
• Downside: hard to keep database consistent • foreign keys
• need to know which DB to talk to for which thing
Database replication
• Have multiple copies of the entire database
• Downside: all copies need to maintain same state
• Write to one needs to propagate to all the others
• Moments of inconsistency: post shows up for some people and not others
Say hi to partner
Question 1
• What kind of websites might need:
• more front-end servers than database servers? • more database servers than front-end servers?
• Some ideas to get you started: • Insta485
• Flight search like flights.google.com
• Wolverine Access for course registration
Question 2
• Which of sharding or replication would be better for scaling databases that store the following data?
• Edits to Wikipedia articles
• Tweets on Twitter made by different users • Web link graph for PageRank
Data Centers
Data centers
• Buildings full of computers
• Glass windows in BBB: used to be able to look into research data centers
Racks and racks…
Inside a server
Server Rack
Data center design
Inefficiencies in data centers
• Any money that isn’t directly spent on computers and the electricity to run them
• Power conversion • Air conditioning
• Utilization
• Servers account for barely half of power • 1W of cooling per 1.5W of IT load
• Managing energy consumption means, to a large extent, managing heat
Total data center costs
“The Datacenter as a computer”, pg. 97 – partially utilized model
Data center efficiency
• Power Usage Effectiveness (PUE)
• Total Facility Power / IT Equipment Power
• For each watt, how much to computing?
• 1.0 means no extra cost at all
• Facebook reports 1.07 for new Fort Worth data center
• Old Michigan data center is ~2.0, but others on campus are 1.1
Cloud Computing
Why cloud computing?
• Large tech companies are very good at running data centers
• Google: they feel it is their competitive edge
• Why have your own servers when you can use a large tech company’s?
• Your own servers: have to buy entire server, during low utilization don’t have other work to do
• Someone else’s servers: rent 1 today, 100 tomorrow, 1 the day after that
• You are a software expert, not a data center expert
Cloud computing
• Rent compute, storage, network from Amazon/Google/Microsoft/others
• Pay based on usage
• # CPUs, GBs of RAM, GB of bandwidth/storage
• Expensive compared to buying servers, but makes servers not your problem
What IS the cloud?
• Three rough “levels”
• Infrastructure-as-a-service, designed for admins
• Machines, storage, network capacity; most of AWS • Platform-as-a-service, designed for developers
• Heroku, Twilio (GO BLUE)
• Software-as-a-service, designed for users
• Gmail, Google Docs, Salesforce
• In 485 we’re mainly concerned with IAAS, though
you will likely encounter PAAS in your future lives
Virtualization
• An AWS S3 instance, like for P2 and P3 is not an entire physical server
• Most servers do nothing most of the time
• Insight: use virtual machines to have multiple customers on same HW
Virtualization
Servers designed for virtualization
• 56-core CPUs
Containerization: Docker
Virtualization vs. containerization
Virtualization
• Run multiple OSs on a single physical machine
• Abstraction of hardware • Replication
• Isolation
• High mem overhead
• Slow startup • Stateful
Containerization
• Run multiple apps on a single virtual or physical machine
• Abstraction of OS
• Replication
• Isolation
• Low mem overhead • Fast startup
• Stateless
• Which of these companies would likely benefit from using a cloud computing provider? Which would likely not? Why?
• Zoom video calling service
• A YouTube competitor that requires lots of CPU usage over a long period of time to encode videos
• The University of Michigan hospital to store sensitive patient records
Learning Objectives
• Describe how load balancing, including sharding and replication allows dynamic websites to scale
• Identify the major parts of physical server and data center design
• List the advantages and disadvantages of cloud computing
• Differentiate process isolation, virtualization, and containerization and identify the tradeoffs associated with each
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com