Background
Example MapReduce job Worker registration
Job submission
Fault tolerance Conclusion
Copyright By PowCoder代写 加微信 powcoder
This site uses Just the Docs, a documentation theme for Jekyll.
Search CS 162 HW 5
HW 5: Map Reduce
MapReduce is a programming model for scalable and highly parallelized data processing. It abstracts away the complexities of developing a fault tolerant distributed system by exposing a simple API where the user specifies two functions:
• map: produces a set of key/value pairs from the input data
• reduce: combines values corresponding to the same key
With these functions, tasks can be automatically parallelized and executed on a cluster. This paradigm is particularly powerful since it allows programmers with little background in distributed systems to write parallelizable code for a wide variety of real world tasks.
In this assignment, which is loosely based on a lab from MIT, you will be implementing your own fault tolerant MapReduce system in Rust. Specifically, you will be implementing worker processes as well as a coordinator process that distributes tasks to the workers. You will also handle worker failure by implementing heartbeats and task redistribution. The system design you will be building is similar to that outlined in the MapReduce paper.
Note: In this assignment, we use the term “coordinator” rather than “master”.
Components
This assignment is split up into three parts with distinct deadlines to help you space out the workload. Deadlines and grading details can be found on Piazza.
Design brief (extra credit)
If you complete a short write-up about your data structures by the deadline posted on Piazza, we will give you 5% extra credit on this homework. Please fill out this template and submit to Gradescope. You will get the full extra credit as long as your design demonstrates effort and does not have obvious issues.
Checkpoint
You will be expected to complete the first part of the assignment (up to and including the Tasks section) by an earlier deadline. This checkpoint will be worth 25% of your grade on the assignment, so if you miss the checkpoint, you may get up to a 75% on the assignment if you pass all the tests by the final deadline.
The final component consists of the rest of the tasks (through Fault tolerance).
Getting started
We strongly recommend doing this assignment locally, but you can also do it from your VM. Begin by pulling the starter code:
cd ~/code/personal
git pull staff master
cd hw-map-reduce
Regardless of whether you are doing the lab locally or on your VM, you will need to install CMake by following the directions here. Make sure you install a relatively new version of CMake (at least 3.20.0) to ensure that you don’t run into any issues. To install CMake on your VM, run the following commands:
CMAKE_VERSION=3.23.2
wget https://github.com/Kitware/CMake/releases/download/v${CMAKE_VERSION}/cmake-${CMAKE_VERSION}-Linux-x86_64
-q -O /tmp/cmake-install.sh
chmod u+x /tmp/cmake-install.sh
sudo mkdir /opt/cmake
sudo /tmp/cmake-install.sh –skip-license –prefix=/opt/cmake
rm /tmp/cmake-install.sh
echo “PATH=/opt/cmake/bin:$PATH” >> ~/.cs162.bashrc
source ~/.cs162.bashrc
To check if CMake is installed correctly, run cmake –version and check if it outputs the version you expect.
If Rust is not installed, install it according to the directions here.
Back to top
Copyright © 2022 CS 162 staff.
Introduction
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com