CS计算机代考程序代写 algorithm python Load Balancing a Consistent Hash Distribution

Load Balancing a Consistent Hash Distribution
Guidelines
This is an open-ended problem, however please do not spend more than 4 hours on a working solution. The code should be easy to understand, and include comments where appropriate. The solution should adhere to the spirit of the algorithm, and you should include a paragraph to explain the solution.
Consistent Hashing
Consistent Hashing is a commonly used technique to distribute data in many scalable systems, both proprietary and open source. The main goal is to avoid large movement of data when a machine or device fails, or new ones are added.
Each storage device has several hash codes associated with it, and all data objects have a key; for example, the hash of its name. To find the correct storage device for an object, you find the next closest device hash. There is a lot of useful information on the internet, e.g. ​http://en.wikipedia.org/wiki/Consistent_hashing​. The reason for having many hashes per device is to spread out rebalancing when a new device is added, or when a failed device removed.
Problem
When storing large objects with normally distributed sizes, it can be the case that some nodes have a disproportionate utilisation relative to others.
The attached example program ignores reliability, but models the use of an Amazon EC2 “hs1.8xlarge” instance being used as a storage server. A static mapping for files (by MD5 hash of the file name) to 20 devices has been generated and used to locate many files. Given a typical normal distribution of file sizes and the initial mapping, some devices contain more data than others. In this example, although the total file size is only 80% of the theoretical capacity, several devices are approaching 95% full.
The task is to modify the mapping to redistribute the data such that all devices are less than 90% full. The amount of data moved should be less than double the simplistic theoretical minimum (just considering bytes, not files). The final usage and total bytes moved should be printed​. Please note that the file names or sizes cannot be changed and your solution must work before you send it over to us. Feel free to make use of the test included in the attachment.
The Example Program
Running the supplied python program for the first time will create and save the initial mapping object and example file data. Rerunning will load the data. This is not intended to be an example of production quality code, and it should be straightforward even though lacking comments.