程序代写代做 html algorithm This assignment will reinforce your understanding of the MapReduce

This assignment will reinforce your understanding of the MapReduce
algorithm by requiring you to trace the working of the MapReduce
algorithm on a specific example.
We have a set of 5 documents containing HTML links to some other documents. Thus, you may have a document, Doc1, which contains the word ‘manufacturer’ which is a hyperlink, www.apple.com, to the landing page of Apple, Inc. (the company) landing page. In this case we regard Doc1 as the source, the word ‘manufacturer’ as the anchor in Doc1, and the landing page (which we will call Apple
Landing Page) as the target. So, we can represent this information as >. The general format for representing this information is >. Of course, Doc1 may contain another hyperlink to Apple Landing page on the anchor ‘Apple’. In this case, we can represent the information about these two links in Doc1 as >. And if we had a second document Doc2 which contains a hyperlink to Apple Landing Page on the anchor ‘Apple’, we can represent this information as >. Thus, the format for representing this information is: >
Suppose you are given 5 documents which may contain hyperlinks to some targets. The MapReduce algorithm is used to compile a list of >. Suppose you have 5 processing nodes (n1, n2, n3, n4, n5) on which to execute MapReduce for this task. The nodes n1, n2, n3 are used for Map and nodes n4, n5 are used for Reduce. Node n1 processes Doc1 and Doc2; node n2 processes Doc3 and Doc4, node n3 processes Doc5. You decide what should be assigned to nodes n4 and n5 (during the intermediate stage between the Map stage and the Reduce stage).
Show the following:
• a.) Information stored on nodes n1, n2, n3 after Map has been executed
• b.) Information stored on nodes n4, n5 after Reduce has been executed
• c.) The final output
Also provide a diagram showing what information is sent to which node from which node.
Submit a text file or a Word file containing the above information. Clearly label each node and the information it contains.
You may assume that the 5 documents contains these an only these links on these anchors
Doc1: Named Doc1, contains anchor ‘Columbia University’ which points to Columbia University Landing Page, anchor ‘SPS’ which points to the SPS Landing Page, anchor ‘NYU’ which points to NYU Landing Page, and ‘Columbia’ which points to Columbia University Landing Page

Doc2: Named Doc2, contains ‘Ivy League school’ which points to Columbia University Landing Page, ‘Apple’ which points to Apple Landing Page
Doc3: SPS Landing Page (hence, named ‘SPS Landing Page’) contains the anchor ‘the
university’ which points to Columbia University Landing Page, the anchor ‘APAN’ which points to the Applied Analytics Program page
Doc4: Named Doc4, contains ‘the university’ which points to NYU Landing Page
Doc 5: Named Doc5, contains ‘iOS’ which points to Apple Landing Page, and contains ‘windows’ which points to Microsoft Landing page.

Related Posts