This assignment will reinforce your understanding of the MapReduce
algorithm by requiring you to trace the working of the MapReduce
algorithm on a specific example.
We have a set of 5 documents containing HTML links to some other documents. Thus, you may have a document, Doc1, which contains the word ‘manufacturer’ which is a hyperlink, www.apple.com, to the landing page of Apple, Inc. (the company) landing page. In this case we regard Doc1 as the source, the word ‘manufacturer’ as the anchor in Doc1, and the landing page (which we will call Apple
Landing Page) as the target. So, we can represent this information as
Suppose you are given 5 documents which may contain hyperlinks to some targets. The MapReduce algorithm is used to compile a list of
Show the following:
• a.) Information stored on nodes n1, n2, n3 after Map has been executed
• b.) Information stored on nodes n4, n5 after Reduce has been executed
• c.) The final output
Also provide a diagram showing what information is sent to which node from which node.
Submit a text file or a Word file containing the above information. Clearly label each node and the information it contains.
You may assume that the 5 documents contains these an only these links on these anchors
Doc1: Named Doc1, contains anchor ‘Columbia University’ which points to Columbia University Landing Page, anchor ‘SPS’ which points to the SPS Landing Page, anchor ‘NYU’ which points to NYU Landing Page, and ‘Columbia’ which points to Columbia University Landing Page
Doc2: Named Doc2, contains ‘Ivy League school’ which points to Columbia University Landing Page, ‘Apple’ which points to Apple Landing Page
Doc3: SPS Landing Page (hence, named ‘SPS Landing Page’) contains the anchor ‘the
university’ which points to Columbia University Landing Page, the anchor ‘APAN’ which points to the Applied Analytics Program page
Doc4: Named Doc4, contains ‘the university’ which points to NYU Landing Page
Doc 5: Named Doc5, contains ‘iOS’ which points to Apple Landing Page, and contains ‘windows’ which points to Microsoft Landing page.