MIS772 – Predictive Analytics – Trimester 1 2022 Assignment 2 – Individual Report (Analytical): Create advanced predictive models for a business
Description
The purpose of this assignment is to develop your ability to (i) analyse patterns in a business dataset utilising descriptive data mining concepts, and (ii) develop advanced predictive models to address questions relevant to a particular business.
The business context for this assignment is the international tourism sector, focusing on providers of tourist accommodation. Organisations such as AirBnB provide a digital platform that tourists can use to rent properties in particular locations around the world. The properties are owned by private individuals (property hosts), and AirBnB takes a commission for bookings via their digital platform.
Copyright By PowCoder代写 加微信 powcoder
AirBnB approached you again to develop a RapidMiner process(es) capable of analysing and predicting customer feedback about their stay at Melbourne Airbnb rental properties. AirBnB provided you with a sample dataset of approximately 1,000 rental listings and 100,000 associated customer reviews. This sample dataset can be downloaded from the unit website.
The provided dataset (A2-AirBNB-Melbourne-dataset.zip) has been partially cleaned up and includes a variety of numerical, nominal and text attributes, and descriptions of these attributes.
AirBNB is also providing you with a list of commonly used positive and negative words to be used in your analysis. These lists can also be downloaded from the unit website.
AirbnbAI would like you to use RapidMiner to address the following tasks:
Task A: Develop a process model to determine if a significant correlation exists in the dataset between:
• the raw sentiment score (calculated as total positive words – total negative words) in all customer review comments of a property, and
• each property’s review score rating.
Task B: Develop a predictive model to estimate the review score ratings of properties located in the Melbourne Central Business District (CBD), using relevant predictor attributes in the data set. Use the following ranges of longitude and latitude to identify CBD properties:
• Longitude > 144.9 and < 145.06
• Latitude > -37.95 and < -37.75
Task C: Identify the number of most significant clusters (using an appropriate performance metric) of CBD properties for the following subset of review score attributes: accuracy, checkin, cleanliness, communication, location and value. You must consider this subset together (i.e., not one by one).
The dataset, word lists and associated notes for this assignment are available on CloudDeakin.
Specific Requirements
You must set and use a local random seed of 2022 in your processes.
You must use the submission template for the assignment provided on Cloud Deakin for your report. Your final report must adhere to page the page limits as only pages within the limits will be marked. It is essential that the executive summary section of your report is targeted at a non-technical reader (e.g., a senior manager at AirBnB) and that the remaining parts of the report target a data/business analyst.
Your final deliverables must include:
i) the final report according to the submission template (as a PDF file) ii) all RapidMiner files (in the RMP format) combined as a single ZIP file.
All partial and final submissions will need to be lodged via the Cloud Deakin dropbox before the deadline.
Please also read the accompanying notes for this assignment on Cloud Deakin in terms of submission format expectations.
You must use RapidMiner for this assignment. The use of Excel or any other analytics tool is therefore not permitted.
The consistency of your RapidMiner file(s) will be checked against the results in your report. You must not modify the data file provided for this assignment before importing it into RapidMiner.
This is an individual assignment. Each student must work on their own assignment individually and submit their work individually. Do NOT collaborate or discuss your work with other students. Please refer to the Deakin policy on Academic Integrity in this regard.
Marking and feedback
The marking rubric for this assignment is available on the Cloud Deakin unit site - in the Assessment folder (under Assessment Resources).
It is always a useful exercise to familiarise yourself with the criteria before completing any assessment task. Criteria act as a boundary around the task and help identify what assessors are looking for specifically in your submission. The criteria are drawn from the unit’s learning outcomes ensuring they align with appropriate graduate attribute/s.
Identifying the standard you aim to achieve is also a useful strategy for success and to that end, familiarising yourself with the descriptor for that standard is highly recommended.
Students who submit their work by the due date will receive their marks and feedback on Cloud Deakin 15 working days after the submission date.
Extensions
There will be no extensions granted unless there are exceptional and most unusual circumstances outside the student’s control.
Students who require a time extension should submit a written request to the Unit Chair, supported with documentation (for example, medical certificate). Such requests should be e-mailed to the Unit Chair.
Requests for extensions will not be considered at late notice or in the absence of evidence of partial submissions on time.
Late submission
The following marking penalties will apply if you submit an assessment task after the due date without an approved extension: 5% will be deducted from available marks for each day up to five days, and work that is submitted more than five days after the due date will not be marked and will receive 0% for the task.
'Day' means working day for paper submissions and calendar day for electronic submissions. The Unit Chair may refuse to accept a late submission where it is unreasonable or impracticable to assess the task after the due date.
The Division of Student Life (see link below) provides all students with editing assistance. Students who wish to take advantage of this service must be organized and plan ahead and contact the Division of Student Life in order to schedule a booking, well in advance of the due date of this assignment. http://www.deakin.edu.au/about-deakin/administrative-divisions/student-life
Referencing
Any material used in this assignment that is not your original work must be acknowledged as such and appropriately referenced. You can find information about plagiarism and other study support resources at the following website: https://www.deakin.edu.au/students/studying
Student Integrity
For information about academic integrity refer to:
https://www.deakin.edu.au/students/studying/academic-integrity
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com