程序代写代做 Excel go graph This assessment is an individually assessed assignment. You are going to analyse

This assessment is an individually assessed assignment. You are going to analyse
the customer churn problem of telephone service companies. Customer churn is
the phenomenon where customers of a business no longer purchase or interact
with the business. Therefore, a high churn means that a higher number of
customers no longer want to purchase goods and services from the business.
Telephone service companies often have customer service branches which
attempt to win back defecting clients, because recovered long-term customers
can be worth much more to a company than newly recruited clients.
The data related to this assessment can be downloaded from Moodle. You need to write a report to discuss
how do you complete the tasks (see the Report Guidance as follows) and go into sufficient depth to
demonstrate knowledge and critical understanding of the relevant processes involved. 100% of available
marks are through the completion of the written report, with clear and separate marking criteria for each
required report section. Notably, a distinct and significant report section on discussing and critiquing the
analysis and implementation processes you carried out for your data solution is required.
Report Guidance
Your report must conform to the below structure and include the required content as described, with
information on specific marking criteria for each section available in the accompanying Criterion Reference
Grid (CRG) document. You must supply a written report containing four distinct sections that provide a
full and reflective account of the processes undertaken.
Section I: Data Loading, Pre-Processing and Summary (15%)
As a first step, you need to download the data from Moodle. There are two datasets: Dataset_01.csv and
Dataset_02.csv are the data you are going to use to train and test your prediction model for customer churn.
The variables in both datasets are briefly explained as follows:
Dataset_01.csv & Dataset_02.csv
Variable
Description
User_ID
Customer ID
User_Gender
Female or male
Is_Senior
Whether the customer is a senior citizen (Yes, No)
Has_Partner
Whether the customer has a partner or not (Yes, No)
Has_Children
Whether the customer has children or not (Yes, No)
Usage_Length
Number of months the customer has stayed with the company
Has_Phone_Service
Whether the customer has a phone service or not (Yes, No)
Multiple_Lines
Whether the customer has multiple lines or not (Yes, No, No phone service)
Intnet_Provider
Customer’s Internet service provider (DSL, Fiber optic, No)Page 2 of 3
Has_Security_Service
Whether the customer has online security or not (Yes, No, No internet service)
Has_Online_Backup
Whether the customer has online backup or not (Yes, No, No internet service)
Has_Device_Protection Whether the customer has device protection or not (Yes, No, No internet
service)
Has_Tech_Support
Whether the customer has tech support or not (Yes, No, No internet service)
Has_Steam_TV
Whether the customer has streaming TV or not (Yes, No, No internet service)
Has_Steam_Movies
Whether the customer has streaming movies or not (Yes, No, No internet
service)
Contract_Type
The contract term of the customer (Month-to-month, One year, Two year)
Has_Paperless_Billing
Whether the customer has paperless billing or not (Yes, No)
Payment_Method
The customer’s payment method (Electronic check, Mailed check, Bank
transfer (automatic), Credit card (automatic))
Monthly_Fee
The amount charged to the customer monthly
Total_Fee
The total amount charged to the customer
Attrition
Whether the customer churned or not (Yes or No)
You need to upload Dataset_01.csv and Dataset_02.csv onto Microsoft Azure Machine Learning (ML) Studio
and merge them into Data.csv file. You need to provide a screenshot of this step (5%).
Have you realised any feature column in the data is useless for your analysis? Tell us why you think it is
useless and use Azure ML Studio to remove this feature from Data.csv. Please provide screenshots of this
step (5%).
How many numeric features exist in Data.csv and what are they? Try to provide a description table for these
numeric features which contain the following values (5%):
Variable Name Mean Median Min Max Standard Deviation Number of Unique Values














Section II: Preliminary Analysis (15%)
Try to verify if the following statements are correct or not by using Microsoft Excel PivotTable:
1. Females users with children tend to have lower monthly fee than the females without children (3%).
2. New users (i.e. with less than 12 months usage length) are more likely to churn than the old users (3%).
3. Senior users are less likely to use the Internet service when compared with other users (3%).
4. Senior users tend to have lower monthly fee than other users (3%).
5. Users who pay the fee via credit card have a higher average monthly fee (3%).
You should give clear data analysis evidence to support your answer.
Example
Statement: Users with partners tend to have a higher monthly fee.
Example answer: Yes, as the below table shows, users who have partner have higher average monthly
fee compare to users without partners.Page 3 of 3
Section III: Data Visualisation (10%)
You need to use either Microsoft Excel or Azure ML Studio to visualise Data.csv. You need to put the result
graphs in your report.
1. Use pie chart to indicate the composition of customer contract types (i.e. showing the percentage of
users who choose certain type contract) (2%).
2. Use histogram to display the distribution of customer usage length (3%).
3. Use bar chart/area to indicate the rate of churn and not churn users with different usage length (5%).
Section IV: Customer Churn Prediction (60%)
You need to use Azure ML Studio to develop (i.e., train and test) two data-driven models to predict the
customer churn. The first model is logistic regression and the second model is neural network. The target
variable is attrition and the evaluation metric is accuracy. You need to answer the following questions:
1. Can you describe and explain both models (i.e., methodology) in a short paragraph (5% for each model)?
2. Can you describe and explain the key stages of the process of how do you carry out both models (5%
for each model)?
3. Can you discuss in detail and summarise the marketing insights provided by both models (40%)? For
example, which variables/features are important in prediction? What types of customers are more likely
to churn? What do you think might be the reasons behind the findings? Your analysis needs to be
reasonable and you can include some marketing theories or evidence to justify your statements along
with the empirical findings from the provided data.