R语言代写代考 ETF3500/ETF5500 高维数据分析
Question A
This question replicates part E of the exam
This question of the exam uses simulated data on purchase of coffee. By now, you should have access to your data set which is produced according to your student ID. The dataset consists of 6 variables on coffee purchase transactions.
Copyright By https://powcoder.com 加微信 powcoder
Three variables are attributes related to the customers:
age: Age of the customer (years).
gender: Gender of the customer.
description: Customer’s description of the coffee that was purchased.
Three variables are attributes related to the type of coffee purchased:
price: Price of the coffee at the time of purchase (dollars).
price_group: Indicates if the coffee was in the expensive, standard or cheap group. origin: Country where the coffee beans were produced.
Based on this information you must answer the questions below. In each question provide both, the answer and the code to produce the answer.
1. Create a scatter plot of age versus price, colour the points according to origin.
2. Using only the purchases of Colombian coffee, create a scatter plot of age versus price,
use the coffee description as labels.
3. Construct a contingency table between the variables price_group and description
4. Apply correspondence analysis to this contingency table to assess the dependence between price_group and description. Discuss the results.
Page 1 of 3
B Question B
The FRED-QD is a quarterly U.S. database for macroeconomic research. We extracted from this data 152 quarterly observations for the following five variables:
GDP_Growth: Real gross department product growth.
HOUST: New privately owned housing units started.
SP500: Financial returns.
Inflation: Inflation constructed from the DGP deflator index (change in overall prices). IndustrialProd: Growth of industrial production.
The five variables were saved into the file macrodata.csv. The following analysis was con- ducted on this data:
From which we obtained the following output
data = as_tibble(read.csv(‘macrodata.csv’,header = T))
factanal(2)
−0.25 0.00 0.25
0.50 0.75 1.00
IndustrialProdGDP_Growth
## # A tibble: 5 x 4
## variable
##
## 1 GDP_Growth
uniqueness fl1 fl2
0.0572 0.958 0.158
Page 2 of 3
## 2 HOUST
## 3 SP500
## 4 Inflation
## 5 IndustrialProd
0.823 0.305 0.290
0.005 0.159 0.985
0.952 0.203 0.0823
0.412 0.752 0.151
Use the information above to answer the following questions.
1. Discuss and interpret the results obtained from the analysis.
2. Provide an estimate for the covariance matrix of the five dimensional vector y = (GDP_Growth, HOUST, SP500, Inflation, IndustrialProd)′. Explain your an- swer. You might need to use R to compute the required matrix operations.
END OF EXAMINATION
Page 3 of 3