CS代考 Sample exam solution

Sample exam solution
Question 1. What are the advantages of using a sample from “a big data” rather than “big
data” (the answer should not exceed 10 sentences).
Less resources are needed in the analysis. The analysis is faster.

Copyright By PowCoder代写 加微信 powcoder

More advanced methodology can be used.
Agenda for Questions 2-3.
Suppose you are a real estate analyst. You obtained the Melbourne real estate data for the 2016-2018 period. The data set includes the following variables:
Year: Distance: Suburb: Propertycount: Rooms: Bathroom: Car:
Price in Australian dollars
S – property sold in auction; SP – property sold prior to auction; Year sold
Distance from CBD in kilometres
Number of properties that exist in the suburb.
Number of rooms
Number of bathrooms
Number of car spots
Land size in sq. metres
BuildingArea: Building size in sq. metres Council: Council
You analysed the data using SAS. The questions below contain SAS output related to this data set.
Question 2. OLS regression.
The table below shows the regression output where the dependent variable is ‘Price’ and independent variable is ‘Distance’:

a) What is the impact of ‘Distance’ on ‘Price’? Is the impact statistically significant? b) How does ‘Price’ change if ‘Distance’ decreases by 2km? Provide all calculations.
c) Discuss the residual plot. Based on the residual plot, would you recommend to make any changes in the regression models? If yes, what changes and why.
d) What is the meaning (or interpretation) of the intercept in the regression model above? 2

e) Compute the predicted value for ‘Price” for the property which is located 10 km away from the CBD.
a) Negative and statistically significant at 1% level.
b) ‘Price’ is expected to increase by 37060*2=74120.
c) Residuals are not randomly distributed. Transform both variables using natural logarithms and then regress ‘Price’ on ‘Distance’.
d) Intercept (1633400) shows the price of the property located in CBD. e) 1633400 – 10*37060 = 1,262,800
Question 3.
The most of real estate transactions have been recorded for the following councils: Boroondara City Council, Darebin City Council, and Council.
You decided to model the probabilities that a property is sold in each of these councils, using a multinomial logit regression. The following variables are used as the independent variables: the natural logarithm of ‘Price’, ‘Rooms’, ‘Bathroom’, and ‘Car’. The results are provided below.
The LOGISTIC Procedure
Model Information
Response Variable
Number of Response Levels Model
Optimization Technique
WORK.HOUSES CouncilArea
generalized logit Newton- ObservationsRead 987 Number of Observations Used 987
Response Profile
Ordered CouncilArea Value
Total Frequency
1 Boroondara City Council 309
2 Darebin City Council 386
3 Council 292

Logits modeled use CouncilArea=’ Council’ as the reference category.
Model Fit Statistics
Criterion Intercept Only Intercept and Covariates
2157.743 2167.532 2153.743
762.725 811.672 742.725
Testing Global Null Hypothesis: BETA=0
Likelihood Ratio Score
-Square DF Pr > ChiSq
1411.0181 8 859.4111 8 328.4687 8
<.0001 <.0001 <.0001 Intercept Intercept ln_price ln_price Rooms Rooms Bathroom Bathroom Car Analysis of Maximum Likelihood Estimates CouncilArea DF Estimate Standard Wald Pr > ChiSq Error Chi-Square
Boroondara City Council 1 -250.8 Darebin City Council 1 -142.9 Boroondara City Council 1 18.7642 Darebin City Council 1 11.0258 Boroondara City Council 1 -1.8106 Darebin City Council 1 -1.0068 Boroondara City Council 1 -0.8175 Darebin City Council 1 -1.0419 Boroondara City Council 1 -0.4602 Darebin City Council 1 -0.4806
14.5711 11.8369 1.0949 0.9031 0.3448 0.2744 0.3759 0.2996 0.2152 0.1671
296.1921 <.0001 145.8268 <.0001 293.7257 <.0001 149.0455 <.0001 27.5780 <.0001 13.4648 0.0002 4.7292 0.0297 12.0922 0.0005 4.5746 0.0324 8.2734 0.0040 a) Which properties are more likely to be sold in Darebin City Council compared to Council? In the discussion, consider those independent variables that are statistically significant at 10% level. b) Which properties are more likely to be sold in Boroondara City Council compared to Council? In the discussion, consider those independent variables that are statistically significant at 10% level. c) Compute the predicted probabilities that the following property is sold in each suburb: 4 Price = 2100000 Rooms = 4 Bathroom =2 Car = 2 d) Based on the results in part e), in which type of suburb is this property most likely to be sold? a) Property with higher prices, fewer rooms, fewer toilets and fewer cars b) Properties with higher prices, fewer rooms, fewer toilets and fewer cars. c) Pr(BCC)= 0.890 Pr(DCC)= 0.110 Pr(HCC)= 0 d) Boroondara City Council Question 4. You would like to forecast a future bond yield using time series models. The variable of interest is a yield on corporate bonds (‘WSPCA’). You may also consider its first difference (‘d_WSPCA’). You used SAS procedure ARIMA to check the stationarity of the variables: run; quit; run; quit; SAS output for ‘WSPCA’ is as follows: proc arima data=work.yield; identify var= WSPCA stationarity=(DICKEY); data work.yield; set work.yield; d_WSPCA=WSPCA-lag(WSPCA); proc arima data=work.yield; identify var= d_WSPCA stationarity=(DICKEY); Augmented Dickey- Root Tests Type Lags Rho Pr |t|
-0.59 0.5527 0 -1.16 0.2466 1 -2.02 0.0441 2 -1.02 0.3082 1

b) What are the ARMA orders in the model above?
c) Which coefficient estimates are statistically significant (at 5% level) in the table above?
d) Find the predicted values (FORECAST) and residuals (RESIDUAL) for the first 3 observations:
Observation d_WSPCA FORECAST RESIDUAL 1 0.056
2 0.08 3 0.01
a) Recommended model by SCAN is ARMA (2,2). Recommended model by ESACF is ARMA (2,1).
b) ARMA (1,2) c) MA(2)
Observation d_WSPCA FORECAST
1 0.056 -0.0046
2 0.08 -0.0009
3 0.01 0.0055
0.0606 0.0809 0.0045
Question 6. What is the purpose of Monte Carlo simulation? Describe using your own words how to implement Monte Carlo simulation. (the answer should not exceed 20 sentences).
The purpose of Monte Carlo simulation is to:
 compute the expected, worst, and best outcomes
 calculate the probability of a particular outcome
 calculate the probability that the outcome is greater or lower
 than a particular value
 find the lowest 1st and 5th percentiles of the outcome (for
 value at risk (VaR))
when the analytical solution is not possible.

Steps to implement Monte Carlo simulation:
1. identify factors that affect a variable of interest
2. determine the distribution of the factors as well as correlation coefficients between the
3. generate random values of these factors (e.g., 1000 or 1000000 times)
4. for each scenario, we calculate the value of the variable ofInterest
5. sort the obtained values in the ascending order.

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com