R语言 SAS代写: STAT 4110/7110 Final

STAT 4110/7110 Final Take-home Exam Spring 2018

Lee Due: Friday, May 4, 2018, 11:59 p.m. (CST)

Name ____________________________ Student ID _______________________

Instructions:
I. The final exam is open book/note.

50 points

  1. Variable names are written in Courier New font in the problems.
  2. The level of significance of testing procedure is 0.05.
  3. Answer the following questions based on your output and comment if necessary. You do notneed to print out and submit the output.
  4. When you finished programming, save both SAS (first_liast.sas) and R program (first_last.R)separately and submit the program file in the Exams link in Canvas.
  5. To get a full credit you should submit both programs and the written exam.
  6. Good luck!

1. (SAS: 20 points total) The following is the list of the variables in the dataset prdsal2 in the

SASHELP library.
Alphabetic List of Variables and Attributes

# Variable

 4    ACTUAL
 1    COUNTRY
 3    COUNTY
  1. 10  MONTH
  2. 11  MONYR
  1. 5  PREDICT
  2. 6  PRODTYPE
  3. 7  PRODUCT
 9    QUARTER
 2    STATE

8 YEAR

Type Len

Num       8
Char     10
Char     20
Num       8
Num       8
Num       8
Char     10
Char     10
Num       8
Char     22
Num       8
Format        Informat
DOLLAR12.2
$CHAR10.
$CHAR20.
MONNAME3.
MONYY.        MONYY.
DOLLAR12.2
$CHAR10.
$CHAR10.
8.
$CHAR22.
4.

Label

Actual Sales
Country
County
Month
Month/Year
Predicted Sales
Product Type
Product
Quarter
State/Province
Year
  1. (1)  (2 pts.) Determine the mean and the standard deviation of the actual sales amount (=ACTUAL) in each country.
  2. (2)  (5 pts.) Create a new dataset to include only the actual sales amount greater than $700 and then compute the mean the standard deviation of the actual sales amount of Canada.

Mean =
Standard deviation =

Country

Mean

Standard deviation

U.S.A.

Mexico

Canada

Page 1 of 4

(3) (3 pts.) Test if there is any interaction between country (=COUNTRY) and product type (=PRODTYPE) in actual sales amount (=ACTUAL) in the dataset prdsal2.

Hypotheses H0:________________________ vs. HA:________________________ P-value =

(Circle one): Do not reject H0 Reject H0

(4) (5 pts.) Test if there are any differences among the countries in actual sales amount. And then determine which countries are significantly different with others if there is any in the dataset prdsal2.

Hypotheses H0:________________________ vs. HA:________________________ P-value =

(Circle one): Do not reject H0 (Circle one):

U.S.A. vs. Mexico: U.S.A. vs. Canada: Mexico vs. Canada:

Reject H0

Different Different Different

Not different Not different Not different

(5) (5 points.) Fit a regression model to predicted sales amount (=PREDICT) using the actual sales amount (=ACTUAL). Discuss the normality assumption of the model in the dataset prdsal2.

Page 2 of 4

2. (Both SAS & R: 5 points each.) Follow the steps below to complete the problem. Create a program using both SAS and R.

  1. (1)  Write a program that would generate s=100 samples of size m=10 from a Binomial random variable having number of trials equal to n=15 and probability of success p=0.4.
  2. (2)  Create a new variable that contains the average of each of the 100 samples.
  3. (3)  Plot it in a histogram.

3. (R: 20 points.) The alaska.pipeline in UsingR package data frame with 107 observations on the following 3 variables.

  •   field.defect: Depth of defect as measured in field
  •   lab.defect: Depth of defect as measured in lab
  •   batch: One of 6 batches

A

measurements of the depths of defects in the Alaska pipeline. The depth of the defects were then

(1) (5 pts.) Test if there is any difference among different batches (=batch) in the depth of defect as measured in field (=field.defect). Assume the equal variances of the depth of defect in field among the batches.

Hypotheses H0:________________________ vs. HA:________________________ P-value =

(Circle one): Do not reject H0 Reject H0

(2) (5 pts.) Create a boxplot of the depth of defect in field (=field.defect)by different batches (=batch). Add the main title “Comparison of Depth of defect in field by Batch” on the top of the plot and label for x- and y-axis, “Batch”, and “Depth of defect in field” , respectively.

Page 3 of 4

consists of in-field ultrasonic

re-measured in the laboratory. These measurements were performed in six different batches.

(3) (5 pts.) Fit a linear model for the lab-defect size (=lab.defect ) as modeled by the field- fefect size (=field.defect) and discuss the appropriateness of the model.

Fitted model: _________________________________________________

(4) (5 pts.) A log-transformation of each variable from Problem (2) seems to provide better linear model for the data. Fit the model log(lab.defect) = β0 + β1 log(field.defect) + ε and determine whether there is any violations of the assumptions.

Fitted model: _________________________________________________

Page 4 of 4