BIST 515: Introduction to Statistical Software Homework 6
Due date: Wednesday, November 28
Complete the following problems below. Within each part, include your SAS program code, all corresponding output, and any additional information needed to explain your answer.
-
(25 total points) This problem is a continuation of problem #1 on Homework #5
(a) (2 points) Sort the data by position. Print the first five observations to partially prove that you sorted the data. Use this sorted data for the remainder of this problem.
(b) (4 points) Construct side-by-side box plots of the 40-yard dash times for each position. Use a yellow fill color for the boxes.
(c) (4 points) Construct side-by-side dot plots of the 40-yard dash times for each position. Use red open circles for the plotting symbols.
(d) (4 points) Using your plots in (b) and (c), compare the 40-yard dash times for the positions. Make sure to include comments about location, centrality, variability, and skewness.
(e) (8 points) Construct a scatter plot of the 40-yard dash times vs. the bench press weight. In your plot, include the following:
(i) Vary the plotting symbols and their color by the position with the following assignments: The specific color names are DB = black, LB = red, OL = blue, RB = darkgreen, S = purple, TE = orange, and WO = gray.
(ii) Gridlines
(iii) Y and X-axis labels of “40-yard dash (seconds)” and “Bench press repetitions”, respectively. (iv) The name of the player with the largest bench press value next to its corresponding plotting point (see http://blogs.sas.com/content/iml/2011/11/11/label-only-certain-observations-with-proc-sgplot.html). (f) (3 points) Using your plot in (e), describe the relationship between the 40-yard dash times and bench press relative to the position. -
(12 total points) This problem is a continuation of problem #2 on Homework #5
(a) (8 points) In part (d), you constructed a plot with proc reg that can be used to estimate the shelf life. Construct this same plot using proc sgplot. To match this exactly, I found it difficult to determine the fill color for the confidence interval band. You may use color=CXB9CFE7 which is a hexadecimal RGB specification of the color.
(b) (4 points) Continuing with the plot in (a), use the dropline and/or refline statements of proc sgplot to draw lines which illustrate the process for how the shelf life is determined. These lines should be red and dotted. - (10 points) An old-fashioned way to find quantiles from a t-distribution is to obtain them from a table 1
of values. These tables typically are arranged so that the rows are labeled by degrees of freedom, the columns are labeled by probabilities, and the cells within the table are quantiles corresponding to a degrees of freedom and probability combination. Construct your own table by using data steps. This table should provide quantiles for degrees of freedom between 1 and 30 and probabilities to the left of the quantile of 0.9, 0.95, 0.975, and 0.99 (thus, 30 rows and 4 columns). Print the entire table using proc print.
2