CS代写 MATH 208 Final Exam December 8th-11th, 2020

MATH 208 Final Exam December 8th-11th, 2020

Question 1 [50 points]
The data for this question comes from the STAR dataset from the AER library. Below is a summary and five sample

rows of a modified version of that dataset containing information from a study examining the effect of reducing class
size on student performance in primary school.
str(STAR_data)

‘data.frame’: 3114 obs. of 6 variables:
$ student_ID: int 1 2 3 4 5 6 7 8 9 10 …
$ stark : Factor w/ 3 levels “regular”,”small”,..: 2 2 1 2 1 1 2 2 1 3 …
$ star1 : Factor w/ 3 levels “regular”,”small”,..: 2 2 1 2 1 1 2 2 1 3 …
$ readk : int 447 450 448 447 431 451 478 455 430 437 …
$ read1 : int 507 579 651 533 558 548 514 530 490 503 …
$ read2 : int 568 588 614 608 608 596 569 608 622 552 …

STAR_data %>% slice(sample(1:n(), 5))

student_ID stark star1 readk read1 read2
1 1127 regular regular+aide 455 571 669
2 1556 regular+aide regular 456 483 560
3 856 regular regular+aide 450 512 571
4 611 regular regular 416 553 618
5 2296 regular+aide regular+aide 451 629 643

Besides the Student ID, we will focus on four other measures from the data: stark and star1, which indicate the
type of class in kindergarten and grade 1, respectively (“regular”, “small”, or “regular+aide”); and readk, read1,
and read2 which are reading scores from kindergarten, grade 1 and grade 2 respectively.

(a) [5 pts] Write a line of code that will generate the following tibble (or data.frame) with the total number of
students who were in each type of class in kindergarten:

# A tibble: 3 x 2
# Groups: stark [3]

1 regular 1067
2 small 987
3 regular+aide 1060

CONTINUED ON NEXT PAGE

MATH 208 Final Exam December 8th-11th, 2020

(b) [5 pts] Write a line of code that will generate the following tibble (or data.frame) with the total number of
students who were in each combination of type of class in kindergarten and grade 1, as below:

count_table

# A tibble: 9 x 3
# Groups: stark, star1 [9]

stark star1 n

1 regular regular 518
2 regular small 85
3 regular regular+aide 464
4 small regular 29
5 small small 924
6 small regular+aide 34
7 regular+aide regular 491
8 regular+aide small 85
9 regular+aide regular+aide 484

(c) [5 pts] Assume the tibble from part (b) is called count_table as above. Now write a line of code that
produces a tibble which gives, for each class type in kindergarten, the proportion of students in each class type
in grade 1:

Here is some code which creates an object STAR_what.
STAR_what <- STAR_data %>%

pivot_longer(cols=readk:read2,names_to=”Test”,values_to=”Score”) %>%
select(-student_ID)

(d) [5 pts] What class of object is STAR_what?

CONTINUED ON NEXT PAGE

MATH 208 Final Exam December 8th-11th, 2020

In class we used xtabs to create contingency tables of counts of combinations of qualitative variables, as in this
STAR_who_denom <- xtabs(~star1+Test+stark,data=STAR_what) STAR_who_denom , , stark = regular star1 read1 read2 readk regular 518 518 518 small 85 85 85 regular+aide 464 464 464 , , stark = small star1 read1 read2 readk regular 29 29 29 small 924 924 924 regular+aide 34 34 34 , , stark = regular+aide star1 read1 read2 readk regular 491 491 491 small 85 85 85 regular+aide 484 484 484 (e) [5 pts] What will the code STAR_who_num[1,3,2] return as output? CONTINUED ON NEXT PAGE MATH 208 Final Exam December 8th-11th, 2020 xtabs can also be used to sum up values of another variable for different combinations of star1, Test and stark by putting the variable name in front of the ~. For example, we can find the total of all scores by using STAR_who_num <- xtabs(Score~star1+Test+stark,data=STAR_what) STAR_who_num , , stark = regular star1 read1 read2 readk regular 273728 306238 228798 small 45797 50785 37660 regular+aide 249580 276710 205622 , , stark = small star1 read1 read2 readk regular 15396 17009 12617 small 500773 552478 413608 regular+aide 18338 20488 14927 , , stark = regular+aide star1 read1 read2 readk regular 261220 290488 218272 small 44596 49270 37070 regular+aide 258514 286343 212980 CONTINUED ON NEXT PAGE MATH 208 Final Exam December 8th-11th, 2020 (f) [5 pts] Using STAR_who_num and STAR_who_denom, write a single line of code that assigns the average score for each star1 by Test by stark combination to an object called STAR_avg as seen below: , , stark = regular star1 read1 read2 readk regular 528.4324 591.1931 441.6950 small 538.7882 597.4706 443.0588 regular+aide 537.8879 596.3578 443.1509 , , stark = small star1 read1 read2 readk regular 530.8966 586.5172 435.0690 small 541.9621 597.9199 447.6277 regular+aide 539.3529 602.5882 439.0294 , , stark = regular+aide star1 read1 read2 readk regular 532.0163 591.6253 444.5458 small 524.6588 579.6471 436.1176 regular+aide 534.1198 591.6178 440.0413 (g) [10 pts] Write a line of code that creates an array that contains the difference between the average read2 and readk scores for each stark by star1 combination using STAR_avg above. star1 regular small regular+aide regular 149.4981 151.4483 147.0794 small 154.4118 150.2922 143.5294 regular+aide 153.2069 163.5588 151.5764 CONTINUED ON NEXT PAGE MATH 208 Final Exam December 8th-11th, 2020 (h) [10 pts] Write code (possibly multiple lines) using the original STAR_what to produce a tibble containing the same rows and columns as the object in part (g). # A tibble: 3 x 4 # Groups: star1 [3] star1 regular small `regular+aide`

1 regular 149. 151. 147.
2 small 154. 150. 144.
3 regular+aide 153. 164. 152.

END OF QUESTION 1

Question 2 [50 points]
We will re-use the same data that was used in Question 1. The description is repeated below for your convenience.
The data for this question comes from the STAR dataset from the AER library. Below is a summary and five sample
rows of a modified version of that dataset containing information from a study examining the effect of reducing class
size on student performance in primary school. T
str(STAR_data)

STAR_data %>% slice(sample(1:n(), 5))

student_ID stark star1 readk read1 read2
1 2159 regular regular 465 564 622
2 2171 regular regular+aide 410 494 586
3 187 regular regular+aide 436 521 566
4 1320 small small 443 558 659
5 1946 regular+aide regular 545 519 584

Besides the Student ID, we will focus on four other measures from the data: stark and star1, which indicate the

MATH 208 Final Exam December 8th-11th, 2020

type of class in kindergarten and grade 1, respectively (“regular”, “small”, or “regular+aide”); and readk,read1,
and read2 which are reading scores from kindergarten, grade 1 and grade 2 respectively.

(a) [6 pts] Below are partially obscured code and two plots of the values of class types for kindergarten and grade

p1<-ggplot(STAR_data,aes(x=star1,fill=stark)) + geom_YYYYYYY() + scale_fill_viridis_d() + ggtitle("Plot 1") + theme_bw() p2<-ggplot(STAR_data) + geom_XXXXXXX(aes(x=product(stark,star1),fill=stark))+ scale_fill_viridis_d() + ggtitle("Plot 2")+ theme_bw() grid.arrange(grobs=list(p1,p2),nrow=2,ncol=1) CONTINUED ON NEXT PAGE regular small regular+aide regular+aide regular+aide regular small regular+aide regular+aide Identify these two plots by name: Plot 1 Plot 2 (b) [8 pts] Using these plots, describe the describe the association between stark and star1. In particular, what does knowing the type of grade 1 class type tell us about the possible kindergartn class type for the students in this sample? MATH 208 Final Exam December 8th-11th, 2020 CONTINUED ON NEXT PAGE MATH 208 Final Exam December 8th-11th, 2020 (c) [6 pts] Although these plots look similar, they are in fact different. There are two important differences in how these plots were constructed, one which is more obvious than the other. Explain what those two differences (d) [6 pts] Write a line of code to create new factor variables in STAR_data for stark and star1 named stark_mod and star1_mod which combine the “regular” and “regular+aide” levels into a single level “not small”. Below is a figure along with the code (partially obscured) which generated it. not small small MATH 208 Final Exam December 8th-11th, 2020 ggplot(STAR_data,aes(x=_______,fill=________,y=read2)) + geom_______() + ggtitle("Plot e") + theme_bw() MATH 208 Final Exam December 8th-11th, 2020 (e) [4 pts] What are the missing geometry and aesthetics that generated the figure on the previous page (that is, what are the words that are missing in the code above for Plot e)? (f) [5 pts] Based on these plots, do you think there is evidence of an association between the modified type of class variables and the grade 1 reading test score? Explain your answer in 3 sentences or fewer. CONTINUED ON NEXT PAGE MATH 208 Final Exam December 8th-11th, 2020 Below is a plot of the reading test scores for kindergarten and grade 1 for the STAR_data by levels of the modified kindergarten class type. 350 400 450 500 550 600 350 400 450 500 550 600 (g) [4 pts] Identify the two kinds of plots in Panel g1 and g2 by name (note that there are two of the same kind of plot in each panel) • Panel g1: • Panel g2: (h) [6 pts] From Panels g1 and g2, would you conclude that there is an association between readk and read1 in either group? Does the association between the two reading test varies seem to vary by levels of the modified kindergarten class type variable? Explain your answers in 4 sentences or fewer. CONTINUED ON NEXT PAGE MATH 208 Final Exam December 8th-11th, 2020 (i) [5 pts] Which of the following plots could also be used to assess the association between reading scores in kindergarten and grade 1 (assuming that neither variable is transformed)? Circle all that apply. A. Line chart B. 2-d density plot C. Treemap D. 2-d histogram END OF QUESTION MATH 208 Final Exam December 8th-11th, 2020 Question 3 [50 points] The goal of this task is to write functions to identify certain repeated patterns of characters in long character vectors, a basic form of a more complicated task that is often used in gene sequencing. For every part of this question, you will assume that the user gives you a vector where each element of the vector contains a single character,lower-case letter. For example, the user may specify: c("b", "c", "b", "d", "c", "a", "b", "b", "d", "c") (a) [15 pts] Write a function below using a for loop (and possibly other control statements) which takes a character vector as an argument and returns the length of the longest sequence of repeated letter “b” for an arbitrary vector. For the example vector above, for example, the length of the longest sequence of repeated “b” values is 2. It does not matter if the longest sequence length occurs multiple times, you only need to report it once. CONTINUED ON NEXT PAGE MATH 208 Final Exam December 8th-11th, 2020 (b) [15 pts] Now assume that if the user inputs a vector that includes a certain stopping character, then you should immediately stop analyzing the sequence and return a value of NA. If the input vector does not include the stopping character, then it proceeds as in part (a) to return the length of the longest sequence of repeated letter “b” values. For example, if the stopping character is “a”, then in the example above, your function should return NA. But if the stopping character is “f”, then in the example above should return 2 as before. Modify your function from part (a) to complete this task. Your function should take two arguments: the input character vector and a stopping character whose default value is “f”. CONTINUED ON NEXT PAGE MATH 208 Final Exam December 8th-11th, 2020 (c) [10 pts] Now assume that you want write code to create a data frame or tibble that contains the longest run in the vector for each letter of the alphabet, except for the single special stopping character specified by the user. If a non-stopping letter does NOT appear in the vector, it should not appear in the table. In other words, if the stopping character is “f”, then applying your code to the example vector above would return. # A tibble: 4 x 2 letter longest

But if the stopping character is “a”, then your function should return NA for all letters, i.e.

# A tibble: 4 x 2
letter longest

Write code below that uses your function from part (b) to produce the desired result. You do not need to write a
separate function for this part, but you can if you think it is helpful.

CONTINUED ON NEXT PAGE

MATH 208 Final Exam December 8th-11th, 2020

(d) [10 pts] Finally, using your code from part (c) so that you can obtain a list with 26 elements, where you
obtain the tibble in part (c) for a each of the 26 possible stopping characters. You do not need to write a
separate function for this part, but you can if you think it is helpful.

END OF QUESTION

MATH 208 Final Exam December 8th-11th, 2020

Question 4 [30 points]
In this question, you will write code to simulate a board game based on the fable, “The Tortoise and the Hare”.

The idea of the game is as follows:

(a) There are 100 spaces on the board and each piece must travel in order through the board.

(b) Both characters start on space 0.

(c) The Hare always gets to move first. The Hare randomly moves forward 5 spaces (when running) or moves
forward 0 spaces (when sleeping), with equal probability.

(d) Then the Tortoise moves forward either 2 spaces or 4 spaces, with equal probability.

(e) The game ends when one of the characters reaches a total of 100 spaces or greater.

[10 points] Write a function below, one_turn, which simulates a single turn in the game, i.e. steps (c) and (d)
above. The function should take two arguments, the current space of the Hare and the updated space of the tortoise.
The function should return the updated space of the the Hare and the upated space of the Tortoise after one turn.
Hint: You can use the sample function in R to choose the number of spaces each player moves forward.

CONTINUED ON NEXT PAGE

MATH 208 Final Exam December 8th-11th, 2020

[20 points] Write a new function which uses your function in part (a) to simulate one entire game, from steps a)
to e) above. Your function should take in one argument: a random seed so that you can replicate the results of the
game. Your function should return a list containing two elements: the name of the winner of the game (i.e. “Hare”
or “Tortoise”) and a tibble containing the history of all spaces travelled by both players .

Question 1 [50 points]
Question 2 [50 points]
Question 3 [50 points]
Question 4 [30 points]

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts