CS计算机代考程序代写 —


title: “STAT340 HW2: Monte Carlo & Testing”
author: Name
date: Date
output: html_document

“`{r setup, include=FALSE}
# check packages installed
if(!require(pacman)) install.packages(“pacman”)
pacman::p_load(ggplot2)

knitr::opts_chunk$set(tidy=FALSE,strip.white=FALSE,fig.align=”center”,comment=” #”)
options(width=100)
“`

[link to source](hw02.Rmd)

## Instructions

Complete the exercises, update the “author” and “date” fields in the header, knit it, and submit **both the HTML and RMD** files to Canvas. Due date: **Oct 13, 2021 at 11:59pm**.

## Exercise 1 (20 points): Generalized [birthday problem](https://en.wikipedia.org/wiki/Birthday_problem)

The birthday problem asks for the probability that in a group of $n$ people, at least 2 people will share the same birthday. This is easy to solve, and the solution is easily found online.

We can generalize this to a more difficult problem and solve it using a Monte Carlo approach: in $n$ people, what is the probability that at least $k$ people have the same birthday?

Write a function `birthday(n,k,i)` that returns a probability estimate given 3 arguments:

– $n$ is the number of people in your sample
– for example, if `n=50` is used, we are asking “in 50 people, what is the probability that…”
– $k$ is minimum number of people that must share a birthday
– for example if `k=4` is used, we asking “…what is the probability that at least 4 people share the same birthday?
– $i$ is the number of iterations to run (default 1000)
– for example, if `i=1000` is used, your function should run 1000 simulations

**Notes**:

– You may assume there are 365 possible dates (no leap years)
– You may assume birthdays are uniformly distributed across the calendar
– this is actually not true, see [this](https://www.panix.com/~murphy/bday.html), or [this](https://fivethirtyeight.com/features/lots-of-parents-dont-want-their-kids-to-be-born-on-leap-day/)
– You may assume the people are sampled [i.i.d](https://en.wikipedia.org/wiki/Independent_and_identically_distributed_random_variables)

**Hints**:

1. There’s no need to use actual dates in the simulation process. Numbers can represent dates and are easier to generate and manipulate in `R`. In particular, we recommend using the `sample()` function with the `x`, `size`, and `replace` arguments set appropriately. See the help page `?sample` for details.
2. Given a vector of numbers, you can easily find duplicates by using the `table()` function. This will produce a named vector showing how many of each value there are. For example, running `table(c(1,3,5,5,7,9,9,9))` will show you there is one 1, one 3, two 5s, one 7, and three 9s.
3. In your function, you will need to use a `for` loop to repeat the simulation `i` times. You will also need a variable outside your `for` loop to keep track of how many simulations satisfy that \# of birthdays $\geq k$.
4. If your function is running correctly, then `birthday(n=23, k=2)`, `birthday(n=87, k=3)` and `birthday(n=188, k=4)` should all be approximately $50\%$.
5. If your function is very slow, consider using the [`Table` function](https://rdrr.io/cran/Rfast/man/Table.html) from the Rfast package, which is 4-5 times faster than the normal `table()` function.

“`{r}
# complete the function
# note i=1000 sets the default value of i to be 1000
birthday = function(n,k,i=1000){
# code goes here
}
“`

This class currently has 162 enrolled students. What is the approximate probability that at least $4$ students have the same birthdays?

> **ANSWER HERE**

## Exercise 2 (15 points): Simulate RV

$X$ is a random variable defined between 0 and 1 with the probability density function $f(x)=2x$. Note this means the cumulative distribution function is $$F(x)=\int_0^xf(x)dx=x^2$$ Write a function `rx(n)` to sample from this random variable, where `n` is the size of the sample to be drawn. Then, use your function to draw a sample of 500 and plot a histogram of the output.

“`{r,fig.width=4,fig.height=3}
# defining pdf of X
pdf_x = Vectorize(function(x){
if(x>0 & x<1){2*x} else 0 }) # showing pdf on plot ggplot() + geom_function(fun=pdf_x,n=10001) + theme_minimal() + xlim(c(-1,2)) + ylim(-1,3) + labs(x='x',y='f(x)') ``` ```{r} # complete the function rx = function(n){ # code goes here } # uncomment the following line of code and check it looks correct # hist(rx(500)) ``` --- ## Exercise 3 (15 points): Testing coin flips In the six sequences below, **only one** of them is actually randomly generated from a fair coin. Use a combination of everything you know (common sense, monte carlo, hypothesis testing, etc.) to identify which is actually random and explain your reasoning. ```{r} flips1 = "HTHTHTHTHTHTHTHTHTHTHTHTHTHTHTHTHHTHTHTHTHTHTHTTHTHTHTHTHTHTHHTHTHTHTHTHTHTHTHTHTHTHTHTHTHHTTHTHTHTHTHTHTHTHTHTHTHTHTHHTHTHTHTHTHTHTHTHTHTHTHTTHTHTHTHTHTHTHTHTHTHTHTHTHHTHTHTHTHTHTHTHTHTHTHTHHTHTHTHTH" flips2 = "HHHTHTTTHHTHHTHHHTTTTHTHTHHTTHTHHHTHHTHTTTHTHHHTHTTTHTHTHHTHTHTTHTHHTHTHTTTHTHHHTHTHTTHTHTHHTHTHTHHHTHTTTHTHHTHTHTHHTTTHTHHTHHTTTTHTHTHHHTHTTHTHHTHTHTTHTHHTHTHHHTHHHTHTTTHTTHTTTHTHHHTHTHTTHTHHTHHTHTTT" flips3 = "HHTHTHTTTHTHHHTHHTTTHTHHTHTTTHTHTHHTHTHTTHTHHHHHHTTTHTHTHHTHTTTHTHHTHTHTTTHTHHHTTHTTTHTHTHHHHTHTTHHTTTTTHTHHHTHTHTTTTTHHHTHHTHHTHHHTTTTHTHTHHHTHHTTTTTHTHHHTHTHTHTTTHTHHHTHTHTHTTHTHHTHTHTHTTTTHTHHHTHTH" flips4 = "HTHHHHHHHTHTTHHTTHHHTHTHTTTHHTHHHTHHTTHTTTTTTTTTHTHHTTTTTHTHTHTHHTTHTTHTTTTTHHHTHTTTHTHTHHHTHTTTTHTHTHHTTHTHTTHHTHTHHHHTHTTHHTTHTTHTTHTHHHHHHTTTTTTHHHTTHTHHHHTTTHTTHHHTTHTHHTTTHHTHHTTTHTHHTHHHTHHTTHHH" flips5 = "HHHHHHHHHHHTTTTTTTTTTTHHHHHHHHHHHHTTTTTTTTTTTHHHHHHHHHHHHHTTTTTTTTTTHHHHHHHHHHTTTTTTTTHHHHHHHHTTTTTTTHHHHHHHHHTTTTTTTTTHHHHHHHHTTTHHHHHHHHHHHTTTTTTTTTTTHHHHHHHHHHHHTTTTTTTTTTTHHHHHHHHHHHHHTTTTTTTTTTHH" flips6 = "TTHTTTHTTTTTTTHTHTHTHTTHTTHTHHTHHTTTHHTHTTTHTHHTHHHTHTTHHTHHTTHTHTTTTHTHTTTHHTTTTTTTTHTHHTTHTTTTTTHTHTHTHTTTHTTHHTTHTTTHHTTTHTTHTTTTHTTTTHHTTTHTHTHHHTTTTTTHTHHTTTTTTTTTTTTHHHTTTHHHTTTHTTTHTHTTHTTTTTHT" # you can use the function below to split the above sequences in vectors of flips split = function(str) strsplit(str, split="")[[1]] split(flips1) ``` Response goes here: