2B03 Assignment 2
Probability Theory (Chapters 4 & 5)
Your Name and Student ID
Due Wednesday October 2 2018 (due in class prior to the start of the lecture)
Instructions: You are to use R Markdown for generating your assignment (see the item Assignments and R Markdown on the course website for helpful tips and pointers).
1. Define the following terms in a sentence (or short paragraph) and state a formula if appropriate (this question is worth 5 marks).
i. Clustered Random Sample ii. Statistical Inference
iii. Census
iv. Nonprobability Sample
v. Complementary Events
2. When the American League and the National League champions are evenly matched, the probabilities that a World Series will end in 4, 5, 6, or 7 games are respectively 1/8, 1/4, 5/16, and 5/16. What is the expected length of a world series when the two teams are evenly matched? (this question is worth 2 marks).
3. Suppose that the probability of a success on a Bernoulli trial is π = 0.25 and is independent from trial to trial. Find the probability of getting X = 2 successes in n = 100 trials (this question is worth 2 marks).
4. A market research firm goes to 36 stores and determines how much (in cents) each charges for an identical tube of travel-sized toothpaste. The resulting sample of prices is (ignore the ## [1] at the beginning of the line – this question is worth 4 marks).
## [1] 105 102 103 103 104 102 107 103 106 101 101 107 113 104 102 105 104
## [18] 101 104 103 112 109 102 107 103 104 104 101 106 105 102 104 103 113
## [35] 108 105
i. Using R (or otherwise), calculate the sample mean, median, and mode of the toothpaste prices.
ii. Using R, calculate the sample variance, standard deviation, and interquartile range of the toothpaste prices.
iii. Using the moments package in R (you must install this package first, then load it via library(moments) before you call the function skewness() etc.), does the coefficient of skewness indicate that the distribution of prices is skewed to the left or to the right?
iv. Using the moments package in R, does the coefficient of kurtosis indicate that the distribution of prices is more or less heavy tailed than the normal distribution?
5. It costs $60.00 to test a certain component of a machine. If a defective component is installed, it costs $1,200.00 to repair the resulting damage to the machine (this question is worth 4 marks).
i. What is the formula for the mean (expected value) of this discrete random variable that takes on two outcomes, $0 or $1,200 with probabilities p(no repair) and p(repair), respectively?
ii. Is it more profitable to install the component without testing it if it is known that a. 3% of all components are defective?
1
b. 5% of all components are defective? c. 8% of all components are defective?
6. Consider the binomial distribution with n = 5 and π = 0.3 hence X ∈ {0, 1, . . . , 5} (this question is worth 4 marks).
i. Calculate the mean and variance of the distribution using the E(X) = i xip(X = xi) type formulas for each.
ii. Repeat i above using the short-cut formula (e.g., the formula depending only on π and n).
7. (Bonus) Suppose that the subjective probabilities for finding oil are p(oil)=0.5 and p(no oil)=0.5. Your company drills for 500ft and finds no oil, but they sample the soil at 500ft. From experience the company knows that the conditional probability of finding this type of soil given that oil is present is 0.2, while the probability of finding this type of soil given that no oil is present is 0.8. What is the posterior probability of finding oil (this question is worth 3 bonus marks)?
2