代写 statistic Project 3 survival guide

Project 3 survival guide
This survival guide (composed largely from emails I have sent to past students) is here to give some practical advice to help you complete project 3.
Analysis 1
Analysis 1 is asking you to conduct t-tests for differences in means.
Begin by watching the video Two-Sample t-tests using the Data Analysis Toolpack that is part of the computer lab from week 7 as it shows you the procedure for doing these tests using the data analysis toolpak. Use the Two-sample assuming unequal variances option.
Sort all the data using Experience. It is very important to select ALL the data before you sort, otherwise you will scramble the data.
• test the null hypothesis that there is no difference in mean Value between Experienced and Inexperienced skippers
• test the null hypothesis that there is no difference in mean Time between Experienced and Inexperienced skippers
Now sort all the data using Age of Boat
• test the null hypothesis that there is no difference in mean Value between Old and New boats
• test the null hypothesis that there is no difference in mean Time between Old and New boats
and the same for Search equipment and General Equipment.
That is EIGHT (8) t-tests in total. Make sure you label everything as you go.
Analysis 2
Next, focusing just on the experienced skippers, we want to test the null hypothesis that there is no difference in mean Value for the experienced skippers with sophisticated search equipment compared to experienced skippers with adequate search equipment. Similarly, we want to test the null hypothesis that there is no difference in mean Time for the inexperienced skippers with sophisticated search equipment compared to inexperienced skippers with adequate search equipment.
(You could either cut and paste just the data for experienced skippers to a new sheet and then sort that by Search, or you could do a 2-level sort that sorts first on Experience and then on Search)
With data sorted like this, two t- tests on Value (one for experienced and one for inexperienced skippers), and then two more on Time.
Now repeat for General Equipment, and Age of boat. You should have TWELVE (12) t-tests in total.
1

Analysis 3
You are asked to construct a confidence interval for a difference in means for each of the t-tests you did in analysis 1 or 2.
A key thing to note is that all the ingredients you need for the 95% CI formula (see the week computer 9 lab) are in the output of the t-test results you made in analyses 1 and 2
95% Confidence Interval
(𝑝𝑝. 𝑒𝑒. −2𝑠𝑠𝑠𝑠. 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 , 𝑝𝑝. 𝑒𝑒. +2𝑠𝑠𝑠𝑠. 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒)
Difference in means
m1–m2 𝑥𝑥1̅−𝑥𝑥̅2
(𝑛𝑛,𝑛𝑛 >20) 12
If you set up a formula that calls on the appropriate parts of the t-test output then you should only have to do this once below one of the t-test tables in Excel and then be able to cut and paste the formula to the same relative position below all the other t-tests (save typing time).
Mean
Variance =(st.dev) Observations
Hypothesized Mean Difference df
t Statistic
𝑃𝑃𝑒𝑒(𝑇𝑇 ≤ 𝑠𝑠) one-tail
t Critical one-tail
𝑃𝑃(𝑇𝑇 ≤ 𝑠𝑠) two-tail
t Critical two-tail
Analysis 4
Analysis 4 is quite a lot like Task 4 from the CI computer lab.
The first thing you need to do is work out what the median value is.
Then create a new variable that takes the value “high” if catch is above the median and “low” if it is below the median (see the week computer 9 lab for how to go about this using the Excel IF statement).
Next create a pivot table of counts for Experience and the new variable.
To get the proportion of experienced skippers with high catch trips you would divide the count of experienced skippers with high value trips by the total number of experienced skippers. This gives you a p (the proportion of skippers with high catch trips) and an n (the total number of experienced skippers) that you can plug into the CI for a proportion formula.
Then repeat the above step for inexperienced skippers. Then you are done!
2
𝐻𝐻 : 𝜇𝜇 − 𝜇𝜇 = 0 012
Point estimate
Standard error
� 𝑠𝑠 12 + 𝑠𝑠 2 2 𝑛𝑛1 𝑛𝑛2
𝑥𝑥̅1 𝑥𝑥̅2
Variable 1 Variable 2
𝑠𝑠2 𝑠𝑠2 12
𝑛𝑛1 𝑛𝑛2
Degrees of freedom
2

Analysis 5
Analysis 5 requires you to use chi-square tests. The best place to start is to take a look at the demo videos in the week 10 computer lab (and the first online lecture from week 10).
You should begin by getting a table of counts (the easiest way is to make a pivot table), e.g. for Experience and Age of Boats it is
Count of Boat
Age of boat
Experienced
new old
ALL
no yes
25 21 56 49
46 105
ALL
81 70
151
Then you want to test the null hypothesis that the variables Experience and Age of Boat are independent (following the same procedure as in lab 10, i.e. make a table of expected counts, compute the chi-square statistic and use CHIDIST to get a p-value).
You then do the same for Experience vs Search , Experience vs General Equipment, Age vs Search, Age vs Equipment and Equimpent vs Search. So SIX (6) chi-square tests in total. Once you have the formulas set up for the 1st one it should be easy to cut and paste for the other two tests.
Pulling it all together
Findings:
Prioritize results that could not be attributed to chance variation (i.e. tests for differences in means that were statistically significant, or associations between variables that were statistically significant). Each finding in your report should be backed up with a table in your appendix,
e.g. if one of your findings is
“The mean catch for experienced skippers is 5.9 to 9.2 tonne higher than the mean catch for inexperienced skippers (Table X)”
Table X should be somewhere in your appendix and contain information on the t-test (i.e. what were the means, variances and samples sizes for the two groups, what was the value of the t-statistic and what was the corresponding p-value) and the confidence interval. The caption to table X should make it clear what null hypothesis the t-test relates to and it should be clear if the p-value is one or two-sided.
Just like in project 1 you should pay attention to formatting, i.e. don’t just cut and paste the table straight from Excel without thinking about decimal places, are all the rows relevant or should you get rid of some?
3

The Appendix
Have a methods section that explains what features of Excel you used and any formulas you used (e.g. the one for calculating a CI for a difference in means).
Have tables that very briefly summarize all the tests you did, e.g. for analyses 1 and 2 this might just show the differences in mean and p-values.
Have tables that give more detailed support for your specific findings.
Have clear captions for these tables, e.g. whenever you report a p-value it should be clear what null hypothesis it is testing.
4