CS计算机代考程序代写 Hive Coursework 2 – Discrete Data Analysis

Coursework 2 – Discrete Data Analysis

MATH20811 Practical Statistics: Coursework 2

(November 2021)

The marks awarded for this coursework constitute 30% of the total assessment for the module.

Your solution to the coursework should be fairly concise (maximum of about 10 pages) and it
should take, on average, about 15 hours to complete.

Please read all the instructions and advice given below carefully.

The submission deadline is 10:00 am on Monday 13 December 2021.

Late Submission of Work: Any student’s work that is submitted after the given deadline will
be classed as late, unless an extension has already been agreed via mitigating circumstances or a
DASS extension.

The following rules for the application of penalties for late submission are quoted from the
University guidance on late submission document, version 1.3 (dated July 2019):

”Any work submitted at any time within the first 24 hours following the published submission
deadline will receive a penalty of 10% of the maximum amount of marks available. Any work
submitted at any time between 24 hours and up to 48 hours late will receive a deduction of 20%
of the marks available, and so on, at the rate of an additional 10% of available marks deducted
per 24 hours, until the assignment is submitted or no marks remain.”

Your submitted solutions should all be in one document which must be prepared
using LaTeX. A 5 mark penealty will be imposed if this is not adhered to. For each
part of the project you should provide explanations as to how you completed what is required,
show your workings and also comment on computational results, where applicable.

When you include a plot, be sure to give it a title and label the axes correctly.

When you have written or used R code to answer any of the parts, then you should list this R code
after the particular written answer to which it applies. This may be the R code for a function you
have written and/or code you have used to produce numerical results, plots and tables. R code
should also be clearly annotated.

Do not use screenshots of R code/output in your report. Instead, to include R code
use the verbatim environment and summarise R output in tables using the table environment, as
demonstrated in the solution of Example Sheet 2.

Your file should be submitted through the Turnitin assessment called ”PS CW2
2021”in the folder ”MATH20811 CW2” under Assessment & Feedback on Blackboard
and by the above time and date. Work will be marked anonymously on Blackboard so please
ensure that your filename is clear but that it does not contain your name and student id number.
Similarly, do not include your name and id number in the document itself.

Coursework 2 – Discrete Data Analysis

There is a basic LaTeX template file on Blackboard which you may choose to use for typing-up
your solutions. The file is called CW2_submitted_work.tex.

Turnitin will generate a similarity report for your submitted document and indicate matches to
other sources, including billions of internet documents (both live and archived), a subscription
repository of periodicals, journals and publications, as well as submissions from other students.
Please ensure that the document you upload represents your own work and is written in your own
words. The Turnitin report will be available for you to see shortly after the due date.

This coursework should hopefully help to reinforce some of the methodology you have been study-
ing, as well as the skills in R you have been developing in the module. Correct interpretation and
meaningful discussion of the results (i.e. attempt to put the results into context) are important
in order to achieve a high mark for the coursework.

Coursework 2 – Discrete Data Analysis

The following table gives the numbers of road casualties in Greater London during 2013, cat-
egorised as being either ”fatal”, ”serious” or ”slight” and grouped by five modes of transport.

Casualty Severity
Fatal Serious Slight Sum

Mode of Transport Pedestrian 65 773 4343 5181
Pedal Cycle 14 475 4134 4623
Powered 2 Wheeler 22 488 3992 4502
Car 25 310 9850 10185
Other Vehicles 6 146 2556 2708
Sum 132 2192 24875 27199

The question of interest is whether the five modes of transport differ in their respective prob-
abilities of different casualty severity. You should regard the row sums as being fixed quantities
here.

1. Given the description of the data, write down a suitable probability model for this matrix
of counts.

[2 marks]

2. Read the data as a matrix into R and label the two dimensions appropriately. Print out
your resulting data matrix.

Then calculate appropriate proportions and comment informally on the question of interest
given above.

[5 marks]

3. Present the proportions data graphically and comment on the resulting plot.

[5 marks]

4. State the relevant statistical hypotheses and then:

– explain how the expected frequencies are calculated under the assumption that your
H0 is true and obtain their values for these data;

– test H0 vs H1 using a significance level α = 0.05 and a critical value from the asymptotic
null distribution of your test statistic. (You should clearly state what this distribution
is.) State your conclusions.

[3 marks]

5. Print out some appropriate sets of residuals and comment on their values in the light of the
conclusions you made in part (iv).

[3 marks]

Coursework 2 – Discrete Data Analysis

6. Write a function in R to obtain B = 5000 values of the test statistic, each one calculated
using a set of data simulated under the assumption that the null hypothesis is true. You
should aim to efficiently make use of for loops in doing this.

Produce a histogram of these simulated values, superimpose the plot of the asymptotic null
distribution and comment informally on the goodness-of-fit.

[6 marks]

7. Construct approximate 95% confidence intervals for (a) the difference between the proba-
bility that a pedestrian is seriously injured and the probability that a car driver is seriously
injured and (b) for cyclists only, the difference between the probabilities of a serious injury
and a slight injury.

[6 marks]

[Total marks = 30]