CS计算机代考程序代写 scheme Excel Department of Mathematics and Statistics STAT2402 Analysis of Observations

Department of Mathematics and Statistics STAT2402 Analysis of Observations

Final Examination
Due date: 6 pm, Monday 25 October 2021

Any extensions to the due date needs to be by special approval from your
student office.

1 Brief

Deb and Trivedi (1997) analysed data on 4406 individuals, aged 66 and over, who are
covered by Medicare, a public insurance program. The data is available by the following
R commands.

install.packages(MixAll)

install.packages(rtkore)

library(MixAll)

library(rtkore)

data(DebTrivedi)

db <- DebTrivedi summary(db) ## ofp ofnp opp ## Min. : 0.000 Min. : 0.000 Min. : 0.0000 ## 1st Qu.: 1.000 1st Qu.: 0.000 1st Qu.: 0.0000 ## Median : 4.000 Median : 0.000 Median : 0.0000 ## Mean : 5.774 Mean : 1.618 Mean : 0.7508 ## 3rd Qu.: 8.000 3rd Qu.: 1.000 3rd Qu.: 0.0000 ## Max. :89.000 Max. :104.000 Max. :141.0000 ## opnp emer hosp ## Min. : 0.0000 Min. : 0.0000 Min. :0.000 ## 1st Qu.: 0.0000 1st Qu.: 0.0000 1st Qu.:0.000 ## Median : 0.0000 Median : 0.0000 Median :0.000 ## Mean : 0.5361 Mean : 0.2635 Mean :0.296 ## 3rd Qu.: 0.0000 3rd Qu.: 0.0000 3rd Qu.:0.000 ## Max. :155.0000 Max. :12.0000 Max. :8.000 ## health numchron adldiff region ## poor : 554 Min. :0.000 no :3507 midwest:1157 ## average :3509 1st Qu.:1.000 yes: 899 noreast: 837 ## excellent: 343 Median :1.000 other :1614 ## Mean :1.542 west : 798 ## 3rd Qu.:2.000 ## Max. :8.000 ## age black gender married ## Min. : 6.600 no :3890 female:2628 no :2000 ## 1st Qu.: 6.900 yes: 516 male :1778 yes:2406 ## Median : 7.300 ## Mean : 7.402 ## 3rd Qu.: 7.800 ## Max. :10.900 ## school faminc employed privins ## Min. : 0.00 Min. :-1.0125 no :3951 no : 985 ## 1st Qu.: 8.00 1st Qu.: 0.9122 yes: 455 yes:3421 ## Median :11.00 Median : 1.6982 Semester 2, 2021, page 1 Due date: 25/10/2021, 6 pm Department of Mathematics and Statistics STAT2402 Analysis of Observations ## Mean :10.29 Mean : 2.5271 ## 3rd Qu.:12.00 3rd Qu.: 3.1728 ## Max. :18.00 Max. :54.8351 ## medicaid ## no :4004 ## yes: 402 ## ## ## ## You should obtain the description of the variables from Deb and Trivedi [1]. Note that the variables have been coded differently from the description in the paper. You should be able to work out the coding yourself from the descriptions in the paper and the data summary. Aims of Analysis The following are the response variables of interest. (a) ofp: number of physician office visits (b) ofnp: number of non-physician office visits (c) opp: number of physician hospital visits (d) opnp: number of non-physician hospital visits (e) emer: number of emergency room visits (f) hosp: number of hospital stays Analyse the data as follows. We consider a categorical variable Type that indicates the type of visit. This variable will have levels ofp, ofnp, opp, opnp, emer, hosp. Another variable Visits will contain the number of visits of the corresponding type. Aim of the analysis: To determine the dependence of the number of each type of visit on private health insurance and access to Medicaid. Notes (a) Consider the data DebTrivedi to be in wide format. Then it is a simple matter of turning this into long format creating the new variables Type and Visits as defined above. (b) You should begin with the Poisson model and check the dispersion. If the model assumptions are satisfied, then you do not need to do any more. Otherwise also fit the quasi-Poisson model, and the negative binomial model. Select the best model. You will need to examine plots of residuals as well. (c) Note that several models could be equally good. In that case the simpler model is preferred. (d) You should investigate some meaningful interaction between categorical variables. Do not over-complicate your model. Semester 2, 2021, page 2 Due date: 25/10/2021, 6 pm Department of Mathematics and Statistics STAT2402 Analysis of Observations 2 Examination submission You are required to write a journal style article presenting your analysis. This is a take home examination. You can use any resource you wish, except consulting with anyone else. The submission should be entirely your own work. Any suspected breach of this will be reported to the faculty misconduct officer for further action. Submit your article to the LMS link under Final Exam 2021. Your paper will include the following sections. (a) Title and author information. (b) Abstract of no more than 500 words. (c) Introduction of no more than two pages. Here you can support your paper with other information from online searches, relating use of health services. For example you may refer to findings in the literature on the usage of health services in USA and Australia, or in the world, and the variables associated with use of health services. As a final paragraph give an outline of the paper, saying what each of the following sections contain. (d) Methodology of no more than a page. Here you can describe the data collection process, as described in the paper by Deb and Trivedi [1]. Also describe the statistical modelling without giving mathematical details. You can assume the reader is familiar with standard statistical techniques. (e) Results, describing the findings of the modelling, in not more than 2 pages. (f) Discussion of no more than a page. Discuss your findings with reference to your introduction. (g) References. As many as you have used. Use any consistent format for references. You may be able to simply copy and paste these. (h) An appendix that details ALL the R commands and output used in the analysis. Include only the code that is relevant to your analysis and your report. The code should be clearly annotated so it is clear to follow and understand, and should indicate each stage of the analysis. A sample journal article has been included for your guidance. (Note that the sample paper does not contain the appendices that your paper requires.) 3 Marking Scheme The marks will be allocated as follows. (a) The Paper: 75 marks. (1) Abstract = 10 marks (2) Introduction = 20 marks. Particular attention will be given to your relevant literature search. (3) Methodology = 10 marks. Clarity of expression is important. Semester 2, 2021, page 3 Due date: 25/10/2021, 6 pm Department of Mathematics and Statistics STAT2402 Analysis of Observations (4) Results = 15 marks. Clearly describe your findings. (5) Discussion = 20 marks. Discuss the findings and any implications. (b) Appendix A for R code: 25 marks. Clearly describe the steps in your modelling. Clearly state what you preferred model is with justification based on verifying model assumptions. References [1] Deb P. and P. K. Trivedi. Demand for medical care by the elderly: A finite mixture approach. Journal of Applied Econometrics, 12:313–336, 1997. [2] R Core Team. R: A Language and Environment for Statistical Computing. R Foun- dation for Statistical Computing, Vienna, Austria, 2021. [3] Kleiber C. Zeileis A. and Jackman S. Regression models for count data in .̊ JSS, 27:1–25, 2008. Semester 2, 2021, page 4 Due date: 25/10/2021, 6 pm Brief Examination submission Marking Scheme