stata 程序代写

The dataset attached was generated from online data. The scenario below is fictional, but intended to test your statistical knowledge.

At Toronto General Hospital, we wanted to test the effect of two safe drugs on patients to see the how it impacted their red cell distribution width (RCD). Patients’ RCD was measured at baseline, again at follow up in 3 weeks and again 2 weeks thereafter. The research question was to see whether either or both of the drugs caused an impact on red cell distribution width. We hypothesized that the effect of either or both drugs should result in an increased RCD. Of course, to determine if there was an actual effect, a control group was enrolled which comprised of patients in Clinic A who did not receive either drug. The drugs were administered on patients in Clinic B and all patients’ gender, age and ethnicity were recorded for us to factor for any expected differences in these categories.

Your task is to perform the statistical analysis to answer the research question above. As a hint, I expect that the statistical models selected will be tested to ensure they met all statistical assumptions. I would like to see how you performed these tests and made conclusions from them. If you indicated on your CV that you are knowledgeable in STATA, please perform the task using this software. If you do not have access to STATA, you can use R, but preference is given to solutions submitted in STATA, as the selected candidate will be working primarily with STATA in this position. You will be required to submit two documents.

  • Your process file (STATA .do file or R script)
  • One word/pdf document which comprises of screenshots of the output of any statistical models, tests

    & visualisations of the raw data you generated; and anything else which you think would be applicable to explaining your analysis. I strongly believe that visualizations of the raw data are significant to assisting your conclusions, so I am interested in seeing these. A major piece of this document will include an easy to understand, non-statistical explanation of the conclusions you made from the output. Your explanation must be understandable by someone who has zero statistical knowledge and directly answers the research question at hand. This should be a simple Yes/No answer to the question accompanied by your reasoning with the evidence you find from your analysis.

    As a further hint these are a few questions to consider:

  • Is the red cell distribution width the same at baseline across the groups? If not, how does this impact

    your analysis?

  • As this is a medical field, it is quite possible that we were unable to obtain follow up data. Patients

    may have passed away over time or may have been unable to come into the hospital on their expected shift due to unforeseeable events. What will you do about the missing data? Please provide evidence why the methodology you selected is most appropriate to handle the missingness.

    This task is not about getting the right answer, as there are several correct answers based on how you approach it. I am interested in seeing your thought processes and how/why you selected the statistical models used. As such, I expect comments I your code, so I can follow your thinking processes.

    . If anything is unclear or there are any questions, please do not hesitate to contact me.

    Thanks,

    Nathaniel Edwards

    Data Scientist

    __________________________________________

    Kidney Health Education and Research Group

    nathaniel.edwards@uhnresearch.ca

    LinkedIn

assignment will give us a better idea of your interest, programming abilities and statistical knowledge for this

Your attempt on this

position