Project 3
Please find by yourself one scenario with a time-to-event data set and a clearly described background, and most importantly, with interesting problems that require analyzing the data set. The analysis of your time-to-event data is somewhat open-ended. At a minimum, you should have the estimates for the survival curves, do some meaningful hypothesis testing, fit a Cox proportional hazard model or some parametric models, and do some simple model diagnostics using graphical and other tools. How to Find a Time-to-event Data set:
• Find a real time-to-event data set. Data set from any area is fine. Your data set should have the following characteristics:
• – Time to event should be the (possibly censored) response. It would be ideal to have some censoring or even truncation, but this is not a strict requirement.
• – There should be at least one explanatory variable. It is better to have two or more explanatory variables.
• – If your data set does not have explanatory variables but the data structure is complicated enough, that is also OK.
• Non-statistics major students are encouraged to use time-to-event data sets from their specialized fields.
• Resources:
• – If you have access to real data from your consulting projects, internship, research, etc., they
may be used for the project.
• – Journal articles provide the some interesting data sets.
• – Some web-sites archives data sets.
• – Google!
• Some example data sets:
• – The National Health and Aging Trends Study (NHATS). Website link: http://www.nhats.org/
• – Predicting Geyser Eruption Time.
data set link: http://www.geyserstudy.org/ofvclogs.aspx. There are some existing sites provide predictions. http://www.rcn.montana.edu/resources/Predictions.aspx
• – NASA Data Repository: website link http://ti.arc.nasa.gov/tech/dash/pcoe/prognostic-data-repository/
One interesting data set is “Turbofan engine degradation simulation data set.”