Assignment
Final Project
MAS 640 – Time Series Analysis and Forecasting
Using data from Zillow – https://www.zillow.com/research/data/ – you will model median sale price (in $) in a city or county assigned to you. You will present your results in the form of a paper created using R Markdown.
Your paper should include an introduction to the problem and data, your final model and how well it fits the observed data, how and why you selected that model, any assumptions you are making and how reasonable those assumptions are. Finally, with your selected model, forecast median sale price for the next three months and discuss your findings.
Your paper should include any relevant tables and figures, but no R code, which should be submitted separately. Your paper may include an appendix for additional tables, figures, or other material as needed (not required). You can include your R code in the appendix (eval=F, echo=T options added to code chunk),
or you can submit the R code separately in a .R file.
I am less interested in whether or not you found some “correct” final model (those do not exist anyway. . . ), and more interested in 1) your ability to explain how you arrived at that model, 2) your ability to communicate results, and 3) your ability to put together a well formatted paper.
This is an individual project and you are not to work together. If you are found cheating or using another students R code, both individuals will receive an F for the course.
Submissions
Please submit the following before Midnight on Friday, March 2nd –
• A PDF of your paper
• The Rmd used to create that paper
• Your R Code (if it’s in the Rmd, that is enough)
Reading in the data and defining the time series
Please see the following sample for help reading in and defining your data as a time series.
State Year Month Price
fl <- read.csv(‘https://raw.githubusercontent.com/lehmannd/classData/master/flData.csv’) head(fl, n=3)
1 Florida 2008 2 Florida 2008 3 Florida 2008
3 206300 4 197900 5 191300
Take note of the first year and first month in the data, which we need for the next line –
fl <- ts(fl$Price, start=c(2008, 03), frequency=12)
1