Project overview
Stat 360 Project Rubric
The project is to create an R package that implements the Multivariate Adaptive Regression Splines (MARS) algorithm described in Friedman (1991):
. Annals of Statistics , Mar., 1991, Vol. 19, No. 1 (Mar., 1991), pp. 1-67
Copyright By PowCoder代写 加微信 powcoder
The main function, mars(), takes a formula, data frame, and control object as input, fits the model by the forward and backward stepwise algorithms (Algorithms 2 and 3), and returns a mars object that contains the final least squares fit, along with a description of the basis functions from the final fit. You will also write predict(), plot(), summary(), and print() methods for mars objects. The mars() function and these four methods will be documented using roxygen2 to produce R documentation files. Your package will include a test dataset that I will provide and unit tests of mars(), the fwd_stepwise(), bwd_stepwise() and predict.mars() functions that implement the forward and backward algorithms and the predict method. You will also include a package vignette that shows users how to use mars() and all four mars methods. In the vignette, include an analysis of one dataset that you find interesting (and that is not the test dataset). The dataset could be from another R package (easiest) or could be one that you have downloaded from another source and included as a package dataset.
The package will be made available in a mars directory within your group’s project folder in the SFUS- tat360Projects GitHub repository. For example if your group’s project folder is MyGroup, your R package will be at https://github.com/SFUStatgen/SFUStat360Projects/tree/main/Projects/MyGroup/mars/. You will also upload (i) a PDF of your vignette and (ii) a PDF of the package reference manual built with devtools::build_manual() to a Crowdmark assessment where they will be marked. The code on GitHub and documentation on Crowdmark are due at noon on Tuesday April 12.
Grading Scheme
Code on GitHub (25 marks)
At the project due date, all projects will be pulled from the SFUStat360Projects GitHub repository. The markers will copy the standard testthat suite from the Exercises/ProjectTestfiles folder of the SFU- Stat360 GitHub repository to your package’s tests directory and will run both devtools::test() and devtools::check() on the package. The grading scheme for your code is as follows.
1. REAMDE (1 marks): A README.md file in your project’s main folder (outside the mars pacakge folder) should include your group members’ names and student numbers.
2. Working R package (7 marks):
• (4 marks) The package should pass the unit tests of mars(), fwd_stepwise(), bwd_stepwise() and predict.mars().
• (1 mark) The package should include DESCRIPTION, LICENCE and NAMESPACE files, and data, data-raw, man, R, tests and vignettes directories
• (2 marks) In addition to passing the unit tests and having the required structure, the package should pass devtools::check().
3. mars.R (9 marks): The main mars.R file should include the mars() function and any others, such as fwd_stepwise(), that are called by mars(). Arrange your functions in a “top-down” manner, with higher-level functions appearing first, followed by successive levels of lower-level functions. You are graded on the following criteria:
• Data structures (2): The input data structures should be a formula, data and mars.control object. The output data structure is an S3 object of class mars that inherits from class lm.
• Correctness (5): The code should work correctly on the test suite. That is, it should pass the unit tests of mars(), fwd_stepwise() and bwd_stepwise() with no errors.
• Readability (1): The steps and logic of your implementation should be clearly laid out. It should be easy for someone else in the class to read your code and understand what is going on.
• E ciency (1): Take steps to avoid computational ine ciencies, such as excessive copying of large R objects.
4. Methods (8 marks): Include one file for each of the plot(), predict(), print() and summary() methods. For each method you are marked on:
• Correctness (1 marks): The method should work correctly on the test dataset.
• Familiarity (1 marks): The method should look familiar to someone who has used the analogous
method for lm() and glm(). Documentation (25 marks)
By the project due date you will need to submit a PDF file containing the package vignette and a PDF file containing the package reference manual. * To obtain the PDF of your vignette, go to the vignette folder and open the .Rmd file, knit it, open the resulting .html file in a brower, and use your browser’s print feature to save the document as PDF. * To obtain the PDF reference manual, use devtools::build_manual() to build a PDF that will be saved to the parent directory of your mars package directory.
1. Vignette (11 marks): The vignette should include an analysis of one dataset that you find interesting (and that is not the test dataset). There is some overlap between the vignette and the documentation of the mars() function. Think of the vignette as long-form documentation that teaches a user how to use all of the features of your package, rather than the terse documentation of the individual functions that is a reference for users who already know something about the package and just want a refresher
or a quick-start on specific functions. The vignette should:
• (2 marks) be logically organized for the goal of introducing your package and its features to a user. • (2 marks) give a description of the MARS algorithm
• (1 mark) show the user how to prepare the inputs and call mars()
• (4 marks) shown the user how to use all four methods on a mars object
• (2 marks) use the most interesting data you can find. Trivial examples will get no marks.
2. Documentation for mars() (10 marks): The documentation for your mars() function must be generated
from roxygen2 comments in your mars.R source file. Marks are allocated as follows:
a. (1 marks) Description (brief) – a one- or two-line description of what the function does
b. (1 mark) Usage – how to call the function
c. (1 marks) Arguments – a list of arguments and their meaning
d. (2 marks) Details – a precise and detailed description of what the function does
e. (1 marks) Value – a description of the function’s return value
f. (1 mark) Author(s) – your name(s)
g. (1 mark) References – a reference to the Friedman paper and any other sources you think are necessary
h. (1 marks) See Also – a brief description of the methods written for MARS objects
i. (1 mark) Example – An example of how to use your function using the test dataset.
3. (4 marks) Documentation for the predict(), plot(), summary(), and print() methods will also be prepared as roxygen2 comments in each of the respective source files.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com