In this homework, you are going to work on bag of little bootstraps algorithm.
In this package, I have implemented the bag of little bootstraps for linear regression model.
Your job is improved my package to various ways. For examples,
1. In the current implementation, only one CPU is used in the algorithm. Make it possible to use more than one CPUs. Note that you should let users to decide if they want to use parallelization.
2. Allow users to specify file names to run the model rather than loading the whole data in the main process.
3. Functions are written in pure R, it is possible, for example, to convert the function lm1 to c++ code. Your might need look at how RcppArmadillo’s fastLm.R and fastLm.cpp. (Spoiler, it is not easy, but if you insist, here is a some slides about it: https://scholar.princeton.edu/sites/default/files/q- aps/files/slides_day4_am.pdf)
4. Write tests and documentations
5. More models? Logistic regression? GLM?
6. You should also write a few pages Rmarkdown documentation to explain your work. One recommendation way is to put the documentation as a vignette. (If you want to use tidyverse in the the vignettes, run usethis::use_package(“tidyverse”, type = “suggest”) to add tidyverse in the suggest field of DESCRIPTION.)
How to start?
The easiest way to start the project is to fork my package then use RStudio to clone from your personal repo.
However, your could also start a new package from scratch.
Grading
Your grade will be determined by the amount of work that you have made and how well they are implemented.
(60%) the code:
o both correctness and efficiency
o code style: You want your code to be clean and well
documented. Just imagine another people will be taking charge of the maintenance of your app. (Hint: make use of styler)
(40%) miscellaneous o tests
o documentations
o pass devtools::check() etc.
o the vignette Examples
library(blblm)
fit<-blblm(mpg~wt*hp,data=mtcars,m=3,B=100)
coef(fit)
#> (Intercept) wt hp wt:hp
#> 48.88428523 -7.88702986 -0.11576659 0.02600976
confint(fit, c(“wt”, “hp”))
#> 2.5% 97.5%
#> wt -10.7902240 -5.61586271
#> hp -0.1960903 -0.07049867
sigma(fit)
#> [1] 1.838911
sigma(fit, confidence = TRUE)
#> sigma lwr upr
#> 1.838911 1.350269 2.276347
predict(fit, data.frame(wt = c(2.5, 3), hp = c(150, 170)))
#> 1 2
#> 21.55538 18.80785
predict(fit, data.frame(wt = c(2.5, 3), hp = c(150, 170)), confidence = TRUE)
#> fit lwr upr
#> 1 21.55538 20.02457 22.48764
#> 2 18.80785 17.50654 19.71772