程序代写代做代考 Test 2

Test 2
You will receive two datasets, one called train, containing the response vector and covariates; the other called test will contain only the covariates.
You will be given the same choice of the 3 models for the base learners and one for the super learner.
I will need back three files: 1. R code, 2. doc file with the clear description of your models and 3. The predictions of probabilities of “yes” for data in test file.
You have to use R if most of your homework and the previous test were performed in R.
If you used a different language is the project – you can choose between that language and R.
For R users
You have to open the file in R environment and perform all the preliminary manipulations in R environment, do not rename the file outside R environment. Otherwise your code will not run as I will not have your manipulated file and you will get 0.
If you do not follow the instructions and will not use the models I assigned you will get half of the points.
If you ignore stacking all together you will get 0.
The task is to create a stacked model. Here I have two tiers
Tier one – basic . For the assigned models you use the basic strategy for stacking discussed in class. If you missed the discussion you can check the notes. You will get at most 85 points.
The advanced – can get upto 110 points. Here you have to use a different strategy for stacking. Your strategy has to better utilize the data –more data for training and at the same time you need to address the information leakage. In your doc file you have to defend your choice of the strategy. An improved strategy will add extra 10 points leading to max of 95. You can earn upto 15 additional points for 1. Expanding the number of base learners (more than 6), the learners have to be diverse enough to be counted. Adding another layers of

stacking. These additional points will be counted only if you utilized an improved stacking strategy. Note: additional models should be DIVERSE(per class discussion). Individuals trees do not count as additional models.
Notes: I have discussed the stacking in class. Please follow the material I introduced