程序代写代做 C STAT 385 HW4

STAT 385 HW4
Due by 12:00 PM on 10262019
HW 4 Problems
Below you will find problems for you to complete as an individual. It is fine to discuss the homework problems with classmates, but cheating is prohibited and will be harshly penalized if detected.
1. Modify the scaler function in the Week 7 notes.
We created a function called scalar in the Week 7 Notes. This function takes a numeric or integer vector as an input check this on your own and it outputs the scaled numeric or integer vectors as an output. Make a new version of this function that 1 takes a dataframe as an argument 2 scales all of the numeric and integervalued columns of that dataframe while leaving all other variables alone 3 returns the original dataframe with scaled numeric and integervalued columns. Verify that your function is working by testing it on two dataframes. The first dataframe can be anything that you want. The second dataframe should consist of only factor variables, i.e.the function should do nothing to this dataframe and also should not output an error message. Your verification checks should be readable, do not simply report the returned dataframes.
2. Using the Chicago Food Inspections Data, do the following:
a. create a visualization plot of at least two variables using this dataset
b. explain what is good and what is bad about the visualization
c. show a substantially improved visualization
d. describe the improvement and why the improved plot in part c helps the readerviewer more than the original plot in part a.
3. Using the Chicago Food Inspections Data, do the following:
a. create a table of descriptive statistics of your choice
b. add one descriptive statistic to the plot in part 1c
c. write a brief explanatory narrative of the visualization in part 2b. In your explanation, be convincing and persuasive about your visualization. Attempt to highlight why this visualization is crucial to your imaginary supervisor.
4. Using the SBA Business Loans Data, do the following:
a. Create a visualization plot of at least two variables using this dataset
b. explain what is good and what is bad about the visualization
c. show a substantially improved visualization
d. describe the improvement and why the improved plot in part c helps the readerviewer more than the original plot in part a.
5. Using the SBA Business Loans Data, do the following:
a. create a table of descriptive statistics of your choice
b. add one descriptive statistic to the plot in part 1c
c. write a brief explanatory narrative of the visualization in part 2b. In your explanation, be convincing and persuasive about your visualization. Attempt to highlight why this visualization is crucial to your imaginary supervisor.
Select inclass tasks
Completion of select inclass tasks will be worth 1 point and will be graded largely by completion. Obvious errors and incomplete work will recieve deductions.
1. Explain what the Riemannian sums simulator in Example 5b of the Week 7 Notes is doing. Why does it work reasonably well? Can you cook up an example where it will not work while keeping n 10000 Hint: think of the relationship between a, b, and n?
2. Create a custom function that computes 3 different outlier detectors given a single vector of data. The function should return 6 cutoff values 2 for each outlier detector from the following methods:
a. 3 Sigma Rule xbarx3 cdot hatsigma
b. 1.5IQR Rule xQ11.5cdot IQR or xQ31.5 cdot IQR
c. Hampel Identifier xtildex 3 cdot tildesigma
tildex is the median of x
tildesigma1.4826 cdot textrmmedianxtildex is the median of the absolute deviation from the median MADM or MAD scale estimate.