程序代写代做代考 Excel ____________________Name

____________________Name

_______________________Name
Fall 2018

1. (10 pts) With multi-category data, we often have to choose the type of correlation matrix we generate to describe relationships. In a British Journal of Mathematical and Statistical Psychology, Conor Dolan indicated that a critical cutoff for determining whether a Pearson correlation is appropriate occurs when you have a variable with around 5 ordinal categories (Dolan, 1994). Given that, I would like you to consider the data set, factdata.xls which includes data in which students answered questions related to food preferences using four different scalings approaches (true/false, Likert from Strongly disagree to Strongly agree, semantic differential, Likert from Disagree to Agree). If you’re having trouble reading in an EXCEL spreadsheet, the first Likert scale necessary for this question is in the raw data file, fLikert.dat. The header includes the variable names. I’ve included the original questionnaire so you can see what the questions looked like.

For the first type of Likert variables flkrt1 to flkrt20 (on the second page of the questionnaire), I would like you to pick variables from one of the following five subgroups:

Seafood I: flkrt1-flkrt5
Fast food: flkrt6-flkrt10
Challenging food: flkrt11-flkrt15
Seafood II: flkrt16-flkrt20

and create a function to generate descriptive statistics appropriate to an interval level variable: mean and standard deviation, and statistics appropriate to an ordinal level variable: median, minimum, maximum, and range, along with the N.

I would like you to put these statistics into an object similar to descripstat2() in the program scndprog.cowdata.R. Make sure to return a matrix and label the dimensions of the matrix appropriately. I’m including the formula for the median and some other statistics we will need later in the semesters in a file called, add.stats.R. When calculating the median, I want you to use this computational formula, not the median() function. The same for the other descriptive statistics. Please use computational formulas, not the functions that are preprogrammed.

Compare your results to describe() in the “psych” package.

Now, also in R, I would like you to create a table including three kinds of correlations: Pearson, Spearman, and Kendall correlations. You can do create this table by stacking the correlation matrices. Once you have all of the correlations in a single table, you will have to rename the dimensions (rows and column) to let the reader know what is what.

The difference between the Spearman and Kendall coefficients involves assumptions regarding the underlying distributions of the variables. Spearman ρ assumes that the ranks are interval scales, while Kendall τ. So, does it matter here? Are the correlation coefficients different? What about the central tendency, does the median differ from the mean? What do you conclude?

Dolan, C. V. (1994). Factor analysis of variables with 2, 3, 5 and 7 response categories: A comparison of categorical variable estimators using simulated data. British Journal of Mathematical und Statistical Psychology, 41, 309-326

2. (6 pts) I would like you to write a program for intensive regression to simultaneously regress dep4 of the portroy data onto dep1, dep2, and dep3. You can pull numbers representing regression weights from a random uniform distribution. Make sure to vary each coefficient between -1 and +1. Remember that you can place restrictions on the coefficients within the runif() function. Use the lm() function to check your results. Note that if you are using three variables to predict a fourth, the regression function would be:

lm(dep4 ~ dep1 + dep2 + dep3)

The standardized regression would be:

lm(scale(dep4) ~ scale(dep1) + scale(dep2) + scale(dep3))

3. (6 pts) For the 20 flkrt items, either factdata.xls or fLikert.dat, take the wide data set and create a long data set creating the dependent variable flkrt. There will be 20 observations per person, on four types of food: seafood_1, fast_food, challenging_food, and seafood_2.
Create a new indexing variable which identified which type of food each item is measuring. So, the long data set should have

id food_type flkrt

This requires one trick that we didn’t discuss in class. The varying variables are going to be all 20 items. Pick a good v.name. I picked flkrt. I called the timevar variable food_type. Rather than times, you want to provide the levels (values) for the food_type variable. There are 20 of them with 5 of each type. You can create that variable using

times = c(rep(1,5),rep(2,5),rep(3,5),rep(4,5)),

One you create this long data set, print it. You will see that the data are sorted by food type and not by id. While the data are sorted by food_type, calculate the mean for each food type pooling across items of that food type and individuals. You can use the indexing ability in R and the mean() function to calculate the mean of the first 135 lines (food_type 1), the next 135 lines (food_type 2), etc to get the means for the four different food types. What are the means? Which one is smallest, which one is largest?

Finally, sort the data by individual id making all of each individuals data contiguous. Make sure to include your R output to show that you have done all of this successfully.

4. (5 pts) I would like you to write a function that will calculate a running sum for these two series. You will initialize each sum at the value of the first number in each series, then start the loop counter for each loop at 2 [Note that is a hint]. If you look at the two series, you will see that they diverge. One simple way to show that they diverge is to calculate the difference between the two series and show that the difference increases. I would like you to write a function that returns the running sum for each series (the string of sums, not just the final sum), and the running set of differences. You can then look at the differences and see that they increase. Use the following strings:

First: 1 2 3 5 4 3 6 4 3 5 7 7 9 8
Second: 2 4 5 8 7 10 10 11 11 14 17 18 21 24

Make sure to subtract First string from the second, so that the differences are positive.
There are many different ways to do this. Any one that gets the correct answer is ok.