程序代写代做代考 Excel FES 844b / STAT 660b 2003

FES 844b / STAT 660b 2003

FES758b / STAT660b
Multivariate Statistics

Homework #5 – MANOVA / GLM
Due : Tuesday, 4/11/17 on CANVAS (by midnight)

HOMEWORK WORKED OUT FOR SAMPLE DATASET
The example below is JUST FOR YOUR PRACTICE.

NOTHING TO TURN IN HERE!

DANIELA.csv contain data collected by Daniela Cusack from three plantations. Each
plantation was divided into areas with homogenous overstory tree species of six types.
In class we looked at factors which predicted the number of individual saplings in each
of three height classes. The saplings were also classified in terms of the dispersal
mechanism associated with that species. The three dispersal mechanisms (the
response variables) were birds, mammals, or other (includes wind, water, bats, gravity).

SAS Results

1. Look at interaction plots between plantation and overstory species for each of the

dispersal mechanisms. Discuss what you see.

Here is SAS code for this :

*MUST SORT DATA TO GET INTERACTION PLOTS!!!;

PROC SORT DATA=IN.overstory; BY treatment plantation;

PROC MEANS DATA=IN.overstory;

VAR mammals birds other;

BY treatment plantation;

OUTPUT OUT=overMEAN;

RUN;

DATA OVERMEAN; SET OVERMEAN; IF _STAT_=’MEAN’;

RUN;

SYMBOL1 VALUE=O C=BLACK H=2 W=5 I=JOIN;

SYMBOL2 VALUE=O C=RED H=2 W=5 I=JOIN;

SYMBOL3 VALUE=O C=GREEN H=2 W=5 I=JOIN;

PROC GPLOT DATA=overmean;

PLOT mammals*treatment=plantation;

PLOT birds*treatment=plantation;

PLOT other*treatment=plantation;

RUN;

Here are results :

These plots suggest that there may be
an interaction between plantation and
overstory species (i.e. treatment). Also
suggests that there may not be much
of a treatment effect (i.e. overstory type
doesn’t affect dispersal rate). Also
seems like overall that the S plantation
has generally higher rates that the
other plantations.

2. Run MANOVA for these two categorical factors. Discuss your results, both
univariate and multivariate.

Here is Code and results for the GLM model (i.e. two-way MANOVA) :

proc glm data=in.overstory;

class treatment plantation;

model mammals birds other=plantation treatment plantation*treatment /

solution;

manova h=plantation treatment plantation*treatment;

run;

Pl a n t a t i o n P Q S

o t h e r

0

1

2

T r e a t me n t

Cb Ha T a Vf Vg Vk

Pl a n t a t i o n P Q S

b i r d s

0

1

2

3

4

5

T r e a t me n t

Cb Ha T a Vf Vg Vk

Pl a n t a t i o n P Q S

ma mma l s

0

1

2

3

T r e a t me n t

Cb Ha T a Vf Vg Vk

Results follow (lots of output).

Univariate Results : For mammals, there are differences between plantations, there
are no overall observed differences due to overstory treatment effect; however, there is
evidence on an interaction effect.

The individual coefficients suggest that for overstory species, the TA species if different
from VK (and perhaps other) species – might want to test this as an indicator variable.

Similar results are observed for birds. No TA species effect thing observed for ‘other’
dispersal mechanisms.

Multivariate Results : Overall, there are differences between Plantations (all
multivariate statistics are significant). For overstory species, only Roy’s Greatest Root
is significant, which suggest that there is a single direction in multivariate space that
shows differences between overstory treatment groups (since Roy’s Greatest Root only
tests the first eigenvalue which is associated with the direction of maximum
discrimination’). Most of the multivariate tests suggest there is an interaction effect
between Plantation and overstory Treatment.

Dependent Variable: mammals mammals

Sum of

Source DF Squares Mean Square F Value Pr > F

Model 17 55.6250000 3.2720588 2.79 0.0021

Error 54 63.2500000 1.1712963

Corrected Total 71 118.8750000

R-Square Coeff Var Root MSE mammals Mean

0.467928 83.78821 1.082264 1.291667

Source DF Type I SS Mean Square F Value Pr > F

Plantation 2 18.75000000 9.37500000 8.00 0.0009

Treatment 5 6.79166667 1.35833333 1.16 0.3410

Treatment*Plantation 10 30.08333333 3.00833333 2.57 0.0128

Source DF Type III SS Mean Square F Value Pr > F

Plantation 2 18.75000000 9.37500000 8.00 0.0009

Treatment 5 6.79166667 1.35833333 1.16 0.3410

Treatment*Plantation 10 30.08333333 3.00833333 2.57 0.0128

Standard

Parameter Estimate Error t Value Pr > |t|

Intercept 3.000000000 B 0.54113221 5.54 <.0001 Plantation P -2.000000000 B 0.76527652 -2.61 0.0116 Plantation Q -1.500000000 B 0.76527652 -1.96 0.0552 Plantation S 0.000000000 B . . . Treatment Cb -0.750000000 B 0.76527652 -0.98 0.3314 Treatment Ha -1.250000000 B 0.76527652 -1.63 0.1082 Treatment Ta -2.500000000 B 0.76527652 -3.27 0.0019 Treatment Vf -1.500000000 B 0.76527652 -1.96 0.0552 Treatment Vg -0.500000000 B 0.76527652 -0.65 0.5163 Treatment Vk 0.000000000 B . . . Treatment*Plantation Cb P 0.000000000 B 1.08226443 0.00 1.0000 Treatment*Plantation Cb Q 1.250000000 B 1.08226443 1.15 0.2532 Treatment*Plantation Cb S 0.000000000 B . . . Treatment*Plantation Ha P 2.250000000 B 1.08226443 2.08 0.0424 Treatment*Plantation Ha Q 0.000000000 B 1.08226443 0.00 1.0000 Treatment*Plantation Ha S 0.000000000 B . . . Treatment*Plantation Ta P 2.000000000 B 1.08226443 1.85 0.0701 Dependent Variable: mammals mammals Standard Parameter Estimate Error t Value Pr > |t|

Treatment*Plantation Ta Q 2.750000000 B 1.08226443 2.54 0.0140

Treatment*Plantation Ta S 0.000000000 B . . .

Treatment*Plantation Vf P 0.500000000 B 1.08226443 0.46 0.6459

Treatment*Plantation Vf Q 1.750000000 B 1.08226443 1.62 0.1117

Treatment*Plantation Vf S 0.000000000 B . . .

Treatment*Plantation Vg P -0.250000000 B 1.08226443 -0.23 0.8182

Treatment*Plantation Vg Q -0.500000000 B 1.08226443 -0.46 0.6459

Treatment*Plantation Vg S 0.000000000 B . . .

Treatment*Plantation Vk P 0.000000000 B . . .

Treatment*Plantation Vk Q 0.000000000 B . . .

Treatment*Plantation Vk S 0.000000000 B . . .

NOTE: The X’X matrix has been found to be singular, and a generalized inverse was used to solve

the normal equations. Terms whose estimates are followed by the letter ‘B’ are not

uniquely estimable.

Dependent Variable: birds birds

Sum of

Source DF Squares Mean Square F Value Pr > F

Model 17 124.5694444 7.3276144 3.07 0.0009

Error 54 128.7500000 2.3842593

Corrected Total 71 253.3194444

R-Square Coeff Var Root MSE birds Mean

0.491748 71.72615 1.544105 2.152778

Source DF Type I SS Mean Square F Value Pr > F

Plantation 2 42.19444444 21.09722222 8.85 0.0005

Treatment 5 8.73611111 1.74722222 0.73 0.6020

Treatment*Plantation 10 73.63888889 7.36388889 3.09 0.0036

Source DF Type III SS Mean Square F Value Pr > F

Plantation 2 42.19444444 21.09722222 8.85 0.0005

Treatment 5 8.73611111 1.74722222 0.73 0.6020

Treatment*Plantation 10 73.63888889 7.36388889 3.09 0.0036

Standard

Parameter Estimate Error t Value Pr > |t|

Intercept 4.000000000 B 0.77205234 5.18 <.0001 Plantation P -2.750000000 B 1.09184689 -2.52 0.0148 Plantation Q -2.750000000 B 1.09184689 -2.52 0.0148 Plantation S 0.000000000 B . . . Treatment Cb 0.250000000 B 1.09184689 0.23 0.8198 Treatment Ha -1.750000000 B 1.09184689 -1.60 0.1148 Treatment Ta -2.500000000 B 1.09184689 -2.29 0.0260 Treatment Vf -0.500000000 B 1.09184689 -0.46 0.6488 Treatment Vg -0.250000000 B 1.09184689 -0.23 0.8198 Treatment Vk 0.000000000 B . . . Treatment*Plantation Cb P -0.750000000 B 1.54410468 -0.49 0.6291 Treatment*Plantation Cb Q 2.000000000 B 1.54410468 1.30 0.2007 Treatment*Plantation Cb S 0.000000000 B . . . Treatment*Plantation Ha P 3.250000000 B 1.54410468 2.10 0.0400 Treatment*Plantation Ha Q 0.750000000 B 1.54410468 0.49 0.6291 Treatment*Plantation Ha S 0.000000000 B . . . Treatment*Plantation Ta P 3.750000000 B 1.54410468 2.43 0.0185 Dependent Variable: birds birds Standard Parameter Estimate Error t Value Pr > |t|

Treatment*Plantation Ta Q 3.750000000 B 1.54410468 2.43 0.0185

Treatment*Plantation Ta S 0.000000000 B . . .

Treatment*Plantation Vf P -0.750000000 B 1.54410468 -0.49 0.6291

Treatment*Plantation Vf Q 2.250000000 B 1.54410468 1.46 0.1509

Treatment*Plantation Vf S 0.000000000 B . . .

Treatment*Plantation Vg P 0.250000000 B 1.54410468 0.16 0.8720

Treatment*Plantation Vg Q -0.500000000 B 1.54410468 -0.32 0.7473

Treatment*Plantation Vg S 0.000000000 B . . .

Treatment*Plantation Vk P 0.000000000 B . . .

Treatment*Plantation Vk Q 0.000000000 B . . .

Treatment*Plantation Vk S 0.000000000 B . . .

NOTE: The X’X matrix has been found to be singular, and a generalized inverse was used to solve

the normal equations. Terms whose estimates are followed by the letter ‘B’ are not

uniquely estimable.

Dependent Variable: other

Sum of

Source DF Squares Mean Square F Value Pr > F

Model 17 26.12500000 1.53676471 2.46 0.0063

Error 54 33.75000000 0.62500000

Corrected Total 71 59.87500000

R-Square Coeff Var Root MSE other Mean

0.436326 99.86140 0.790569 0.791667

Source DF Type I SS Mean Square F Value Pr > F

Plantation 2 8.58333333 4.29166667 6.87 0.0022

Treatment 5 4.79166667 0.95833333 1.53 0.1950

Treatment*Plantation 10 12.75000000 1.27500000 2.04 0.0466

Source DF Type III SS Mean Square F Value Pr > F

Plantation 2 8.58333333 4.29166667 6.87 0.0022

Treatment 5 4.79166667 0.95833333 1.53 0.1950

Treatment*Plantation 10 12.75000000 1.27500000 2.04 0.0466

Standard

Parameter Estimate Error t Value Pr > |t|

Intercept 1.000000000 B 0.39528471 2.53 0.0144

Plantation P -1.000000000 B 0.55901699 -1.79 0.0792

Plantation Q -0.750000000 B 0.55901699 -1.34 0.1853

Plantation S 0.000000000 B . . .

Treatment Cb 1.000000000 B 0.55901699 1.79 0.0792

Treatment Ha -0.250000000 B 0.55901699 -0.45 0.6565

Treatment Ta -0.500000000 B 0.55901699 -0.89 0.3751

Treatment Vf 0.250000000 B 0.55901699 0.45 0.6565

Treatment Vg 1.000000000 B 0.55901699 1.79 0.0792

Treatment Vk 0.000000000 B . . .

Treatment*Plantation Cb P -0.250000000 B 0.79056942 -0.32 0.7530

Treatment*Plantation Cb Q -0.500000000 B 0.79056942 -0.63 0.5298

Treatment*Plantation Cb S 0.000000000 B . . .

Treatment*Plantation Ha P 1.000000000 B 0.79056942 1.26 0.2113

Treatment*Plantation Ha Q 0.000000000 B 0.79056942 0.00 1.0000

Treatment*Plantation Ha S 0.000000000 B . . .

Treatment*Plantation Ta P 1.250000000 B 0.79056942 1.58 0.1197

Dependent Variable: other

Standard

Parameter Estimate Error t Value Pr > |t|

Treatment*Plantation Ta Q 1.500000000 B 0.79056942 1.90 0.0631

Treatment*Plantation Ta S 0.000000000 B . . .

Treatment*Plantation Vf P -0.250000000 B 0.79056942 -0.32 0.7530

Treatment*Plantation Vf Q 1.000000000 B 0.79056942 1.26 0.2113

Treatment*Plantation Vf S 0.000000000 B . . .

Treatment*Plantation Vg P -0.750000000 B 0.79056942 -0.95 0.3470

Treatment*Plantation Vg Q -0.750000000 B 0.79056942 -0.95 0.3470

Treatment*Plantation Vg S 0.000000000 B . . .

Treatment*Plantation Vk P 0.000000000 B . . .

Treatment*Plantation Vk Q 0.000000000 B . . .

Treatment*Plantation Vk S 0.000000000 B . . .

NOTE: The X’X matrix has been found to be singular, and a generalized inverse was used to solve

the normal equations. Terms whose estimates are followed by the letter ‘B’ are not

uniquely estimable.

The GLM Procedure

Multivariate Analysis of Variance

Characteristic Roots and Vectors of: E Inverse * H, where

H = Type III SSCP Matrix for Plantation

E = Error SSCP Matrix

Characteristic Characteristic Vector V’EV=1

Root Percent mammals birds other

0.65908195 93.98 0.02545218 0.05607267 0.11657032

0.04220450 6.02 -0.16462472 0.10087871 0.01551757

0.00000000 0.00 -0.03795726 -0.03382330 0.12965599

MANOVA Test Criteria and F Approximations for the Hypothesis of No Overall Plantation Effect

H = Type III SSCP Matrix for Plantation

E = Error SSCP Matrix

S=2 M=0 N=25

Statistic Value F Value Num DF Den DF Pr > F

Wilks’ Lambda 0.57833466 5.46 6 104 <.0001 Pillai's Trace 0.43775243 4.95 6 106 0.0002 Hotelling-Lawley Trace 0.70128645 6.02 6 67.584 <.0001 Roy's Greatest Root 0.65908195 11.64 3 53 <.0001 NOTE: F Statistic for Roy's Greatest Root is an upper bound. NOTE: F Statistic for Wilks' Lambda is exact. Characteristic Roots and Vectors of: E Inverse * H, where H = Type III SSCP Matrix for Treatment E = Error SSCP Matrix Characteristic Characteristic Vector V'EV=1 Root Percent mammals birds other 0.37140765 80.86 -0.12906445 0.08720092 0.12616486 0.07431415 16.18 0.09056686 0.02757895 0.04952936 0.01360850 2.96 -0.06580182 0.07810298 -0.11076723 MANOVA Test Criteria and F Approximations for the Hypothesis of No Overall Treatment Effect H = Type III SSCP Matrix for Treatment E = Error SSCP Matrix S=3 M=0.5 N=25 Statistic Value F Value Num DF Den DF Pr > F

Wilks’ Lambda 0.66962536 1.50 15 143.95 0.1122

Pillai’s Trace 0.35342158 1.44 15 162 0.1335

Hotelling-Lawley Trace 0.45933030 1.56 15 93.113 0.0991

Roy’s Greatest Root 0.37140765 4.01 5 54 0.0036

NOTE: F Statistic for Roy’s Greatest Root is an upper bound.

Characteristic Roots and Vectors of: E Inverse * H, where

H = Type III SSCP Matrix for Treatment*Plantation

E = Error SSCP Matrix

Characteristic Characteristic Vector V’EV=1

Root Percent mammals birds other

0.96430157 83.05 0.02260670 0.06053308 0.11038347

0.12388040 10.67 -0.14514093 0.10390370 -0.03122217

0.07292461 6.28 -0.08725202 0.00210760 0.13221488

MANOVA Test Criteria and F Approximations for the Hypothesis

of No Overall Treatment*Plantation Effect

H = Type III SSCP Matrix for Treatment*Plantation

E = Error SSCP Matrix

S=3 M=3 N=25

Statistic Value F Value Num DF Den DF Pr > F

Wilks’ Lambda 0.42218474 1.75 30 153.31 0.0158

Pillai’s Trace 0.66910687 1.55 30 162 0.0449

Hotelling-Lawley Trace 1.16110658 1.97 30 111.02 0.0059

Roy’s Greatest Root 0.96430157 5.21 10 54 <.0001 NOTE: F Statistic for Roy's Greatest Root is an upper bound. 3. Perform (multivariate) contrasts to compare levels of a particular factor or combinations of factors. Discuss your results. To do this, we have to analyze this as a ONE WAY MANOVA with a combination treatment effect. The code below creates a new variable called trtcombine which just combines plantation code and treatment code. There are LOTS of possibilities here. I contrasted Plantations P and Q, and P and S. I contrasted overstory species Vf vs. Vg. Finally, I looked at the interaction between Vf and Vg between plots P and Q. data in.overstory; set in.overstory; trtcombine=trim(trim(plantation) || trim(treatment)); run; To run the actual model with contrasts use proc sort data=in.overstory; by trtcombine; proc glm data=in.overstory; class trtcombine; model mammals birds other=trtcombine; contrast 'P vs Q' trtcombine 1 1 1 1 1 1 -1 -1 -1 -1 -1 -1 0 0 0 0 0 0; contrast 'P vs S' trtcombine 1 1 1 1 1 1 0 0 0 0 0 0 -1 -1 -1 -1 -1 -1; contrast 'Vf vs Vg' trtcombine 0 0 0 1 0 -1 0 0 0 0 1 0 -1 0 0 0 0 1 0 -1 0; contrast 'Vf vs Vg PQ interaction' trtcombine 0 0 0 1 0 -1 0 0 0 0 -1 0 1 0 0 0 0 0 0 0 0; manova h=trtcombine; run; Lots of output, I leave interpretation to you. 4. Daniela also measured the amount of light and the density of forest litter on each plot. Fit a model that includes there covariates as predictors of the number of saplings associated with each dispersal mechanism. Use PROC PLOT to make plots to check for linearity. For Daniela’s data, we add the possible covariate effects of light and litter. The SAS code is PROC GLM DATA=IN.OVERSTORY; CLASS TREATMENT PLANTATION; MODEL MAMMALS BIRDS OTHER=TREATMENT PLANTATION TREATMENT*PLANTATION LIGHT LITTER / SOLUTION; MANOVA H=TREATMENT PLANTATION TREATMENT*PLANTATION LIGHT LITTER; RUN; The output shows that neither appears to have a significant univariate or multivariate effect on dispersion mechanism counts. MANOVA Test Criteria and Exact F Statistics for the Hypothesis of No Overall Light Effect H = Type III SSCP Matrix for Light E = Error SSCP Matrix S=1 M=0.5 N=22 Statistic Value F Value Num DF Den DF Pr > F

Wilks’ Lambda 0.91994901 1.33 3 46 0.2748

Pillai’s Trace 0.08005099 1.33 3 46 0.2748

Hotelling-Lawley Trace 0.08701677 1.33 3 46 0.2748

Roy’s Greatest Root 0.08701677 1.33 3 46 0.2748

MANOVA Test Criteria and Exact F Statistics for the Hypothesis of No Overall Litter Effect

H = Type III SSCP Matrix for Litter

E = Error SSCP Matrix

S=1 M=0.5 N=22

Statistic Value F Value Num DF Den DF Pr > F

Wilks’ Lambda 0.99465782 0.08 3 46 0.9693

Pillai’s Trace 0.00534218 0.08 3 46 0.9693

Hotelling-Lawley Trace 0.00537088 0.08 3 46 0.9693

Roy’s Greatest Root 0.00537088 0.08 3 46 0.9693

5. Check model assumptions by making a chi-square quantile plot of the residuals.

Modify your model as appropriate based on your findings.

I’ve already discussed how to make chi-square quantile plots. You just need the

residuals. The SAS code simply adds an OUTPUT option to PROC GLM

PROC GLM DATA=IN.OVERSTORY ;

CLASS TREATMENT PLANTATION;

MODEL MAMMALS BIRDS OTHER=TREATMENT PLANTATION TREATMENT*PLANTATION;

MANOVA H=TREATMENT PLANTATION TREATMENT*PLANTATION;

OUTPUT OUT=OUTSTAT RESIDUAL=RESIDUALA RESIDUALB RESIDUALC;

RUN;

%INCLUDE “C:\Documents and Settings\jon\My Documents\Classes\Multivariate

Articles Programs Resources\Software Programs\SAS programs\MULTNORM.SAS”;

%MULTNORM(VAR= RESIDUALA RESIDUALB RESIDUALC, DATA=OUTSTAT)

The resulting plot looks good – no evidence of serious departure from multivariate
normality.

SPSS Results

1. Look at interaction plots between plantation
and overstory species for each of the
dispersal mechanisms. Discuss what you
see.

To make interaction plots in SPSS (called Profile
Plots), use Analyze  Generalized Linear Model 
Multivariate. This will give you the plots and the model
needed for question two. Indicate that mammals, birds
and other the Dependent Variables and that Plantation
and Treatment are ‘Fixed Factors’.

Click on Plots, indicate that you’d like Treatment on
the Horizontal Axis and Plantation as Separate Lines,
then click ADD.

See SAS section for interpretation of results and
similar plots.

2. Run MANOVA for these two categorical factors. Discuss your results, both
univariate and multivariate.

3. Perform (multivariate) contrasts to compare levels of a particular factor or

combinations of factors. Discuss your results.

To make new
combination
treatment variable,
can do manually in
EXCEL, or in SPSS
use TRANSFORM
 COMPUTE and
use the
Concatenate
function : Make
sure you click here
and indicate that
the data type is
STRING
To run the
contrasts, click on the CONTRASTS button in SPSS when running Analyze  GLM.
However, this only gives UNIVARIATE contrasts. No multivariate contrasts available
that I know of – sorry!

4. Daniela also measured the amount of light and the density of forest litter on each
plot. Fit a model that includes there covariates as predictors of the number of
saplings associated with each dispersal mechanism.

In SPSS, just enter LIGHT and LITTER in the COVARIATES box in Analyze  General
Lineral Models. Results are discussed in SAS section.

5. Check model assumptions by making a chi-square quantile plot of the residuals.
Modify your model as appropriate based on your findings.

I’ve already discussed how to make chi-square quantile plots in
SPSS (see notes at the very end of Principle Components
Analysis).

In SPSS, when using Analyze  General Liner Models 
Multivariate, click on SAVE and then choose
UNSTANDARDIZED RESIDUALS

R Results (code only – for results see SAS section)

1. Look at interaction plots between plantation and overstory species for each of the
dispersal mechanisms. Discuss what you see.

Here is the R code : See SAS section above for results and interpretation.

#get the data

daniela=read.csv(“http://www.reuningscherer.net/stat660/data/Daniela.c

sv”, header=T)

#make an interaction plots

#this statement makes 4 plots per page

par(mfrow=c(2,2))

#this makes the plots

interaction.plot(daniela$Treatment,daniela$Plantation,daniela$mammals,

lwd=3,col=c(“red”,”blue”,”black”),xlab=”Species”,main=”Interaction

Plot for Mammals”)

interaction.plot(daniela$Treatment,daniela$Plantation,daniela$birds,

lwd=3,col=c(“red”,”blue”,”black”),xlab=”Species”,main=”Interaction

Plot for Birds”)

interaction.plot(daniela$Treatment,daniela$Plantation,daniela$other,

lwd=3,col=c(“red”,”blue”,”black”),xlab=”Species”,main=”Interaction

Plot for Other”)

2. Run MANOVA for these two categorical factors. Discuss your results, both
univariate and multivariate.

#fit linear model

mod1=manova(as.matrix(daniela[,8:10])~daniela$Treatment +

daniela$Plantation +daniela$Plantation*daniela$Treatment)

#get univariate results

summary.aov(mod1)

#get multivariate results

summary.manova(mod1)

summary.manova(mod1,test=”Wilks”)

3. Perform (multivariate) contrasts to compare levels of a particular factor or
combinations of factors. Discuss your results.

I’ll simply say here that contrasts in R are difficult and at present not pleasant. You can
see the comments I’ve made in the notes about contrasts – bu