BIA 652 Factor Analysis
Practical Multivariate Analysis, Afifi, et. al. Chapter 15
Overview of Class 12
• Recall Dimension Reduction and PCA – Chapter 14
• Factor Analysis – Chapter 15
• Ethics Cases
• HW (Hand In):15.1,2:
• Class 12 – Further discussion of Project & Test
• Project outline on CANVAS
• Dates: Test (in class) – 4/27; Oral Project – 5/4 & 5/11; Written Project – Monday following Oral.
• http://psych.colorado.edu/~carey/Courses/PSYC7291/ClassDa taSets.htm
• Wolves, Twins, UScrime
• Midges on CANVAS
2
Factor Analysis
(A special case of Structural Equation Modeling)
3
Goals
• Generalization of Principal Components Analysis
• Explain interrelationships among a set of
variables.
• Select a small number of factors to convey
essential information
• Perform additional analysis to improve
interpretation
• EFA vs CFA: http://www2.sas.com/proceedings/sugi31/200-31.pdf
4
Factor Model
• Start with P standardized variables
• Express each variable as a linear combination of
m common factors plus a unique factor • m << P. Ideally m is known in advance
5
Examples
• Fifty test scores
• Each is a function of m = 3 factors
• Verbal, quantitative, analytical skills
• CESD items
• Each response is a function of some factors of
depression
6
Model Equations
X1 = l11 F1 + l12 F2 + ... + l1m Fm + e1 X2=l21F1 +l22F2 + ... + l2mFm + e2
.. .
Xp=lp1F1 +lp2F2 + ... + lpmFm + ep
7
Terms
Xi= lijFj +ei
Fj = Common or latent factors ei = Unique factors
lij = Coefficients of common factors = Factor Loadings
8
Implications
• Variance of any original (X) variable is composed of:
• Communality: part due to common factors, and
• Specificity: part due to unique factor • Variance Xi = V (Xi ) = communality +
specificity •=hi2 +ui2
• = 1 when X’s are standardized
9
Assumptions
• Each V (Fj ) = 1
• Fj ‘s are uncorrelated
• Fj ‘s and ei ‘s are uncorrelated
10
Steps on Factor Analysis
• Initial factor extraction:
• Estimate the loadings and communalities
• Factor “rotations” to improve interpretation
11
Example
100 data points generated from five variables with multivariate normal distribution.
12
Example data model (known)
X1 = 1 * F1
X2 = 1 * F1
X3 = 0 * F1
X4 = 0 * F1
X5 = 0 * F1
+ 0 * F2 + e1
+ 0 * F2 + e2
+ 0.5 * F2 + e3 + 1.5 * F2 + e4 + 2 * F2 + e5
13
Example - Implications
• F1 , F2 and all ei ‘s are independent, normal variables
• Therefore: the first 2 X’s are inter-correlated, and the last 3 X’s are inter- correlated
• And: The first 2 X’s are not correlated with the last 3 X’s
14
Means and Correlations
Means: 0.163, 0.142, 0.098, -0.039, -0.013
Correlation Matrix
1.0
0.757
1.0
0.047
0.054
1.0
0.115
0.176
0.531
1.0
0.279
0.322
0.521
0.942
1.0
15
Steps on Factor Analysis
• Initial factor extraction:
• Estimate the loadings and communalities
• Factor “rotations” to improve interpretation
16
Initial Factor Extraction
17
Initial Factor Extraction
• Principal Component Factor Model
• Iterated Principal Component Factor Model
• Maximum Likelihood Model
18
Principal Component Factor Model
• Recall – Principal Component Model: • C = AX, PC’s are Functions of X’s
• We want: X’s = Functions of F’s
19
Basic Idea
• X1 , X2 – correlated Transform them into:
• C1 , C2 – uncorrelated
20
Inverse Model
• If C = 5X, then X = (1/5)C, or X = 5-1 C
• The Inverse of Principal Components
Model is X = A-1 C
• In this case, A is an orthogonal matrix, -1 T
Therefore: A = A and
X1 = a11 C1 + a21 C2 + ... + ap1 Cp .. .
Xm = am1 C1 + am2 C2 + ... + amp Cp
21
PC Factor Model Derivation
Xi =
Xi =
Xi=
Fj = Common or latent factors ei = Unique factors
aji Cj
aji Cj+ aji Cj (1,mandm,p)
ljiFj +ei
lij = Coefficients of common factors = Factor Loadings
22
Interpretation
• Var (C j ) = λj NOT 1
• Transform: Fj = Cj / λj 1⁄2
• Therefore: Var (Fj ) = 1
• And loadings are: lij = (λj 1⁄2 )(a ji )
23
Interpretation
lij is the correlation coefficient between variable i and variable j
24
In Previous Example
• Variances of the principal components are:
2.578, 1.567, 0.571, 0.241, 0.043
• Select m = 2 factors
25
Previous Example PC Method (p 386)
26
Reading the Output (p 385)
• The factor model:
X1 = 0.511F1 + 0.782F2 + ei
X2 = 0.553F1 + 0.754F2 + e2 ...
• Communality:
For X1 : h1 2 = 0.873
ForX2 :h2 2 =0.875
• Specificity = 1 - Communality
27
Implications
• 1st row: h1 2 = 0.87 = .512 + .782 , etc.
• 1st column: Variance Explained
= 2.58 = .512 + .552 + ... , etc.
• Variance Explained = eigenvalue
• Variance Explained = hi 2 = total variance
explained by common factors = 4.145 = 83% of total variance.
28
• •
Initial Factor Extraction Method 2 Iterated Principal Factors (IPF)
Select common factors to maximize the total communality
Use iterative procedure:
1. Getinitialcommunalityestimates
2. Use these (instead of original variances) to get
Principal Components and factor loadings
3. Getnewcommunalityestimates
4. If appreciable change, go to step 2,
5. Else, stop.
29
Example: IPF Method (p 388)
30
Comparison: PCF vs IFE
31
Steps on Factor Analysis
• Initial factor extraction:
• Estimate the loadings and communalities
• Factor “rotations” to improve interpretation
32
Factor Rotations
33
Factor Rotations
• Find new factors that are easier to interpret:
• For each X, some high factors, and some low.
• Varimax orthogonal rotation: Maximize
Var (lij 2 | Fj ) therefore vary lij within each factor
• Quartimax orthogonal rotation: Maximize Var (all lij 2 ) therefore vary all lij
34
Factor Diagram for Principal Component Extraction Method (p 387)
35
Orthogonal Rotation for Principal Component Factor Diagram(p 391)
36
Factor Diagram for IPF Extraction Method (p 361)
37
Orthogonal Rotation for IPF Factor Diagram (p 392)
38
Comparison
39
Varimax for Principal Components (p 393)
40
Varimax for IPF (p 393)
41
Oblique Rotations
• No longer require Orthogonality
• Most common: direct quartimin method
42
Direct Quartimin Oblique Rotation for Principal Component Factor Diagram (p 394)
43
Direct Quartimin Oblique Rotation for IPF Factor Diagram (p 395)
44
Comparison
45
Comparison Orthogonal vs Oblique Rotations
Advantages
Disadvantages
Orthogonal
• Factors Independent
• Communalities Preserved
• Interpretation slightly less clear
Oblique
• Better Interpretation
• Factors are Correlated
• Communalities Change
46
Example USCrime
PCA, Factor, Cluster
47
Example CESD See Page 398
48
Interpretation
Principal Component Analysis identified 5 PCs, but previous literature used 4 factors, not 5:
• Factor 1: loads heavily on items 1 – 7
• Factor 2: items 12 – 18 (somatic and retarded
activity)
• Factor 3: items 19 – 20 (interpersonal), plus
items 9 and 11 (positive-affect)
• Factor 4: no clear pattern: item 8 (positive
affect) has highest loading
49
Factor Scores
50
Factor Scores
• FA: each X = function of F’s
• Express each F = function of X’s.
51
Computing Factor Scores
• Recall Multiple Linear Regression: Y = A + B1 X 1 + B2 X 2 + ....
• B=Sxx-1Syx
• In Factor Analysis, target is:
F = A + B1 X 1 + B2 X 2 + ....
• B = Sxx -1 SFx
• Then Sxx and SFx is the column of Factor Loadings 52
Uses of Principal Component and Factor Scores
• 1st Principal Component Score can summarize several variables.
• Can be used as dependent or independent variable in other analyses
• Factor scores can be used as dependent or independent variables in other analyses
53
Caveats
• Number of Factors should be chosen with care – check default options
• There should be at least two variables with non-zero weights per factor
• If Factors are correlated, try Oblique Factor Analysis
• Results usually evaluated by “reasonableness to investigator” as opposed to formal tests
• Motivate theory, not replace it.
54
A Bit of Review
• Preparation of Data: Outliers, Null Values, etc.
• Regression – Simple and Multiple
• Discriminant Analysis
• Logistic Regression
• Cluster Analysis – Hierarchical, K-Means
• Dimension Reduction
• Principal Component Analysis
• Factor Analysis
55
A more realistic example Depression – Using SAS
56