CS计算机代考程序代写 A nonparametric measure of correlation

A nonparametric measure of correlation

• we have seen that Pearson’s correlation
coefficient

– measures only linear association
between variables

– can be greatly affected by outlying
values

• Spearman’s correlation coefficient is
designed to overcome these problems

• to calculate Spearman’s rho

– rank the x and y values separately

– calculate the usual (Pearson)
coefficient on the ranks

Example: The data on the diameter and
useable volume of wood is given below with
the ranks, calculated separately for each
variable.

1

diameter rank volume rank
36 15.0 192 15
28 10.5 113 11
28 10.5 88 10
41 20.0 294 20
19 3.5 28 4
32 13.0 123 12
22 6.0 51 6
38 17.0 252 18
25 8.5 56 7
17 1.5 16 1
31 12.0 141 13
20 5.0 32 5
25 8.5 86 9
19 3.5 21 2
39 18.5 231 17
33 14.0 187 14
17 1.5 22 3
37 16.0 205 16
23 7.0 57 8
39 18.5 265 19

2

• the Pearson correlation (on the original
values) is

MTB> corr c1 c2

Correlations: C1, C2

Pearson correlation of C1 and C2 = 0.976

• the Spearman correlation is the Pearson
correlation of the ranks

MTB > corr c3 c4

Correlations: C3, C4

Pearson correlation of C3 and C4 = 0.989

• the Spearman value is larger, reflecting
the curvature in the plot of the data

3

Example: The bottom right panel of the
figure showing various correlations was
dominated by one disparate value. The values
and their ranks are shown below.

x rank y rank

8 1 7 5

9 2 6 4

10 3 5 3

11 4 4 2

12 5 3 1

20 6 15 6

• the Pearson correlation is r = .79

• the Spearman correlation is r
s
= −.14

• the Spearman measure has
downweighted the unusual value

• when the two quantities are quite
different, it is important to investigate
whether there are unusual values or a
curved relationship

4