统计学习导论+基于R应用.pdf
An Introduction to Statistical Learning
with Applications in R
Gareth J ames )
Witten)
Hastie)
Robert Tibshirani)
2015.6
An 1ntroduction to Statistical Learning: with Applications in R
1SBN 978-7-111-49771-4
1V. C8
01-2013-7855
Translation from English language edition: An lntroduction to Statistical Learning by Gareth
]ames , Daniela Witten , Trevor Hastie and Robert Tibshirani
Copyright ( 2013 Springer_ Verlag New York , Inc
Springer is a part of Springer Science+ Business Media
All rights Reserved
Science+ Business Media
)
x 260mm 1/16
978-7-111-49771-4
(010) 88378991 88361066 (010) 88379604
(010) 68326294 88379649 68995259
When we wrote An Introduction to we had a single goal: to make key
concepts in statistical machine learning accessible to a very broad audience. We are thrilled that
Professor Xing Wang has taken the time to translate our book into Chinese , so that these concepts
will be made accessible to an even broader audience. We hope that the readers of this Chinese
translation will find our book to be a useful and informative introduction to a very exciting and im-
portant research area.
Sincerely!
Gareth James , Daniela Witten , Trevor Hastie and Robert Tibshirani
(The Elements of Statistical Learning ,
V
(The Elements of Statistical Learn-
ing , ESL) Tibshirani ,
(An to Statistical Learning ,
Pallavi
Basu , Alexandra Chouldechova , Patrick Danaher , Will Fithian , Luella Fu , Sam Gross , Max
Grazier G’Sell , Courtney Qiao , Elisa
Xin Lu Tano
Berra
Gareth James
Daniela Witten
Trevor Hastie
Robert Tibshirani
1. 1
1. 2 …………………… 4
1. 3 ……………………… 4
1. 4 …………… 6
1. 5 ………… 6
1. 6 ………………… 8
1. 7 ……… 9
1. 8
1. 9
2. 1
2.2
2. 3
2.4 37
3. 1
3.4
3.5
3.6
3. 7
4. 1
4.2
4.3
4.4
4.5
4.6
4. 7
5. 1
5.2
5.3
5.4
3.3 6. 1
6.2
6.3
6.4
6.5
6.6 ……… 173
6. 7
6. 8
7. 1
7.2
7.3
7.4
7.5
7.6
7. 7
7.8
7.9
8. 1
8.2
8.3
8.4
9. 1
9.2
9.3
9.4 ………………… 246
9.5
9.6
9. 7
10.1
10.2
10. 3
10.4
10. 5
10.6
10.7
1. 1
1. 1. 1
10
iA;,,
+
Education Level
CCN
2003 2006 2009
Year
• 2
2003
(
Yesterday Two Days Previous Three Days Previous
1. 1.2
s –
TT
i
–
T
u 0
Q)
eo
Q)
Down Up
Today’s Direction
Down Up
Today’s Direction
Down Up
Today’s Direction
3 •
component)
–
Up
Today’s Direction
1. 1
Down
N
pE –
o 0
H
…..
u
Q)
1. 1.3
A A
-r4
. . . . . . . . .
-40 -20 0 20
ZI
CN
C
NN R
o
T
o
?
. . .. .
. . . .
2 .
• .
-40 -20 0 20 40
ZI
.. _. ..,.. •• A ..
o j -_-.- “_fÞ .•
1. _ • •
“”,- .. NO I ..
o
60 40 60
4
1.2
statistical
model)
Fried-
Tibshirani
1.3
of Statistical
1. 3 • 5
(An Statistical
6
1.4
1.5
Wage
=3
age =
iable Narne
1. 5 • 7
i = 1 , 2 ,… , n; j = L
n) , p) 0
&P
ZZZ 12···n ZZZ Il-
zzz
X
I il \
I X ,,., I I i2 I
X.
, • ,
, • ,
,
11
.
11 /E\
•••
mu
zzz /’!lllIll—llt\
J
X
= 3
x = (x1 x 2 •• • xp )
transpose = (X i1 X i2 … X ip )
nnn
zz
…
z
ZZZ
T X
8
y
Yl) , Y2) , …, f
-iqF-hphn
/’!lllIll—llt\
a
0 A E X s 0
E ‘Xd , B E dxs
A=(; :),
1 1 2 \ 15 6 \ 11 X 5 + 2 X 7 1 X 6 + 2 X 8 \ 1 19 22 \
AB=I 11 1=1 1=1 1
501
1.6
1. 7 • 9
1.7
-1
Auto
Boston
Caravan
Carseats
College
Defalut
Khan
NCI60
OJ
Portfolio
Smarket
USArrests
Wage
Weekly
10
1.8
http://www – bcf. usc. edu/ gareth/ISL/
1.9
2.1
TV
N
o
m-c-
o 10 20 30 40 50
Radio
300
variable) , sales output
12
predictor
variable) re-
dependent variable)
(X) ,
Y = j( X) + B (2. 1)
term)
systematic
of
of
. o . .. . . .. o . . . . . o ugszH . • .
ogooa
. . o
o
•• o
lqL 10
0
8U la
c u
-MM
0
14s
lM
e
l2Y
·
-11 lo
10 12 14 16 18 20 22
Years of Educatio
of of educa-
of
ty
2.1 • 13
of
of
2. 1. 1
inference
Y = j(X) (2.2)
box)
14
E ( Y – y) 2 = E [j( X) + e – j( X) ] 2
= [j(X) – j(X) r + Var( B) (2.3)
2.1 • 15
2.1.2
training data)
= 1 , 2 ,… , = 1 ,
(xj’Yj) , (X U Y2)’ … , (Xn ,yJ = (x ij ‘X i2 ‘ … ,Xip)T O
( 1
j(X) (2.4)
model)
+
ordinary least squares)
parametric)
16
felexible
incorne X X seniori ty
of
2.1 • 17
18
2.1.3
additive model ,
GAM)
(
Subset Selection
ii Lasso
Least Squares
Generalized Additive Models
Trees
Bagging , Boosting
hFOJ Support Vector Machines
Low High
Flexibility
2.1 • 19
2.1.4
i = 1,… ,
regression)
unsupervised)
( cluster
20
23 4A
+ + +
f:,
!? 0_
u
Cè:>(j)
+
o 2 4 6 8 10 12
X 1
A
N –I
o 2 4 6
X 1
supervised
2.1.5
the value of a
the price of a
B,
2.2 • 21
2.2
2.2.1
squared error , MSE)
MSE – J(x;) ) 2 (2.5)
(
( history of
Yl) , Y2) , …,
, f( x 2 ) , … ,
22
xo) = Yo 0
Ave (f (xo) – YO)2 (2.6)
MSE)
o 0
ho u fcq p
E m qp
N
CCP >
o 20 40 60 80 100 2 5 10 20
X Flexibi1ity
2.2 • 23
of
(
2.2.2
E (Yo – J(XO))2 = Var(f(xo)) + [Bias (f(xo)) r + Var( B) (2.7)
m.N
o
N
o
N –
o –
• 24
N
20 10
Flexbility
5 2
nu nu
80 60
X
40 20
mHCH
o
N
o
o
o
o
20 10
Flexibility
5 2
hu
80 60 40
X
20
(xO)) test MSE)
test
(xO))
2.2 • 25
Var( B)
m.N
o
qdD p
2 5 10 20
Flexibility
F‘,J
cpp q
o
cqp p
2 5 10 20
Flexibility
o
2 5 10 20
Flexibility
26
trade-off)
2.2.3
Yl) , …,
error rate) ,
(2. 8)
indicator variable)
training
test
(2.9)
Pr (Y = j I x = xo) (2. 10)
2.2 • 27
Y = 1 I X = xo) > O.
Y = orange I
decision
X j
– maxjPr( Y = j I X =
1 – E ( max Pr (Y = j I X) ) (2. 11)
( Y = jX = xo) < 1
28
H
0, 1304 0
=
KNN: K=10
X
1
29 • 2.2
KNN: K=100 KNN: K=1
KNN
K
0
] 0 l -;
-............
\ I - Training Errors I
Test Errors
0.01 0.02 0.05 1.00 0.50 0.20 0.10
lIK
30
2.3
http:// cran. r- proj ect. orgl
2.3.1
(inputl ,
input2)
() (
3 ,
c(1 , 3 , 2 , 5)
[1] 1 3 2 5
c (1, 6 ,2)
> x
[1] 1 6 2
> y = c (1, 4 ,3)
xvd fkrtnu hh1 +U+U nqunquvd ee+
18
> 15 ()
[1] “x” “y”
> rm(x , y)
character (0)
2.3 • 31
> rm (l ist=ls ())
rnatrix
>
> x=matrix(data=c(1 , 2 , 3 ,4) , nrow=2 , ncol=2)
> x
[ , 1] [, 2]
[1 ,] 1 3
[2 ,] 2 4
=
> x=matrix(c(1 , 2 , 3 ,4) , 2 , 2)
=
E U R Ti –w o r vu b , nJL , , qu , 4i
,
([ c -1
,
+b m”
> sqrt(x)
[, 1] [, 2]
[1 ,] 1. 00 1. 73
[2 ,] 1. 41 2.00
> x^2
[, 1] [, 2]
[1 ,] 1 9
[2 ,] 4 16
rnorrn
()
> x=rnorm(50)
>
> cor(x , y)
[1] 0.995
32
rnorrn
seed seed
> set.seed(1303)
> rnorm (50)
[1] -1.1440 1.3421 2.1854 0 . 5364 0.0632 0.5022 -0.0004
seed
> set.seed(3)
> y=rnorm (100)
> mean(y)
[1] 0.0110
> var(y)
[1] 0.7329
> sqrt(var(y))
[1] 0.8561
> sd(y)
[1] 0.8561
2.3.2
plot (x ,
plot
> x=rnorm(100)
> y=rnorm(100)
> plot(x , y)
> plot(x , y , xlab=”this is the x-axis” , ylab=”this is the y-axis” ,
main=”Plot of X vs Y”)
n e
“r dH PA—14 eo rc gvd(C 1
,
fi
(t·d dlel ppdl
u
off
seq (a ,
seq(O , 1 , =
(3:
2.3 • 33
> x=seq (1, 10)
> x
[1] 1 2 3 4 5 6 7 8 9 10
> x=1:10
> x
[1] 1 2 3 4 5 6 7 8 9 10
> x=seq(-pi , pi , length=50)
1. ,
2. ,
0
> y=x
> f=outer(x , y , function(x , y)cos(y)/(1+x-2))
> contour(x , y , f)
>
> fa=(f-t(f))/2
> contour(x , y , fa , nlevels=15)
irnage
persp
> image(x , y , fa)
> persp(x , y , fa)
>
> persp(x , y , fa , theta=30 , phi=20)
> persp(x , y , fa , theta=30 , phi=70)
> persp(x , y , fa , theta=30 , phi=40)
2.3.3
> A=matrix(1:16 , 4 ,4)
> A
[, 1] [, 2] [, 3] [, 4]
[1 , ] 1 5
[2 , ] 2 6
[3 , ] 3 7
[4 ,] 4 8
> A [2 , 3]
[1] 10
9 13
10 14
11 15
12 16
34
> A[c(1 , 3) , c(2 ,4)]
[, 1] [, 2]
[1 , ] 5 13
[2 , ] 7 15
> A[1:3 , 2:4]
[, 1] [, 2] [, 3]
[1 , ] 5 9 13
[2 , ] 6 10 14
[3 , ] 7 11 15
> A[1:2 ,]
[, 1] [, 2] [, 3] [, 4]
[1 , ] 1 5 9 13
[2 , ] 2 6 10 14
> A [ , 1: 2]
[, 1] [, 2]
[1 , ] 1 5
[2 , ] 2 6
[3 , ] 3 7
[4 , ] 4 8
> A [1 ,]
[1] 1 5 9 13
> A[-c(1 , 3) ,]
[, 1] [, 2] [, 3] [, 4]
[1 ,] 2 6 10 14
[2 ,] 4 8 12 16
> A[-c(1 , 3) , -c (1, 3 ,4)]
[1] 6 8
dirn
> dim(A)
[1] 4 4
2.3.4
read. table
te. table
Mac , U
table (
data frame) 0
fix
> Auto=read.table(“Auto.data”)
> fix(Auto)
2.3 • 35
table ( = T
header =
> strings=”?”)
> fix(Auto)
cav
> Auto=read.csv(“Auto.csv” , header=T , na.strings=”?”)
> fix(Auto)
> dim(Auto)
[1] 397 9
> Auto[1:4 ,]
dirn
orni t
> Auto=na.omit(Auto)
> dim(Auto)
[1] 392 9
> names(Auto)
[1] “mpg” “cylinders” “displacement” “horsepower”
[5] “weight”
[9] “name”
“acceleration” “year”
2.3.5
“origin”
> plot(cylinders , mpg)
Error in plot (cylinders , mpg) object’ not found
attach
> plot(Auto$cylinders , Auto$mpg)
> attach (Auto)
> plot(cylinders , mpg)
36
as. factor
> cylinders=as.factor(cylinders)
> plot(cylinders , mpg)
> plot(cylinders , mpg , col=”red”)
> plot(cylinders , mpg , col=”red” , varwidth=T)
> plot(cylinders , mpg , col=”red” , varwidth=T , horizontal=T)
> plot(cylinders , mpg , col=”red” , varwidth=T , xlab=”cylinders” ,
ylab=”MPG”)
hist
EU 4EA = s k a e r b
), =–1414 0O CC
PAP
‘
p
‘
mmm +U+U+U SSS ·l·-
-1
>>>
pairs
> pairs(Auto)
> mpg + displacement + horsepower + weight +
Auto)
()
> plot(horsepower , mpg)
> identify(horsepower , mpg , name)
summary
> summary(Auto)
mpg
Min. : 9.00
1 st Qu.: 17 . 00
Median :22.75
Mean : 23.45
3rd Qu.: 29 . 00
Max. : 46.60
horsepower
Min. : 46.0
1st Qu.: 75.0
Median : 93.5
Mean : 104.5
3rd Qu. :126.0
Max. : 230.0
t0004noo n-
…..
e851455 m605975 e11124-c··
………
.
a·· lunu PQaQ 5·ln· ·1ntdadz dlseera
M1MM3Mu nuhunun4nunu nununu7fnunu
s00040o r
…..
.
e344588 AU–
••••••••••
n -unu lQaQ vd··ln-cntdadx
-lseera M1MM3MU
weight acceleration
Min. :1613 Min. : 8.00
1st Qu.:2225 1st Qu . :13.78
Median :2804 Median :15.50
Mean : 2978 Mean : 15.54
3rd Qu.:3615 3rd Qu.:17.02
Max. :5140 Max. :24.80
37
year origin name
Min. :70.00 Min. :1 . 000 amc matador : 5
1st Qu.:73.00 1st Qu.: 1. 000 ford pinto : 5
Median :76.00 Median :1.000 toyota corolla : 5
Mean : 75.98 Mean : 1 . 577 amc gremlin 4
3rd Qu.: 79 . 00 3rd Qu . : 2 . 000 amc hornet 4
Max. : 82.00 Max. : 3.000 chevrolet chevette: 4
(Other) :365
> summary(mpg)
Min . 1 st Qu . Median Mean 3rd Qu . Max .
9 . 00 17 . 00 22 . 75 23 . 45 29 . 00 46 . 60
2.4
(
(
(
(
(
38
(
(
Obs. X j X2 X3
3
2 2
3 3
4 2
5 1
6 1 1
=X2 =X3
(
(
Y
Red
Red
Red
Green
Green
Red
of applications received)
of applicants accepted)
o of new students enrolled)
students from top 10% of high
school class)
students from top 25% of high
school class)
N umber of full- time undergraduates)
39
N umber of part – time undergraduates)
o tuition)
and board costs)
book costs)
personal spending)
of faculty with Ph. )
of )
ratio)
of alumni who )
Graduation rate)
( csv
> rownames (college) =college [, 1]
>
AE–, e e le ogb ce =14 e-i guo ec -irk –z cf >>
( c) (
A [, 1: 10
> Elite=rep(“No” , nrow(college))
> te [college$Top10perc >50] =” Yes”
> Elite=as.factor(Elite)
> college =data. frame (college , Eli te)
(
(rnfrow=c (2 ,
40
(
(
(
(
(
> library(MASS)
> Boston
> ?Boston
(
(
(
regression)
sales radio
(1
(3
42
3.1
linier
(3. 1)
Y on
X)
Sales X TV
Y
3. 1. 1
= 1,… , n ,
ei =Yi
residual
43 • 3.1
of
RSS = + + … +e:
RSS = +… + (3.3)
L
L (X i – x)
(3.4)
n i=i
squares coefficient
= 0.047 50
C’I
o
C’I
300 250 200 150
TV
nu
50
3.1.2
44
2 =
2
D
2 1
5 6 7
(3.5)
regression line)
least
squares line) (3.
y = 2 + 3X + B (3. 6)
=
2
45 •
/
3.1
m
o
– ‘
h
o
o
o 1 2 -2 -1 0
x x
=2
2 -2
systematically
= (3.7)
(3.8) =
L 2 l’U L
@
46
=
error) =
IRSS/(n
interval) 0 95
. ) (3.9)
[131 – ] (3. 10)
. (3. 11)
130 ,
[ O. 042 , O. 053
6
hypothesis) :
Ho:
alternative
=
(3. 12)
(3. 13)
3.1 • 47
– 0
(3.14 )
)
t 0
reject the null hypothesis)
TV
7.0325
0.047 5 0.0027
15.36
17.67
Intercept <0.000 1 <0.000 1 3.1.3 13) the extent to which the model fits the standard error , @ 48 = = - y;) 2 ''1 n - L ''1 n - L i=i (3. 15) 1. RSS = L (Yi - y;)2 (3.16) 3 260/14000 = of = 1 ,… , Yi Yi proportion) R2 = TSS - RSS RSS TSS TSS (3. 17) TSS = L (Yi _ Y ) sum of squares) TSS - (proportion of variability in Y that can be explained using X) 0 RSE 3.2 • 49 E: (x i - x) Cor(X , Y) = (3.18) JI J I 2 R2 = r2 3.2 Intercept 9.312 0.563 16.54 <0.000 1 0.203 0.020 9.92 <0.000 1 Intercept 12.351 0.621 19.88 <0.000 1 newspaper 0.055 0.017 3.30 <0.000 1 50 (3.5) (3.19) all other predictors X TV X radio + B (3.20) 3. 2. 1 Y (3.21) RSS = L (Yi _ y;)2 = L 2 (3.22) Y 3.2 • 51 Intercept 2.939 0.311 9 9.42 <0.000 1 TV 0.046 0.0014 32.81 <0.000 1 Radio 0.189 0.0086 21. 89 <0.000 1 newspaper -0.001 0.0059 -0.18 0.8599 newspaper TV , TV newspaper sales TV 0.0548 0.0567 0.7822 0.3541 0.5762 newspaper 1 0.2283 sales 1 52 3.2.2 ( 1 = 0 (TSS - RSS)/ F=P(3.23) RSS/(n - ) - y) El RSS/(n - f El (TSS - RSS)/pf E 1 (TSS - RSS) /p f @ 3.2 • 53 R2 1. 69 0.897 570 23 (RSSo - RSS)/q F (3.24) SS/( n - ) least one tors is = ) >
high –
@
54
which)
information criterion , Bayesian information cri-
terion ,
= 1 073 741
3.2 • 55
897
681
686
RSE = A RSS (3.25)
yn-p-l
interaction)
21
(1
56
Radio
squares plane)
true population regression plane)
f(X)
confidence
model
error) 0 in-
985 , 11 528
930 ,
@
3.3 •
3.3
3.3.1
quantitative )
qualitative) 0
t
age cards education incorne
lirnit
ethnici ty
g ; ; l | i i i j i l l I … I J I J l i
[ i I i ‘. :i: !
iU! j l | : ; : j U j i j i |
age , cards ,
come ,
57
58
factor)
r1
X. <
LO
(3.26)
8 + ++
8 + Z + (3. 27)
Intercept
gender
509.80
19.73
33.13
46.05
15.389
0.429
<0.0001
0.6690
r1
X. <
L - 1
+ Bi
Yi + Bi = {
+ Bi
ethnicity
3.3 • 59
FEI-EL
Z (3.28)
r1
LO
(3.29)
88 ++ 12z +++
....
8 + + Z + (3.30)
base-
= 0
Intercept
ethnicity
53 1. 00
-18.69
-12.50
46.32
65.02
56.68
11. 464
-0.287
-0.221
<0.000 1
0.7740
0.8260
contrast)
3.3.2
60
X
J
0
+ B
interaction
term)
X2 + B (3. 31 )
31
+ B
+ B (3. 32)
ers
ts = 1. 2 + 3.4 X lines + O. 22 X workers + 1. 4 X (lines X workers)
= 1. 2 + (3.4 + 1. 4 X workers) X lines + O. 22 X workers
+ 1. 4 X
sales X TV X radio X (radio X TV) + B
X radio) X TV X radio + B (3.33)
3.3 • 61
X
-89.7)/(100 -89.7)
radio) X 1 000 = 19 +
1. 1 X TV) X 1 000 =
29 + 1. 1 X
Intercept
TV
radio
TV X
6.7502
0.0191
0.0289
0.001 1
0.002
0.009
0.000
27.23
12.70
3.24
20.73
<0.000 1
<0.000 1
0.0014
<0.000 1
X incorne i +
LO
= þ) + { (3.34)
62
2 f .‘ D T P
sE E
E E 2 E
FCCE E 4 2 J
50 100 150 50 100 150
Income Income
+ X
X incorne i
(3.35)
e m o c × + , ...
..
+ e m o c × + ~~ e c a -i a b
polynomial
power
rnpg X horsepower X horsepower
2
+ B (3.36)
= horsepower ,
X 2 = horsepower
2
3.3 • 63
o
o
100 150
Horsepower
horsepower )
50 200
Intercept 56.900 1
-0.4662
0.0012
1. 800 4
0.031 1
0.0001
31. 6
-15.0
10.1
<0.000 1
<0.000 1
<0.000 1
horsepower
horsepower 2
regression)
3.3.3
of 0
(3 non- constant variance of eITor
outlier) 0
(5)
residual
= Yi -
fitted))
• 64
Residua1 Plot for Quadratic Fit
mlo--m--
Residua1 Plot for Linear Fit CNmHO-
cmlo--m-l
155 0
35 20 25 30
Fitted Va1ues
15 30 10 15 20 25
Fitted Values
65 • 3.3
0.0
mN-OHlmul
80 60
0.5
40 20
100 80
jfiJ:ljT
I ....0 J 0 , 0 0 - 0 V nO ....000 f"'l_ I .... 0
, ,
60
p= 0.9
40 20
m.-m.OMU.Olm.Hl
100 80 60
Observation
40 20
66
VAR(e;)
shape)
heteroscedasticity
Response Y Response log
3
:=: cF= a
iEi T
o
:=: 6710
OCC E
10 IS 20 2S 30 2.4 2.6 2.8 3.0 3.2 3.4
Fitted Values Fitted Values
=
67 • 3.3
o 0 0
'6
0 V o
-2 0 2 4 6
Fitted Values
20 0
'"
:-s! 0 0 0
08 0 00
I Io 0 0 0
-2 0 2 4 6
Fitted Values
200
þ.., N
o
-2 -1 2 o 1
X
0.000.050.100.150.200.25
Leverage
410
020
/
…,
?48J
0
41 0
v
O
-2 -1 0 1 2 3
X
o
o
2 O
X]
-1 4
• 68
h. = _!_ + 2 =
n L 2
(3.37 )
lirni
/
D
o
o 0 o 0
1"'\ 0 Q
0 0 e
o 0 o 0
2000 4000 6000 8000 12000 2000 4000 6000 8000
Limit Limit
12000
3.3 • 69
N
<'"l
o
0.16 0.17 0.18 0.19 -0.1 0.0 0.1 0.2
Credi
lirni
lirnit
70
Credi
Intercept 43.828 -3.957 <0.0001
Model1 age -2.292 0.672 -3.407 0.0007
0.173 0.005 34.496 <0.0001
Intercept -377. 537 45.254 -8.343 <0.0001
Mode12 2.202 0.952 2.312 0.0213
0.025 0.064 0.384 0.7012
multicollinearity)
inflation factor , VIF) 0 VIF
VIF(ß) =
01 ,
t worthiness 0
3.4
( 1
)
=
• 71
3. 1.
(3
1.
0.049) , 172 , 0.206) ,
- 0.013 , 0.011) 0
1.145 ,
21
Y=j(X)
(7
72
3.5
parametric
K- nearest neighbors regression) ( 0 K
0
f(xo)
bias- variance trade- off)
3.5 • 73
H >
-1.0 -0.5 0.0 0.5 1.0
X
þ…, N
-1.0 -0.5 0.0 0.5 1.0
X
?
K=
0
=
=
p
74
(“‘l
-1.0 -0.5 0.0 0.5 1.0
X
_0/
E
CZl
ã 8l
(1) _:
::E ‘-‘
0.2 0.5 1.0
lI K
,
rV’3 2| / crn p
Fd
3S uE m 3
CvP CCC2 E 2
-1.0 -0.5 0.0 0.5 1.0 0.2 0.5 1.0
X l/K
HV3 o
GeeD
2
3
; | /
CFD d
E
cCO = E
-1.0 -0.5 0.0 0.5 1.0 0.2 0.5 1.0
X l/K
= 1 =9
3.6 • 75
p=l
G,·E ·4 q,·= ·4
2
=
2
p
p=2 p=3 p=4
16 /|iv
o
————–1 0 0 …..…
o
o
o
o
0.2 0.5 1.0 0.2 0.5 1.0 0.2 0.5 1.0 0.2 0.5 1.0 0.2 0.5 1.0 0.2 0.5 1.0
1/K
neigh-
curse of
3.6
library )
cunn quTL AAnb MUT4 (( V
M
Vu
rr aa rr bb -1·-1414
stall
instal l. packages
76
3.6.2
age
lstat
> fix(Boston)
> names(Boston)
[1] “crim” “indus” “chas” “nox” “rm” “age”
[8] “dis” “rad” “tax” “ptratio” “black” “lstat” “medv”
Boston o
(y x , data)
>
Error in eval(expr , envir , enclos) : Object “medv” not found
>
> attach
>
> lm.fit
Call:
lm(formula = medv lstat)
t5 aQd +U-Shu —
s nt5 ep& ie4 cc3 ·1r 4iaLV ft en oI
> summary(lm.fit)
Call:
lm(formula = medv lstat)
Residuals:
Min 1Q Median 3Q Max
-15.17 -3.99 -1. 32 2 . 03 24 . 50
Estimate Std . Error t value PrC>ltl)
(Intercept) 34.5538 0.5626 61.4 <2e-16 ***
lstat -0 . 9500 0.0387 -24 . 5 <2e-16 ***
Signif. codes: 0 *** 0.001 ** 0 . 01 * 0.05 . 0 . 1
3.6 • 77
Residual standard error: 6.22 on 504 degrees of freedom
Multiple R-squared: 0.544 , Adjusted R-squared: 0.543
F-statistic: 602 on 1 and 504 DF , p-value: <2e-16
1m.
> names(lm.fit)
[1] “coeff icients” “residuals ” “effects”
[4] “rank”
[7] “qr”
[10] “call”
> coef (lm.fit)
“s n14H FDel –ve Bed s-4o axm s eH ul la au vd
-l
ds” ees trm t·r Ife fdt
(Intercept) lstat
34.55 -0.95
> confint(lm.fit)
2.5 % 97.5 %
(Intercept) 33.45 35.659
lstat -1.03 -0.874
predict
> predict (lm. fit , data. frame (l stat=(c (5 , 10 , 15))) ,
fit lwr upr
1 29.80 29.01 30.60
2 25.05 24.47 25.63
3 20.30 19.73 20.87
> predict(lm.fit , data.frame(lstat=(c(5 , 10 , 15))) ,
)
fit lwr upr
1 29.80 17.566 42.04
2 25.05 12.828 37.28
3 20.30 8.078 32.53
47 , 25. 63)
>v et m-1 +U-am +U14 14e t·-014 pa
ab1ine (a , b)
> abline(lm.fit , lwd=3)
> abline(lm.fit , lwd=3 , col=”red”)
> plot(lstat , medv , col=”red”)
78
))
nJhH·· ==44 PAPAC ,,
PA
VV
,
ddo eenJh mm·-,,
4i
+U+u
,
aao ss–+U+U+U 000 141414 >>>
(
p10t
par (mfrow = c (2 ,
, nJ& ct =
-l
w44 0· rm ro a14 p
‘
PA
> plot(predict(lm.fit) , residuals (lm.fit))
> plot(predict(lm.fit) , rstudent(lm.fit))
+b -1 44
)-1Jm +b14 84S
e
mu -414 sv et ua vz ta am h-tc lh P&WKU
7t
>>
3
which. max
3.6.3
(y xl + x2 +
(
>
> summary(lm.fit)
Call:
lm(formula = medv lstat + age , data =
Residuals
Max
-15.98 -3.98 -1. 28 1. 97 23.16
Coefficients:
Estimate Std. Error t value Pr(>ltl)
(Intercept) 33.2228 0.7308 45.46 <2e-16 ***
lstat -1.0321 0.0482 -21.42 <2e-16 ***
age 0.0345 0.0122 2.83 0.0049 **
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 6 . 17 on 503 degrees of freedom
Multiple R-squared: 0 . 551 , Adjusted R-squared: 0.549
F-statistic: 309 on 2 and 503 DF , p-value: <2e-16
3.6 • 79
> , data=Boston)
> summary(lm . fit)
Ca11:
1m(formu1a = medv . , data = Boston)
Residua1s:
Min 1Q Median 3Q Max
-15.594 -2.730 -0.518 1 . 777 26.199
Coefficients:
Estimate Std . Error t va1ue Pr(>ltl)
(Intercept) 3.646e+01 5 . 103e+00 7.144 3 . 28e-12 ***
cr l. m -1. 080e-01 3.286e-02 -3.287 0.001087 **
zn 4.642e-02 1. 373e-02 3.382 0.000778 ***
indus 2.056e-02 6.150e-02 0.334 0.738288
chas 2.687e+00 8.616e-01 3.118 0.001925 * *
nox -1.777e+01 3.820e+00 -4.651 4.25e-06 ***
rm 3.810e+00 9.116 < 2e -16 ***
age 6.922e-04 1. 321e-02 0.052 0.958229
dis -1.476e+00 1.995e-01 -7.398 6.01e-13 ***
rad 3.060e-01 6.635e-02 4.613 5.07e-06 ***
tax 3.761e-03 -3.280 0.001112 **
ptratio -9.527e-01 1.308e-01 -7.283 1 . 31e-12 ***
b1ack 9.312e-03 2.686e-03 3.467 0.000573 ***
lstat -10 . 347 < 2e -16 ***
Signif. codes : 0 C ***' 0.1 C , 1
Residua1 standard error: 4.745 on 492 degrees of freedom
Mu1tip1e R-Squared: 0.7406 , Adjusted R-squared : 0.7338
F-statistic: 108.1 on 13 and 492 DF , p-va1ue: < 2.2e-16
summary (1m. f i t ) $ r. summary(lm fit) vif
> 1ibrary(car)
> vif (lm.fit)
cr l. m indus chas rm age
1. 79 2.30 3.99 1. 07 4 . 39 1. 93 3 . 10
dis rad tax ptratio b1ack lstat
3.96 7.48 9.01 1. 80 1. 35 2.94
>
> summary(lm.fit1)
80
> lm.fit1=update(lm.fit ,
3.6.4
x
+ age + lstat:
>
Call:
lm(formula = medv lstat * age , data = Boston)
Residuals:
Min 1Q Max
-15.81 -4.04 -1. 33 2.08 27.55
Coefficients:
Estimate Std. Error t value Pr(>ltl)
(Intercept) 36.088536 1. 469835 24.55 < 2e-16 ***
lstat -1. 392117 0.167456 -8.31 8.8e-16 ***
age -0.000721 0.019879 -0.04 0.971
lstat:age 0.004156 0.001852 2.24 0.025 *
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' , 1
Residual standard error: 6.15 degrees of freedom
Multiple R-squared: 0.556 , Adjusted R-squared: 0.553
F-statistic: 209 on 3 and 502 DF , p-value: <2e-16
3.6.5
1m (X^2)
> )
> summary(lm.fit2)
Call:
lm(formula = medv lstat + I(lstat-2))
Residuals
Min 1Q Max
-15.28 -3.83 -0.53 2.31 25.41
Coefficients:
Estimate Std. Error t value Pr(>ltl)
(Intercept) 42.86201 0.87208 49.1 <2e-16 ***
lstat -2.33282 0.12380 -18.8 <2e-16 ***
I (l stat-2) 0.04355 0.00375 1 1. 6 <2e-16 ***
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' , 1
Residual standard error: 5.52 on 503 degrees of freedom
Multiple R-squared: 0.641 , Adjusted R-squared: 0.639
F-statistic: 449 and 503 DF , p-value: <2e-16
3.6 • 81
>
> anova(lrn.fit , lrn.fit2)
Analysis of Variance Table
Model 1: rnedv lstat
Model 2: rnedv lstat + I(lstat-2)
Res.Df RSS Df Sum of Sq F Pr(>F)
504 19472
2 503 15347 1 4125 135 <2e-16 ***
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' , 1
) ct =-l 0· rm f1 ro a--PP
(
> , 5))
> surnmary(lrn . fit5)
Call:
lm(formula = medv poly(lstat , 5))
Residuals:
Min 1Q Median
-13.543 -3.104 -0.705
3Q Max
2.084 27.115
Coefficients:
****** ****** ******
)666765 l1444444nunvn4 lleeeeeo >
222140
(-r
<<< 31o e042989 +U r255555 0311111 r· E055555 d +u qu t362555 a542042 m· .... t256221 ru-- 12345 ))))) +b+b+U+U+U+U PAaaaaa ettttt CSSSSE r---14141414 tvdvdvdvdvd n14141411 Ti00000 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' , 1 Residual standard error: 5.21 degrees of freedom Multiple R-squared: 0.682 , Adjusted R-squared: 0.679 F-statistic: 214 and 500 DF , p-value: <2e-16 82 > )
3.6.6
> fix(Carseats)
> names(Carseats)
[1] “Sales” “CompPrice”” Income” “Advertising”
[5] “Population” “Price” “ShelveLoc” “Age”
[9] “Education” “Urban” “US”
veloc
>
> summary(lm.fit)
Call:
lm(formula = Sales . + + Price:Age , data =
Carseats)
Residuals:
Min 1Q Median 3Q Max
-2.921 -0.750 0.018 0.675 3.341
Coefficients:
Estimate Std. Error t value Pr(>ltl)
(Intercept) 6.575565 1.008747 6.52 2.2e-10 ***
CompPrice 0.092937 0.004118 22.57 < 2e-16 ***
Income 0.010894 0.002604 4.18 3.6e-05 ***
0.070246 0.022609 3.11 0.00203 **
Population 0.000159 0.000368 0.43 0.66533
Price 0.007440 < ***
ShelveLocGood 4.848676 0.152838 31.72 < 2e-16 ***
ShelveLocMedium 1.953262 0.125768 15.53 < 2e-16 ***
Age -0.057947 0.015951 -3.63 0.00032 ***
Education -0.020852 0.019613 -1. 06 0.28836
UrbanYes 0.140160 0.112402 1. 25 0.21317
USYes -0.157557 0.148923 -1. 06 0.29073
Income:Advertising 0.000751 0.000278 2.70 0.00729 **
Price:Age 0.000107 0.000133 0.80 0.42381
Signif codes: o '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' , 1
Residual standard error: 1.01 degrees of freedom
Multiple R-squared: 0.876 , Adjusted R-squared: 0.872
F-statistic: 210 and 386 DF , p-value: <2e-16
3.6 • 83
> attach(Carseats)
> contrasts (ShelveLoc)
Good Medium
nunu nU4Anu
m u
dod aoe
R
veLoc-
3.6.7
> LoadLibraries
Error: obj ect ‘LoadLi braries found
> LoadLibraries()
Error: find
> LoadLibraries=function(){
+ library (I SLR)
+ library(MASS)
+ print(“The libraries have been loaded.”)
> LoadLibraries
library (I SLR)
library(MASS)
print(“The libraries have been loaded.”)
}
> LoadLibraries()
[1J “The libraries have been loaded.”
84
3. 7
X) = GPA , X 2 = IQ , X3 = Gender
X4 = Xs
=20 ,
=
(
i
( /
=
( +
(
(
Yi =
(3.38)
Yi =
0
85
( horsepower
(
i
mpg
(
(
( mpg
(
i
iii. year
(
(
(
( O?
(
(f)
(
(
86
> set.seed(l)
> x=rnorm(100)
> y=2*x+rnorm(100)
(
(y x +
(c)
=
z
ny–M
1.
2
(
(
=
( =
seed
(
0 , 0.25)
(
y = – 1 + O. 5X + B (3.39)
87
(
(
(
( a)
( a)
(
> set.seed(1)
> x1=runif(100)
> x2=0.5*x1+rnorm(100)/10
> y=2+2*x1+0 . 3*x2+rnorm(100)
(b)
(
= 0
=
(
(
(f) (c)
(
> x1=c(x1 , 0.1)
> x2=c(x2 , 0.8)
> y=c(y , 6)
88
(
=Ü?
(
+ B
,
L
,
4.1
(3
y ,), Y2)’ …,
?
overdose seizure
,\, stroke
>'” i 2 , drug overdose
t3″ epileptic, seizure
“‘,
fE drug
seizure
+” default
0666-VOOOON
J
4.2
(1. epileptic seizure
r “, l2. stroke
drug overdose
r “, [0 , stroke
” lj , drug
overdose
1]
4.3
h
500 1000 1500 2000 2500 0 500 1000 1500 2000 2500
Balancc Balancc
[1]4.2 èefault
4.3
92
Pr(default = Yes! balance)
Pr(default=Yes I JlJ
p(balance)
p(balance)
4.3.1
. (4.1)
=
<0 , >1
10
gistic function) ,
p(X) (4.2)
e–
maximum
= e!’w (4.3)
1 – p(X)
p(X)/[ 1 – p(X)
0.2
1 -0.2
O. 9
(
logl 1=
4.3
(4.4)
)
3)
4.3.2
likelihood function)
p(x,) n (l-p(x,.)) (4.5)
i’:yó’ =0
=
O. 005
Intercept
balance
l
10.6513 0.3612 -29.5
0, 005 5 0.000 2 24 , 9
<0.0001 <0.000 1 93 94 ) 4.3.3 1 + e!.+ß,X O. 586 , 58. 32 -e…+iq~e s) YN tt na ee u ss ss ee uu aa ff ee dd b student [Ye$] Q. 404 9 -49.55 3.52 <0.000 1 0.0004 4.3.4 log( (4.6) \1-p(X)1 p(X) (4.7) 1 + 4.3 • 95 income 00 income , [Yes] Intercept balance student (YesJ t 10.8690 -22.08 O. 005 7 0.000 2 24.74 Q. 003 0 O. 008 2 O. 37 -0.6468 0.2362 -2.74 <0.000 1 <0.000 1 Q. 711 5 0.0062 EE 5 II I H 500 1000 1500 2000 No y" Credit Card Balance Status 96 p(X) =0.058 1 + _ 8 ><0 p(X) = .- = o. 1 + e (4.8) (4.9) 4.3.5 overdose seizUre Y = stroke I - Pr( Y = stroke IX) - Pr( Y = drug overdose 4.4 I X (1 (3 4.4.1 4.4 density h( Y = k I X = x) (4.10) I 4.4.2 = \ ç-f } P.(x) -_ (4.12) expl- \ '" t"1 I } 141 mz m N -4 -2 0 2 4 0 2 3 4 '1T\,"" (4. 15) iiikj 1TA =nk/n (4.16) .î. 0' = function) ( 4‘ 17) 4.4 4.4.3 ...• P Cov(X) = f(x) - ; (4.18) (2,,)"" 1 ;E 100 8k (x) (4.19) -1 J..t1 (4.20) H N " H 4 -2 2 4 4 -2 2 4 X, X, =20 (1 4.4 p=4 , = 75. 9644 252 9896 23 81 104 9667 333 10000 x Pr(default I X = .) > 0.5 (4.21)
Pr( I X =.) > 0.2 (4.22)
• 102
9432 138
235 195 430
9667 333
M
AV
dF M H
AF
F
aF
AF
AV
AF
AV
aF
AF
J ,
. ,
0.4 0.2 0.3 0.1 0.0
operating
under the ROC
crnve , Lil–
4.4
ROC Curve
q
B I I f
2
0.0 0.2 0.4 0.6 0.8 1.0
FaIse posìtive
N
P
w P’
FP/N
TP/P
TP/p.
1
104
4.4.4
analysis.
+
+ 1
N N
? N
T
-4 -2 2 4
X, X,
k! =
4.5 105
4.5
p::
x =x
llJ (4.13)
g(Pi(Z)l E iPAZ) J = Co + C]X (4.24) 1 – p, (x)
1og( (4.25)
– Pl’
P >
cross-
SCENAR103 SCENARI02 SCENARIOl
6
106
m
N
SCENARI06
EUK
004-10
SCENARI04
R
O
004-11
4.6 <$0 107
x; , X(
4.6
4.6. j
> (I5LR)
> names
[11 “Year” “Lagl” “Lag2”
[6] “Lag5” “Volume” “Today”
> dim(Smarket)
(1] 1250 9
> summary(Smarket)
“Lag3” “Lag4”
Year
Min , :2001 Min. :-4.92200
1st Qu.:-0.63950 1s t Qu.:-0.S3950
Median : 2003
Mean :2003
3rd Qu.: 2004
Max. :2005
Median 0.03900 Median 0.03900
0.00383 Mean 0.00392
3rd Qu.: 0.59675 3rd Qu.: 0.59675
Max. Max. 6.73300
Lag3 Lag4 Lag5
Min , : -4.92200 Min. : -4 , 92200 Min. : -4.92200
Qu.:-O.64000 1st Qu.:-O.64000
0.03850 Median 0.03850
0.00172 Mean 0.00164 Mean 0.00561
3rd
Max. 5.73300
3rd Qu.: 0.59675 3rd Qu.: 0.59700
Max. 5 , 73300 Max. 6.73300
Volume
M1n. :0.356
:1. 257
: 1 .423
Mean :1.478
3rd Qu.:l.642
> (Smarket)
Today
: -0 , 63950
Q.03850
Mllan 0.00314
Qu.: 0.69675
Max. 5.13300
DOllo:602
U1> :648
cor ()
> cor (Smarket)
Error 1n cor{Sm j!.:rket) ‘x’ must be numeric
> cor (Smarltet [“,-9] )
Year Lag1 Lag2 Lag3 Lag4 Lag5
‘0.02970 0.03060 0.03319 0.03669 0.02979
Lag1 0 , 0297 ,.1.00000 -0.02629 -0.01080 -0.00299 -0.00667
Lag2 0.0306 ‘:”0.02629 1. 00000 -0.02590 -0.01085 -0.00366
Lag3 -0.02590 1. 00000 -0 , 02406 -0 , 01881
Lag4 0.0357′;”0.00299 -0.01085 -0.02708
Lag6 0.0298′:’:0 , 00567 -0.00356 -0.01881 -0 , 02708 1. 00000
Volume 0.5390 -0.04182 -0.04841 -0.02200
Today 0.0301 -0.02616 -0 , 01025 -0.03486
Volume Today
Year 0.5 3;90 0.03010
Lag1 0.0409 -0.02616
Lag2 -0.0434 -0.01025
-0 , 00245
Lag4 -0 , 0484 -0.00690
Lag6 -0 , 0220 -0.03486
Volume 1.0000 0.01459
Today 0 , 0146 1.00000
> attach(Smarket)
> plot (Volume)
4.6.2
l’J
linear model)
4.6 • 109
glm
>
• )
> summary(glm.fit)
Call:
glm(formula = ‘”” Lag1 + Lag2 Lag3 + Lag4 + Lag5
+ Volume. ‘7 Smarket)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.20 1.07 1.15 1.33
Estimate Std. Error z value Pr(>!z!)
-0.12600 0.24074 -0.52 0.60
Lag1 -0.07307 0.05017 -1. 46 0.15
Lag2 -0.04230 0.05009 -0.84 0.40
Lag3 0.01109 0.04994 0.22 0.82
Lag4 0.00936 0.04997 0.19 0.85
Lag5 0.01031 0.04951 0.21 0.83
0.13544 0.15836 0.86 0.39
(Dispersion p’aram.eter for family taken to be 1)
Null deviance: 1731.2 on 1249 degrees of
RGsidual deviance: 1727.6 6n 1243 degrees of freedom
AIC: 1742
Number of Fisher Scoring 3
> coef
(Intercllpt) Lag1 Lag2 Lag3 Lag4
-0.12600 -0.07307 -0.04230 0.01109 0 , 00936
Lag5 Volume
0.01031 0.13644
> summary (glm $coef
Error z value Pr{>!z!)
(Intercept) -0.12600 0.2407 -0.623 0.601
Lag1 -0.07307 0.0502 -1.457 0.145
Lag2 -0.04230 0.0501 -0.845 0.398
Lag3 0.01109 0.0499 0.222 0.824
Lag4 0.187 0 , 851
Lag6 0.01031 0.0495 0.208 0.835
Volume 0.13544 0.1584 0.855 0.392
> summary(glm.fit)$coef[ ,4]
(Intercept) Lag1 Lag2 Lag3 Lag4
0.145 0.398 0.824 0.851
Lag5
0.835 0.392
predict
> (glm” “)
:> glm.probs [1:10J
1 :2 3 4 5 6 7 8 9 10
0.507 0.481 0.481 0.515 0.511 0.507 0.493 0.609 0.518 0.489
>
Up
Down 0
Up 1
> glm. (” Down” , 1250)
> glm. pred (glm. probs
table
> table(glm.pred , Direction)
Diraction
glm . pred Down Up
00 101’0 145 141
Up 457 607
> (507+145) /1250
[1] 0.5216
>
[1] 0.5216
145
:>
:>
4.6 R
> (Smarket .2005)
(1] 252 9
> Direction. 200S”‘Direction (! trainl
vector)
! train
>
> (glm . fi t , Smarket respo_nse”)
> .252)
> glm. pred [glm . probs
>
Direction .2005
glm.pred Down Up
Down 77 97
Up 34 44
>
(l J 0.48
> .2006)
(1) 0.52
l a i m 1″ b@ zg ya zp ms ae fx ‘” kp xy at ma ss ao $2 a ‘@ 2k L?m +S -b gt aAA LAZ N az is ec r>
i
nie < ar axp g=s aLbnr u usu gg >>
112
> glm. .252)
> glm.pred[glm.probs>.5)=”Up”
> table (gllll.pred ,
Direction.2005
glm.pred Down Up
Do W”n 35 35
Up 76 106
>
(1] 0.56
> 106/(106+76)
(1] 0.582
> ,
(1.1 , -0.8)) , type””’ response ,,)
2
0.4791 0.4961
4.6.3
>>aa il aa rr = tm e st be us sb G KB ae ak sz am ts a du 2a gt aa Ld t’ 1′ g2 ag m+ t-cg rL i
S( saz <
zt
rzie affr r-baaeo iddl<1111a
ad
>>>
cl
probabili_ties of groups:
Down Up
0.492 0.508
G:roup means
Lag1 Lag2
Down 0.0428 O. -0339
-0.0313
Coefficients of linea:r
LDl
Lagl -0.642
Lag2 -0.514
> plot (lda.
X 0.642 x
4.6 • 113
Lagl – 0.514 x
O. 642 x Lagl
0.514 x
predict
> lda. Smarket .2005)
> names(lda.pred)
(1) “class” “posterior n “x”
> lda.
>
Direction .2005
lda. pred Doy ;u Up
Down 35 35
Up 76 106
> .2005)
$
> sum{lda.pred$posterior[.1]>=.6)
<1J 70
> sum(lda.pred$posterior[.lJ<.5)
[lJ 182
> lda.pred$posterior[1:20.11
> lda.class(1:20)
(lda.
(1] 0
> +Lag2 • data=Smarket • subset “‘train)
> qda.fit
Call:
“” Lagl + Lag2. data ‘” Smarket.
of
DOlln Up
0.492 0.508
114
Lag1 Lag2
Down 0.0428 0.0339
Up -0.0395 -0.0313
> qda. $claaa
> table (qda. clas3 , .2006)
.2006
qda.class Down Up
Down 30 20
Up 81 121
> møa:n (qda.
(1] 0.699
4.6.5
(1
Xo
(4)
column
> library(class)
> train.X=cbind{Lagl , Lag2) [train.l
> .J
>
,
> knn. pred=knn
>
Direction .2006
knn.pred Down Up
Down 43 58
Up 58 83
> /252
[1) 0.5
4.6 115
> knn.
Direction .2005
knn.pred Down Up
Down 48 54
Up 63 87
> .2005)
(lJ 0.536
4.6.6
> dim (Caravan)
(1) 5822 86
> attach(Caravan)
> $ummary(purchase)
No Yes
5474 348
> 348/5822
[1) 0.0598
> X=scale (Caravan [ • -86])
> var (Caravan ( , 1J)
[1) 165
> var(Caravan (, 2])
[1] 0.165
>
[1] 1
> X [ .2])
[1] 1
116
1000
> J
,J
> yCPllrchase
> set.seed(l)
> knn.pred-knn(train.X , test.X.train.Y.k=1)
> mean
[1] 0.118
>
(1) 0.059
>Y g e t -, daog ees ryy nr
to38
a8N76 ae8
>
kt97 1rny6 bp(o a/
a1
>
> (knn. pred ,
teat , Y
knn.pred No Yes
54
Yes 21 5
> 5/26
(1J 0.192
> (train. X,
> table(knn.pred , test.Y)
test.Y
knn.pred No Yes
No 930 55
117
> 4/15
tlJ 0.267
> (Purchaserv ,
message:
numerically 0 or 1 occurred
> glm. (glm- , fi t • Caravan ,J • type””’ response ,,)
> glm. , 1000)
> glm. pred
> table{glm.pred , test ,y)
test.Y
glm.pred No Yes
No 934 59
Yes 7 0
> , 1000)
> glm. pred [glm. probs >.25] “,” YGS”
> table(glm.pred , test.Y)
test .,y
glm.pred No YeB
48
YGS 22 11
> 11/(22+11)
[1] 0.333
4.7
(4.11)
of dimensionality)
( = 1
118
1] x [0 ,
=0.6 , X, =0.
= 1]
=
(
(
= -6 , ß, =0.05 ,
(
.. 119
(
(
(
(
(
(
(
frame
(
(
(
120
(
()
> Power2=function(x.a){
> Pouer2(3 ,8)
817 •
(result)
“,,”
(
> PlotPower(1:10 ,3)
10 , 23• …,
cross- validation)
el assessment) ,
5. 1
error error rate)
122 ..
1. 1
5.1.1
validation set
hold out
123 n
1
123 ‘.1
iiE . . . . . . . . \ \ i
2 4
> set.seed(l)
> traiu”‘samp1e (392 , 196)
>
> attach (Auto)
> (-trainJ-2)
(1) 26 , 14
> , data”‘Auto
> mean ((mpg-predict (1m.
(1] 19 , 82
> ,3) subset
> mean
(1] 19.78
> (2)
> (392 ,196)
> 1m.
> meau ((mpg-prediet (1m. fit_ , Auto) )
(lJ 23.30
> 1m.
> ,Auto)) -2)
[lJ 18.90
> , 3)
> 1-2)
[1] 19.26
5.3.2
= “binomial
5.3
1m
> (mpg……,horsepower
> coef
horsepower
39.936 -0.158
> .data”, Auto)
>
(Intercept)
39.936 -0.158
glm glm
> library(boot)
> .data “, Auto)
>
> cv.err$delta
1 1
24.23 24.23
cv. glm
> (0 , 5)
> fo1’
+
+
+ }
> cV.error
[1] 24.23 19.25 19.33
5.3.3
CV. glm
> set. seed(17)
> cv.error.l0″‘rep(O , 10)
(1 in ltl0){
+
+ $delta [1]
+ }
> cv , error.10
(1J 24.21 19.34 18.68 19.02 18.90 19.71 16.95 19.50
134 •
5.3.4
alpha. fn
>
+ X”‘data$X (index)
+ [indax]
+ return((var(Y)-cov(X , Y))/(var(X)+var(V)-2*cov(X ,Y)))
+ }
>
[1] 0.576
> (1)
> (100 .100 , replacG=T))
(1) 0.696
boot
> _, alpha. fn.
ORDINARY NONPARAMETRIC BOOTSTRAP
Call
5.3
‘” 1000)
bias error
-05 0.0886
> ,
> boot.fn(Àuto , 1:392)
(Intercept) horsepower
39.936
boot. fn
> set.
> (Auto , sample (392 , 392 ,
(
38.739
> )
40.038 -0.160
> boot. fn , 100Q}
ORDINARY BOQTSTRAP
Call :
‘” R “” 1000)
Statistics
bias std. arror
0.0297 0.8600
-0.158 -0.0003 0.0074
()
Error t value Pr(>!tj)
39.936 0.71750 55.7
horsepowar -0.158 7.03e-81
136 ..
>
+
subset””index) )
> set , S
ORDINARY , NONPARAMETRIC BOOTSTRAP
Call
boot statistic c ” , 1000)
std. error
6. 098
Esti m.ate Std. Pr(>1 t!)
(Intercept) 56.9001 1.80043 32 1.7
in 1: 10000) {
“””’4) >0
>
> mean(store)
(
(b)
i
i
E
138 ..
(
fn ()
( fn
(
cv. glm
glm
Up” I Lagl , Lag2)
(
i
, (1)
> y”‘ :t’norm (100)
“> x”‘ :t’norm (100)
>
(
i.
139
+ e
tii. Y +13
+e
frame
(
(
(
(
(Boston $
(
(
Y = ß, (6.1)
1
interpretability) 0
(,infinite)
feature
variable
1)
141
)
6. 1
6.1.1
selectìon)
10
, p
1e.?9
“‘,
+ Jiil
+
p
142
. , , , , ,
, , t , ,
•
J5í.
Ml. …
6, l.
lìIT
=
6 , 1.2
.. 143
• p-l
”
“‘,
1 , 2 , … – k) = 1 +
1. 3
+
144 •
113
l
1 1 rating
2 I income
3 \ rating , income , student
4I cards , income student , limit
“‘p , p-l , “‘, 1
.
6.1.3
1.
0
6, 1
MSE
(Akaike information m
+
E
S
E
N
B
E
E
4 6 8 10 2 4 6 8 10 2 4 6 8 10
Number ofPredictors Number ofPredictors
11’1 6-2
e Cp
146 $
AIC
BIC• (RSS + (6.3)
–
RSS/(n-d-1) = 1 (6.4)
n -d-1
BSS
-1
147 6.2
:;11 •
00
148 •
6.2
6.2.1
mzzb
Xjl =Xa
[49 6.2
– Income
.”. Limit
Rating
Student
0,2 0.4 0.6
IIPflUIIPlb
l.0 0.8 le+02 le+OO
•
IIß:
X1J
‘6
(6.6)
150
~\/
\
-”
ih
le-Ol 0.0 0.2 0.4 0.6 0.8 1.0
IIßflh/llßlh
f
6.2.2 lasso
151 6.2
=
(sparse model)
(6.7)
, ,
4 , , , ,
,
w
Limit
0 0
0.6 1.0 0.8 0.2 2000 5000 50 100200 500 20
(6.8)
(6.9) ,
152
+
} , 10
lasso
153 • 6.2
I +
s
ii
iz
0.02 0 .1 0
.J ~
1.0 0 ,2 0.4 0.6 0.8
00 Training Data
10.00 50.00 0 .50 2.00
A
——–
0.02 0.10 0.50 2.00 10.00 50.00
A
S
M
J
0.4 0.5 0.6 0.7 0.8 0.9 1.0
R2 on Training Data
f!P n
PTLMR
+
PTLHR
(6.12)
ß; = r/ (1
(6. 13)
(6. 14)
6.2
Ir,
m
M
dv
‘
-1.5 -0 .5 0.0 0.5 1.0 1.5
Y,
!…
-0.5 0.0 0.5 1.0 1.5
Y,
006-10
=
X) =f(YI
Y =ßo + … + Xpßp + e
156
d I I d
â
I d
/k ez
\
;;H / \ I ;J
d1 / “- I d
2
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
6.2.3.
jiij , gJ
5e-03 5e-02 5• 01 5e-02 5e-Ol
(signal variable) 0
s
15
iE
0.0
)57
0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6
IIp,’ II ,l lI pll , IIp,’I!, II! IÎI!,
1.0 0.8
X,
Z2. •..
P ZmzZM
m =
M
Yi = ()o + L (J”,Zirn + = l ,”‘,n
+
{>f M P P M Þ
(6. 16)
(6.17)
6.3
158 •
M
(6. 18)
Zz. •.•
6.3.1
components
PCA
pop
R
ZE
BE
20 30 60
Population
Z , = 0.839 x (pop -pop) +0.544 x (ad – ad) (6.19)
+
=
X (pop -pop) (ad –
839 x (pop, – pop) + O. 544 x
zu’ …,
(6.20)
159 6.3
•
• • • 0 10
Population 1st
20 20
= =
=0‘ 839 x (pop , – + O. 544 x ( ad, – ad) <
.' .... .; .
:
.... ••.
..
3
"
.·kt
.
. .. ". : .., . "
…3 > 2 -1 0 1 2 -2 -1 0 1 2
1st Princìpal Component 1st Principal Component
aE
2 2
s
160 •
Z, = 0.544 x (pop – pop) – O. – ad)
. .. . .
..
1·
··..-··
.•
‘P·t·-.
…. –
…
—
2··· ..a
,,e
•
.
. .
..
. .
.
.
l
7
Il’ I
‘”
s
.
-0.5 0.0 0.5 1.0
2nd Principal Componcnt
1.0 0.0 0.5 1.0
2nd Principal Component
1.0
prirúiì Zu.
12
• 161
jjh\\
o 10 20 30 40 0 10 20 30 40
Number ofComponents Number of Components
PCR
;:
Ridge Regression and Lasso
/
/
o 10 20 30 40 0.0 0.2 0.4 0.6 0.8 1.0
Number of Compouents Shrinkage Factor
e of Statîstìca}
@ 162
\–,-, Rating Student
2 4 6 8
Number of Components
:të:
2 4 6 8 10
Number of
“‘,
6.3.2
163 6.4
mEE#Z
60 50 40
Population
30
thogonal-
…
6 , 4
6.4.1
164 •
6.4.2
= 1
flexible)
m
$A e
..’ @
@
@
./ .
-1.5 0.0 0.5 1.0
X
• m
?
-0.5 0,0 0.5 1.0
X
1
s
5 10 15 5 10 15 5 10 15
Nurnber ofVariables NumberofVariablcs
6. L
6.4.3
=20 ,
of dimensionali-
@ 166
p=2000
N
p “, SO
m
N
p”’20
n
N
1 70 111
p
=2 las.
1 28 51
Degrees ofFreedom
1 16 21
6, 4, 4
6.5 167
=
6.5
6.5.1
na
> library (ISLR)
> fix (Hittars)
> names(Hitte ;ra)
[1J “AtBat” “HmRun” “Runs” “RBI”
[$] “Walka” “Yaars” “CAtBat”
“CRuns” “Laague” “Division”
[16) “PutOuts” “Asaists” “Errors” “Salary”
> dim (Hi tters)
(1] 322 20
> sum(is
[1] 59
>
> dim
[1] 263 20
>
(1] 0
regsubsets
> library (1eaps)
>
>
Subset abject
Call: regsubsets .formula(Salary ‘” .,
19 Variables
1 SUbSêts ot each ta 8
Selectian Algarithm: exhaustive
AtBat Hits HmRun Runs RBI Walks Years
• 168
s t
*G
J 8 1 N e u
“”””””””g
se -g CUUUHHH”””L ze
r
US Ri-
>}>>>>>>)>>>>>>>>>>>>>>>111111111111111111111111 <{<<<<<<<<<<<(<<<<<<<<<<12848678 12845678 12345678 subsets .nvmax=19) > reg. full)
summary
> nam ,es (reg . summary)
(1) “which” “rsq”
(7)
“bic” “cp” “adjr2 ” “rss”
6.5
56 34 65 00 96 24 85 00 46 14 58 00 95 04 58 00 15 94 45 00 84 74 45 00 14 54 45 0o
q 683 r24 $45 y roo a m106 m244 usss s eooo g e]]] r109
-11
>
EE
>.
> (reg. summary$rss , xlab=”Number of Variahles”. •
> summary$adjr2 of Variables” ,
yhb””’
points fg points
max
>
[1] 11
> (11 , reg. summary$adjr? [1 1], col””’red” , pch”’20)
> summary$cp , xlab=” Number of Variables” , ylab=”Cp” ,
type””l’)
> lõhich. min Creg. summary$cp)
(1) 10
> points (1 0 , reg. summary$cp [10] ,
> )
[1] 6
> plot (reg. summary$bic , xlab””Number of Variablas” , ylab=” BIC” ,
)
> pCh””20)
regsubsets
>” 2)
>
r
>”
2dpz racb “””” ZUEm ll-I aaaa cccc sggg ”” 1111 1111 UUUHU ffff tttt 11ii ffff gggg eeee rrrr <<<
t9 a6
>
88
6t ,A-
zm l u f t i f
>
2
gtz ep5 ze
“)
> 8ummary(regfit.fwd)
> , data=Hi tters • nvmax=19 •
“)
> summary( :regfit.bwd)
> coaf
Hits
79.451 1.283
CHmRun
1.442 -129.987
> coef
109.787 -1.959
CWalks
-0.305 -127.122
> coef(regfit.bwd ,7)
(Intercept)
106.649 -1.976
CWalks DivisionW
0.716 -116.169
Walks
3.227
PutOuts
0.237
Hits
7.450
PutOuts
0.253
Hits
6.767
0.303
6.5.3
CAtBat
-0.375
Walks
4.913
Walks
6.056
CHits
1.496
CRBI
0.854
CRuns
1.129
> set.aeed(l)
> train=sample(c(TRUE , FALSE).
> test”‘(!train)
6.5 1> 171
> (Salary””,. (traln ,J.
nvmax=19)
test. (Sala:ry”-‘. • [test .J)
mode l. matrix
> val. (NA , 19)
in 1:19){
+ id=i)
+ names
+
>
> val. errors
(lJ 220968 169157 178518 163426 168418 171271 162377 157909
[9] 164056 148162 151156 151742 152214 157359 158541 158743
[17] 159973 159860 150106
> which.min(val.errors)
(1) 10
> coe :f , 10)
AtBat
CHits
1. 105
PutOuts
0.238
CHmRun
1.384
7.163
CWalks
-0.748
Walks
3.643
CAtBat
LeagueN
> predict. regsubsets •
+ ([2J J)
+ matm lll; odel.matrix(form , newdata)
+ coefi=coef(object
+ mat [, xvars)%* Y. coefi
+ }
172 •
> CSalary””. , data=Hitters •
> coef(regfit ,best ,10)
(Intercept) AtBat
162.535 -2.169
CRuus CRBI
1.408 0.774
0.283
Hits
6.918
CWalks
0.831
Walks
6.773
-112.380
CAtBat
-0.130
PutOuts
0.297
> k=10
> set , se8d(1)
> pasteC1:19)))
in l:k){
+ best. fit”‘regsubsets (Salary…….. [folds l”‘j , J ,
nvmax=19)
+ 1:19){
+
+ (j , i] “‘mean ( (Hi tters$Salary [folds ='” j] -pred) -:2)
+ )
+ )
x
> , 2 , mean)
> mean.cv , errors
(1) 160093 151159 146841 138303 144346 130208
(9J 129460 125335 125154 133461 133975 131826 131883
(17J 132751 133096 132805
> par
> CV.
6.6 .. 173
> (Salary”-‘. •
> coef Creg. best , 11)
(Intercept) Walks CA tBat
135.751 -2.128 6.924 5.620 -0.139
CRuns CRsr CWalks DivisionW
1.455 -0.823 43..112
Assists
0.289
6.6
> Hitters) [, -1)
>
mode l. matrix
6.6.1
> library (glmnet)
> length””lOQ)
>
x
> dim(coef(ridge.mod))
(1) 20 100
498
174 -to
> ridge. lIlod$lambda [50]
t1 J 11498
> coef(ridge .mod) [, 50]
407.356 0 , 037 0.138
RBI Walka Yeara
0.240 0 , 290 1. 108
CR.uns CRBr
0.088 0.023 0.024
Assists
-6.215 0 , 016 0.003
> sqrt (sum )
[1] 6.36
HmRun
0.526
0.003
CWalks
0.025
Runs
0 , 231
CHits
0.012
0.085
NEl wLeagueN
0.301
> ridge. mod$lambda [60]
(1) 705
> coef(ridge.mod) [.60]
(Intercept)
64.325 0.112 0.656
RBI Walks Years
0.847 1. 320 2.596
CRuns CRBr
0.338 0.094 0.098
Assists
-54.659 0.119 0.016
> sqrt(sum(coef(ridge.mod) [-1 , 60]A2))
(1) 57.1
HmRun
1.180
CAtBat
0.011
CWalks
0.072
Errors
-0.704
0.938
0.047
LeagueN
13.684
NEl wLeagueN
8.612
>
AtBat HmRun
48 ‘ 766 -0.368 1. 969 -1. 278
RBI Years
0.804 0.005
CHmRun CRuns CRBI CWalks
0.624 0.221 0.219 -0.150
OivisionW PutOuts Assists Errors
-118.201 0.250 0.122 -3.279
Runs
1.146
CHits
0.106
LeagueN
45.926
Ne 1ol’LeagueN
-9.497
> (1)
nrow(x)/2)
>
>
175
“” n
> , J , y [train] , alpha=’O , lambda”‘ßrid ,
-12)
> ridge. prad”‘pradict )
(1] 101037
(1] 193253
> ridge.pred=predict(ridge.mod , s=le10 , newx=x[test .])
>
<1] 193253
> . mod , $=0 , newx=x [test , J ;
> maan((ridge , pred-y.teat)-2)
[1]
>
> predict . mOd , 5″‘0 , exact “‘T , type””” coaffic ients”) [1: 20 ,]
glmnet
cv. glmnet
> sat.saad(l)
> , y alpha=O)
> plot (cv.
> min
> ..
212
predict
glmnet
176
> .mod .l)
> -2)
(lJ 96016
Runs
1.1132
CHits
0.0649
LeagueN
27.1823
Ne l.TLèagueN
7.2121
> out=glmnet
> pred i.ct (out , type””’ coøff ic ients” • [1:20 ,]
(Intercept) AtBat Hits HmItun
9.8849 0.0314 1. 0058 0.1393
RBI Walks Years CAtBat
0 , 8732 1. 8041 0.1307 0.0111
CHmRun CRuns CRSI CWalks
0.4516 0 , 1290 0.1374 0.0291
DivisionW Errors
91. 6341 0.1915 0.0425 -1. 8124
lasso
“”
6.6.2
> , alpha”‘l ,
> plot(lasso.mod)
> set.seed( 1}
> cv. ,J ,
> plot(cv.out)
> bestlam=cv. out$lambda. min
> , neyx”‘x [test ,))
>
[1) 100743
6.7 @
>
s”‘bestlam) (1 :20.J
> lasso. cO ,ef
(Intercept) AtBat Hits HmRun R.uns
18.539 0.000 1.874 0.000 0.000
RBI Walks Yêars CAtBat CHits
0.000 0.000 0.000 0.000
CHmRun
0.000
-103.485
CRuns
0.207
0.220
caBr CWalks LeagueN
0.413 0.000 3.267
Assists Errors
0.000 0.000
Walks caBr
2.218 0.207 0.413
PutOuts
0.220
> lasso.coef[lasso.coef!=O]
Hits
18.539 1.874
LeagueN DivisionW
6. 7
1
> library{pls)
,
> pcr. (Salary””” .scale=TRUE ,
pcr
scale
“”
pcr
> summary(pcr.fit}
Data: X 19
Y dimension: 263 1
Fit svdpc
Number of considered: 19
VALIDATION: RMSEP
Cross -validated using 10 random segments
(Intsrcept) 1 comps 2 comps 3 comps 4 comps
CV 452 348.9 352.2 353.5 352.8
adjCV 452348.7351.8352.9352.1
538 p64 c84 8 390 p29 c84 8 $32 p02 4 $47 P81 m
@C74 n ig a l ps68 xp15 6m ec64 22 a ez r$13 apse vm
080
%c34
1
0 N ZY MUZ za A-Ra TXS
178 ..
pcr mean
error)
= 124
va l. type =”
>
M=19 ,
summary of
=p =
, (1)
> pcr. fi t”‘pcr (Salary”‘. , data=Hi ttars • subset =train , scale=TRUE ,
validation “,” CV”)
> validationplot (pcr. fit. val. type=” MSEP “)
> ,x(test ,J , ncomp=7)
-2)
[1] 96556
> pcr (y””‘X. , ncomp=7)
>
Data: X dimension: 263 19
Y dimansion: 263 1
6vdpc
Number of components 7
TRAINING: % variance explained
x
y
x
y
2 comps 3 4 comps
38.31 60.16 70.84 79.03
40.63 4 1. 58 42.17 43.22
7 comps
92.26
46.69
5 comps
84.29
44.9 0.
6 c_omps
88.63
46.48
179 6.8
()
::> ßet. seed(l)
> pls. fi t”‘pl .s r (Salary^’,. scale”‘TRUE ,
;> summary(pls.fit)
19
Y dimension: 131 1
kernelpls
Number consider
TRAINING: Y. variance explained
1 cOmps :2 comps 3 comps
38.12 53.46 66.05
33.58 38.96 4 1. 57
>2 P c ” 3 s e t [ x ‘>aL
>
ss le pt
>
>
Data: X dimension: 263 19
Y dimension: 263 1
method: kernelpls
Number of COmponents considered: 2
TRAINING: % variance explained
2 comps
38.0$ 51.03
43.05 46.40
X
Salary
180 @
6.8
1 , 2 , …
(k+
(k + 1
(k +
(k +
(k +
(
(
llJ
?ilJ
(
(
(
p=2 , X21 X 12 +Xn =0 ,
(3,
(
=ß20
(
(
182 ..
(c)
(,)
(
Xl , ,..
fram
(
+ 8
(
183
( n=l
(
(
(
additive
7.1
1
7.1
7. 1
+ S;
+ Si (7. 1)
“‘, 1)
Polynomial
s R O “.,
”
h
20 30 40 50 60 70 80
Age Age
J(x,) = (3, +
E
, ,
s
“”
186 ..
> 250 I x;) (7·3)
1 + exp(ßo + …
7.2
C,(X) = I(X
h(x ,g) = (x – g)’. = 1
10
X2 , g,), g,
(7.10)
s
‘’ ::p
”
7.4.4
50%.
191 7.4
;:1(;:j
Natural Cubic Spline
\
tlili–
ijL~
2 4 6 8 10
Degrees ofFreedom ofNatural Spline
2 4 6 8 10
Degrees of Freedom of Cuhic Spline
192 ..
7.4.5
11
s
B
a
“e
20 30 40 50 60 70 80
Ago
7.5
7.5.1
–
3
‘”
7.5 @
g’
7.5.2
effective degree of
S,y (7.12)
193
(7.13)
X! ,
I l’ = >: (x,))’ =
\J’ CA
7.6
Spline
EE
w w w w
Ag,
v
bL
Regression
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
.
“‘ßo
(7.14 )
=0.
7.1
additive model ,
196 •
7. 7
Local
E
20 30 40 50 60 70 80
Ag,
7.7.1
YI = ßf}
+ 8 1
(7.15)
+ + …+ !p(x/p) + 8 ,
wage + e (7.16)
R
20 30 4{) 50 60 70 80 2009 2005 2001 2003
IE
198 • UT R -• A
education
7.7.2
= IT
g(4L) 1= ßo + … + ß,X, (7.17)
1 -p(X)1
Y = 1IX)/P( Y
+ (7.18)
I
7.7
X
1 – p(X)
(7.19)
p(X) =Pr (wage >250! year , age. education)
N-
NUO
2003
education
2005 2007 2009
-?
HS
NO
?
,
2005 2007 2009 20 30 40 50 60 70 80
year age education
2003
200
7.8
> library , (ISLR)
> (Wage)
7.8. i
> fit=lm(wage”‘poly(age ,4)
> )
Estimate Std. Error t value Pr(>ltl)
(Intercept) 111.704 0.729 153.28 <2e -16
poly(age , 4) 1 447.068 39.915 1 1. 20 <2e-16
poly{age , 39.915 -1 1. 98
poly{age , 4)3 125.522 39.915 3.14 0.0017
poly{age , 4)4 -77.911 39.915 -1. 95 0.0510
age^2 ,
() age^2 ,
>
> coef(summary(fit2))
Std. )
(Intercept) -1. 84e +02 6.009+01 -3.07 0.002180
4 , 2.12e+01 5.8ge+00 3.61 0.000312
poly(age. 4 , raw ‘” 1)2 -5.64e-01 2.06e-01 -2.740.006261
poly(age , 4 , 6.81a-03 3.070-03 2.22 0 ,026398
poly(ago , 4 , raw “” 0.051039
> • data=Waga)
>
I(age-4)
-1 ,84e+02 2 ,12e+01 -3.20a-05
>
çbind
7.8
>
> age. grid=seq (from “, agelims [lJ [2] )
>
>
>
> , wage • :x:lim=agelims • cex”‘.5. col=” darkgrGy”)
> title (“Degree -4 Polynomial” .-outer”‘T)
> <:(1 1 =" blue ,,)
> (age. 56 . bands , 1wd”‘1 , co1=” blue” , 1 tY””3)
()
> preds2″‘predict
> i t))
[1]
anova
t . 1′” , data=Wage)
>
> fit. (age .3) ,data=Wage}
>
> fit
> .4 , fit .5)
of
Modal 1: wage ,…, age
Model 2: wage ,…, polyCage , 2)
Model 3: waga ‘”
Model 4: wage ‘” poly (age , 4)
Model 5: waga ,…, polyCage , 5)
Res . Df RSS Df SUlIl of Sq
2998 5022216
F Pr(>F)
670 -os e00 2 991 588 393 4 1 660 857 770 856 21 2 111 044 370 466 37k 977 777 444 765 999 999 222 234
5 2994 4770322 1 1283 0.80 0.3697
Signif. codes: 0 0.001 0.01 0.05 0 , 1 ‘ , 1
( < 202 .. > coaf(summary(fit.5))
poly (age. 5) 1
5)2
poly(age.6)3
poly (age. 5) 4
poly(age , 5)5
Std. Pr (> 1 t 1)
11 1. 70 0.7288 153 , 2780 O.OOOe+OO
447.07 39.9161 1 1. 2002 1. 4916-28
-478.32 39.9161 -1 1. 9830 2.3688-32
125.62 39.9161 3.1446 1. 67ge-03
-77.91 39.9161 -1. 9519
35.81 39.9161 -0.8972 3.697e-01
anova
:>
[1] 143.6
> 1-1m (wage”-‘education +age •
;. fit .2=lm(wage,,-,educatiou+poly(age ,2). , data=Wage)
> 1m (wage”-‘education +poly (age .3) • data=Wag El)
> .3)
000
> >250) “‘Po1y (age • data=Wage , family “, binomial)
predict
> (age=age. giid) , se””T)
=
P,(Y=IIX)
1 + exp(Xß)
7.8
> )
> se.bands.logit ‘”
preds$se .:f it}
> se.bands ‘” exp(se.bands.logit)/(l+Gxp(se.bands.logit))
> •
se”‘T)
> I (wage >250) • , tY )i) e=”n” , ylim=c (0 .‘ 2) )
> points(jitter(age} ,
col=” darkgrey “)
> Cage .grid col=”blue”)
(age . grid • S6 . bands , 1wd=1 , col=” blue” •
> table
(17.9 , 33.5] (33.5 ,49) (49 , 64.5] (64.5 , 80.1J
750 1399 779 72
> , 4) , data”‘Wage)
> coef(summary(fit))
cut (age , 4) (33.5.49]
cut(age , 4)(49 , 64.6)
cut
Estimate Error t value
94.16 1. 48 63.79 O.OOe+OO
13.15
23.66 2.07 11.44 1. 04e-29
7.64 4.99 1. 53
<33.
7.8.2
()
> library(splines)
> age ,knots”‘c (25 ,40 , 60) ) •
>
> wage.
> linas (age . grid , pred$:f
> , 1
> , lty=” dashed “)
204 •
> dim(bs{age , knots=c(25.40.60)))
[1J 3000 6
>
[1] 3000 6
> att x: (bs(age.df=6)
25%
33.8 42.0 51. 0
> fit2=lm(wage “-‘Ils(age ,df=4) .dat ð. =Wage’)
>
> pred2$f i t , col “,’1
ns
>
Spline”)
> fi t”‘smooth. spline (age , wage • df =16)
> (age ,yaga , cv”‘TRUE)
> fit2$df
[1J 6.8
> lines(fit ,
> , lwd”‘2)
> legend (., topright ” 16
cOl”‘c (“red”.
spline
> plot Cage , Vage , •
> title (“Local
> fit”‘lollss (wage”lage data=Wage)
>
> lines(age.grid , predictCfit :f rame (age “‘agG . grid)) ,
> lines (age. grid , ,data. :f rame (age”‘aga. grid)) ,
, lwd””2)
> legend (” topright ” ,” •
cOl”‘c (“red” , “blue “) lwd”‘2 , 8)
=0. “”‘0.
7.8
7.8.3 GAM
(
>
> library (gam)
>
> (1 ,3) )
> plot{gam.m3. , col””’blua”)
> plot. gam (gaml , se=TRUE , col “,” xed”)
gam
>
> gam. m2=gam (wage”-‘year+s (age .5) +edu.cat ioo • data=Wage)
> F”)
Analysis of Table
vaga s(age , 6) + education
Model 2: wage ‘” year + s (age education
3: wage ‘” s(year , 4) + s(age , 5) + aducation
Resid. Df Resid. Dev Df F Pr(>F)
1 2990 3711730
2 2989 3693841 1 17889 14.5 0.00014 ***
3 2986 3689770 3 4071 1.1 0.34857
Signif. codes: 0 0.001 ‘**” 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ , 1
206
> summary(gam.m3)
Call: gam (formula ‘”‘lage ,…., 4) + s(age. 5} +
data ‘” Wage)
Deviance Residuals:
Min 30
14 , 17
Max
213.48
(Dispersion Parameter 1236)
5222086 on 2999 degrees of freedom
Deviancê: 3689770 on 2986 degrees of
AIC: 29888
Number of Local Scoring 2
DF for Terms and F-values for Nonparametric Effects
(Intercept)
a (year , 4)
s{age , 6)
education
Df Npar F
1
Pr(F)
1
4
3
4
1. 1 0.35
32.4
Signif. codes 0.001 0.01 0.05 ‘.’ 0.1 ‘ , 1
>
>
data=Waga)
> p1ot. gam (gam .10. se=TRUE. co1″”’ green “)
>
> 1ibrary(akima)
> plot(gam.lo.i)
207
nomial o
>
> par(mfrow=c(1.3))
>
:;. I (wage >250))
ed’ucation FALSE TRUE
1. < HS Grad 26.
Grad 966 8
3. Some Col1age 643 7
4. College Grad 663 22
6. Advanced 381 45
:;. family'"
data"'Wage , subset "'(education < HS Grad"})
:;. plot(gam.lr.s , se"'T , col="green")
7.9
ß3' =ß, +ß3X3 (x
(
ßl. b,. C1 • djO
aZ + bzx + C2 ;:(,2 + d2::lIl
(
(
( !', W
( =
a1 +
2c1
(
208 •
m=l
( m=2
( m=3
( m=3
(X)
+ e
= 1, 13, = 1 , ßl = -
4 =1(0 ,,; X<;2) b, (X) = +
l( 4 X ;;,
+ß,b,(X) +8
5 g,
-
- g(X,))' (x
(
j obclass
(
(
(
(
(
11.
( 100
0
+ B
>
> (2J
Y-ß2X2 =ßo +e
>
> [2}
(
210
()
(
8. 1
8.1.1
<4.
Years <
x I Years < 4.5 1 • Rz ::::: I X ! Years'> “‘” IX
IYears> =4.5 , Hits>=117.5}o
000 x xe5. 999 :::::402 000 X e6. 740 :::::
845 346
212 •
Yea.r<4.5
5.11
il--
6.74
<
6.00
117.5
terminaI
<
J 1
1 45 Yean 24
Years..
(1 Xz ,
8 3 2
..
: ! ; ; u··E·E·-81·
·····2···288
8.1 • 213
R,o
xeRz.
Rp "',
Rz. .., R, o
L L (y,- (8.1)
IcR,
R,(j,s) = IXlx, =
214 ..
I -")
X,
R‘ R,
R,
R,
R,
t. t, .,
X,
R,
R ‘
2 6
Rz. "',
complexity
ing)
(Yl - YR)2 rl (8.4)
i
F
8.1 • 215
.",
(b)
. (cross
Year<4.S
6.189
216 •
2 4 6 8 10
Tree Size
8.1.2
occurring
(classifica-
tion error
EZI (8.5)
(8.6)
8.1 .. 217
Thal
Thal:
00
<
8.1.3
= ß. + r,Xßj (8.8)
M
• 1 (8.9)
218 ..
5 10
Tree Size
y"
15 No
8.1.4
8.2
X, x,
<
?
-2 -1 0 2 -1 0 2
X, X,
8.2
'1
8.
bootstrap aggregation)
220 •
,
RP
,
J"(x)
vote)
ES
Test:Bagging
SO 100 IS0 200 250 300
Number ofTrees
8.2 "
8.2.2
Fb,
ExAng
S
> library (ISLR)
> )
> High=ifalae (Sales
frame
> (Carseats
> -5a1es , Carseats)
> summary(tree.carseats)
Classification trae
tree(formula = • – 5a1es , data ‘” Ca.rseats)
Variabl (-J s used in tree
(1) “Priçe” “Income” “CompPrice ”
(5)
Number of nodes; 27
Residual mean deviance: 170.7 I 373
rate: 0.09 ‘” 36 ( 400
-27
text
>
>
<
226 •
:> tree.carseats
node) , aplit , n , deviance , yval , (yprob)
* node
root 400 54 1:5 No ( 0.5’90 0.410 )
2) ShelveLoc: Bad , Medium 315 390.6 No ( 0.689 0.311 )
4) Price < 92.5 46 56.53 Yes ( 0.696 )
8) Income < 12.22 No ( 0.700 0.300 )
="
:> set.seed(2)
:> 200)
:>
:>
:> tree.
:> tree.
:>
Higb.
tree. pred No Yes
No 86 27
Yes 30 57
:> 1200
[1] 0.715
tree
tree
cv. tree
:> set.seed(3)
:> cv.
> names(cv ‘ carseats)
[1 J “k” “ll1ethod”
:> cv. carseats
[1] 19 17 14 13 9 7 3 2 1
$dev
[lJ 55 55 53 52 50 56 69 65 80
$k
[1) – Inf 0.0000000 0.6666667 1 ‘ 0000000 1.7500000
2.0000000 4.2500000
(8) 5.0000000 23.0000000
(1J
[1 J “tree. se CJ.uence”
227 • 8.3
> par
> , CV.
> plot (cv. • cv . carseats$dev
> prune.
> ‘plot (prune . carseats)
carseats ,
> tree. .
> table
High. test
tree.pred No Yes
No 94 24
Yes 22 60
> (94+60) /200
[1] 0.77
> prune. best=15)
> plot(pruna.carseats)
>
>
> ,pred
.pred No Yes
No 86 22
Yes 30 62
> (86+62) /200
[lJ 0.74
8.3.2
> library (MASS)
> set. seed
= nrow (Boston) /2}
> tree. (medv,””.
> summary
Regression tree
‘” medv .. subset ‘”
used
[1] “dis”
nodes: 8
deviance: 12.65″ 3099 / 245
Distribution of residuals
Min. 1
-2.0420 -0.0536 12.6000
3rd Qu
1.9600
Mean
0.0000
228 •
> plot(tree , boston)
<9 ,
tree
> boston)
>
prune , tree
>
> plot
> , pretty”‘O)
> , newdata””Boston (-train ,])
>
> plot{yhat , boston.test)
> a’bline (0 ,1)
>
[1] 25.05
8.3.3
>
> set. seed (1)
> bag. (madv……. ,. dat a.””Boston.
> bag. boston
Call:
‘” ., data ‘” Boston.
importance ‘” TaUE. eub’sat .. train)
Type of random
Numbar 01 trees: 500
No. of variables tried at each split: 13
Mean of squared 10.77
86.96
.. 229
> yhat. bag ‘” predict (bag. bo’ston , (-train ,J)
> plot (yhat. bag.
> abline (0 ,1)
> mean ((
[1] 13.16
randomForest
>
ntree”’25)
> yhat. bag ‘” predict (bag.
>
[1} 13.31
randomForest
> set.sGGd(l)
> rf , (medv”-‘. , • sllbset “‘train ,
mtry”‘6 , importance “‘TRUE)
> predict J )
(1) 11. 31
> importance (rf.boston)
%IncMSE
zn 2.103 50.31
indus 8.390 1017.64
chas 2.294 66.32
12.791 1107.31
rm 30.754 5917.26
age 10.334 552.27
dis 14 , 641 1223.93
rad 3.583 84 , 30
tax 8.139 435.71
ptratio 11. 274 817.33
black 8.097 367.00
30.962 7713.63
>
230
8.3.4
=” gaussian”
tion =” trees “” 5
> libr ll.ry{gbm)
> set.seed(1)
> .data=Boston(train ,]
“gaussian” ,n. tre l’l s””5000 , interaction. depth=4)
>
var rel ‘ in’
45.96
2 rm 31.22
3 dis 6 ‘ 81
4 crim 4.07
5 nox 2.56
6 ptxatio 2.27
7 black 1.80
8 age 1.64
9 tax 1. 36
10 indus 1.27
11 chas 0.80
12 rad 0.20
zn 0.015
ence
> par (mfrow”‘c 2) )
> plot
> plot i””’ )
> yhat.
ll.txaes=5000)
>
[lJ 1 1. 8
(8.
> b005t.
,n. txees”‘5000 , shxinkaga =0.2.
vaxbose “‘F)
> yhat.
n. txaes =5000)
> mean-((yhat
(1) 11. 5
231 8.4
8.4
R2 •
t’}..
2.
1 –
0.1 , 0.15 , 0.2 , 0.2 , 0.55 , 0.6 , 0.6 , 0.65 , 75
X2<1 -1.06 0.63 -'1.80 232 • ( ( ( ( ( ( ( tree ( ( ( ( ( ( 233 , ( ( margin vector vector machine) 9. 1 9. 1. 1 +ß,X, = 0 (9.1) 1) (X" 9.1 235 , )(,) +ß2X:Z + ... = 0 (9.2) X" "', X,)' + ßzXz + ... < 0 (9.4) (9.3) UU ZZ(::) RP YI' ... E 1 -1 ,1} (xt = -1 + "11 …………JY -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 X, +2X! +2X1 +3Xz + 2X1 (9.5) 9. j.2 (9.6) < 0 , 0::: 1 , + ß,',,) > 0 (9.8)
=ßo +ß2XZ.
(9.7)
1
236 •
N . .
1
X,
2 3 -1 2 3
00 9.2
9.1.3
maximal margin hyperplane)
separating
margin
) +
9.1
vector)
p
9.1.4
Xn E Y2′ .,. Yn E 1 – 1,
. .
X,
maximizeM (9.9)
P..{J”….fJ,
I.ß! = 1 (9.10)
r,(ßo + ß1XIl +… + ßpxip) i = (9. 11)
Yj(ßq M , i = 1’…..n
+ … >
(9.9)
238
.
.
aw–
.
1 2 3
.
. . . .
. .
.
.
. .
.
.
mOOM
‘”
50ft
tor
9.1.5
9.2
9.2.1
. . m
-… . . . . . . .
F
3 2 3 2 1
9.2
vector
? 6$J
i 4 5
8
-0.5 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 2.5
9.2.2
maximize M
…
1
HO , ZMC
(9.12)
(9.13)
(9.14)
(9.15)
11
(9. 8″ variahle)
– (9.
(9.12) –
– (9.11)0
C
–
241 9.3 •
..:. ,
J‘.
•
3
-AV
-J-//
M;’
N
4tII .1 ;:..f 0
, . ” , , , , , , ,
,
2
.
• 0/.0
‘
M; •
N
N
2
. . .
.
2
. .
• . ”
4
0 x;
x;
.
H
? N
2
– (9.
2
9.3
9.3.1
• 242
”
. .
.
.. ”al
..•
,8··· …
-14
.. .
,. . . .
.’\ e!
1i ‘:,
f
-2
‘*
N
?
0 4 2 4
x; ,
….
maximize M
1
(9 , 16)
9, 3, 2
9.3 .. 243
– inner product)
(9.17)
(9.18)
i = ,
= A + E
(9.20)
K(x”x ,.) = (1 (9.22)
kernel)
244 •
>
N
‘;>1 .
@
-4 -2 2 4 -4 -2 2 4
X, X,
kemel)
K(xnxi.) = (9.24)
T
)T
(x/
9.3
9.3.3
=
>
<
u U U M U ,. U U U M U ,.
False False positive rate
10-3 •
246 @
"
q
"
•
• SVM: 'FIO.l
0.2 0.4 0.6 0,&
False positive
0 10
•
..LDA
02 0.4 0.6 0.8
False positive rate
e
0.0 1.0
10
1.0
9.4
9.4.1
9.4.2
9.5
+ ... +
9.5
=ß. + - (9.15) ilJ
.1
15
I . (9.26)
L(X ,
y)o
(9.25
]
1088)
+ ... ===
00
248 @
• SVM Loss
Regression Loss
-6 -4 -2 0 2
+ ". +
+ßIXiJ +
vector
ß,
9.6
9.6.1
"linear".
9, 6
> 6Gt. aead
> x_””matrix{rnorm ‘(20*2). ncol”2)
> y=c(rep(-1.10) , rep(1 ,10)}
> + 1
> plot(x , col=(3-y))
>
> library(el071)
> svmfit=svm(y””. , data”‘dat. kernel=”linear”.
=
=TRUE o
> dat)
plot. svm
()
> svmfit$inde>:
(1) 1 2 5 7 14 16 17
>
Call
$vm(formula y = cost ‘” 10 ,
scal <'l '" FALSE) Parameters SVM-Kernel: linear cost: 10 gamma: 0.6 Nuwber of Support ( 43) Nurnber of Classes: 2 Levels -1 1 > svm.f i t “”svm (Y””‘, data=dat , kernel””’ linear”.
> dat)
>
[1] 1 2 3 4 5 7 9 10 12 13 14 15 16 17 18 20
jiii
> set .seed (1)
(8vm , Y””‘” • data”‘dat , kernel “,” linear” •
0.1.
> ßllmmary(tune.out)
tuning of
cross
– best
– best performancG: 0.1
performance result .s:
cost error
1 1e-03 0.70 0.422
2 0 , 70 0.422
3 0.10 0.211
4 0 ‘ 15 0.242
5 5e+00 0.15 0.242
6 16+01 0.15 0.242
7 0.15
=0.
> bestmod=tllne.out$best .model
> sllmmary
predict
, {rnorm
> 20 , rep=TRUE)
> + 1
> (ytest))
>
>
truth
predict -1 1
-1 11 1
1 0 8
9.6
= Q. 01
> data”‘dat. kernel=”linaar” , .01 ,
>
truth
1 11 2
1
>
> pch”’19)
>
> (y””‘” cost”‘l(5)
> summary(svmfit)
Call
svm .• ‘” “linear” ,
+05)
?arameters:
SVM-Type:
linea”r
coat: 1e+05
gamma: 0.6
Number of Support :3
( 12)
Number 2
Levels
-1 1
> plot
> (y””. ,
> summary(svmfit)
>
cost
252 •
9.6.2
nel “”,”
> set.seed(1)
> (rnorm (200*2) , ncol=2)
>
> ;o:: [101:150 , ]=x[101:150 ,)-2
> y=c(rep(1.150) .rep(2 ,50))
>
> plot(x ,
1 ,
> train””samplé(200 , 100)
> data”‘dat (train ,J. kernel=”radial”.
cost=1)
> plot(svmfit , )
> aummary {sllmf
Call
s V”m ‘” ., .. ,
gamma ‘” 1 ,
Paramete :r s
BVM-Type:
radial
cost : 1
gamma: 1
Numbar of SupPQrt Vectors: 37
( 17 ’20 )
Number of Classes: 2
Levels
1 2
> svmfit”‘svm(y””.. kernel””’radial”
> , dat [train .])
9.6
> set. selld (1)
> (svm , Y””. , data ,;, dat (train ,J.
1 , 10 ,100 , 1000).
gamma”c (0.5 ,1 , 2 , 3 , 4)))
> summary(tune.out)
Parameter tuning :
– sampling method: crO$$
– best
gamrna
1 2
performance: 0.12
røsults:
cost gamma error dispersion
1 0.5 0.27 0.1160
2 le+OC
3 h+Ol
4 16+02
5 1e+03
6 1e -Ql
7
0.5 0.13
0 , 15
0.5 0.17
0.5 0.21
1. 0 0.25
1. 0 0.13
0.0823
0.0707
0.0823
0.0994
0.1354
0.0823
>
[-train , 1))
9.6.3
> library (ROCR)
> (pred , truth , …) {
+ predob = (pred , truth)
+ (predob.
+ plot{perf ,.,.)}
+ß1Xl +ßzXz +…+
254
> (y,,-,_.
> ,J
. values
”
> , dat (train , “y”1 , main=”Training )
>
gamma”’50.
> (svmf i t . flex , dat [train .J ,
values=T)) values
> rocp’lot tted , dat
> {predict dat ,J ,
values
> rocplot , dat (-train , “y “) , main””’ Test Data”)
> (predict ,J , decision
9.6.4
> set , seed (1)
> l!:”‘rbind(x , , )
> rep(O , 50))
> x
> frame )
> par (mfro li’ ”’c )
> )
a m a g o s a d a x l a k t a a
Fvl lsp
9.6.5
9.6 .. 255
>; library
> names (Khan)
,
[lJ 63 2308
>
[1] 20 2308 ,
[1] 63
[1J 20
1 2: 3 4
8 23 12 20
> table
1 2: 3 4
3 6 S 5
> frame )
> out”‘svm(y”‘. ,
> summary (out)
$vm (formula ‘” Y ,.._. ., •
‘” 10)
Parameters
SVM-Type: C-elassification
SVM-Kernel: linear
coat: 10
gamma: 0.000433
Number of Support Vectors: 58
(2020117)
Number of
>y $ t a d z d e t i f s t u
4
<
:e a31 1b @2a e1 L
>
40000
2
30020
1
20300
2
18000
>>t s e Y S M h k
ae zt y
t
‘a td gs ea tt xa $d aw
>
ha$ vae m
40005 80240 20600 18000 e1234 d e r p
• 256
9 , 7
( 1 +3X, -X, + 3X! – X2 > + 3Xl – X2 <
+X1 +2Xz +X1
+X l +2X2
( +X1)2 + (2 _XZ)2 =4"
+ (2 _XZ)2 +X,)' + (2
+X1 )2 + (2 _XZ)2
1)7 (2 , 2)1 (3 , 8)1
(
(
(
(
vh"-4244231
( p
>
> (500) -0.5
> > 0)
Xl xX2 ,
(
(
(h)
258 •
(
(c)
(
> plot
,
svm “,
(
cost “”
( ‘WJ
(
= 2 ‘”
X2 , ,..
X2 • ••••
clustering)
10.1
260 ‘’
10.2
component ana1ysis ,
X2 • •••
10.2.1
pnnciple
ZI (10.1)
10.2
(
,
Zil (10.2)
:::: -1 (10.3)
Zu
Zu. ••• Z”l
= •.•
,.,
::::0.544″
Z:n,
Za + …+
cþ” =
261
262 •
3
=40
Murde!
UlbanPop
Rape
PC1
0.5358995 1809
0.583 183
0.278 190 9 O. 872 806 2
0.543432
-0.5 0.0 0.5
JersJv
hd… “” …
MMMhgm~jo
ssault
?
1 0 1
Firnt Principal
!fI 10-1
(Rape
-3 -2 2 3
263 •
Murder
10.2
10.2.2
. . .
. . . • .’ . . ‘. … . . • . . .
•
-1.0
. . .
.‘ • • Æ”
• . .
. . . .
-0.5 0.0 0.5
First princìpal component
264 @
M
(10.5)
=
M
10.2.3
Murder ,
87.73 , 6 945. 16
Scaled Unscaleà
-0.5 0.0 05 -0.5 0.0 0.5 1.0
;1·J jEii
t
1 :
I <::> Ás;.J â æ I
!
8 <:'01 1 I a g 11 l' g Q T
I
-3 -2 -1 0 1 2 3 -100 -50 0 50 100 150
First Pricipal Component First Pricipal Component
10.2 265
10,
of vanance explained ,
(1 0, 6)
,,, J
z zz;
(10 , 7)
(10. 8)
266 •
plot)
1 ,
1.0 1.5 2.0 25 3.0 3.5 4.0
Component
10.3 • 267
10.2.4
10.3
10.3.1
268 •
K=4
" .. ..
tJj:•h" y
.. . :.' .:- .. d BY r
2 .; :
"',
(1) C, UC, U"'UC,= 11.
(2) c, nc',
e Cko
- ,
W( C,) } (10.9)
- X;'j) 2
I.....k
c,
(10.10)
- ,
10.3
(Xòj - X ,,;)2 = 2 (10. 12)
(a)
(10.11)
opti
D,ta Iteration 1, 8tep 2a
,
.3, ... •• • 2 33.t •• •
.. .. $
" , . .
- .
1, Step 2b lteration 2, Step 2a
,
tjLte
.
320.9 235.8 235.8
, 9 ,
…4L·· •• a-•• ••• • J' P …ddd •• "Y •• 9 ….-·, •• L •• hA ••• "Y 9 ••• . . .
;
235.8 235.8 310.9
e
.A3·'." ••• ••• • •• •• g …3·-.·.4.5.B" •• wpg • 2 -e …OS ·.·.4J.1..·.H • 2 m . ,
2 - ·-· : 2 J -
10.3.2
10.3
i .. "" •
•
e
.. • _e
,..,'" •• ..
• .
-4
X,
.
"
271
272 •
-
6 5 7
13
6
9
2
8
7
5
-1.0 -0.5 0.0 0.5 1.0
x,
ii;; 10.
9 9
8 8
2 2
6 6
:J 4
-1.5 -1.0 0.0 0.5 1.0 -0.5 0.0 0.5 1.0
?
x‘
9
2
X,
9
GIZ
,1...-1
-1.5 -1.0 -0.5 0.0 0.5 1.0 -1.5 -1.0 -0.5 0.0 0.5 1.0
111 , 121 , "',
274 •
n-l , ..., 2:
centroid)
10.3
"
m
Variable lndex
• 276
Socks Computers
-
"
10.3.3
10.4
et al.
10.4
>
> states
> names(USArrests)
(1) “Murder” “Assault” “UrbanPop”” Rape”
278 ..
> 2 , mean}
Murder Assault UrbanPop R.ape
7.19 110.16 66.54′ 2 1. 23
apply
> apply(USArreStB , 2 , var)
Murder Assaul t UrbanPop
19.0 6945.2 209.5 87.1
> pr.out”‘prcomp{USArrests ,
prcomp =
> names (pr. out)
[lJ “scale” “x”
>
Murder Assault
7.79 170.76 65 , 64 2 1. 23
> pr ,
Murder A.s sault UrbanPop Rape
4 , 36 83.34 14.47 9.37
pr. out
>
PCl PC2 PC3 PC4
0.418
Assault -0.583 -0.268 -0 , 743
UrbanPop -0 , 278 -0 , 873 -0.378 0.134
Rape -0.543 -0.167 0 , 818 0.089
$
10.4 • 279
,
[1J 60 ‘4
, scale=O)
biplot =
> pr.out$rotation ,
> scale”‘O)
prcomp
>
[1J 1. 675 0. ,995 0.597 0.416
>
> pr.var
[1} 2.480 0.990 0.357 0.173
> pve=pr.varlsum(pr.var)
>’ pve
[lJ 0.6201 0.2474 0.0891 0.0434
> plot(pve , Component” , o :f
Variance Explained” , yliw”‘c )
> (pve) , xlab””’ Principal Component”.
CUlIlulative of ,
cumsum
> a-“‘c(1 , 2.S.-3)
> CU lIlSUlll (a)
[1) 1 3 11 8
• 280
10.5
kmeans
> set.sÐed(2)
> ncol=2)
>
> -4
> (x ,2. “’20)
out $
>
[1J 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 :2 2 2 2 2 1 1 1 1
(30J 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
10.5.1
> cOl=(km.out$clUster+l) , main=”K-Means
Resul ts xlab=””. ylab=”” , pch=20 , cax”‘2)
>
> km.out”‘kmeans(x , 3 , nstart=20)
> km.out
K-meana clustering with 3 clusters of sizes 10 , 23 , 17
Cluster means
[.lJ
1 2.3001545
2 -0.3820397
3 3.7789567
[.2J
-0.08740753
2 2 2 2 3 3 8 1 32 32 32 32 31 12 31 12 32 12 32 12 32 12 82
r t c32 @ v12 g32 a 112 r e82 t e-u1 1[ C
Wïthin sum of squares by cluster
[1] 19.56137 25.74089
(bet lJeen_SS I total < 55 '" 79.3 %)
"withinss"
Available components:
[1J "cluster" "centers" "totss"
"size"
> plot(x , main=”K-Means
with K”‘3″ , y1ab=”” , pCh”’20 , cex=2)
281
“”
nstart
10.5
ßßed (3)
>
> km.out$tot.withinss
(1) 104.3319
:>
:>
(1) 97.9793
km tot.
out $ wi
seed
hclust
x
10.5.2
:> bC. completechçlust (dist(x) ,
:> avørage
> metbod””’single”)
> par(mfrow=c{1 ,3))
> Linkage ” , xlab”””’. Bub”””’ ,
cax=.9)
:> plot (hc. average. lllain””’ Averaga Linkage”. xlab”””’. sub=””.
cex=.9)
> plot (hc. single , main :,,” Single Linkage”.
2 2 2 2 2 1 Z 1 12 12 12 12 22 12 12 22 12 12
)12 2
12
, e12 1 p12 m c
22
c h12
282
21 21 21 11 11 11 1211 1211 1211 1211 1112 1221 1111 1211 1211 1211 1211
>212
>
11
2
‘1211 e
,
812@11 al r12g11 en V-2-11 a6 ·1111 cc <
3 8 3 3 1 $ 1 1 13 18 13 13 23 13 13 14 18 13 13
>
13
4
as ,
eIS l g13 11s ee
-s
c h13
E
> (x)
> (xsc) , mathod=”completa
‘ll’ ith Scalad
dist
hclust
>
>
> (dd > cOlllpl.ete “), .main””’ Complete Linkage
with Gorrelation -Based- Distance”. )
10.6
> library (ISLR)
>
>
> dim(nci .data)
(1) 64 6830
10. 6 • 283
> nci .1abs
(1] “CNS” “CNS” “CNS”
> tabl e.
nci.labs
BREAST CNS COLON K562A -repro K562B
757 i i
LEUKEMIA MCF7A-repro MCF7D-r e.pro MELANOMA NSCLC
6 1 1 8 9
OVARIAN PROSTATE RENAL UNKNOWN
6 , 9
10.6.1
)0 (nci.
Cols=function(vec){
+
+
+ }
)0 par(mfrow=c(1 ,2))
)0 plot col”‘Cols(nci.labs) ,
xlab”””Zl” , ylab””’Z2”)
)0 plo:t (pr. out$x (, c col”‘Cols(nci.labs) ,
> (pr. out)
Importance of
PCl PC’ PC3 PC4 PC5
27.853 21.4814 19.8205 17.0326 15.9718
of Variance 0.114 0.0676 0.0575 0.0425 0.0374
CUlllulative Proportion 0.114 0.1812 0.3185
)0 plot(pr.out)
out $
284
,
aa
,
“,,! …VV J
.-“” ., – . , . . . .
”
N
. .
‘ . .
. . ..
..!”fto. ..
.. 8
v
……”” .
40 -20 0 20 40 60 -40 -20 0 20
ZI ZI
Q
;-
40
>
> par , 2))
> plot ‘. ylab=”PVE” , Component” ,
col”‘U
> PVE”. xlab=”
[2 r
cumsum (pr. Qut) [3 ,
10 20 30 40 50 60 0 10 20 30 40 50
Principa!
60
N
TO
10.6.2
10.6
> (nci. data)
> par
>
> plot .labs.
xlab=”” , sub”””’ , ylab=”-”}
> labels=nci. labs ,
main=-” Average Linkage” , sub=””. )
> plot(hclust(data.dist , labels=nci.labs.
main=”Single Linkage” I aub=”” .ylab=””)
> hc.out”‘hclust(dist(sd.data})
> hc.
> par(mfrow=c(l , l))
> labels=nci.labs)
> (b=139 , ,,)
> hc.
Call:
hclust (d ‘”
Cluster method compl_ete
Distance euclidean
Number of 64
286 ..
Complete Lìnkage j
A verage Linkage l EEago-
Single Lìnkage
, (2)
> 4 , =20)
>
> table(km.clustars , hc.clusters)
hc.clusters
km. 1 2 3 4
111009
2 0 0 8 0
3 9 0 0 0
420700
> hc. out”‘hclust (dist (pr.
> Clust. on First
Score Vectors “)
> table (hc . nci .labs)
10.7
( ;
[030407| 0.3 0.5 0.8
0.4 0.5 0.45
0.7 0.8 0.45
(
(
X, X,
1 1 4
2 1 3
3 4
4 5 1
5 6 2
6 4
288 ..
(
(
(
(e)
(a)
1 000 x
0
(
(
–
,289
(
(
(
(
(
290 •
.
edul –
( csv
(