CS计算机代考程序代写 data mining MAST90083 Computational Statistics & Data Mining Bootstrap Methods

MAST90083 Computational Statistics & Data Mining Bootstrap Methods

Tutorial & Practical 9: Solutions

Question 1

1. Given X = {x1, …, xn}, with µ = E (xi)

θ̂ = θ (F1) =

[∫
x

(
1

n

n∑
i=1

δ (x− xi)

)
dx

]3
=

[
1

n

n∑
i=1

xi

]3
= x̄3

2. To find the bias b1 = E
(
θ̂ − θ0

)
= E

(
θ̂
)
− θ0 we need E

(
θ̂
)

= E (x̄3). Let σ2 =
var (xi) = E (xi − µ)

2
and γ = E (xi − µ)

3
be the third cumulant for xi. The third

cumulant of x̄

E (x̄− µ)3 = E

((
1

n

n∑
i=1

xi

)
− µ

)3
=

E

(
1

n

n∑
i=1

(xi − µ)

)3
=

1

n3
E

(
n∑
i=1

(xi − µ)

)3
=

=
1

n3

n∑
i=1

E (xi − µ)
3

=

n3
=

γ

n2

.

Similarly E (x̄− µ)2 = var (x̄) = σ2/n. For both the variance and the third cumulant,
the cross terms are eliminated by i.i.d. of the samples.

E (x̄)3 = E (x̄− µ+ µ)3

= E
(
µ3 + 3µ2(x̄− µ) + 3µ(x̄− µ)2 + (x̄− µ)3

)
= µ3 +

3µσ2

n
+

γ

n2

Therefore

b1 = E
(
θ̂
)
− θ0 = E

(
θ̂
)
− µ3 =

3µσ2

n
+

γ

n2
= E [θ(F1)− θ(F0)|F0]

3. The bootstrap estimate b̂1 is given by

b̂1 = E [θ(F2)− θ(F1)|F1] =
3x̄σ̂2

n
+

γ̂

n2

where

1

MAST90083 Computational Statistics & Data Mining Bootstrap Methods

x̄ =


xdF1(x) =

1

n

n∑
i=1

xi σ̂
2 =


(x− x̄)2 dF1(x) =

1

n

n∑
i=1

(xi − x̄)2 and

γ̂ =


(x− x̄)3 dF1(x) =

1

n

n∑
i=1

(xi − x̄)3

4. The bootstrap bias reduced estimate is

θ̂1 = θ (F1)− b̂1 = 2θ (F1)− E [θ(F2)|F1]

= x̄3 −
3x̄σ̂2

n

γ̂

n2

5. The bias b2 requires the derivation of the expression of E
(
θ̂1

)
.

Using n = 1 in the expression of E (x̄)3 gives E (x1)
3

= µ3 + 3µσ2 + γ. Therefore for any
j

E

(
xj

n∑
i=1

x2i

)
= E

(
x3j
)

+
n∑

i=1,i 6=j

E
(
xjx

2
i

)
= µ3 + 3µσ2 + γ + (n− 1)µ

(
µ2 + σ2

)
averaging over j gives

E

(

n∑
i=1

x2i

)
= µ3 + 3µσ2 + γ + (n− 1)µ

(
µ2 + σ2

)
Since

σ̂2 =
1

n

n∑
i=1

x2i − x̄
2

E

(

n∑
i=1

x2i

)
= E

[
1

n

n∑
i=1

x2i − x̄
3

]

= E

[
1

n

n∑
i=1

x2i

]

[
x̄3
]

= µσ2 +
1

n

(
γ − µσ2

)

γ

n2

For the derivation of E (γ̂) note that

2

MAST90083 Computational Statistics & Data Mining Bootstrap Methods

E
[
(xj − µ)2(x̄− µ)

]
=

1

n
E (xj − µ)

3
=
γ

n

E
[
(xj − µ)(x̄− µ)2

]
=

1

n

n∑
i=1

E
[
(xi − µ)(x̄− µ)2

]
= E(x̄− µ)3

γ

n2

Using these two relations we have

E (γ̂) = E (xi − x̄)
3

= E
{

(xi − µ)3 − 3(xi − µ)2(x̄− µ) + 3(xi − µ)(x̄− µ)2 − (x̄− µ)3
}

= γ

(
1−

3

n
+

2

n2

)
Using these expressions we get

b2 = E
(
θ̂1 − θ0

)
=

3

n2
(
µσ2 − γ

)
+

n3

n4

6. The bias of the estimator θ̂1 is of order 1/n
2 compared with a bias of order 1/n for θ̂

Question 2

1. The empirical estimator is given by

θ̂ = θ (F1) =

[∫
xdF1(x)

]3
=

[
1

n

n∑
i=1

xi

]3
= x̄3

2. The evaluation of the bias requires the computation of

E
(
θ̂
)

= µ3 +
3µσ2

n
and b1 = E

(
θ̂
)
− θ0 =

3µσ2

n

since the population is normal γ = 0.

3. The bootstrap bias-reduced estimate is

θ̂1 = θ (F1)− b̂1 = 2θ (F1)− E [θ(F2)|F1]

= x̄3 −
3x̄σ̂2

n

4. The bias b2 requires the derivation of the expression of E
(
θ̂1

)
b2 = µ

3 +
µσ2

n

3

n

[
µσ2 −

1

n
µσ2
]
− µ3 =

3µσ2

n2

3

MAST90083 Computational Statistics & Data Mining Bootstrap Methods

5. In the case we use σ̃2 instead of σ̂2 we have

σ̃2 =
1

n− 1

n∑
i=1

(xi − x̄)
2

=
n

n− 1
σ̂2

and

θ̂1 = x̄
3 −

3x̄σ̃2

n
= x̄3 −

3x̄σ̂2

n− 1

and the bias is given by

b2 = µ
3 +

µσ2

n

3

n− 1

[
µσ2 −

1

n
µσ2
]
− µ3 = 0

Question 3

Let qα be the α−percentile of the bootstrap distribution

P
(
θ̂∗ − θ̂ ≤ qα

)
= P

(
θ̂∗ ≤ θ̂ + qα

)
= α

and denote θ̂∗α = θ̂ + qα. On the other hand let q1−α be the (1 − α)−percentile of the
bootstrap distribution

P
(
θ̂∗ − θ̂ ≤ q(1−α)

)
= P

(
θ̂∗ ≤ θ̂ + q(1−α)

)
= 1− α

and denote θ̂∗(1−α) = θ̂ + q(1−α). The relation

Ĥ−1(α) ≤ θ̂ − θ ≤ Ĥ−1(1− α)

corresponds to

1− 2α = P
(
qα ≤ θ̂ − θ ≤ q(1−α)

)
= P

(
θ̂∗α − θ̂ ≤ θ̂ − θ ≤ θ̂


(1−α) − θ̂

)
= P

(
2θ̂ − θ̂∗α ≥ θ ≥ 2θ̂ − θ̂


(1−α)

)

4