THE UNIVERSITY OF CALGARY DEPARTMENT OF GEOMATICS ENGINEERING
ENGO 363: Estimation & Statistical Testing Winter 2019
Due date: 11:59 pm on Tue, Feb 5, 2019
Lab# 1: Single and multivariate statistics, and confidence intervals
Part I:
Objective: To familiarize the students with the theory and measure of errors
1. The same distance was measured 50 times by two different observers “A” and “B”. The field results of the two observers are given in the following table. Note that the measurement units are metres.
1
Meas No.
Observer
Meas No.
Observer
Meas No.
Observer
A
B
A
B
A
B
1
153.0
149.6
18
151.7
148.4
35
155.1
150.6
2
152.9
151.4
19
154.5
150.3
36
150.3
151.6
3
149.0
149.0
20
155.0
149.0
37
146.1
148.7
4
149.2
150.2
21
144.3
149.2
38
150.4
151.0
5
145.5
15.2
22
149.8
151.1
39
152.1
149.4
6
147.3
150.0
23
152.7
151.0
40
149.6
149.5
7
147.1
149.4
24
151.2
151.0
41
157.9
150.7
8
147.7
151.3
25
151.6
150.2
42
146.9
149.0
9
153.3
150.5
26
151.7
148.2
43
142.7
150.4
10
149.2
150.7
27
142.5
147.9
44
152.9
150.5
11
147.7
149.7
28
149.4
151.1
45
147.3
151.8
12
152.2
150.2
29
149.8
150.0
46
148.7
149.8
13
150.0
150.4
30
151.5
151.3
47
147.5
148.7
14
155.6
148.7
31
148.1
148.9
48
147.8
151.2
15
149.7
152.5
32
150.2
149.8
49
146.8
151.3
16
143.9
151.7
33
149.4
150.6
50
151.6
150.0
17
144.5
149.6
34
154.4
150.6
______________________________________________________________________________ Winter 2019 ENGO 363: Estimation & Statistical Testing Lab # 1
In C/C++
a. For each of the above two random samples of observations compute the range, median,
and mean to two decimal places
b. Using the mean as the best estimate, compute the residuals, their sum, mean, variance,
and standard deviation, and also the standard deviation of the best estimate for the two
data sets
c. From the results for both observers, compute the weighted mean of the distance together
with its standard deviation
In Matlab
d. State the best estimate for each of the two random samples with its corresponding
standard deviation at 68%, 95%, and 99% confidence levels
e. Plot a “time series” of the observations, the best estimate, and the weighted mean for
each data set; make sure to show the 68%, 95%, and 99% confidence intervals
f. Plot the residual probability density histogram for the given samples; make sure to show
the mean and the 68%, 95%, and 99% confidence intervals
g. Do the plots exhibit any systematic issues? Why or why not?
h. Are there any outlying residuals/observations? Why or why not? Explain what you would
do to mitigate the problem if there are outliers.
Part II:
Objective: To familiarize the students with the concepts of multivariate statistics
2. For the purpose of setting specifications for the choice of new members of a football team of a sports club, a random sample of ten of old forward players was chosen. The weight, height, speed, and number of goals scored in one season for each player were recorded.
2
No
Weight – w (kg)
Height – h (m)
Speed – s (m/sec)
# of scored goals – g
1
73
1.80
7.2
13
2
77
1.70
7.2
7
3
71
1.86
8.0
19
4
68
1.70
7.3
10
5
82
1.78
5.0
7
6
73
1.71
7.4
14
7
80
1.70
5.1
8
8
82
1.90
5.2
8
9
71
1.65
7.1
12
10
85
1.79
5.9
6
In C/C++
a. Compute the mean, variance and standard deviation of the multivariate sample N
NL1, L2, L3, L4
______________________________________________________________________________ Winter 2019 ENGO 363: Estimation & Statistical Testing Lab # 1
where
L1 w1, w2,…… w10 L2 h1, h2,…… h10
L3 s1, s2,…… s10 L4 g1, g2,…… g10
b. Establish the variance-covariance matrix CN of the multivariate sample
c. Compute the correlation matrix N of the multivariate sample
In Matlab:
d. Produce a scatter plot for each two individual components
e. Given the correlation matrix and the scatter plots, discuss the degree of correlation
between each two individual components
Note: Units must be included in all the above computed values / produced plots.
Write-up / Deliverables
The write-up should include the results for the required steps (presented in a tabular and/or graphical format) and the answers to any questions.
Program Source Code
This is an individual lab assignment and as such all results presented in the write-up must be obtained from your own program(s).
Computations must be performed in the C/C++ language using the Eigen library.
Data visualization should be performed in Matlab.
Source code will be evaluated on modularity, style, readability, and use of comments. The use of
functions is mandatory. Do not underestimate the value of good, well documented code in terms
of long term usefulness, and contribution to your grade.
Data for the program(s) must come from external files. Hard coding of data should be avoided.
______________________________________________________________________________ Winter 2019 ENGO 363: Estimation & Statistical Testing Lab # 1
3