程序代做CS代考 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS

Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS

CORPFIN 2503 – Business Data Analytics:
Statistical tests (supplementary material)

£ius

Week 3: August 9th, 2021

£ius CORPFIN 2503, Week 3 1/61

Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS

Outline

Basics

Hypotheses

Tests

P-value

Errors

Con�dence interval

Examples using SAS

£ius CORPFIN 2503, Week 3 2/61

Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS

Introduction
Hypothesis testing is probably the most important part of the
quantitative analysis.

Hypothesis testing determines which of the two mutually exclusive
statements about a population is best supported by the sample
data.

In short, which of the two statements is best supported by the data.

One can be test whether:

• a particular trading strategy is pro�table
• the active fund outperformed its peers or benchmark
• the performance of stock A is higher than the performance of
stock B

• etc.
£ius CORPFIN 2503, Week 3 3/61

Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS

Introduction II

Steps of hypothesis testing:

1. State the null hypothesis on the population

2. Select the sample

3. Calculate the test statistic value

4. Make a �nal inference based on the test statistic results.

£ius CORPFIN 2503, Week 3 4/61

Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS

Sample vs population

Population includes all members of a certain group:

• all stocks traded on a particular stock exchange (e.g., ASX)
A sample is a part of the population:

• stocks included in S&P/ASX 200.

If we want to make a judgment about the whole population, then
the sample chosen should represent the population.

£ius CORPFIN 2503, Week 3 5/61

Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS

Null hypothesis

The null hypothesis (H0) is a statement or the initial assumption
or claim about the overall population.

We need to be careful while stating the null hypothesis because we
are going to perform the rest of the testing based on the
assumption that the null hypothesis is true.

In most cases, we expect to reject null hypothesis.

We will reject null hypothesis if we get some substantial evidence
against it.

Null hypothesis should be based on the theory.

£ius CORPFIN 2503, Week 3 6/61

Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS

Null hypothesis II

Examples of null hypotheses:

• a particular trading strategy is not pro�table (based on
E�cient Market Hypothesis)

• the performance of a particular active fund is the same as the
performance of its peers or benchmark

• the performance of stock A is the same as the performance of
stock B

• Vaccination and �u are independent of each other
• etc.

£ius CORPFIN 2503, Week 3 7/61

Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS

Alternative hypothesis

An alternative hypothesis (H1) is the second hypothesis that is a
substitute to the null hypothesis.

Null hypothesis and alternative hypothesis are mutually exclusive.

In most cases, the alternative hypothesis is the opposite of the null
hypothesis.

If we reject null hypothesis, then we accept the alternative
hypothesis.

Alternative hypothesis is not always needed.

£ius CORPFIN 2503, Week 3 8/61

Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS

Alternative hypothesis II

Examples of alternative hypotheses:

• a particular trading strategy is pro�table
• the performance of a particular active fund is not the same as
the performance of its peers or benchmark

• the performance of stock A is not the same as the
performance of stock B

• Vaccination and �u depend on each other.

£ius CORPFIN 2503, Week 3 9/61

Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS

One- and two-tailed tests

A one-tailed test allows us to determine if the mean of one
sub-sample is greater or smaller than the mean of another
subsample, but not both.

A direction for one-tailed test must be chosen prior to testing.

For example, H0: the performance of stock A is better than the
performance of stock B.

£ius CORPFIN 2503, Week 3 10/61

Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS

One- and two-tailed tests II

A two-tailed test allows us to determine if the means of the two
sub-samples are di�erent from one another.

For the two-tailed test, we do not have to specify a direction prior
to testing.

For example, H0: the performance of stock A is the same as the
performance of stock B.

£ius CORPFIN 2503, Week 3 11/61

Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS

Theory

Central Limit Theorem: The distribution of the means of large
samples tends to be normal, regardless of the distribution of the
parent population.

Distributions:

• tests for means: normal vs t-distributions
• test for independence (between two categorical variables from
a single population): Chi-square distribution

• tests for variances: F-distribution.

£ius CORPFIN 2503, Week 3 12/61

Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS

Normal vs t-distribution

The t-distribution becomes equivalent to the normal when the
number of observations becomes large.

If the population standard deviation is known, then use normal.

If the population standard deviation not known:

• If the number of observations is large (≥ 30), then use normal.
• If the number of observations is small (< 30), then use t-distribution. £ius CORPFIN 2503, Week 3 13/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Normal distribution £ius CORPFIN 2503, Week 3 14/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS P-value After calculating a test statistic, we should match it with the corresponding p-value and then either to accept or to reject null hypothesis. If the p-value (the probability) is more than a certain threshold, then the null hypothesis is accepted. Generally, 5% is taken as an industry standard for the p-value. For a p-value < 5%: • the null hypothesis is rejected • the sample statistic is signi�cantly di�erent from the population parameter. £ius CORPFIN 2503, Week 3 15/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS P-value II P-value £ius CORPFIN 2503, Week 3 16/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS P-value for one-tailed test 5% critical region for one- tailed test £ius CORPFIN 2503, Week 3 17/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS P-value for two-tailed test 2.5% critical region for one- tailed test 2.5% critical region for one-tailed test A 5% tolerance means 2.5% on the either side of the null hypothesis value. £ius CORPFIN 2503, Week 3 18/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS P-value III α α 1 – 2 × α α = tail area Central area = 1 � 2×α z 0.1 0.8 1.28 0.05 0.9 1.65 0.025 0.95 1.96 0.01 0.98 2.33 0.005 0.99 2.58 £ius CORPFIN 2503, Week 3 19/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS P-value IV Suppose our rejection rule is that p-value must be lower than 5%: • z=1.65 for one-tailed test • z=1.96 for two-tailed test. =⇒ it is easier to reject null hypotheses for one-tailed tests. α = tail area Central area = 1 � 2×α z 0.1 0.8 1.28 0.05 0.9 1.65 0.025 0.95 1.96 0.01 0.98 2.33 0.005 0.99 2.58 £ius CORPFIN 2503, Week 3 20/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Degrees of freedom Degrees of freedom to describe the number of values in the �nal calculation of a statistic that are free to vary (`Glossary of Statistical Terms'. Animated Software. Retrieved on 2018-08-20.). For t-distribution, the degrees of freedom are equal to the sample size � 1 (i.e., n− 1). Why �1? £ius CORPFIN 2503, Week 3 21/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Degrees of freedom II Example using the mean (average) • suppose we need to pick 3 numbers that have a mean of 10 • once we have chosen the �rst two numbers, the third is �xed • the third number can be computed as (mean ×3− the sum of both chosen numbers) • so only �rst two numbers are free to vary • if we pick 8 and 12 then the third number is 10 • if we pick 5 and 24 then the third number is 1 • So degrees of freedom for a set of 3 numbers is 2. £ius CORPFIN 2503, Week 3 22/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS t distribution critical values Upper tail probability Degrees of freedom 0.05 0.025 0.005 5 2.015 2.571 4.032 10 1.812 2.228 3.169 15 1.753 2.131 2.947 20 1.725 2.086 2.845 25 1.708 2.06 2.787 30 1.697 2.042 2.75 50 1.676 2.009 2.678 100 1.66 1.984 2.626 1000 1.646 1.962 2.581 z∗ 1.645 1.96 2.576 A t distribution converges to a normal distribution when the number of degrees of freedom (n− 1) becomes large. £ius CORPFIN 2503, Week 3 23/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Type I and type II errors Type I error is the rejection of null hypothesis which is actually true (a.k.a., `false positive'). Type II error is failing to reject a null hypothesis which is false (a.k.a., `false negative'). In other words: • due to a type I error, we will incorrectly infer the existence of something that does not exist • due to a type II error, we will incorrectly infer the absence of something that exists. £ius CORPFIN 2503, Week 3 24/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Type I and type II errors: Example We want to test whether a new proposed trading strategy is pro�table. We develop the following hypotheses: • Null hypothesis (H0): a new strategy is not pro�table. • Alternative hypothesis (H1): a new strategy is pro�table. £ius CORPFIN 2503, Week 3 25/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Type I and type II errors: Example II Type I error: we reject null hypothesis and assume that trading strategy is pro�table; however, the strategy is actually not pro�table. Type II error: we fail to reject null hypothesis and assume that trading strategy is not pro�table whereas the strategy is actually pro�table. £ius CORPFIN 2503, Week 3 26/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Type I and type II errors III In general, there is a trade-o� in statistical tests between: • the acceptable level of false positives and • the acceptable level of false negatives. It all depends on the level of signi�cance (minimum p-value required for accepting the null hypothesis): 1. if it is 10%, the probability of Type 1 error is high but the probability of Type 2 error is low 2. if it is 1%, the probabilty of Type 1 error is small but the probability of Type 2 error is high. £ius CORPFIN 2503, Week 3 27/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Con�dence interval The con�dence interval gives an estimate of the interval of values that a population parameter is likely to be in. Con�dence limits are the lower and upper limits of the con�dence interval are called . If 5% is the rejection region, then the remaining 95 percent will be a nonrejection region. =⇒ 95% will be the con�dence interval. £ius CORPFIN 2503, Week 3 28/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Con�dence interval II For normal distribution, con�dence interval is:( x̄− z∗ σ √ n , x̄+ z∗ σ √ n ) ,where x̄ is sample mean σ is standard deviation z∗ is critical value of standard normal distribution n is the number of observations. £ius CORPFIN 2503, Week 3 29/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Sample Monthly returns (in %) on 17 largest ASX stocks for the last 60 months. Source: Eikon. £ius CORPFIN 2503, Week 3 30/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Sample II £ius CORPFIN 2503, Week 3 31/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Descriptive statistics proc univariate data=work.mqg plots; var return; run; work.mqg is the data set with only two variables: date and return (monthly return on MQG stock). £ius CORPFIN 2503, Week 3 32/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Descriptive statistics II £ius CORPFIN 2503, Week 3 33/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Descriptive statistics III £ius CORPFIN 2503, Week 3 34/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Distributional properties Let's run normality tests: proc univariate data=work.mqg plots normaltest; var return; histogram / kernel normal; run; £ius CORPFIN 2503, Week 3 35/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Distributional properties II £ius CORPFIN 2503, Week 3 36/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Distributional properties III Normality tests suggest that MQC returns are not normally distributed. However, given the relatively small number of observations and the shape of its histogram, I would not worry too much about this. £ius CORPFIN 2503, Week 3 37/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS One-sample t-test Let's run one-sample t-test using the returns on MQG stock. Null hypthesis (H0): return is 0. Alternative hypothesis (H1): return is not 0. proc ttest data=work.mqg; var return; run; £ius CORPFIN 2503, Week 3 38/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS One-sample t-test II £ius CORPFIN 2503, Week 3 39/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS One-sample t-test III £ius CORPFIN 2503, Week 3 40/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS One-sample t-test IV Let's amend our hypotheses. • Null hypothesis (H0): return is 1.5%. • Alternative hypothesis (H1): return is not 1.5%. proc ttest data=work.mqg H0=1.5; var return; run; £ius CORPFIN 2503, Week 3 41/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS One-sample t-test V £ius CORPFIN 2503, Week 3 42/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS One-sample t-test VI £ius CORPFIN 2503, Week 3 43/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Two-sample t-test Let's test whether monthly returns on ANZ and CBA stocks are statistically di�erent: • Null hypothesis (H0): returns are the same. • Alternative hypothesis (H1): returns are di�erent. We will use work.cba_anz2 data set. £ius CORPFIN 2503, Week 3 44/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Two-sample t-test II £ius CORPFIN 2503, Week 3 45/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Two-sample t-test III SAS code: proc ttest data=work.cba_anz2; class ticker; var return; run; £ius CORPFIN 2503, Week 3 46/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Two-sample t-test IV £ius CORPFIN 2503, Week 3 47/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Two-sample t-test V £ius CORPFIN 2503, Week 3 48/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Two-sample t-test VI We fail to reject null hypthesis (returns are the same). Thus, monthly returns of CBA and ANZ are statistically the same during the last 60 months. £ius CORPFIN 2503, Week 3 49/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Two-sample t-test VIII Let's run two-sample t-test for CBA and MQG stocks for the same set of hypotheses: • Null hypothesis (H0): returns are the same. • Alternative hypothesis (H1): returns are di�erent. £ius CORPFIN 2503, Week 3 50/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Two-sample t-test IX £ius CORPFIN 2503, Week 3 51/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Two-sample t-test X £ius CORPFIN 2503, Week 3 52/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Two-sample t-test XI P-value is slightly higher than 6%: • if our signi�cance threshold is 5%, then we fail to reject null hypothesis • if our signi�cance threshold is 10%, then we reject null hypothesis. £ius CORPFIN 2503, Week 3 53/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Paired t-test Consider 17 largest stocks on ASX. Let's test whether their monthly returns are di�erent on 30/06/2018 and 31/07/2018: Some statistical properties: proc means data=work.asx; var _30_06_2018 _31_07_2018; run; work.asx is the original data set imported from Eikon (not transposed). £ius CORPFIN 2503, Week 3 54/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Paired t-test II Some statistical properties: £ius CORPFIN 2503, Week 3 55/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Paired t-test III The actual test: proc ttest data=work.asx; PAIRED _30_06_2018 * _31_07_2018; run; £ius CORPFIN 2503, Week 3 56/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Paired t-test II Test's results: £ius CORPFIN 2503, Week 3 57/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Paired t-test III More results: £ius CORPFIN 2503, Week 3 58/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Statistical tests for correlations From previous lectures, we already know that we can test whether correlation coe�cients are statistically signi�cant or not. Let's consider 4 stocks from our sample of 17 and generate correlation matrix: proc corr data=work.asx2; var ANZ_AX CBA_AX RIO_AX MQG_AX; run; £ius CORPFIN 2503, Week 3 59/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Statistical tests for correlations II Signi�cant coe�cients at 1% level: ANZ & CBA, ANZ & MQG, CBA & MQG. £ius CORPFIN 2503, Week 3 60/61 Basics Hypotheses Tests P-value Errors Con�dence interval Examples using SAS Recommended reading Konasani, V. R. and Kadre, S. (2015). �Practical Business Analytics Using SAS: A Hands-on Guide�: chapter 8. £ius CORPFIN 2503, Week 3 61/61 Basics Basics Hypotheses Hypotheses Tests Tests P-value P-value Errors Errors Confidence interval Confidence interval Examples using SAS Examples using SAS