Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
CORPFIN 2503 – Business Data Analytics: Statistical tests (supplementary material)
Week 3: August 9th, 2021
£ius CORPFIN 2503, Week 3 1/61
Copyright By PowCoder代写 加微信 powcoder
Hypotheses Tests P-value Errors
Condence interval
Examples using SAS
Hypotheses
Condence interval Examples using SAS
CORPFIN 2503, Week 3 2/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Introduction
Hypothesis testing is probably the most important part of the quantitative analysis.
Hypothesis testing determines which of the two mutually exclusive statements about a population is best supported by the sample data.
In short, which of the two statements is best supported by the data.
One can be test whether:
• a particular trading strategy is protable
• the active fund outperformed its peers or benchmark
• the performance of stock A is higher than the performance of stock B
£ius CORPFIN 2503, Week 3 3/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Introduction II
Steps of hypothesis testing:
1. State the null hypothesis on the population
2. Select the sample
3. Calculate the test statistic value
4. Make a nal inference based on the test statistic results.
£ius CORPFIN 2503, Week 3 4/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Sample vs population
Population includes all members of a certain group:
• all stocks traded on a particular stock exchange (e.g., ASX)
A sample is a part of the population: • stocks included in S&P/ASX 200.
If we want to make a judgment about the whole population, then the sample chosen should represent the population.
£ius CORPFIN 2503, Week 3 5/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Null hypothesis
The null hypothesis (H0) is a statement or the initial assumption or claim about the overall population.
We need to be careful while stating the null hypothesis because we are going to perform the rest of the testing based on the assumption that the null hypothesis is true.
In most cases, we expect to reject null hypothesis.
We will reject null hypothesis if we get some substantial evidence against it.
Null hypothesis should be based on the theory.
£ius CORPFIN 2503, Week 3 6/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Null hypothesis II
Examples of null hypotheses:
• a particular trading strategy is not protable (based on Ecient Market Hypothesis)
• the performance of a particular active fund is the same as the performance of its peers or benchmark
• the performance of stock A is the same as the performance of stock B
• Vaccination and u are independent of each other
£ius CORPFIN 2503, Week 3 7/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Alternative hypothesis
An alternative hypothesis (H1) is the second hypothesis that is a substitute to the null hypothesis.
Null hypothesis and alternative hypothesis are mutually exclusive.
In most cases, the alternative hypothesis is the opposite of the null hypothesis.
If we reject null hypothesis, then we accept the alternative hypothesis.
Alternative hypothesis is not always needed.
£ius CORPFIN 2503, Week 3 8/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Alternative hypothesis II
Examples of alternative hypotheses:
• a particular trading strategy is protable
• the performance of a particular active fund is not the same as the performance of its peers or benchmark
• the performance of stock A is not the same as the performance of stock B
• Vaccination and u depend on each other.
£ius CORPFIN 2503, Week 3 9/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
One- and two-tailed tests
A one-tailed test allows us to determine if the mean of one sub-sample is greater or smaller than the mean of another subsample, but not both.
A direction for one-tailed test must be chosen prior to testing.
For example, H0: the performance of stock A is better than the performance of stock B.
£ius CORPFIN 2503, Week 3 10/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
One- and two-tailed tests II
A two-tailed test allows us to determine if the means of the two sub-samples are dierent from one another.
For the two-tailed test, we do not have to specify a direction prior to testing.
For example, H0: the performance of stock A is the same as the performance of stock B.
£ius CORPFIN 2503, Week 3 11/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Central Limit Theorem: The distribution of the means of large samples tends to be normal, regardless of the distribution of the parent population.
Distributions:
• tests for means: normal vs t-distributions
• test for independence (between two categorical variables from a single population): Chi-square distribution
• tests for variances: F-distribution.
£ius CORPFIN 2503, Week 3 12/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Normal vs t-distribution
The t-distribution becomes equivalent to the normal when the number of observations becomes large.
If the population standard deviation is known, then use normal.
If the population standard deviation not known:
• If the number of observations is large (≥ 30), then use normal.
• If the number of observations is small (< 30), then use t-distribution.
£ius CORPFIN 2503, Week 3 13/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
Normal distribution
£ius CORPFIN 2503, Week 3 14/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
After calculating a test statistic, we should match it with the corresponding p-value and then either to accept or to reject null hypothesis.
If the p-value (the probability) is more than a certain threshold, then the null hypothesis is accepted.
Generally, 5% is taken as an industry standard for the p-value.
For a p-value < 5%:
• the null hypothesis is rejected
• the sample statistic is signicantly dierent from the population parameter.
£ius CORPFIN 2503, Week 3 15/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
P-value II
£ius CORPFIN 2503, Week 3
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
P-value for one-tailed test
£ius CORPFIN 2503, Week 3 17/61
5% critical region for one- tailed test
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
P-value for two-tailed test
A 5% tolerance means 2.5% on the either side of the null hypothesis value.
£ius CORPFIN 2503, Week 3 18/61
2.5% critical region for one- tailed test
2.5% critical region for one-tailed test
Hypotheses
Tests P-value
Errors Condence interval Examples using SAS
P-value III
α = tail area
0.1 0.05 0.025 0.01 0.005
Central area = 1 2×α z
0.9 1.65 0.95 1.96 0.98 2.33 0.99 2.58
CORPFIN 2503, Week 3
Hypotheses Tests P-value Errors Condence interval Examples using SAS
P-value IV
Suppose our rejection rule is that p-value must be lower than 5%: • z=1.65 for one-tailed test
• z=1.96 for two-tailed test.
=⇒ it is easier to reject null hypotheses for one-tailed tests.
α = tail area
0.1 0.05 0.025 0.01 0.005
Central area = 1 2×α z
0.9 1.65 0.95 1.96 0.98 2.33 0.99 2.58
CORPFIN 2503, Week 3
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Degrees of freedom
Degrees of freedom to describe the number of values in the nal calculation of a statistic that are free to vary (`Glossary of Statistical Terms'. Animated Software. Retrieved on 2018-08-20.).
For t-distribution, the degrees of freedom are equal to the sample size 1 (i.e., n − 1).
£ius CORPFIN 2503, Week 3 21/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
Degrees of freedom II
Example using the mean (average)
• suppose we need to pick 3 numbers that have a mean of 10
• once we have chosen the rst two numbers, the third is xed
• the third number can be computed as (mean ×3− the sum of both chosen numbers)
• so only rst two numbers are free to vary
• if we pick 8 and 12 then the third number is 10
• if we pick 5 and 24 then the third number is 1
• So degrees of freedom for a set of 3 numbers is 2.
£ius CORPFIN 2503, Week 3 22/61
Hypotheses
Tests P-value Errors Condence interval Examples using SAS
t distribution critical values
Degrees of freedom
5 10 15 20 25 30 50 100 1000 z∗
0.05 0.025
2.015 2.571 1.812 2.228 1.753 2.131 1.725 2.086 1.708 2.06 1.697 2.042 1.676 2.009
1.66 1.984 1.646 1.962 1.645 1.96
4.032 3.169 2.947 2.845 2.787
2.75 2.678 2.626 2.581 2.576
Upper tail probability
A t distribution converges to a normal distribution when the number of degrees of freedom (n − 1) becomes large.
£ius CORPFIN 2503, Week 3 23/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Type I and type II errors
Type I error is the rejection of null hypothesis which is actually true (a.k.a., `false positive').
Type II error is failing to reject a null hypothesis which is false (a.k.a., `false negative').
In other words:
• due to a type I error, we will incorrectly infer the existence of something that does not exist
• due to a type II error, we will incorrectly infer the absence of something that exists.
£ius CORPFIN 2503, Week 3 24/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Type I and type II errors: Example
We want to test whether a new proposed trading strategy is protable.
We develop the following hypotheses:
• Null hypothesis (H0): a new strategy is not protable.
• Alternative hypothesis (H1): a new strategy is protable.
£ius CORPFIN 2503, Week 3 25/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Type I and type II errors: Example II
Type I error: we reject null hypothesis and assume that trading strategy is protable; however, the strategy is actually
not protable.
Type II error: we fail to reject null hypothesis and assume that trading strategy is not protable whereas the strategy
is actually protable.
£ius CORPFIN 2503, Week 3 26/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Type I and type II errors III
In general, there is a trade-o in statistical tests between: • the acceptable level of false positives and
• the acceptable level of false negatives.
It all depends on the level of signicance (minimum p-value required for accepting the null hypothesis):
1. if it is 10%, the probability of Type 1 error is high but the probability of Type 2 error is low
2. if it is 1%, the probabilty of Type 1 error is small but the probability of Type 2 error is high.
£ius CORPFIN 2503, Week 3 27/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Condence interval
The condence interval gives an estimate of the interval of values that a population parameter is likely to be in.
Condence limits are the lower and upper limits of the condence interval are called .
If 5% is the rejection region, then the remaining 95 percent will be a nonrejection region.
=⇒ 95% will be the condence interval.
£ius CORPFIN 2503, Week 3 28/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Condence interval II
For normal distribution, condence interval is:
x ̄−z √n,x ̄+z √n ,where
x ̄ is sample mean
σ is standard deviation
z∗ is critical value of standard normal distribution
n is the number of observations.
£ius CORPFIN 2503, Week 3 29/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Monthly returns (in %) on 17 largest ASX stocks for the last 60 months.
Source: Eikon.
£ius CORPFIN 2503, Week 3 30/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
£ius CORPFIN 2503, Week 3 31/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Descriptive statistics
proc univariate data=work.mqg plots;
var return;
work.mqg is the data set with only two variables: date and return (monthly return on MQG stock).
£ius CORPFIN 2503, Week 3 32/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
Descriptive statistics II
£ius CORPFIN 2503, Week 3 33/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
Descriptive statistics III
£ius CORPFIN 2503, Week 3 34/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Distributional properties
Let's run normality tests:
proc univariate data=work.mqg plots normaltest;
var return;
histogram / kernel normal;
£ius CORPFIN 2503, Week 3 35/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
Distributional properties II
£ius CORPFIN 2503, Week 3 36/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Distributional properties III
Normality tests suggest that MQC returns are not normally distributed.
However, given the relatively small number of observations and the shape of its histogram, I would not worry too much about this.
£ius CORPFIN 2503, Week 3 37/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
One-sample t-test
Let's run one-sample t-test using the returns on MQG stock.
Null hypthesis (H0): return is 0.
Alternative hypothesis (H1): return is not 0.
proc ttest data=work.mqg;
var return;
£ius CORPFIN 2503, Week 3 38/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
One-sample t-test II
£ius CORPFIN 2503, Week 3 39/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
One-sample t-test III
£ius CORPFIN 2503, Week 3 40/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
One-sample t-test IV
Let's amend our hypotheses.
• Null hypothesis (H0): return is 1.5%.
• Alternative hypothesis (H1): return is not 1.5%.
proc ttest data=work.mqg H0=1.5;
var return;
£ius CORPFIN 2503, Week 3 41/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
One-sample t-test V
£ius CORPFIN 2503, Week 3 42/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
One-sample t-test VI
£ius CORPFIN 2503, Week 3 43/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Two-sample t-test
Let's test whether monthly returns on ANZ and CBA stocks are statistically dierent:
• Null hypothesis (H0): returns are the same.
• Alternative hypothesis (H1): returns are dierent.
We will use work.cba_anz2 data set.
£ius CORPFIN 2503, Week 3 44/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
Two-sample t-test II
£ius CORPFIN 2503, Week 3 45/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
proc ttest data=work.cba_anz2;
class ticker;
var return;
Two-sample t-test III
£ius CORPFIN 2503, Week 3 46/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
Two-sample t-test IV
£ius CORPFIN 2503, Week 3 47/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
Two-sample t-test V
£ius CORPFIN 2503, Week 3 48/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Two-sample t-test VI
We fail to reject null hypthesis (returns are the same).
Thus, monthly returns of CBA and ANZ are statistically the same during the last 60 months.
£ius CORPFIN 2503, Week 3 49/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Two-sample t-test VIII
Let's run two-sample t-test for CBA and MQG stocks for the same set of hypotheses:
• Null hypothesis (H0): returns are the same.
• Alternative hypothesis (H1): returns are dierent.
£ius CORPFIN 2503, Week 3 50/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
Two-sample t-test IX
£ius CORPFIN 2503, Week 3 51/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
Two-sample t-test X
£ius CORPFIN 2503, Week 3 52/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Two-sample t-test XI
P-value is slightly higher than 6%:
• if our signicance threshold is 5%, then we fail to reject null hypothesis
• if our signicance threshold is 10%, then we reject null hypothesis.
£ius CORPFIN 2503, Week 3 53/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Paired t-test
Consider 17 largest stocks on ASX.
Let's test whether their monthly returns are dierent on 30/06/2018 and 31/07/2018:
Some statistical properties:
proc means data=work.asx;
var _30_06_2018 _31_07_2018;
work.asx is the original data set imported from Eikon (not transposed).
£ius CORPFIN 2503, Week 3 54/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
Paired t-test II
Some statistical properties:
£ius CORPFIN 2503, Week 3 55/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Paired t-test III
The actual test:
proc ttest data=work.asx;
PAIRED _30_06_2018 * _31_07_2018;
£ius CORPFIN 2503, Week 3 56/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
Test's results:
Paired t-test II
£ius CORPFIN 2503, Week 3 57/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
More results:
Paired t-test III
£ius CORPFIN 2503, Week 3 58/61
Hypotheses Tests P-value Errors Condence interval Examples using SAS
Statistical tests for correlations
From previous lectures, we already know that we can test whether correlation coecients are statistically signicant or not.
Let's consider 4 stocks from our sample of 17 and generate correlation matrix:
proc corr data=work.asx2;
var ANZ_AX CBA_AX RIO_AX MQG_AX;
£ius CORPFIN 2503, Week 3 59/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
Statistical tests for correlations II
Signicant coecients at 1% level: ANZ & CBA, ANZ & MQG, CBA & MQG.
£ius CORPFIN 2503, Week 3 60/61
Basics Hypotheses Tests P-value Errors Condence interval Examples using SAS
Recommended reading
Konasani, V. R. and Kadre, S. (2015). Practical Business Analytics Using SAS: A Hands-on Guide: chapter 8.
£ius CORPFIN 2503, Week 3 61/61
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com