Microsoft Word – TOPIC 2 Notes_Models and Regression.docx
Copyright c©Copyright University of Wales 2020. All rights reserved.
Copyright By PowCoder代写 加微信 powcoder
Course materials subject to Copyright
UNSW Sydney owns copyright in these materials (unless stated otherwise). The material is subject to copy-
right under Australian law and overseas under international treaties. The materials are provided for use by
enrolled UNSW students. The materials, or any part, may not be copied, shared or distributed, in print or
digitally, outside the course without permission. Students may only copy a reasonable portion of the material
for personal research or study or for criticism or review. Under no circumstances may these materials be
copied or reproduced for sale or commercial purposes without prior written permission of UNSW Sydney.
Statement on class recording
To ensure the free and open discussion of ideas, students may not record, by any means, classroom lectures,
discussion and/or activities without the advance written permission of the instructor, and any such recording
properly approved in advance can be used solely for the students own private use.
WARNING: Your failure to comply with these conditions may lead to disciplinary action, and may give rise
to a civil action or a criminal offence under the law.
THE ABOVE INFORMATION MUST NOT BE REMOVED FROM THIS MATERIAL.
© Copyright University of Wales 2020. All rights reserved. This copyright notice must not be removed from this
THE LINEAR REGRESSION MODEL
1. Introduction
The linear regression model represents one of the most powerful tools used to model and
test financial theories. To highlight its importance in finance, a number of applications
are presented.
(a). The Capital Asset pricing Model
Consider the Capital Asset Pricing Model (CAPM) which relates the return on the ith
asset at time t, ,i tR , to the return on the market portfolio ,m tR . Both rates are adjusted by
some risk free rate of return ,f tR . The adjusted rates are called the excess returns.
The risk characteristics of an asset are determined by its !coefficientβ
i t f t m t f t
The risk properties of the asset are summarized as follows:
1. 1:β > the asset exhibits greater risk than the market portfolio as its returns exhibit
relatively greater variability. The stock in this case is commonly referred to as an
aggressive stock.
2. 1:β = the assets exhibits the same risk as the market portfolio as its returns exhibit
relatively the same variability.
3. 0 1 :β< < the asset exhibits less risk than the market portfolio as its returns exhibit relatively less variability. The stock in this case is commonly referred to as a conservative stock. 4. 1 0β− < < : the asset returns move in the opposite direction to the returns on the market portfolio. The stock in this case represents an imperfect hedge against movements in the market portfolio. 5. 1 :β = − the asset returns move in the opposite direction to the returns on the market portfolio. The stock in this case represents a perfect hedge against movements in the market portfolio as down (up) movements in the market portfolio are matched on average by up (down) movements in the asset. © Copyright University of Wales 2020. All rights reserved. This copyright notice must not be removed from this The CAPM relationship is conveniently summarized by the linear regression model , , , ,( )i t f t m t f t tR R R R uα β− = + − + where tu is a disturbance term. A test of the CAPM for a given asset is: α = 0 (b). Arbitrage Pricing Theory A generalization of the CAPM model is based on Arbitrage Pricing Theory (APT). A simple form of this model, due to Chen, Roll and Ross (“Economic Forces and the Stock Market”, Journal of Business, 1986), is to extend the CAPM equation by including a set of unanticipated changes, or news. The APT equation becomes , , , , 1 ,( )i t f t m t f t unanticipated t tR R R R X uα β γ− = + − + + where ,unanticipated tX represents the unanticipated change at time t (for example, unexpected returns on some commodity or unexpected output growth), while the remaining variables are defined as above. (c). Term Structure of Interest Rates Consider the relationship between the return on a 3-month bond, 3,tR and a 1-month bond, 1,tR . Under the pure expectations hypothesis 3, 1, 1, 1 1, 2(1 ) (1 )(1 )(1 )t t t t t tR R E R E R+ ++ = + + + where 1,( )t t jE R + represents the conditional expectation of 1,t jR + based on information at time t. Take the natural log of both sides of the above equation and use the approximation that 1, 1,ln(1 )t t j t t jE R E R+ +≈ + to get 1, 1, 1 1, 23, Suppose that 1,tR follows a random walk 1, 1, 1t t tR R ν−= + where ν t is a disturbance term. Then © Copyright University of Wales 2020. All rights reserved. This copyright notice must not be removed from this Substituting into the above equation for 3,tR gives 3, 1,t tR R= This suggests that the term structure of interest rates can be modeled by the following linear regression model 3, 1,t t tR R uα β= + + where tu is a disturbance term. A test of the expectations hypothesis is a test of the following hypotheses (d). Present Value Model According to the Gordon model, the price of a stock is equal to the expected discounted dividend stream where tD is the dividend payment, R is the constant discount factor and ( )t t jE D + represents the conditional expectation of t jD + conditional on information at time t. Suppose that tD follows a random walk, 1t t tD D ν−= + where tν is a disturbance term. Then ( ) .t t j tE D D+ = Substituting into the present value relationship © Copyright University of Wales 2020. All rights reserved. This copyright notice must not be removed from this 1 (1 ) (1 ) 1 1 1/(1 ) = + + +# $ where the properties of a geometric progression are used + + + + = < < Alternatively, the present value model can be expressed as a linear relationship by taking the natural logarithms of both sides log log( ) log( )t tP R D= − + This suggests that the present value model can be represented by the following linear regression model log( ) log( )t t tP D uα β= + + , where tu is a disturbance term. A test of the present value model is a test of the following hypothesis Note that an estimate of the discount factor is obtained by noting that log( )Rα = − . Rearranging gives the discount factor exp( )R α= − . © Copyright University of Wales 2020. All rights reserved. This copyright notice must not be removed from this 2. Formulation and Estimation of the General Linear Regression Model The examples above show that the relationships between financial series can be represented in general by the following linear regression model β β β β= + + + + +K0 1 1, 2 2, ,t t t K K t tY X X X u , (eqn.1) where the sample period is 1, 2, ,t T= K . Here Y is the dependent variable, 1 21,X X= to KX is a set of explanatory variables, kβ , k=1,2,…,K, are the unknown population coefficients and tu is a disturbance term. The same equation in vector-matrix notation: × × + + × × = +Y X β u1 ( 1) ( 1) 1 1T T K K T , (eqn.1*) where bold letters represent the corresponding vectors and matrices, and the subscript indicates their dimensions. Dimensions are useful for understanding and to make sure that suggested multiplication is valid. Note: if the model includes intercept the first column in matrix XT K is the column of 1s. The sample counterparts of (1) and (1*) are = + + + + +K0 1 1, 2 2, ,t t t K K t tY b b X b X b X e (eqn.2) × × + + × × = +Y X b e1 ( 1) ( 1) 1 1T T K K T , (eqn.2*) where kb (vector b ) is the sample estimate of kβ (vector β ); te (vector e ) is known as the residual or error àt t te Y Y= − , à= −e Y Y and the fitted values are given by = + + + +K0 1 1, 2 2, ,àt t t K K tY b bX b X b X , =Y X b1 ( 1) ( 1) 1àT T K K The 'skb (vector b ) are estimated by minimizing the sum of squared errors © Copyright University of Wales 2020. All rights reserved. This copyright notice must not be removed from this ∑ or in vector notations, simply !e e . ( ) ( ) ( ) ⇒ = − − + ≡ e e Y Xb Y Xb Y Y Y Xb b X Y b X Xb e e X Y X Y X Xb X Xb X Y X Xb X Y X X X Xb X X X Y b X X X Y # ' ( )'( ) ' ' ' ' ' ' ( ' )min ( ' ) FOC : 0 ' ' 2 ' 0 2 ' 2 ' 0 ' ' ' ' ' ' ' ' ( ' )SOC :# 2 ' pos.#definite# #min This rule according to which the coefficients β they are estimated is called the ordinary least squares (OLS) estimator, while an estimated numerical value for a coefficient for the observed realization of a random sample are the OLS estimates, while. In vector notation OLS estimator is given by + × + × × + + × × " "=b (X X ) X Y!1( 1) 1 ( 1) ( 1) ( 1) 1K K T T K K T T . Assumptions about disturbance term u (i) The disturbance term has zero mean (ii) The variance of the disturbance term is constant for all observations (Homoskedasticity assumption) (iii) The disturbances corresponding to different observations have zero correlation (No autocorrelation) (iv) The disturbance at time t is uncorrelated with the values of the explanatory variables at time t or, formally, ( ) 0! =XuE . (In this case, the explanatory variables are said to be contemporaneously exogenous). Alternatively, we could assume that X is non-stochastic (deterministic and can be taken outside of the expectation operator). (v) The disturbances assumed to be normally distributed (not crucial for large T) (vi) There is no perfect linear relationship between the explanatory variables (No multicollinearity). (vii) The dependent and independent variables are stationary (that is, the variables do not contain random walk components, which we shall discuss later). Under above assumptions, the OLS estimator is asymptotically (for large T) consistent, efficient and normally distributed. Further, the usual OLS standard errors, t-statistics, F- statistics, and LM statistics discussed further are asymptotically valid. © Copyright University of Wales 2020. All rights reserved. This copyright notice must not be removed from this If X is non-stochastic, the OLS estimator b is unbiased and efficient. Unbiasness, i.e. ( ) =b βE , is easy to proof by substituting Y from Eq. 1* in the estimator: ! ! ! ! ! ! ! != = + = + ! ! ! != + = + = + = b (XX) XY (XX) XXβ (X X) X u β (X X) X u b β (X X) X u β (X X) X u β β !1 !1 !1 !1 !1 !1( ) ( ) 0E E E Note: ! !− =b β (X X) X u!1 Variance-covariance matrix is given by: ( ) ( )!1 !1 !1 !1 !1 2 !1 2 !1 ( )( ) ( ) " " " " " " " " "− − = = = " " " "= = b β b β (XX) Xuu X(XX) (XX) X uu X(XX) (XX) X I X(XX) (XX) Note: we used here (ii) and (iii) which implies that 2( ) σ" =uu IE , where I is the identity matrix. The resulting variance-covariance matrix has 2σ on the diagonal (constant variance) and 0s on the off-diagonal elements (no serial correlations). Variance of the disturbance term 2σ is not observed and need to be estimated. Its estimate is given by 2s below. 3. Diagnostics and Tests in the General Linear Regression Model There exists a number of diagnostics which can be used to determine if the estimated model is estimated correctly. In particular, if there is no information contained in the estimated residuals, namely, in te , this is evidence that no information has been excluded and that the chosen model is correctly specified. (a). Sum of Squared Residuals The objective of OLS is to minimize the sum of squared residuals. The sum of squares can be used to compute the variance of the residuals Note: need to divide by − +( 1)T K (but not T as in sample mean estimator) to obtain unbiased estimator of variance 2σ . This accounts for the fact that K+1 parameters are estimated in this regression. The standard error of the regression is given by © Copyright University of Wales 2020. All rights reserved. This copyright notice must not be removed from this Relatively large values of s indicate that a substantial amount of change in the dependent variable cannot be explained by changes in the independent variables. (b). The Coefficient of Determination The coefficient of determination is a measure of the goodness of fit of the model. It measures the proportion of variation in the dependent variable Y that is explained by the regression equation. It is computed as Explained,sum,of,squares 1 Total,sum,of,squares ( ) àExplained-sum-of-squares ( )T tt Y Y== −∑ This is the sum of squared deviations of the regression values of Y, àY about the mean of 1Total(sum(of(squares ( ) This is the total sum of squared deviations of the sample values of Y about the mean of Y. Interpretation If the regression equation contains a constant term, 2R is between zero and one. The closer is 2R to one, the better the fit. For example, an 2 0.9162R = means 91.62% of variation in the dependent variable is explained by the regression equation. This is considered to be a good fit. On the other hand, an 2 0.21R = means that only 21% of the variation in the dependent variable is explained by the regression equation. The fit is not particularly good and suggests that the regression equation has excluded important explanatory variables. It can be shown that 2R will never decrease when another variable is added to the regression equation. Hence there may be a tendency to keep adding explanatory variables into the regression equation so as to increase 2R without reference to any underlying economic theory. To circumvent this problem, the adjusted 2R is computed as 2 2 11 (1 ) © Copyright University of Wales 2020. All rights reserved. This copyright notice must not be removed from this (c). Test of Coefficients (t-tests) To test the importance of an explanatory variable in the regression equation, the associated parameter estimate can be tested to see if it is zero. A t-test is used to do this. The null and alternative hypotheses are, respectively, The test statistic is t"statistic:( where kb is the OLS estimated coefficient of kβ and ( )kSE b is the corresponding OLS standard error. The t-test is distributed as Student t with T-K degrees of freedom. For large T, for 2-sided test values of the t-test in the range of -2 to 2, represent a failure to reject the null hypothesis at approximately the 5% level. Alternatively, p-values less than 0.05α = constitute rejection of the null hypothesis at the 5% level. (d). Robust standard errors The OLS standard error of kb (denoted ( )kSE b ) is not valid if the errors in the regression model are heteroscedastic and\or serially correlated. White (1980) derived the correct formula for the standard error of kb when the errors are heteroscedastic of unknown form and are not autocorrelated. These standard errors are known as White or heteroscedastic- consistent standard errors. Denoted the White or heteroscedastic-consistent standard error of kb as ( )W kSE b . If heteroscdasticity but not autocorrelation is present in the estimated residuals, a t-test of the significant of kb should be undertaken using the statistic t"statistic:( Newey and West (1987) generalized the formula of White to cover both the case of heteroscedasticity and serial correlation of unknown form in the residuals. Denote the Newey-West or heteroscedastic and autocorrelation consistent standard error of kb as ( )NW kSE b . If heteroscdasticity and autocorrelation is present in the estimated residuals, a t-test of the significant of kb should be undertaken using the statistic t"statistic:( © Copyright University of Wales 2020. All rights reserved. This copyright notice must not be removed from this To calculate the Newey-West standard error of kb (that is, ( )NW kSE b ), a lag truncation parameter, which represents the number of autocorrelations used in accounting for the persistence in the OLS residuals, must be chosen. Newey and West suggest taking the lag truncation parameter as the integer part of 2/94( /100)T . Eviews adopts this suggestion. Others have suggested 1/4T . Note that if the lag truncation parameter is chosen to be zero, ( )NW kSE b corrects only for heteroscedasticity and is identical to ( )W kSE b . (e). F-test A joint test of all the explanatory variables is determined by the F-test. For the case where there is an intercept, the null and alternative hypotheses are respectively, : %at% least%one% % is%not%zero The F-statistic is computed as (1 ) / ( ( 1)) This is distributed as α − +, ( 1)( )K T KF . Large values of F constitute acceptance of the alternative hypothesis. Alternatively, p-values less than 0.05α = constitute rejection of the null hypothesis. (f). Testing Linear Restrictions A special case of the F-test discussed immediately above is when it is necessary to test subsets of parameters. In the case of testing APT, the restrictions are where iβ and jβ are the coefficients associated with the unanticipated variables. To perform the test, 1. Estimate the APT model and retrieve the unrestricted sum of squared residuals SSU. 2. Estimate the CAPM and retrieve the restricted sum of squared residuals SSR. (Note that the CAPM model is the restricted model since it is a special case of the APT model where the coefficients on the unanticipated variables are zero). © Copyright University of Wales 2020. All rights reserved. This copyright notice must not be removed from this 3. Compute the F-statistic where R is the number of restrictions. The statistic is distributed as , −R T KF . 4. A large value of the statistic (larger than critical values) constitutes rejection of the null hypothesis that the restrictions are valid. (g). Durbin-Watson Test of Autocorrelation One way of testing the adequacy of the regression specification is to examine if there are any patterns in the residuals. A common statistic used for this purpose is the Durbin- Watson (DW) statistic. The null and alternative hypotheses are respectively: : $No$autocorrelation : Autocorelation$(positive) The DW sta 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com