Basics LPM Logit models Probit models
CORPFIN 2503 – Business Data Analytics:
Applications of logit and probit models
čius
Week 5: August 23rd, 2021
čius CORPFIN 2503, Week 5 1/53
Basics LPM Logit models Probit models
Outline
Basics
LPM
Logit models
Probit models
čius CORPFIN 2503, Week 5 2/53
Basics LPM Logit models Probit models
Introduction
We use simple and multiple linear regressions if the dependent
variable is continuous.
For example, the dependent variable is the car price.
What if the dependent variable is a dummy variable, such as
SEDAN=1 if a car type is sedan and 0 otherwise?
čius CORPFIN 2503, Week 5 3/53
Basics LPM Logit models Probit models
Introduction II
Other interesting issues:
• dividend payers vs non-payers
• firms going bankrupt and not
• firms issuing equity vs debt securities
• credit rating downgrades and upgrades
• will a firm become an M&A target or not
• etc.
čius CORPFIN 2503, Week 5 4/53
Basics LPM Logit models Probit models
Introduction III
If we use OLS regressions in these cases, we would estimate linear
probability models (LPMs).
It is better to estimate logit or probit models.
čius CORPFIN 2503, Week 5 5/53
Basics LPM Logit models Probit models
LPM example
Let’s estimate an LPM where the dependent variable equals 1 if a
firm pays dividends using ASX data from Workshop 3:
data work.asx;
set work.asx;
label dividend_payer=”Dividend payer”;
ln_assets=log(assets);
re_equity_d=0;
if re_equity>0 & re_equity ne . then re_equity_d=1;
run;
PROC REG DATA=work.asx;
MODEL dividend_payer=re_equity_d ln_assets;
RUN;
čius CORPFIN 2503, Week 5 6/53
Basics LPM Logit models Probit models
LPM example II
čius CORPFIN 2503, Week 5 7/53
Basics LPM Logit models Probit models
LPM example II
The results suggest that:
• firms with positive RE/equity ratio have 25.6% higher
probability of paying dividends
• the increase in ln(assets) by 1 leads to 9.4% higher probability
of paying dividends.
If RE/equity > 0 and ln(assets) = 15 then a firm has 18%
probability of paying dividends (–1.48192 + 0.25610 + 15 ×
0.09350 = 0.17668).
Let’s look at the residuals.
čius CORPFIN 2503, Week 5 8/53
Basics LPM Logit models Probit models
LPM example III
čius CORPFIN 2503, Week 5 9/53
Basics LPM Logit models Probit models
Slide #61 of Lecture 4
čius CORPFIN 2503, Week 5 10/53
Basics LPM Logit models Probit models
LPM example IV
Residuals are not normally distributed and are subject to
heteroscedasticity.
čius CORPFIN 2503, Week 5 11/53
Basics LPM Logit models Probit models
Introduction II
LPMs are subject to:
1. heteroskedasticity (standard errors will be wrong, and
hypothesis tests will be incorrect)
2. residuals will not be normally distributed (as the residuals can
take on two possible values)
3. the predicted values for the dependent variable might be
greater than 1 or lower than 0 (as LPM assumes the linear
impact of the independent variable):
• if the dependent variable equals 1 if a firm pays dividends and
0 otherwise, values outside the region [0, 1] are not logical
• probability of paying dividends cannot be negative.
The solution is to use logit (logistic) or probit models.
čius CORPFIN 2503, Week 5 12/53
Basics LPM Logit models Probit models
Theory
Suppose p is the probability that Y = 1:
p = P (Y = 1).
Functional form of LPM with 2 independent variables is:
p = β0 + β1×1 + β2×2.
The LHS of the LPM can range from 0 to 1, but the RHS can vary
from −∞ to ∞.
=⇒ We need to transform the dependent variable to eliminate the
0 to 1 constraint.
We can eliminate the upper bound (p = 1) by using the ratio p
1−p .
čius CORPFIN 2503, Week 5 13/53
Basics LPM Logit models Probit models
Theory II
p
1−p is the odds of an event occurring.
Let’s assume that the probability of success of some event is 0.75.
Then the probability of failure is 1 – 0.75 = 0.25.
The odds of success are defined as the ratio of the probability of
success over the probability of failure.
=⇒ The odds of success are 0.75/0.25 = 3.
=⇒ The odds of success are 3 to 1.
Similarly, if the probability of success is 0.5, i.e., 50-50 percent
chance, then the odds of success are 1 to 1.
čius CORPFIN 2503, Week 5 14/53
Basics LPM Logit models Probit models
Theory III
p
1−p is the odds of an event occurring.
p Odds
0.0001 0.0001
0.01 0.0101
0.1 0.1111
0.25 0.3333
0.5 1
0.75 3
0.9 9
0.99 99
0.9999 9999
• When probability is either very
small or very big, changes in odds
hardly impact probability.
• When probability is between 0.1
and 0.9, changes in odds
substantially impact probability.
• The range of odds is between 0 and
∞.
čius CORPFIN 2503, Week 5 15/53
Basics LPM Logit models Probit models
Theory IV
We can eliminate the lower bound of 0 by taking the natural
logarithm of the odds ratio.
-10
-8
-6
-4
-2
0
2
0 1 2 3 4 5 6
y
Odds
y=ln(Odds)
čius CORPFIN 2503, Week 5 16/53
Basics LPM Logit models Probit models
Theory V
The log odds of the event occurring is lnOdds = ln
[
p
1−p
]
.
p Odds ln(Odds)
0.0001 0.0001 –9.210
0.01 0.0101 –4.595
0.1 0.1111 –2.197
0.25 0.3333 –1.099
0.5 1 0
0.75 3 1.099
0.9 9 2.197
0.99 99 4.595
0.9999 9999 9.210
• When probability is either very
small or very big, changes in log
odds hardly impact probability.
• When probability is between 0.1
and 0.9, changes in log odds
substantially impact probability.
• If p < 0.5, then odds < 1 and
log odds < 0.
• If p = 0.5, then odds = 1 and
log odds = 0.
• If p > 0.5, then odds > 1 and
log odds > 0.
čius CORPFIN 2503, Week 5 17/53
Basics LPM Logit models Probit models
Theory VI
Functional form of logit (logistic) model with 2 independent
variables:
lnOdds = ln
[
p
1− p
]
= β0 + β1×1 + β2×2.
The dependent and independent variables can vary between −∞
and ∞.
čius CORPFIN 2503, Week 5 18/53
Basics LPM Logit models Probit models
Theory VII
Let’s derive the predicted value of p:
ln
[
p
1− p
]
= β0 + β1×1 + β2×2,
p
1− p
= eβ0+β1×1+β2×2 ,
p = e
β0+β1×1+β2×2
1+eβ0+β1×1+β2×2
,
where e ≈ 2.71828.
čius CORPFIN 2503, Week 5 19/53
Basics LPM Logit models Probit models
Theory VIII
The impact of Q on the predicted probability depends on Q:
• if Q is low, the impact is small
• if Q is neither low or high, the impact is big
• if Q is high, the impact is small.
čius CORPFIN 2503, Week 5 20/53
Basics LPM Logit models Probit models
Theory IX
In case of logit models, predicted probability is never < 0 or > 1,
and the line is not straight.
čius CORPFIN 2503, Week 5 21/53
Basics LPM Logit models Probit models
Theory X
Functional form of logit (logistic) model:
p =
eβ0+β1×1+β2×2
1 + eβ0+β1×1+β2×2
.
Suppose that eβ0+β1×1+β2×2 = 1, 000, 000 then:
p =
1, 000, 000
1 + 1, 000, 000
< 1.
Suppose that eβ0+β1x1+β2x2 = 0.0000001 then:
p =
0.0000001
1 + 0.0000001
> 0.
=⇒ p is always between 0 and 1.
čius CORPFIN 2503, Week 5 22/53
Basics LPM Logit models Probit models
Logit models
In SAS, there are several procedures to estimate logit models:
• LOGISTIC (our main choice)
• QLIM
• GENMOD
• PROBIT
• MDC
• PHREG and
• CATMOD.
čius CORPFIN 2503, Week 5 23/53
Basics LPM Logit models Probit models
Logit models II
Now let’s estimate logit model using the same dataset:
PROC LOGISTIC DATA=work.asx;
MODEL dividend_payer (EVENT=’1’) = re_equity_d ln_assets;
RUN;
Option (EVENT=’1’) makes SAS estimate the probability of paying
dividends.
čius CORPFIN 2503, Week 5 24/53
Basics LPM Logit models Probit models
Logit models III
čius CORPFIN 2503, Week 5 25/53
Basics LPM Logit models Probit models
Logit models IV
čius CORPFIN 2503, Week 5 26/53
Basics LPM Logit models Probit models
Logit models V
čius CORPFIN 2503, Week 5 27/53
Basics LPM Logit models Probit models
Logit models VI
Coefficient estimates for firm size and for positive RE/equity ratio
dummy are significantly positive.
The results suggest that larger firms as well as firms with positive
RE/equity ratio are more likely to pay dividends:
• a 1 unit increase in RE/equity ratio dummy will result in a
1.6120 increase in the log odds to pay dividends (if there are 2
firms with identical ln(assets), the log odds for the one with
positive RE/equity ratio would be 1.6120 greater than the log
odds for the firm with negative RE/equity ratio)
• a 1 unit increase in ln(assets) will result in a 1.1059 increase in
the log odds to pay dividends.
čius CORPFIN 2503, Week 5 28/53
Basics LPM Logit models Probit models
Logit models VII
To compute the probability of paying dividends for a particular firm,
we simply need to plug in its ln(assets) and RE/equity dummy
value in the equation below:
p =
e−22.4746+1.6120×RE/equity dummy+1.1059×ln(assets)
1 + e−22.4746+1.6120×RE/equity dummy+1.1059×ln(assets)
.
Suppose ln(assets)=20 and RE/equity dummy=1, then p=0.778.
Suppose ln(assets)=17 and RE/equity dummy=0, then p=0.025.
Suppose ln(assets)=20 and RE/equity dummy=0, then p=0.412.
Suppose ln(assets)=17 and RE/equity dummy=1, then p=0.113.
čius CORPFIN 2503, Week 5 29/53
Basics LPM Logit models Probit models
Descriptive statistics
Let’s look at the properties of ln_assets and re_equity_d:
proc univariate data=work.asx plots;
var ln_assets;
run;
proc freq data=work.asx;
tables dividend_payer * re_equity_d /
norow nocol nopercent;
run;
čius CORPFIN 2503, Week 5 30/53
Basics LPM Logit models Probit models
Descriptive statistics II: ln_assets
čius CORPFIN 2503, Week 5 31/53
Basics LPM Logit models Probit models
Descriptive statistics III: ln_assets
ln(assets)=20: assets=e20= 485,165,195. Very large firms!
ln(assets)=17: assets=e17= 24,154,952. Average firms.
The former firms are 20 times larger than the latter firms.
čius CORPFIN 2503, Week 5 32/53
Basics LPM Logit models Probit models
Descriptive statistics IV: re_equity_d
Two-way table:
čius CORPFIN 2503, Week 5 33/53
Basics LPM Logit models Probit models
Odds ratios
Odds ratio estimates are used to see the exact impact of each
individual variable on the odds of the positive outcome of the
model.
E.g., the odds ratio estimate for RE/equity dummy indicates the
impact of RE/equity dummy on the odds of paying dividends:
• What is the change in the odds when there is a unit change in
the independent variable?
čius CORPFIN 2503, Week 5 34/53
Basics LPM Logit models Probit models
Odds ratios II
Odds ratio for RE/equity dummy is 5.013.
Suppose ln(assets)=20 and RE/equity dummy=1:
• p(div. payer)=0.778 & p(div. non-payer)=1 – 0.778=0.222
• odds of paying over not paying dividends = 0.778
0.222
= 3.51.
Suppose ln(assets)=20 and RE/equity dummy=0:
• p(div. payer)=0.412 & p(div. non-payer)=1 – 0.412=0.588
• odds of paying over not paying dividends = 0.412
0.588
= 0.70.
Change in odds with unit change in RE/equity dummy is
3.51
0.70
= 5.013.
čius CORPFIN 2503, Week 5 35/53
Basics LPM Logit models Probit models
Odds ratios III
For 2 otherwise identical firms, the odds to pay dividends:
• for a firm with the positive RE/equity would be exp(1.6120) =
5.013 times greater
• for a firm with ln(assets) greater by 1 unit would be
exp(1.1059) = 3.022 times greater.
čius CORPFIN 2503, Week 5 36/53
Basics LPM Logit models Probit models
Marginal effects
Interpreting the impacts on log odds and odds might be tricky.
Why not look at the impact of a variable on the probability to pay
dividends, holding all other variables in the model constant?
Yes, we can. This is known as a marginal effect.
However, it depends on the the variable values.
Thus, we compute marginal effect at each observation and then
calculate the sample average of individual marginal effects to obtain
the overall marginal effect.
čius CORPFIN 2503, Week 5 37/53
Basics LPM Logit models Probit models
Marginal effects II
SAS procedure LOGISTIC does not compute marginal effects but a
procedure QLIM does:
PROC QLIM DATA=work.asx;
MODEL dividend_payer = re_equity_d ln_assets
/ discrete(d=logistic);
OUTPUT OUT=work.marginal_effects MARGINAL;
RUN;
PROC MEANS DATA=work.marginal_effects mean min max maxdec=3;
VAR Meff_P2_re_equity_d Meff_P2_ln_assets;
title ’Average of the Individual Marginal Effects
(Logit Model)’;
RUN;
čius CORPFIN 2503, Week 5 38/53
Basics LPM Logit models Probit models
Marginal effects III
The results:
On average:
• having positive RE/equity ratio increases the probability of
paying dividends by 0.106
• ln(assets) greater by 1 unit increases the probability of paying
dividends by 0.073.
These values are smaller than those from the LPM (0.25610 &
0.09350).
čius CORPFIN 2503, Week 5 39/53
Basics LPM Logit models Probit models
Model fit statistics
If we estimate several logit models, how do we know which one is
the best?
AIC (Akaike Information Criterion) and SC ( ) are
used to compare two or more models and pick the best one.
A model with minimum AIC and SC values are preferred:
• such model would have fewer independent variables and
• better fit to the data.
čius CORPFIN 2503, Week 5 40/53
Basics LPM Logit models Probit models
Model fit statistics II
Let’s estimate 4 logit models and compare their fit statistics:
PROC LOGISTIC DATA=work.asx;
MODEL dividend_payer (EVENT=’1’) = re_equity_d ln_assets;
RUN;
PROC LOGISTIC DATA=work.asx;
MODEL dividend_payer (EVENT=’1’) = re_equity_d assets;
RUN;
PROC LOGISTIC DATA=work.asx;
MODEL dividend_payer (EVENT=’1’) = re_equity_d;
RUN;
PROC LOGISTIC DATA=work.asx;
MODEL dividend_payer (EVENT=’1’) = ln_assets;
RUN;
čius CORPFIN 2503, Week 5 41/53
Basics LPM Logit models Probit models
Model fit statistics III
Summary of results:
• Model 1 is the best (its indep. var.:
RE/equity dummy and ln(assets)).
• Model 3 is the worst (its indep.
var.: RE/equity dummy).
• Models with covariates are better
than the model without any
covariate.
In general, AIC and SC lead to the same
model being selected.
čius CORPFIN 2503, Week 5 42/53
Basics LPM Logit models Probit models
Probit models
The key difference between probit and logit models is their
functional form.
For logit (logistic) models, it is the cumulative standard logistic
distribution function:
p =
eβ0+β1×1+β2×2
1 + eβ0+β1×1+β2×2
.
For probit models, it is the cumulative standard normal probability
distribution function:
p = Φ (β0 + β1×1 + β2×2) .
čius CORPFIN 2503, Week 5 43/53
Basics LPM Logit models Probit models
Probit models II
Now let’s estimate probit model using the same dataset:
PROC PROBIT DATA=work.asx;
MODEL dividend_payer (EVENT=’1’) = re_equity_d ln_assets;
RUN;
or
PROC LOGISTIC DATA=work.asx;
MODEL dividend_payer (EVENT=’1’) = re_equity_d ln_assets
/ LINK=PROBIT;
RUN;
čius CORPFIN 2503, Week 5 44/53
Basics LPM Logit models Probit models
Probit models III
čius CORPFIN 2503, Week 5 45/53
Basics LPM Logit models Probit models
Probit models IV
čius CORPFIN 2503, Week 5 46/53
Basics LPM Logit models Probit models
Probit models V
čius CORPFIN 2503, Week 5 47/53
Basics LPM Logit models Probit models
Probit models VI
Coefficient estimates for firm size and for positive RE/equity ratio
dummy are significantly positive.
The results suggest that larger firms as well as firms with positive
RE/equity ratio are more likely to pay dividends.
Interpretation of the coefficients in probit regression is not as
straightforward as the interpretations of coefficients in LPM or logit
models.
If LOGISTIC procedure is used, then one gets AIC and SC values
that can be used to identify the best model (which is the one with
the lowest AIC and SC values).
čius CORPFIN 2503, Week 5 48/53
Basics LPM Logit models Probit models
Probit models VII
To compute the predicted probability of paying dividends for a
particular firm, we simply need to plug in its ln(assets) and
RE/equity dummy value in the equation below:
p = Φ(−11.7732 + 0.8823× RE/equity dummy
+ 0.5767× ln(assets))
where Φ is the cumulative standard normal distribution function
(Excel function: NORM.S.DIST).
Suppose ln(assets)=20 and RE/equity dummy=1, then p=0.740.
Suppose ln(assets)=17 and RE/equity dummy=0, then p=0.024.
Suppose ln(assets)=20 and RE/equity dummy=0, then p=0.405.
Suppose ln(assets)=17 and RE/equity dummy=1, then p=0.139.
The results are very similar to those of logit model.
čius CORPFIN 2503, Week 5 49/53
Basics LPM Logit models Probit models
Marginal effects
SAS procedure PROBIT also does not compute marginal effects;
thus, we can use the procedure QLIM:
PROC QLIM DATA=work.asx;
MODEL dividend_payer = re_equity_d ln_assets
/ discrete(d=probit);
OUTPUT OUT=work.marginal_effects2 MARGINAL;
RUN;
PROC MEANS DATA=work.marginal_effects2 mean min max
maxdec=3;
VAR Meff_P2_re_equity_d Meff_P2_ln_assets;
title ’Average of the Individual Marginal Effects
(Probit Model)’;
RUN;
čius CORPFIN 2503, Week 5 50/53
Basics LPM Logit models Probit models
Marginal effects II
The results:
On average:
• having positive RE/equity ratio increases the probability of
paying dividends by 0.109
• ln(assets) greater by 1 unit increases the probability of paying
dividends by 0.071.
These values are almost identical to those from logit model.
čius CORPFIN 2503, Week 5 51/53
Basics LPM Logit models Probit models
Probit models VIII
The results are very similar to those of logit model.
Logit and probit models lead to essentially the same results in most
cases.
Logit models tend to converge a little bit faster.
=⇒ Thus, use either of them.
čius CORPFIN 2503, Week 5 52/53
Basics LPM Logit models Probit models
Required reading
Konasani, V. R. and Kadre, S. (2015). “Practical Business
Analytics Using SAS: A Hands-on Guide”: chapter 11.
čius CORPFIN 2503, Week 5 53/53
Basics
LPM
Logit models
Probit models