Introduction Data Model Predictions Other issues
CORPFIN 2503 – Business Data Analytics:
Applications of multinomial logit models
čius
Week 6: August 30th, 2021
čius CORPFIN 2503, Week 6 1/35
Introduction Data Model Predictions Other issues
Outline
Introduction
Data
Model
Predictions
Other issues
čius CORPFIN 2503, Week 6 2/35
Introduction Data Model Predictions Other issues
Introduction
The dependent variable in logit regressions have 2 possible
outcomes (e.g., to pay dividends or not).
The dependent variable in multinomial logit regressions have more
than 2 possible unordered outcomes.
E.g., for corporate bond issues:
• currency: AUD, USD, EUR, GBP
• coupon: 0, fixed, variable
• credit rating agency: Moody’s, Fitch, S&P.
E.g., buying a car:
• type: sedan, truck, SUV
• brand: BMW, Tesla, Toyota
• colour: blue, white, red.
čius CORPFIN 2503, Week 6 3/35
Introduction Data Model Predictions Other issues
Multinomial logit regressions
Suppose we have n possible outcomes.
1 outcome will be chosen as “base” outcome.
The other n− 1 outcomes are separately regressed against the
“base” outcome.
n− 1 logit models will be estimated.
čius CORPFIN 2503, Week 6 4/35
Introduction Data Model Predictions Other issues
Multinomial logit regressions II
Let’s assume that we have 3 possible outcomes (Y = 0, 1, 2). Base
outcome is Y = 0.
There are 2 independent variables: X1 and X2.
Then the following models will be estimated:
ln
Pr(Y = 1)
Pr(Y = 0)
= α1 + β1 ×X1 + γ1 ×X2,
ln
Pr(Y = 2)
Pr(Y = 0)
= α2 + β2 ×X1 + γ2 ×X2.
čius CORPFIN 2503, Week 6 5/35
Introduction Data Model Predictions Other issues
Data
US corporate bond data from Workshop 4.
Variables of interest:
• Currency
• Maturity
• Credit rating.
Goal:
To predict bond’s currency using its maturity and credit rating.
In other words:
Does bond’s currency depend on its maturity and credit rating?
čius CORPFIN 2503, Week 6 6/35
Introduction Data Model Predictions Other issues
Data II
Let’s consider 3 currencies: USD, EUR, and other.
data work.bonds;
set work.bonds;
currency2=currency;
if currency in (“Australia” “British P” “Canadian”
“Swiss Fra”) then currency2=”Other”;
run;
currency2 will be our dependent variable.
čius CORPFIN 2503, Week 6 7/35
Introduction Data Model Predictions Other issues
Data III
Let’s generate frequency distribution table for currency2.
proc freq data=work.bonds;
tables currency2;
run;
čius CORPFIN 2503, Week 6 8/35
Introduction Data Model Predictions Other issues
Data IV
Frequency distribution table for currency2:
čius CORPFIN 2503, Week 6 9/35
Introduction Data Model Predictions Other issues
Data V
The independent variables include:
1. bond maturity ln_maturity2
2. credit rating dummy cr_rating_d.
SAS code:
data work.bonds;
set work.bonds;
maturity2=(maturity-today())/365;
ln_maturity2=log(maturity2);
cr_rating_d=0;
if s_p in (“AAA” “AA+” “AA” “AA-” “A+” “A” “A-“)
then cr_rating_d=1;
run;
čius CORPFIN 2503, Week 6 10/35
Introduction Data Model Predictions Other issues
Data VI
Let’s get descriptive statistics of bond maturity:
PROC MEANS DATA=work.bonds mean std min p25 median
p75 max maxdec=3;
VAR maturity2 ln_maturity2;
title ’Descriptive statistics of bond maturity’;
RUN;
čius CORPFIN 2503, Week 6 11/35
Introduction Data Model Predictions Other issues
Data VII
čius CORPFIN 2503, Week 6 12/35
Introduction Data Model Predictions Other issues
Data VIII
Let’s generate a two-way table for currency and credit rating
dummy:
proc freq data=work.bonds;
tables currency2*inv_grade_d / norow nocol nopercent;
title ’Two-way table for currency and credit rating
dummy’;
run;
čius CORPFIN 2503, Week 6 13/35
Introduction Data Model Predictions Other issues
Data IX
čius CORPFIN 2503, Week 6 14/35
Introduction Data Model Predictions Other issues
Model
We assume that currency2 = f (ln_maturity2, cr_rating_d).
SAS code for multinomial logit model:
proc logistic data = work.bonds;
model currency2 = ln_maturity2 cr_rating_d
/ link = glogit;
output out=work.bonds_pred predprobs=(individual);
run;
Alternatively, we could use SAS procedure CATMOD.
čius CORPFIN 2503, Week 6 15/35
Introduction Data Model Predictions Other issues
The results
SAS produces a few tables:
čius CORPFIN 2503, Week 6 16/35
Introduction Data Model Predictions Other issues
The results II
More tables:
čius CORPFIN 2503, Week 6 17/35
Introduction Data Model Predictions Other issues
The results III
This is the test whether none of the predictors in either of the
models have non-zero coefficients.
čius CORPFIN 2503, Week 6 18/35
Introduction Data Model Predictions Other issues
The results IV
Not over yet . . .
Null hypothesis:
There is no relation between the predictor variable and the outcome
(i.e., the estimates of the predictor in both of the fitted models are
0).
If the p-value is less than the specified α (e.g., 0.1), then this null
hypothesis can be rejected.
čius CORPFIN 2503, Week 6 19/35
Introduction Data Model Predictions Other issues
The results V
Main results:
ln_maturity2 =
−1.7201:
• bonds with shorter
maturities are more
likely to be in EUR
than in USD
• a 1-unit increase in
ln_maturity2 is
associated with a
1.7201 decrease in
the relative log odds
of making bond issue
in EUR vs. USD.
čius CORPFIN 2503, Week 6 20/35
Introduction Data Model Predictions Other issues
The results VI
Main results:
ln_maturity2 =
−1.0010:
• bonds with shorter
maturities are more
likely to be in other
currency than in USD
• a 1-unit increase in
ln_maturity2 is
associated with a
1.0010 decrease in
the relative log odds
of making bond issue
in other currency vs.
USD.
čius CORPFIN 2503, Week 6 21/35
Introduction Data Model Predictions Other issues
The results VII
Main results:
cr_rating_d = 1 if a firm has a good
credit rating.
cr_rating_d = 0.6414:
• bonds with good
credit ratings are
more likely to be in
EUR than in USD
• a good credit rating
increases the relative
log odds of making
bond issue in EUR vs.
USD by 0.6414.
čius CORPFIN 2503, Week 6 22/35
Introduction Data Model Predictions Other issues
The results VIII
Main results:
cr_rating_d = 1 if a firm has a good
credit rating.
cr_rating_d =
−0.5957:
• bonds with good
credit ratings are less
likely to be in other
currency than in
USD. The impact is
insignificant
• a good credit rating
decreases the relative
log odds of making
bond issue in other
currency vs. USD by
0.5957.
čius CORPFIN 2503, Week 6 23/35
Introduction Data Model Predictions Other issues
The results IX
Last table:
• 0.367; if ln_maturity2 increases
by 1-unit then odds of making bond
issue in other currency vs. USD
would be expected to decrease by a
factor of 0.367 (e−1.0010=0.367).
ln_maturity2:
• 0.179; if
ln_maturity2
increases by 1-unit
then odds of making
bond issue in EUR vs.
USD would be
expected to decrease
by a factor of 0.179
(e−1.7201=0.179)
čius CORPFIN 2503, Week 6 24/35
Introduction Data Model Predictions Other issues
The results X
Last table:
• 0.551; a good credit rating is
expected to decrease the odds of
making bond issue in other
currency vs. USD by a factor of
0.551 (e−0.5979=0.551). The
impact is insignificant.
cr_rating_d:
• 1.899; a good credit
rating is expected to
increase the odds of
making bond issue in
EUR vs. USD by a
factor of 1.899
(e0.6414=1.899)
čius CORPFIN 2503, Week 6 25/35
Introduction Data Model Predictions Other issues
Predictions
SAS generated work.Bonds_pred file which includes the following
new variables:
_FROM_: the actual value of the dependent variable
(currency2)
_INTO_: the predicted value of the dependent variable
IP_Euro: the predicted probability that currency2 = ”EURO”
IP_Other: the predicted probability that currency2 = ”Other”
IP_US_Dollar: the predicted probability that
currency2 = ”US Dollar”.
IP_Euro+ IP_Other+ IP_US_Dollar = 1.
_INTO_ is equal to the value which predicted probability is the
highest.
čius CORPFIN 2503, Week 6 26/35
Introduction Data Model Predictions Other issues
work.Bonds_pred
čius CORPFIN 2503, Week 6 27/35
Introduction Data Model Predictions Other issues
Predictions II
Model with 2 independent variables and 3 possible outcomes:
ln
Pr(Y = 1)
Pr(Y = 0)
= α1 + β1 ×X1 + γ1 ×X2,
ln
Pr(Y = 2)
Pr(Y = 0)
= α2 + β2 ×X1 + γ2 ×X2.
Predicted probabilities are:
Pr(Y = 1) =
eα1+β1×X1+γ1×X2
1 + eα1+β1×X1+γ1×X2 + eα2+β2×X1+γ2×X2
,
Pr(Y = 2) =
eα2+β2×X1+γ2×X2
1 + eα1+β1×X1+γ1×X2 + eα2+β2×X1+γ2×X2
,
Pr(Y = 0) = 1− Pr(Y = 1)− Pr(Y = 2).
čius CORPFIN 2503, Week 6 28/35
Introduction Data Model Predictions Other issues
Predictions III
Let’s predict the probabilities for the first observation:
ln_maturity2 = 2.0808104673 ≈ 2.081
cr_rating_d = 0.
Predicted probabilities are:
Pr(Y = ”Euro”) =
e1.8417−1.7201×2.081+0.6414×0
1 + e1.8417−1.7201×2.081+0.6414×0 + e−0.7925−1.0010×2.081−0.5957×0
= 0.1427,
Pr(Y = ”Other”) =
e−0.7925−1.0010×2.081−0.5957×0
1 + e1.8417−1.7201×2.081+0.6414×0 + e−0.7925−1.0010×2.081−0.5957×0
= 0.0458,
Pr(Y = ”US Dollar”) = 1− 0.1427− 0.0458 = 0.8115.
čius CORPFIN 2503, Week 6 29/35
Introduction Data Model Predictions Other issues
Predictions IV
Pr(Y = ”US Dollar”) is the highest:
Thus, the predicted currency is US_Dollar.
Our calculations are consistent with the results in
work.Bonds_pred file.
čius CORPFIN 2503, Week 6 30/35
Introduction Data Model Predictions Other issues
Predictions V
Let’s predict the probabilities for the bond with:
ln_maturity2 = 1.4
cr_rating_d = 1.
Predicted probabilities are:
Pr(Y = ”Euro”) =
e1.8417−1.7201×1.4+0.6414×1
1 + e1.8417−1.7201×1.4+0.6414×1 + e−0.7925−1.0010×1.4−0.5957×1
= 0.5038,
Pr(Y = ”Other”) =
e−0.7925−1.0010×1.4−0.5957×1
1 + e1.8417−1.7201×1.4+0.6414×1 + e−0.7925−1.0010×1.4−0.5957×1
= 0.0287,
Pr(Y = ”US Dollar”) = 1− 0.5038− 0.0287 = 0.4674.
čius CORPFIN 2503, Week 6 31/35
Introduction Data Model Predictions Other issues
Predictions VI
Pr(Y = ”Euro”) is the highest:
Thus, the predicted currency is Euro.
čius CORPFIN 2503, Week 6 32/35
Introduction Data Model Predictions Other issues
Marginal effects
What about marginal effects?
It is possible to compute, but there is no built-in command in SAS.
=⇒ We do not need to know how.
čius CORPFIN 2503, Week 6 33/35
Introduction Data Model Predictions Other issues
Multinomial probit model
Multinomial probit model:
• is similar to multinomial logit model
• is superior to multinomial logit model
• does not allow to make predictions easily.
čius CORPFIN 2503, Week 6 34/35
Introduction Data Model Predictions Other issues
Ordered logit/probit models
If the values of the dependent variable can be ordered (e.g., low,
medium, high) then one should use:
• ordered logit/probit models
• OLS if there are 5 or more categories (not ideal method)
• multinomial logit models (but one would fail to use some of
the information available).
čius CORPFIN 2503, Week 6 35/35
Introduction
Data
Model
Predictions
Other issues