CORPFIN 2503 – Business Data Analytics
2021 S2, Workshop 4: Applications of multivariate regressions
£ius
1 Corporate bond prices
Let’s download corporate bond prices and related information from Eikon using the
following procedure:
1. in Eikon search window, type `corporate bond prices’ and select app `Corporate
Bond Prices’ (CBPX) from a few suggestions
2. apply the following �lters (some �lters are not shown in the default view):
(a) Sector: Consumer Goods, Manufacturing, Telephone, Transportations
(b) Domicile: United States
(c) Amount Outstanding: >= U$350,000,000
(d) Coupon: >= 0.5
(e) Maturity: > 31-August-2029
3. make sure that the following �lters are visible on the left hand side:
(a) Currency
(b) Convertible
(c) Credit rating
(d) Callable
(e) Putable
(f) Issue date
(g) Seniority
1
4. click `Update’ button on left bottom corner in order to apply the selected
�lters
5. you should end up with around 1,000 bond issues
6. click the Excel icon on the right top corner in order to download the bond
price information as an Excel �le.
2 Multiple linear regression and multicollinearity
Multicollinearity is a phenomenon due to a high interdependency between the inde-
pendent variables. If we include highly correlated independent variables in the same
regression model, then this could cause multicollinearity. Let’s use SAS provided
car data to investigate the relation between car price and car length and to explore
the consequences of multicollinearity.
/* Creating data file: */
DATA work.car_data;
SET SAShelp.Cars;
RUN;
/* Correlation: */
proc corr data=work.car_data;
var invoice enginesize cylinders horsepower MPG_City MPG_Highway
Length weight wheelbase;
run;
A lot of variables are highly correlated. Let’s estimate �rst a simple regression
where the dependent variable is car price and independent variable is car length:
/* OLS regression: */
PROC REG DATA=work.car_data;
MODEL invoice=length;
RUN;
Now let’s include `Horsepower’ as an additional independent variable. What are
the di�erences in results?
2
PROC REG DATA=work.car_data;
MODEL invoice=length Horsepower;
RUN;
Then re-estimate the regression with `EngineSize’ as an additional independent
variable. Are the results di�erent?
PROC REG DATA=work.car_data;
MODEL invoice=length Horsepower EngineSize;
RUN;
Lastly, include `Wheelbase’ as an additional independent variable:
PROC REG DATA=work.car_data;
MODEL invoice=length Horsepower EngineSize Wheelbase;
RUN;
Have the results changed again?
3 Corporate bond yields (at home)
Suppose you are a bond analyst and you have been asked to estimate the yield for
a bond with the following characteristics:
� maturity: 10 years
� coupon: 1%
� amount: $1,000,000,000
� currency: US dollars
� seniority: senior unsecured
� S&P credit rating: AA
� sector: Food Processors
� domicile: USA
� convertible: no
3
� callable: yes
� puttable: no.
Suggested steps:
1. create a copy of the excel �le generated in Task #1 and use it in the following
steps
2. the following variables�`Bid’, `B Yld’, `Date’, and `Time’�are formulas; copy
and paste them as values
3. delete the following columns:
� Issuer Name
� RIC
� Bid
� Date
� Time
� Domicile
� Moody’s (Issue)
� Fitch (Issue)
� S&P (Issuer)
� Moody’s (Issuer)
� Fitch (Issuer)
rename the remaining columns as
� coupon
� maturity
� isin
� yield
� sector
� currency
4
� issue_date
� convertible
� credit_rating
� callable
� putable
� seniority
� amount
4. make sure that `amount’ variable values are in `General’ format; that is, com-
mas are not used to indicate thousands, millions etc.
5. save �le in CSV format (e.g., Bonds.csv) and import it into SAS
6. tabulate all categorical variables to see their distributions
7. in SAS create the following variables:
(a) years to maturity:
� SAS code: maturity2=(maturity-today())/365;
(b) amount in billions of USD (amount2)
(c) a natural logarithm of amount (ln_amount)
(d) a dummy if bond’s currency is USD
(e) a dummy if a bond is convertible
(f) a dummy if a bond is callable
(g) a dummy if a bond is putable
(h) a dummy if `seniority’ is `Senior Unsecured’
(i) the following dummy variables:
� aaa_d if credit rating is AAA; SAS code:
aaa_d=0;
if credit_rating=”AAA” then aaa_d=1;
� aa_d if credit rating is AA+, AA, AA-; SAS code:
aa_d=0;
if credit_rating in (“AA+” “AA” “AA-“) then aa_d=1;
5
� a_d if credit rating is A+, A, A-
� bbb_d if credit rating is BBB+, BBB, BBB-
� bb_d if credit rating is BB+, BB, BB-
� b_d if credit rating is B+, B, B-
8. check whether the dependent variable and potential independent variables have
outliers
9. estimate multiple regression model where the dependent variable is yield and
the independent variables are: amount (amount2), coupon, maturity2, and
newly created dummy variables
10. re-estimate multiple regression model but this time without outliers (i.e., you
may use WHERE statement; e.g., WHERE yield>0;)
11. identify the di�erences in results between the two regressions in terms of co-
e�cient estimates, their signi�cance, and R-squared
12. re-estimate multiple regression model without outliers but controlling for sec-
tor �xed e�ects
13. use the obtained coe�cient estimates to predict the yield for our bond
14. repeat the last two steps using ln_amount instead of amount2; in this case,
you do not need to exclude observations with outliers in amount (if any).
6
Corporate bond prices
Multiple linear regression and multicollinearity
Corporate bond yields (at home)