FM 9528 – Banking Analytics Coursework 2
1
Coursework 2 – Credit Risk Analytics
Credit card lending is one of the most common offerings in modern Fintechs. Usually granted by a
bank, these products are now being granted by a Fintech that acts as a front for a bank that actually
takes the risk. Deciding who to grant these services is an interesting problem under these
circumstances.
In this coursework, you will develop a fully compliant PD model from the data they make available,
from the raw data to the level 2 calibration, using what you have learned in the lectures. The
objective of the coursework is to estimate the capital requirements for the credit card company as
if they were a bank.
You are given information from approximately 50,000 credit cards. The data includes information
from the application to the credit card in Brazil during 2007, some of which can be used to predict
the performance of the loan. The variable description is available in the Excel file
“CC_VariablesList.xls”
With this information, the dataset, and your knowledge from the course, answer the following
questions:
1. (15%) Clean the dataset so it is ready to apply models to it. Discuss all your decisions. Design
three variables from the dataset that you think could improve prediction (using e.g. ratios,
averages, trends, aggregations, etc. Please note normal cleaning does not count). Explain
your rationale on choosing those variables.
2. (15%) Calculate the WoE and perform the variable selection procedures you see fit. Explain
your decisions.
3. (20%) Construct a scorecard which can model the probability of default for the credit card
applications. Discuss your choice of variables, embedded selection methods, choice of
parameters of these and your final performance in terms of AUC. How many variables do
you recommend using?
4. (20%) Compare your scoring model with an XGBoosting model and Random Forest model
trained over the data without the WoE transformation. Use cross-validation to determine
your optimal parameters, if necessary, discuss the accuracy metrics you deem relevant.
Compare the performance of the three models and discuss your findings.
5. (10%) Discuss the variable importance for all models. Do they agree? Why?
6. (10%) Assume the company gives every approved borrower a credit card with a one-month
salary credit limit. Furthermore, assume that the interest rate the credit card charges is 20%
per year over the used limit, with an average utilization of 32% of the approved limit. Design
a two-cut-off point strategy for your scorecard and discuss its results.
7. (Extra credit, 15% See extra submission tab in OWL) Using the monthly macroeconomic
information you consider relevant (see for example
https://tradingeconomics.com/brazil/indicators), calibrate a long-run PD for the credit
cards granted. For this, first segment your scorecard curve into 7 to 15 groups, then regress
your monthly PDs (grouped from your objective variable) against the macroeconomic
variables and the past PDs as discussed in the additional material left in OWL. Use the long-
term forecasts you can find online from reputable sources for your long-term calibrated
https://tradingeconomics.com/brazil/indicators
FM 9528 – Banking Analytics Coursework 2
2
values. If you cannot find them, assume a value which makes sense to you and explain why.
Analyse your results.
The remaining 10% is given by the format and style as discussed in the rubric.
Conditions of the coursework
Software: You must use Python to run the numerical calculations over your portfolio. A copy of your
jupyter notebook must be attached to the coursework as an appendix in readable format, and a link
to the notebook must also be included. Instructions how to export to PDF can be found here:
https://stackoverflow.com/questions/52588552/google-co-laboratory-notebook-pdf-download.
The notebook text MUST be machine readable (so no screenshots of the notebook please)
otherwise a 25% discount will apply.
Word Limit: 2000 words +/-10% either side of the word count is deemed to be acceptable. Any text
that exceeds an additional 10% will not attract any marks. The relevant word count includes items
such as cover page, executive summary, title page, table of contents, tables, figures, in-text citations
and section headings, if used. The relevant word count excludes your list of references and any
appendices at the end of your coursework submission (including the code).
You should always include the word count (from your software word processor, not Turnitin), at the
end of your coursework submission, before your list of references.
Title/Cover Page: You must include a title/ cover page that includes: your Student ID, Course Code,
Assignment Title, Word Count. This assignment will be marked anonymously, please ensure that
your name does not appear on any part of your assignment otherwise a discount will be applied.
Submission Deadline: November 22nd, 23:59.
Turnitin Submission: The assignment MUST be submitted electronically via OWL. All required
papers may be subject to submission for textual similarity review to the commercial plagiarism
detection software under license to the University for the detection of plagiarism. All papers
submitted for such checking will be included as source documents in the reference database for the
purpose of detecting plagiarism of papers subsequently submitted to the system. Use of the service
is subject to the licensing agreement, currently between The University of Western Ontario and
Turnitin.com (http://www.turnitin.com).
Late Submission: Late submissions are possible up to seven days after the deadline. There is a linear
10% penalty per day of late submission (Final mark = Original mark – 10% * day) subtracted directly
from the final mark. Submissions after the seven days are not accepted and will be considered a
non-submission.
https://stackoverflow.com/questions/52588552/google-co-laboratory-notebook-pdf-download
http://www.turnitin.com)/