代写代考 MANG 2043 – Analytics for Marketing

MANG 2043 – Analytics for Marketing

MAT012 – Credit Risk Scoring
Lecture 2b

This Lecture’s Learning Contents
Statistical methods for scorecard development
Classification methods in credit scoring
Simple linear regression
Fisher discriminant approach
Logistic regression
Other generalised linear approaches

Methods used in credit granting
Judgement evaluation
5Cs: Character, Capital, Collateral, Capacity, Condition
Statistical methods
Naïve Bayes method
Discriminant analysis/linear regression
Logistic regression
Classification trees
Nearest neighbour methods
Maximising divergence
Operational research methods
Linear programming
Heuristic methods
Neural network algorithms
Support vector machines
Genetic algorithms

How scorecards are developed-statistical methods
Building a scoring system classifying into goods/bads
Take sample of previous applicants (credit scoring) and previous borrowers (behavioural scoring)
For each applicant in sample, classify subsequent credit history as
Acceptable (good)
Not acceptable (bad-missing 3 consecutive months of payments)
‘in determinates’ who are ignored in subsequent analysis
Divide set of characteristics, A, into:
AB: attributes of those who were bad
AG: attributes of those who were good
Accept those who gave answers in AG and reject those in AB (but this assumes future is like past)
Estimate probability of being good for each attribute combination

Why build a linear scorecard?

Graph of simple scorecard on age and income

Better classifier
But lots more
parameters

Not perfect
But only two parameters
e.g. Let the cut-off be: Age 25, income £25k= Score 30
Age 30; Income £50k=Score 60, so accept
Age 20; income £10k=Score 26, so reject
Better to split
continuous variables
into bands

What is a score?
Returning to the idea of estimating probability. Given characteristics , we want to estimate
If score is a sufficient statistic , it will give you all the information that the data provides about the problem. Hence,

e.g. if you know , all the detailed values of do not provide any more information.

Example of a scorecard
Gives a score to attributes ( answer) of a characteristic (application form question).

If score is 40 or above accept; if below 40 reject.
Not all application form questions will be scored and some other data – credit bureau reference may be scored

Philosophy of credit scoring
It is predictive not explanatory.
Anything that helps predict can be used.( first letter name, insurance)
It has to interact easily with organisations information system.
So variables that can be used must be easily and cheaply obtained and automatically updated
Whole raft of systems now used
application (credit) scoring
performance ( behavioural) scoring set credit limits or allow extra credit products
collection scoring who of defaulters will repay and how to get them to repay
response scoring – who will respond to direct mailing
usage scoring – who will use credit product
attrition scoring – who will change to other product

Log odds score
Define log odds score by

If pG pB are proportions of Goods, Bads in population, then

This is naïve Bayes score
Does not usually work because xi are not independent
But shows how scores can be built up

Naïve Bayes way & simple log odds scorecard
Owner Not owner Total
G B G B G B
30- 100 10 200 40 300 50
30+ 500 10 100 40 600 50
Total 600 20 300 80 900 100

Building simple log odds scorecard
30- 30+ Owner Not owner Pop score Total score
-0.41 0.29 1.20 -0.88 2.20
1 0 1 0 2.99
1 0 0 1 0.91
0 1 1 0 3.69
0 1 0 1 1.61

Only works because assume age/home ownership independence;
If not, use regression or other methods

Linear regression & Logistic regression
Reality is that one needs to deal with dependence between the characteristics and linear and logistic regression approaches adjust coefficients to allow for this
First scorecards were linear regression/discriminant analysis approaches. Now main approaches are logistic regressions (Note that logistic regressions are defined to give log odds score. But in fact, we can show linear regression after scaling can also give almost log odds score).
After coarse classifying, either
Use a binary variable for each attribute class (so lots of variables for one characteristic) or
Use weights of evidence to define value on each attribute class (one variable for each characteristic)

Linear regression: Bayes decision rule
L is lost profit on classifying a ‘good’ as a ‘bad’; D is debt incurred by classifying a ‘bad’ as a ‘good’.
( ) is proportion of ‘good’ (‘bad’) in population
( )is probability of observing data , given, respectively, a ‘good’ (‘bad’) applicant.
is probability of finding a ‘good’ if application data are

Linear regression: Bayes decision rule
Expected loss is

Our goal is to minimize the loss function

If distributions are multivariate normal with means and common covariance , this reduces to the linear rule

Linear regression: Fisher discriminant rule
Discriminant analysis asks what linear combination of best separates goods from bads.
Our goal is to choose to maximise

We need to maximise the linear function (same as Bayes rule):
Midpoint between the two means is
Classify as good if

Linear regression: regression line on prob.
Discriminant analysis (LDF) is equivalent to linear regression if there are only two classification groups:

Again, it turns out that this is solved by
Since this is a regression that can use the least squares approach, it means that the coefficient are calculated by an analytic expression.

Linear regression: statistical test
For a linear regression,

Use ordinary least squares regression to estimate
In order to estimate the probability of observation will be a ‘good’ (‘bad’)- ( ), we simply let the target variable for each sample observation to be

Linear regression: statistical test

Problems: the right-hand-side of the linear equation could take any value of but the left-hand-side is bounded. The assumption of normally distributed error terms do not hold anymore.
To tackle the issue, we can use a bounding function to limit the model outcome between 0 and 1.

Linear regression: further notes

: this tells us how much the variation in the probability can be explained by , which is the strength of the linear relationship.
Wilks ( likelihood ratio test): if the variances are the same, . This can be used to test hypothesis.
T-test: to check if coefficients of variables are non-zero in order to judge if these variables should be included in the scorecard
(sample Mahalanobis distance with a F-distribution): the measure in the Fisher approach:

Logistic regression (LR)
Logistic regression assumes
If variables are multivariate normal, discriminant approach implies that linear regression hold. But linear regression also holds for much wider class of models including independent binary, log-linear models etc.
Classify as good if
However, we cannot use least squares to estimate . Instead, we have to use the maximum likelihood

Since this is non-linear, it needs to use iterative procedures such as Newton Raphson. Results prove to be similar to the discriminant analysis.

Other generalised linear model transformation

()lnlnlnln
If x, x, …x are independent,
()()()….()
swoexwoexwoex

ssswoexwoex

ln(900/100)2.20

weights of evidence
(30)lnln(2/3)0.41
(30)lnln(4/3)0.29
()lnln(10/3)1.20
(lnln(5/12)0.88
woenotowner

p G | x( )∝ p x |G( ) pG

LpxGpDpxBp

AG = x |w1x1 +w2x2 + ……+wqxq > c{ } where w = mG −mB( )T S −1

Y = w1X1 + …+wqXq

wmEYGwmEYBwSVarY

distance between means of

c = 0.5wT mG +mB( )

w = mG − mB( )

y = β0 + β1×1 + β2×2 + …+ βqxq + e

distance between means( )2
variane of populations( )

distance between means
variane of populations

w0 +w1x1 + …+wqxq ≥ cS

or maximise log1
where 1 for good, 0 for bad.

= w0 +w1x1 + …+wqxq

pi = w0 + w1x1 + …+ wqxq

N x( ) = 1

N −1 pi( ) = w0 +w1x1 + …+wqxq

1 wT x i >1

wT x i 0 < w 0 wT x i < 0 /docProps/thumbnail.jpeg 程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts