Case: Home Equity Line of Credit
Executive Summary:
Financial services are now faced with increasing possibility of default on loans. Therefore, an efficient prediction model is necessary to avoid the default, guaranteeing the profit of the loan service. In this case study, a logistic regression model will be built implementing the JMP software to predict the potential default line of credit.
Objective:
Build the logistic regression model based on JMP
Build personal logistic regression model based on JMP with a subset of predictors
Clarify the insight of the my best model from both the business and statistical perspective.
JMP Model: (write a short description)
Input:
Y: BAD
X: All predictors
After choosing y, all predictors are input into the fit model. From the output, we can make a decision whether a variable can make sense in this prediction model as well as paving the road for testing another data set.
Key Insights:
Logistic regression, JMP model, all predictors
Your Best Model: (write a short description)
Input:
Y: BAD
X: DEROG, DELINQ, CLAGE, DEBTINC
Enter some predictors into the fit model and using stepwise to process these predictors first, compare the value based on different predictors. Eventually, we choose the four factors above because their value is smaller compared to others. Then the model will be made via the Go-Run Model option.
Key Changes Made:
Instead of using all predictors, we choose a subset of them: DEROG, DELINQ, CLAGE, DEBTINC.
Key Insights:
Logistic regression, subset of predictors, value
Why your model is better? (Grill it)
From the statistical perspective, after using stepwise logistic regression with different subsets several times, we can decide the appropriate subset of these predictors based on the value of .
From the business perspective, the four factors actually make sense, and the impact of each predictor is generally consistent with the statistics in MyBestModel. For example, the estimate for DEROG is negative, indicating that those customers who have more derogatory reports are more inclined to have higher credit risk.
What is the lift (as a ratio) provided by your model compared to Baseline Model for both training and testing? What is the increase in net dollar amount compared to the Baseline Model for both training and testing?
The lift provided by my model compared to Baseline Model for both training and testing is
1.49:1.
The total increase in net dollar amount compared to the Baseline Model for both training and testing is $2,200,000, increased by 4.27%.
Conclusion:
In a nutshell, a baseline JMP model can be built with the whole predictors. Moreover, a more effective model can use only a subset of these factors. Eventually, we find that the four elements DEROG, DELINQ, CLAGE, DEBTINC make more sense than the other predictors.
JMP Logistic Model
Build the Logistic Model using JMP (Go option) on the following conditions,
Y = BAD
X = All predictors
Cutoff Probability for mailing = 0.15
Note: It may not be possible to obtain some values for Validation data in that case ignore it.
1. Statistical KPIs of JMP Model – From JMP Printout
Measure
Training
Validation
Definition
Entropy RSquare
0.2641
0.3905
1-Loglike(model)/Loglike(0)
Generalized RSquare
0.3253
0.4691
(1-(L(0)/L(model))^(2/n))/(1-L(0)^(2/n))
Mean -Log p
0.2226
0.1981
∑ -Log(ρ[j])/n
RMSE
0.2420
0.2259
√ ∑(y[j]-ρ[j])²/n
Mean Abs Dev
0.1202
0.1061
∑ |y[j]-ρ[j]|/n
Misclassification Rate
0.0720
0.0680
∑ (ρ[j]≠ρMax)/n
N
1000
500
n
Statistical KPIs of JMP Model – From Excel Printout
Training
Validation
Accuracy %
87.70%
True Positive Rate
56.70%
False Positive Rate
8.97%
Sensitivity ( True Positive Rate)
56.70%
Specificity (True Negative Rate)
91.03%
ii) a) Business KPIs of JMP Model – Training (copy & paste from Excel)
Predicted number of Good Loans
=
8640
Upper limit for Loans
=
10000
Actual number of approved loans
=
8640
Propensity of Good Loan
=
95.139%
Propensity of Bad Loan
=
4.861%
Total Profit
=
$24,480,000
b) Business KPIs of JMP Model – Testing (copy & paste from Excel)
Predicted number of Good Loans
=
8730
Upper limit for Loans
=
10000
Actual number of approved loans
=
8730
Propensity of Good Loan
=
96.220%
Propensity of Bad Loan
=
3.780%
Total Profit
=
$27,000,000
iii) Interpret the Model (decision tree) – From Business Point of view & Statistical Point of view
From the statistics of the JMP model, we can see that the estimate for DEROG, DELINQ, DEBTINC is negative, indicating that those who have more derogatory reports or delinquent trade lines, higher debt to income ratio tend to be classified into the bad credit category. This result also fit well with the business common sense that customers with failure in the above factors are more likely to cause loss to lenders.
Other factors seem to have less effect because the amount of loan or the number of trade lines can hardly reflect whether a customer ‘s credit is bad or good.
iv) Confusion Matrix for Training (copy & paste)
GoodLoan
BadLoan
GoodLoan
822
81
903
BadLoan
42
55
97
864
136
1000
iv) Confusion Matrix for Testing (copy & paste)
GoodLoan
BadLoan
GoodLoan
840
70
910
BadLoan
33
57
90
873
127
1000
v) Lift Table (copy & paste from Excel)
Lift Table in Propensity
Training
Testing
Lift with respect to Baseline – JMP Model
9.808132875
0
Lift with respect to Baseline – My Best Model
9.756347698
0
Lift Table in Dollars
Training
Testing
Lift with respect to Baseline – JMP Model
1.464114833
0.000522129
Lift with respect to Baseline – My Best Model
1.486733976
0.000547965
Lift with respect to JMP Model – My Contribution
0.022615646
0.000025827
Overall Lift with respect to Baseline -My Best Model
1.486733976
0.000547965
vi) Attach JMP Printout (Remove unwanted parts – Copy and Paste then edit it.)
My Best Logistic Model – Repeat the same process as above.
2. Statistical KPIs of JMP Model – From JMP Printout
Measure
Training
Validation
Definition
Entropy RSquare
0.2178
0.2400
1-Loglike(model)/Loglike(0)
Generalized RSquare
0.2750
0.3040
(1-(L(0)/L(model))^(2/n))/(1-L(0)^(2/n))
Mean -Log p
0.2491
0.2537
∑ -Log(ρ[j])/n
RMSE
0.2576
0.2610
√ ∑(y[j]-ρ[j])²/n
Mean Abs Dev
0.1351
0.1389
∑ |y[j]-ρ[j]|/n
Misclassification Rate
0.0800
0.0710
∑ (ρ[j]≠ρMax)/n
N
1000
500
n
Statistical KPIs of JMP Model – From Excel Printout
Training
Validation
Accuracy %
87.70%
True Positive Rate
56.70%
False Positive Rate
8.97%
Sensitivity ( True Positive Rate)
56.70%
Specificity (True Negative Rate)
91.03%
ii) a) Business KPIs of MyBest Model – Training
Predicted number of Good Loans
=
8820
Upper limit for Loans
=
10000
Actual number of approved loans
=
8820
Propensity of Good Loan
=
95.465%
Propensity of Bad Loan
=
4.535%
Total Profit
=
$25,680,000
b) Business KPIs of JMP Model – Testing (copy & paste from Excel)
Predicted number of Good Loans
=
8800
Upper limit for Loans
=
10000
Actual number of approved loans
=
8800
Propensity of Good Loan
=
96.591%
Propensity of Bad Loan
=
3.409%
Total Profit
=
$28,000,000
iii) Interpret the Model (decision tree) – From Business Point of view & Statistical Point of view
From the statistics of the MyBestModel, we can see that the estimate for DEROG, DELINQ, DEBTINC is negative, indicating that those who have more derogatory reports or delinquent trade lines, higher debt to income ratio tend to be classified into the bad credit category. The estimate for CLAGE is positive. This may imply that the age of the oldest trade line has something to do with credit. However, this factor seems less critical than the other three from the statistical perspective.
This result also fit well with the business common sense that customers with failure in the above factors are more likely to cause loss to lenders.
iv) Confusion Matrix for Training (copy & paste)
GoodLoan
BadLoan
GoodLoan
842
61
903
BadLoan
40
57
97
882
118
1000
iv) Confusion Matrix for Testing (copy & paste)
GoodLoan
BadLoan
GoodLoan
850
60
910
BadLoan
30
60
90
880
120
1000
v) Lift Table (copy & paste from Excel)
Lift Table in Propensity
Training
Testing
Lift with respect to Baseline – JMP Model
9.808132875
0
Lift with respect to Baseline – My Best Model
9.756347698
0
Lift Table in Dollars
Training
Testing
Lift with respect to Baseline – JMP Model
1.464114833
0.000522129
Lift with respect to Baseline – My Best Model
1.486733976
0.000547965
Lift with respect to JMP Model – My Contribution
0.022615646
0.000025827
Overall Lift with respect to Baseline -My Best Model
1.486733976
0.000547965
vi) Attach JMP Printout (Remove unwanted parts – Copy and Paste then edit it.)
Key information Cutoff Probability for mailing = 0.17