Case: Home Equity Line of Credit
Executive Summary:
Financial services are now faced with increasing possibility of default on loans. Therefore, an
efficient prediction model is necessary to avoid the default, guaranteeing the profit of the loan
service. In this case study, a logistic regression model will be built implementing the JMP
software to predict the potential default line of credit.
Objective:
Build the logistic regression model based on JMP
Build personal logistic regression model based on JMP with a subset of predictors
Clarify the insight of the my best model from both the business and statistical perspective.
JMP Model: (write a short description)
Input:
Y: BAD
X: All predictors
After choosing y, all predictors are input into the fit model. From the output, we can make a
decision whether a variable can make sense in this prediction model as well as paving the road
for testing another data set.
Key Insights:
Logistic regression, JMP model, all predictors
Your Best Model: (write a short description)
Input:
Y: BAD
X: DEROG, DELINQ, CLAGE, DEBTINC
Enter some predictors into the fit model and using stepwise to process these predictors first,
compare the 𝑝 −value based on different predictors. Eventually, we choose the four factors
above because their 𝑝 −value is smaller compared to others. Then the model will be made via the
Go-Run Model option.
Key Changes Made:
Instead of using all predictors, we choose a subset of them: DEROG, DELINQ, CLAGE,
DEBTINC.
Key Insights:
Logistic regression, subset of predictors, 𝑝 −value
Why your model is better? (Grill it)
From the statistical perspective, after using stepwise logistic regression with different subsets
several times, we can decide the appropriate subset of these predictors based on the value of 𝑝.
From the business perspective, the four factors actually make sense, and the impact of each
predictor is generally consistent with the statistics in MyBestModel. For example, the estimate
for DEROG is negative, indicating that those customers who have more derogatory reports are
more inclined to have higher credit risk.
What is the lift (as a ratio) provided by your model compared to Baseline Model for both
training and testing? What is the increase in net dollar amount compared to the Baseline
Model for both training and testing?
The lift provided by my model compared to Baseline Model for both training and testing is
1.49:1.
The total increase in net dollar amount compared to the Baseline Model for both training and
testing is $2,200,000, increased by 4.27%.
Conclusion:
In a nutshell, a baseline JMP model can be built with the whole predictors. Moreover, a more
effective model can use only a subset of these factors. Eventually, we find that the four elements
DEROG, DELINQ, CLAGE, DEBTINC make more sense than the other predictors.
JMP Logistic Model
Build the Logistic Model using JMP (Go option) on the following conditions,
Y = BAD
X = All predictors
Cutoff Probability for mailing = 0.15
Note: It may not be possible to obtain some values for Validation data in that case ignore it.
1. Statistical KPIs of JMP Model – From JMP Printout
Measure Training Validation Definition
Entropy RSquare 0.2641 0.3905 1-Loglike(model)/Loglike(0)
Generalized RSquare 0.3253 0.4691
(1-(L(0)/L(model))^(2/n))/(1-
L(0)^(2/n))
Mean -Log p 0.2226 0.1981 ∑ -Log(ρ[j])/n
RMSE 0.2420 0.2259 √ ∑(y[j]-ρ[j])²/n
Mean Abs Dev 0.1202 0.1061 ∑ |y[j]-ρ[j]|/n
Misclassification Rate 0.0720 0.0680 ∑ (ρ[j]≠ρMax)/n
N 1000 500 n
Statistical KPIs of JMP Model – From Excel Printout
Training Validation
Accuracy % 87.70%
True Positive Rate 56.70%
False Positive Rate 8.97%
Sensitivity ( True Positive Rate) 56.70%
Specificity (True Negative Rate) 91.03%
ii) a) Business KPIs of JMP Model – Training (copy & paste from Excel)
Predicted number of Good Loans = 8640
Upper limit for Loans = 10000
Actual number of approved loans = 8640
Propensity of Good Loan = 95.139%
Propensity of Bad Loan = 4.861%
Total Profit = $24,480,000
b) Business KPIs of JMP Model – Testing (copy & paste from Excel)
Predicted number of Good Loans = 8730
Upper limit for Loans = 10000
Actual number of approved loans = 8730
Propensity of Good Loan = 96.220%
Propensity of Bad Loan = 3.780%
Total Profit = $27,000,000
iii) Interpret the Model (decision tree) – From Business Point of view & Statistical Point
of view
From the statistics of the JMP model, we can see that the estimate for DEROG, DELINQ,
DEBTINC is negative, indicating that those who have more derogatory reports or
delinquent trade lines, higher debt to income ratio tend to be classified into the bad credit
category. This result also fit well with the business common sense that customers with
failure in the above factors are more likely to cause loss to lenders.
Other factors seem to have less effect because the amount of loan or the number of trade
lines can hardly reflect whether a customer ‘s credit is bad or good.
iv) Confusion Matrix for Training (copy & paste)
GoodLoan BadLoan
GoodLoan 822 81 903
BadLoan 42 55 97
864 136 1000
iv) Confusion Matrix for Testing (copy & paste)
GoodLoan BadLoan
GoodLoan 840 70 910
BadLoan 33 57 90
873 127 1000
v) Lift Table (copy & paste from Excel)
Lift Table in Propensity Training Testing
Lift with respect to Baseline – JMP Model 9.808132875 0
Lift with respect to Baseline – My Best Model 9.756347698 0
Lift Table in Dollars Training Testing
Lift with respect to Baseline – JMP Model 1.464114833 0.000522129
Lift with respect to Baseline – My Best Model 1.486733976 0.000547965
Lift with respect to JMP Model – My Contribution 0.022615646 0.000025827
Overall Lift with respect to Baseline -My Best Model 1.486733976 0.000547965
vi) Attach JMP Printout (Remove unwanted parts – Copy and Paste then edit it.)
My Best Logistic Model – Repeat the same process as above.
2. Statistical KPIs of JMP Model – From JMP Printout
Measure Training Validation Definition
Entropy RSquare 0.2178 0.2400 1-Loglike(model)/Loglike(0)
Generalized RSquare 0.2750 0.3040
(1-(L(0)/L(model))^(2/n))/(1-
L(0)^(2/n))
Mean -Log p 0.2491 0.2537 ∑ -Log(ρ[j])/n
RMSE 0.2576 0.2610 √ ∑(y[j]-ρ[j])²/n
Mean Abs Dev 0.1351 0.1389 ∑ |y[j]-ρ[j]|/n
Misclassification Rate 0.0800 0.0710 ∑ (ρ[j]≠ρMax)/n
N 1000 500 n
Statistical KPIs of JMP Model – From Excel Printout
Training Validation
Accuracy % 87.70%
True Positive Rate 56.70%
False Positive Rate 8.97%
Sensitivity ( True Positive Rate) 56.70%
Specificity (True Negative Rate) 91.03%
ii) a) Business KPIs of MyBest Model – Training
Predicted number of Good Loans = 8820
Upper limit for Loans = 10000
Actual number of approved loans = 8820
Propensity of Good Loan = 95.465%
Propensity of Bad Loan = 4.535%
Total Profit = $25,680,000
b) Business KPIs of JMP Model – Testing (copy & paste from Excel)
Predicted number of Good Loans = 8800
Upper limit for Loans = 10000
Actual number of approved loans = 8800
Propensity of Good Loan = 96.591%
Propensity of Bad Loan = 3.409%
Total Profit = $28,000,000
iii) Interpret the Model (decision tree) – From Business Point of view & Statistical Point
of view
From the statistics of the MyBestModel, we can see that the estimate for DEROG, DELINQ,
DEBTINC is negative, indicating that those who have more derogatory reports or
delinquent trade lines, higher debt to income ratio tend to be classified into the bad credit
category. The estimate for CLAGE is positive. This may imply that the age of the oldest
trade line has something to do with credit. However, this factor seems less critical than the
other three from the statistical perspective.
This result also fit well with the business common sense that customers with failure in the
above factors are more likely to cause loss to lenders.
iv) Confusion Matrix for Training (copy & paste)
GoodLoan BadLoan
GoodLoan 842 61 903
BadLoan 40 57 97
882 118 1000
iv) Confusion Matrix for Testing (copy & paste)
GoodLoan BadLoan
GoodLoan 850 60 910
BadLoan 30 60 90
880 120 1000
v) Lift Table (copy & paste from Excel)
Lift Table in Propensity Training Testing
Lift with respect to Baseline – JMP Model 9.808132875 0
Lift with respect to Baseline – My Best Model 9.756347698 0
Lift Table in Dollars Training Testing
Lift with respect to Baseline – JMP Model 1.464114833 0.000522129
Lift with respect to Baseline – My Best Model 1.486733976 0.000547965
Lift with respect to JMP Model – My Contribution 0.022615646 0.000025827
Overall Lift with respect to Baseline -My Best Model 1.486733976 0.000547965
vi) Attach JMP Printout (Remove unwanted parts – Copy and Paste then edit it.)
Key information → Cutoff Probability for mailing = 0.17
Case: Home Equity Line of Credit