CS考试辅导 MAT012 Credit Risk Scoring (2015/16)

University of Cardiff
MAT012 Credit Risk Scoring (2015/16)
Lab Session 2

Copyright By PowCoder代写 加微信 powcoder

1. Using SAS to plot graphs
Use of variables and understand the dataset in SAS.
Create a library ‘crs’. Import the file ‘german.csv’ to SAS base. This is a dataset consisting of 1000 cases of applicants for credit together with details of 20 of their characteristics. 700 of the applicants turned out to be Good (‘1’ in Good column) and 300 were Bad (‘1’ in the Bad column). There is a data dictionary at the end of this handout. Check that you understand the meaning of the data set which is provided in this worksheet.
a) Create a histogram for the variable Age
b) Create a histogram for the variable Purpose
c) Create a box plot for the variable Age by Good/Bad

2. Coarse classifying classes
The following data describes the attributes of the characteristic “time with bank” in a credit scoring sample.

Under 6 months
6-12 months
12-18 months
18-24 months
24-36 months
5-10 years

No. of Goods

No. of Bads

Use this information to find the best coarse classifying classes to split this into using
a. The Chi-square statistic
b. The information statistic

Age in years

Amount of loan

Status of existing checking account:
1: < 0 DM; 2: 0 to <200 DM; 3: >=200 DM/ salary assignments for at least 1 year; 4: no checking account

Other debtors/guarantors: 1: none; 2: co-applicant; 3: guarantor

Number of dependents

Duration in months

Present employed since:
1: unemployed; 2: < 1 year; 3: 1 to < 4 years; 4: 4 to < 7 years; 5: >= 7 years

Number of existing credits at this bank

Foreign worker: 1: yes; 2: no

Good/bad payer

0: no credits taken/all credits paid back duly; 1: all credits at this bank paid back duly; 2: existing credits paid back duly till now;
3: delay in paying off in the past; 4: critical account/other credits existing (not at this bank)

Housing: 1: rent; 2: own; 3: for free

Instalment rate in percentage of disposable income

Job: 1: unemployed/unskilled – non-resident; 2: unskilled – resident; 3: skilled employee/official; 4: management/self-employed/highly qualified employee/officer

Marital status: 1: male: divorced/separated;
2: female: divorced/separated/married; 3: male: single;
4: male: married/widowed; 5: female: single

Other instalment plans: 1: bank; 2: stores; 3: none

Property: 1: real estate; 2: if not 1: building society savings agreement/life insurance; 3: if not 1/2: car or other, not in attribute 6; 4: unknown/no property

Purpose of loan: 0: car (new); 1: car (used); 2: furniture/equipment; 3: radio/television; 4: domestic appliances; 5: repairs; 6: education;
7: vacation; 8: retraining; 9: business; X: others

Date beginning permanent residence

Savings account/bonds: 1: < 100 DM; 2: 100 to < 500 DM; 3: 500 to < 1000 DM; 4: >= 1000 DM; 5: unknown/no savings account

Telephone: 1: none; 2: yes, registered under the customer’s name

3. Using logistic regression to build a scorecard
a. Prior to building the scorecard, we need to first recode the following variables:

Using the variable ‘checking’ and the above five newly coded variables to build a scorecard utilizing logistic regression. You can use either ‘Bad’ or ‘Good’ as the target variable [Hint: use proc logistic and the option param=glm].

Consider the following:
i) Why do we need to recode all variables?
ii) Examine the results and explain whether they make sense for each variable.
iii) How could you interpret these estimates? Why has SAS left the maximum likelihood estimates of the last attribute of each variable?

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com