Credit Risk modelling with Machine Learning

3 min readJun 9, 2022

Credit risk is the possibility that a person or company fails to meet contractual obligations, such as mortgages, credit card debts, and other types of loans.

Minimizing the risk of default is a major concern. For this reason, commercial and investment banks, venture capital funds, asset management companies and insurance firms, are increasingly relying on technology to predict which clients are more prone to stop honoring their debts.

Machine Learning models have been helping these companies to improve the accuracy of their credit risk analysis, identifying defaulters, while also minimizing false positives, preventing clients to be wrongly classified as defaulters.

Credit Risk components:

Exposure, Recovery(r), Default(D)

Expected loss = D *A *(1-r)

Factors affecting interest rate:

Features/ Data Points for ML Model.

Credit Score
Credit History
Financial Ratios
Business Metrics: DAU/MAU/Ticket Size/M1 Retention
Loan size
Length of term

Machine Learning models

K Nearest Neighbours

k-nearest neighbors is quite a simplistic model with fewer features, but with thousands of data points and many features, it can be quite powerful.

Logistic Regression

For logistic regression, the target data must be binary, which means that it can only have two outcomes. In credit risk classification, the target data is indeed binary: a person can either default or not default on a loan.

Decision Trees

Binary results are not that useful for credit modeling. We can’t use a binary result to set client’s interest rate.

The solution is to use random forests. A random forest is simply a collection of decision trees, all structured differently. To make a prediction, each tree in the forest “votes” for whether or not it thinks client will default on the loan. Client’s credit risk is the percentage of the trees that say he/she will default.

Gradient boosting algorithms: XGBoost, LightGBM, CatBoost

Model Evaluation Metrics

With regards to the evaluation of the models, it’s worth mentioning that we should consider Precision, Recall and F1 Score as evaluation metrics, for the following reasons:

Precision will give us the proportion of positive identifications that were indeed correct. It can be defined as:

Recall will determine the proportion of real positives that were correctly identified, and it can be defined as:

F1 Score is a metric that is useful when we need to seek a balance between precision and recall. The formula is defined as:

The best model possible would be the one that could minimize false negatives, identifying all defaulters among the client base, while also minimizing false positives, preventing clients to be wrongly classified as defaulters.