Credit Risk modelling with Machine Learning

Credit Risk components:

Exposure, Recovery(r), Default(D)

Factors affecting interest rate:

Features/ Data Points for ML Model.

  • Credit History
  • Financial Ratios
  • Business Metrics: DAU/MAU/Ticket Size/M1 Retention
  • Loan size
  • Length of term

Machine Learning models

K Nearest Neighbours

k-nearest neighbors is quite a simplistic model with fewer features, but with thousands of data points and many features, it can be quite powerful.

Logistic Regression

For logistic regression, the target data must be binary, which means that it can only have two outcomes. In credit risk classification, the target data is indeed binary: a person can either default or not default on a loan.

Decision Trees

Gradient boosting algorithms: XGBoost, LightGBM, CatBoost

Model Evaluation Metrics

With regards to the evaluation of the models, it’s worth mentioning that we should consider Precision, Recall and F1 Score as evaluation metrics, for the following reasons:



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Sankalp Thakur

Sankalp Thakur

I take notes on medium about everything around the sun.