ML EngineeringMay 20266 min read

Building a Credit Scoring Model From Scratch

I built a credit scoring system to understand how banks actually approach this problem. Not the academic version. The engineering version. What does it take to go from raw loan data to a system that returns a decision in under 120ms with an explanation attached?

The data pipeline comes first

I worked with 307,000 loan records. The raw data had missing values, inconsistent formats, and the usual mess you get from anything that came out of a real system. Before any modeling, I needed a pipeline that could handle new records the same way it handled training data.

This is where most tutorials skip ahead. They clean the data manually in a notebook and move on. But if your preprocessing is not reproducible and serializable, your model cannot run in production. I built the pipeline with sklearn's Pipeline and ColumnTransformer so that the same transformations applied to training data would apply to inference requests automatically.

Why XGBoost and why it works here

Gradient boosted trees handle tabular credit data well for a few reasons. They are robust to outliers, handle missing values natively, and tend to find nonlinear feature interactions that logistic regression misses. They also produce calibrated probabilities, which matters when you are outputting a default probability rather than a binary decision.

The model reached a ROC-AUC of 0.79 on holdout data. That is a reasonable number for this type of dataset. It is not impressive in isolation, but the goal was not a number. It was a working system.

SHAP for adverse action explanations

In real lending, you cannot just say "denied." Regulation requires an explanation. SHAP gives you that explanation at the individual prediction level.

SHAP (SHapley Additive exPlanations) assigns each feature a contribution score for a specific prediction. For a declined application, I could tell you: the biggest factors were a debt-to-income ratio of 0.61 and 3 derogatory marks. That is an adverse action notice.

The deployment part

The model is served via FastAPI. An incoming request hits an endpoint, runs through the same sklearn pipeline, gets scored by XGBoost, and gets SHAP values computed, all in under 120ms on average. The whole thing runs in Docker on AWS Lambda.

The engineering side of this project taught me more than the modeling side. Getting a model to run in a notebook is straightforward. Getting it to run reliably, consistently, with explanations, under a latency constraint. That is a different problem.

All postsFaizan Khan