All posts
ML EngineeringApr 20264 min read

The False Positive Problem in Fraud Detection

Most fraud detection tutorials optimize for recall. Catch as much fraud as possible, minimize false negatives. That makes sense if missing fraud is the only cost. But in a real payment system, there is another cost that rarely gets mentioned: the false positive.

What a false decline actually costs

When your model blocks a legitimate transaction, a few things happen. The customer is frustrated. They might try again and get blocked again. They might call support. They might never come back. Studies on card fraud put the cost of a false decline (in customer lifetime value terms) somewhere between $7 and $40 per incident, depending on the customer.

For high-value customers making legitimate large purchases, a false decline can cost more than the fraud you were trying to prevent.

This changes the problem. You are no longer just minimizing fraud losses. You are minimizing total cost, which includes the cost of blocking real customers.

How this changes the model

I built my fraud detection system around this tradeoff explicitly. The ensemble (XGBoost plus Isolation Forest) was tuned for precision at a specific recall threshold rather than just AUC. The threshold was chosen based on the cost ratio: how much does a false positive cost relative to a missed fraud case?

The system reached 94.2% precision at the operating threshold. That means about 1 in 17 flagged transactions is a false positive. Whether that is acceptable depends entirely on the cost structure of the business. There is no universally correct number.

The streaming part

Fraud scoring has to be real-time. A batch job that scores transactions every hour is useless. The fraud is long gone. I built the pipeline on Kafka with Redis for feature caching, scoring incoming events in under 50ms. The latency constraint is as important as the model accuracy.

The thing I took away from this project: the interesting problems in ML engineering are not the modeling problems. They are the constraints the business puts on you that force you to make different choices than you would in a notebook.

All postsFaizan Khan