Apache® Spark™ News

Detecting Financial Fraud at Scale with Decision Trees and MLflow on Databricks

Detecting fraudulent patterns at scale is a challenge, no matter the use case. The massive amounts of data to sift through, the complexity of the constantly evolving techniques, and the very small number of actual examples of fraudulent behavior are comparable to finding a needle in a haystack while not knowing what the needle looks like. In the world of finance, the added concerns with security and the importance of explaining how fraudulent behavior was identified further increases the complexity of the task.