Fraud Detection Application

Fraud in digital transactions is increasing every year, causing huge losses for financial and e-commerce industry players. Many security systems are still reactive, not preventive, so fraud is only discovered after the loss has occurred. According to the KPMG Global Banking Fraud Survey report, more than 50% of global financial institutions experienced a significant increase in digital fraud cases in 2019.

Client

[ Bank / E-Commerce ]

View Website

[ Huggingface ] [ Huggingface ]

Time Line

[ 2025 ]

View Project

[ Github ] [ Github ]

Objective

Developing a fraud detection application based on machine learning that is able to identify suspicious transactions in real-time to prevent financial losses. This solution aims to change the approach from reactive to preventive in dealing with the increasing cases of digital fraud.

Data source: Credit Card Fraud Prediction

Process

Data preparation: Cleaning data, Data splitting, Encoding categorical columns, and Scaling numeric features.

EDA (Exploratory Data Analysis): The distribution of target variables, amounts, transaction categories that dominate fraud cases and correlations between features are visualized using heatmaps to identify interrelated features.

Modeling: Develop 2 machine learning models, which will later be optimized with hyperparameter tuning and validated using recall and cross-validation.

Results

Random Forest with hyperparameter Tuning
Random Forest selected over Logistic Regression as the best model based on the results of the performance evaluation. This selection was strengthened by the application of hyperparameter tuning to optimize its performance.

Evaluation

Recall score: 0.97
Reduces the chances of the Fraud team missing potential frauds.

Cross-validation accuracy: 0.95
The model performs stably on various subsets of data, not just on the training data.

Model ready to deploy
A notebook for new predictions has been prepared with an easy-to-use .pkl model.

Business Impact

Improving Transaction System Security
With a recall of 98% on test data, the system is able to detect almost all incoming fraudulent transactions.

Manual Investigation Cost Efficiency
With an accurate model, only a small portion of transactions need to be checked manually.

Minimizing False Negatives
By focusing on recall metrics, this model prioritizes the detection of all fraud cases even if it means a slight increase in false positives.

Next Project

View

Bank Marketing Classification [ Data Science ]

Crafting Ideas Into Life

“Collaborate with me to craft innovative designs
and data-driven solutions that stand out.”

contact now