Hybrid AI Models Combining Rules and Machine Learning for Financial Fraud Control in the United States

Mahuma Akter, Rejon Kumar Ray, Md Shafiqur Rahman, Tanjina Tuly, Md Shohail Uddin Sarker, Santosh pant, Al Amin and Md Al Mamun Siddike

doi:10.7492/70082h56

Authors

Mahuma Akter, Rejon Kumar Ray, Md Shafiqur Rahman, Tanjina Tuly, Md Shohail Uddin Sarker, Santosh pant, Al Amin and Md Al Mamun Siddike Author

DOI:

https://doi.org/10.7492/70082h56

Abstract

Detecting financial fraud is getting harder as digital payments explode and transaction volumes skyrocket. Usually, fraud is just a speck in a massive pile of normal activity, creating a data imbalance that makes standard detection methods struggle. Banks often stick to rule-based systems because they are easy to explain to regulators, but these setups aren't great at catching clever, evolving theft. On the flip side, machine learning is great at spotting tricky patterns in big data, but it can be a "black box" and often flags too many innocent transactions. This study looks at whether hybrid AI, mixing old-school rules with modern machine learning, can detect more fraud without making the process unreliable. Using a classic credit card dataset of 284,807 transactions where fraud was extremely rare, the research compared three setups: basic rules, standalone machine learning, and hybrid systems. Logistic Regression, Random Forest, and XGBoost served as the model baselines. These were tested against three hybrid ideas: using rules as features, a two-stage sequential pipeline, and a weighted ensemble that blends scores from both. Since fraud is so rare, the study moved past basic accuracy to focus on Precision-Recall Area Under the Curve (PR-AUC), recall, and the actual financial cost of false alarms versus missed thefts. While XGBoost was the strongest individual performer with the best PR-AUC, the hybrid systems offered better real-world perks. They were more effective at cutting down false positives and lowering the total cost of managing fraud. The two-stage pipeline and weighted ensemble specifically did the best job of balancing detections against total alerts. Ultimately, these results show that hybrid systems are a smart middle ground, offering a setup that is both easy to understand and highly predictive.

Hybrid AI Models Combining Rules and Machine Learning for Financial Fraud Control in the United States

Mahuma Akter¹, Rejon Kumar Ray², Md Shafiqur Rahman³, Tanjina Tuly⁴, Md Shohail Uddin Sarker⁵, Santosh pant⁶, Al Amin⁷ and Md Al Mamun Siddike⁸

Detecting financial fraud is getting harder as digital payments explode and transaction volumes skyrocket. Usually, fraud is just a speck in a massive pile of normal activity, creating a data imbalance that makes standard detection methods struggle. Banks often stick to rule-based systems because they are easy to explain to regulators, but these setups aren't great at catching clever, evolving theft. On the flip side, machine learning is great at spotting tricky patterns in big data, but it can be a "black box" and often flags too many innocent transactions. This study looks at whether hybrid AI, mixing old-school rules with modern machine learning, can detect more fraud without making the process unreliable. Using a classic credit card dataset of 284,807 transactions where fraud was extremely rare, the research compared three setups: basic rules, standalone machine learning, and hybrid systems. Logistic Regression, Random Forest, and XGBoost served as the model baselines. These were tested against three hybrid ideas: using rules as features, a two-stage sequential pipeline, and a weighted ensemble that blends scores from both. Since fraud is so rare, the study moved past basic accuracy to focus on Precision-Recall Area Under the Curve (PR-AUC), recall, and the actual financial cost of false alarms versus missed thefts. While XGBoost was the strongest individual performer with the best PR-AUC, the hybrid systems offered better real-world perks. They were more effective at cutting down false positives and lowering the total cost of managing fraud. The two-stage pipeline and weighted ensemble specifically did the best job of balancing detections against total alerts. Ultimately, these results show that hybrid systems are a smart middle ground, offering a setup that is both easy to understand and highly predictive.