Empirical Evidence of Posterior Probability Bias Under SMOTE Oversampling in Imbalanced Credit Card Fraud Detection: A Comparative Study of HistGradient Boosting and Logistic Regression | IJCT Volume 13 – Issue 3 | IJCT-V13I3P123

International Journal of Computer Techniques
ISSN 2394-2231
Volume 13, Issue 3  |  Published: May – June 2026

Author

Anthony A. Ogunbor

Abstract

Detecting credit card fraud is a classic imbalanced binary classification problem in which fraudulent transactions belong to the minority class. A typical example, is the Kaggle cardholder’s dataset where fraudulent transactions are only 0.173%. This study is directly inspired by earlier work by Dal Pozzolo et al, showing that under-sampling distorts posterior class probabilities, while preserving ranking order, rendering the default classification threshold suboptimal and inflating false positive rates. This paper extends that finding to over-sampling by applying the Synthetic Minority Oversampling Technique (SMOTE) as the sole imbalance intervention to Histogram-based Gradient Boosting (HGB) and Logistic Regression (LR) on the same 284,807- transaction Kaggle Credit Card Fraud dataset used in a previous study. When SMOTE was applied as the only imbalance intervention, LR’s false alarms increased by a factor of 165, from 35 to 5,775 across the five cross-validation folds used in the experiment, while ROC-AUC remained almost unchanged at 0.9806. This pattern, i.e., massive false alarm inflation with stable ranking, is the operational fingerprint of posterior probability bias as described in Dal Pozzolo et al. On the other hand, the other classifier in the experiment – HGB, responded differently, with false alarms decreasing and F1 improving by 28.3%, confirming that SMOTE’s effect on probability calibration is model-dependent. On the test set, HGB achieved F1 of 0.8342 with 18 false alarms, compared to LR’s F1 of 0.1131 with 1,415 false alarms. These results constitute direct empirical evidence that SMOTE over-sampling induces posterior probability bias that is consistent with, and therefore extends earlier established theoretical framework from under-sampling to an over-sampling context

Keywords

Credit card fraud detection, SMOTE, Logistic Regression, Histogram-based gradient boosting, Posterior probability bias, Class imbalance, Over-sampling, Probability calibration, False positive inflation

Conclusion

This study set out to determine whether SMOTE oversampling induces posterior probability bias similar to the under-sampling induced bias demonstrated by Dal Pozzolo, Caelen, Johnson, and Bontempi in their work on undersampling. The answer to this question, supported by clear and unambiguous experimental evidence, is yes. The four conclusions that summarize the findings are given below. First, SMOTE-induced probability bias is confirmed and demonstrated with evidence. When SMOTE was applied as the sole class imbalance intervention to Logistic Regression (LR), false alarms increased by a factor of 165, from 35 to 5,775 across 5 Cross Validation (CV) folds, while ROC-AUC remained virtually unchanged at 0.9806. On the isolated test set, LR generated 1,415 false alarms at the default threshold of 0.5. This precise pattern of spikes in false alarm inflation with stable ranking (ROC-AUC), is the operational fingerprint (evidence) of posterior probability bias as described by authors of the earlier study. This study demonstrates this fingerprint for SMOTE over-sampling on the same dataset where Dal Pozzolo et al. demonstrated it for under-sampling. Thus, class imbalance not only hinders learning minority patterns but also distorts posterior probabilities when resampling methods are applied, whether by under-sampling (Dal Pozzolo et al., 2015) or by over-sampling (the empirical findings in this study). Second, class_weight=’balanced’ is not the source of the bias. An initial experiment applied both SMOTE and class_weight=’balanced’ simultaneously. And then SMOTE was isolated and evidence shows that class_weight=’balanced is not the source of the bias. This confirms that class weighting introduces operationally benign distortions. Third, model expressiveness and SMOTE response are inseparable. HistGradient Boosting responded to SMOTE with improved performance, that is, false alarms decreased and F1 rose by 28.3%. With Logistic Regression’s precision collapsed. Therefore, SMOTE benefits are model-dependent: a classifier must have sufficient capacity to exploit synthetic samples without being miscalibrated by them. Logistic Regression’s linear decision boundary lacks this capacity on this dataset. Fourth, Randomized Search is the preferred optimization strategy. Randomized Search outperformed Exhaustive Grid Search for HistGradient Boosting (CV F1: 0.8293 vs 0.7036) while it spent less time evaluating fewer combinations. This empirically confirming the work of Bergstra and Bengio on this real-world imbalanced Kaggle Credit Card dataset. The most important direction for future work is the formal implementation of Dal Pozzolo et al.’s probability correction formula adapted for SMOTE oversampling. Their formula – p’ = (β × p_s) / (β × p_s − p_s + 1), where β = true_fraud_prior / smote_fraud_rate ≈ 0.00346 – should correct the upward probability shift that this study has identified as the source of Logistic Regression’s false alarm inflation. If this formular is applied successfully, this would dramatically reduce false alarms while preserving recall, and as a result, will provide a practical remedy for the bias demonstrated in this study, and for the first time, formally validate Dal Pozzolo et al.’s correction framework in relation to over-sampling. Additional future directions should explore evaluation on a second dataset to assess generalizability.

References

[1] A. Pozzolo, O. Caelen, R. Johnson, et al, “Calibrating probability with Underdamping for Unbalanced Classification”, IEEE Symposium Series on Computational Intelligence, pp. 159 – 166, 2015. 10.1109/SSCI.2015.33 [2] S. Chakraborty, L. Dey, “Multi-objective, Multi-class and Multi-label Data Classification with Class Imbalance: Theory and Practices”, Springer, Singapore; 2024. 10.1007/978-981-97962-29 [3] S. Bhattacharyya, S. Jha, K. Tharakunnel, & JC. Westland, “Data mining for Credit Card Fraud: A comparative study”, Decision Support Systems, Vol 50, pp 602-613, 2011. 10.1016/j.dss.2010.08.008 [4] M. Sopiyan, K. Fauziah, Y.F. Wijaya, “Fraud Detection Using Random Forest, Logistic Regression and HistGradient Boosting on Credit Cards”, JUITA: Jurnal Informatika, Vol 10, pp. 77 – 87, 2022. 10.30595/juita.v10i1.12050 [5] R. J. Bolton, DJ. Hand, “Unsupervised Profiling Methods for Fraud Detection”, Proc. Credit Scoring and Credit Control, Vol 2, pp. 5 – 7, 2001. [6] H. Ali, MNM, Salleh, K. Hussain, et al, “A Review on Data Preprocessing Methods for Class Imbalance Problem”, International Journal of Engineering & Technology, Vol 8, pp.390-397, 2019. 10.14419/ijet.v8i3.29508 [7] NV. Chawla, KW. Bowyer, LO. Hall, et al, “Synthetic minority oversampling technique”, Journal pf Artificial Intelligence Research, Vol 16, pp.321-357, 2002. 10.1613/jair.953 [8] J.W. Osborne, “Improving your data transformations: Applying the Box-Cox Transformation”, Practical Assessment Research and Evaluation, Vol 15, pp. 12, 2010 [9] J.H. Friedman, “Greedy Function Approximation: A gradient Boosting Machine”, Annals of Statistics, Vol 29, pp. 1189-1232, 2001 [10] G. Ke. Q. Meng, T. Finley, et al., “LightGBM: A Highly Efficient Gradient Boosting Decision Tree”, Advances in Neural Information, Advances in Neural Information Processing Systems, Vol 30, pp3146 – 3154, 2017. 10.5555/3294996.3295074 [11] F. Pedregosa, G, Varoquaux, A, Gramfort, et al., “Scikit-learn: Machine Learning in Python”, Journal of Machine Learning Research, Vol 12, pp.2825-2830, 2011. 10.5555/1953048.2078195 [12] G. Velarde, M, Weichert, A. Deshmunkh, et al, “Tree Boosting Methods for Balanced and Imbalanced Classification and Their Robustness Over Time in Risk Assessment”, Intelligent Systems with Applications, Vol 22, 2024. 10.1016/j.iswa.2024.200354 [13] P. Hajek, MZ. Abedin, U Sivarajah, “Fraud Detection in Mobile Payment Systems Using an XGBoost-Based Framework”, Information Systems Fronteirs Systems Frontiers, pp 1 – 19, 2022. 10.1007/s10796-022-10301-9 [14] T. Hastie, R. Tibshirani, J. Friedman, “The Elements of Statistical Learning”, Springer, New York; 2009. 10.1007/978-0-387-84858-7 [15] J. Bergstra, Y. Bengio, “Random Search for Hyper-parameter Optimization”, Journal of Machine Learning Research, Vol 13, pp281- 305, 2012. 10.5555/2188385.2188395 [16] “Pipeline and composite estimators”, (2026). Accessed: April 1, 2026. https://scikit-learn.org/stable/modules/compose.html [17] “Credit Card Fraud Detection”, (2015). Accessed: April 1, 2026: https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud.

How to Cite This Paper

Anthony A. Ogunbor (2026). Empirical Evidence of Posterior Probability Bias Under SMOTE Oversampling in Imbalanced Credit Card Fraud Detection: A Comparative Study of HistGradient Boosting and Logistic Regression. International Journal of Computer Techniques, 13(3). ISSN: 2394-2231.

© 2026 International Journal of Computer Techniques (IJCT). All rights reserved.

Submit Your Paper