Abstract
Online auctioning has attracted serious fraud given the huge amount of money involved and anonymity of users. In the auction fraud detection domain, the class imbalance, which means less fraud instances are present in bidding transactions, negatively impacts the classification performance because the latter is biased towards the majority class i.e. normal bidding behavior. The best-designed approach to handle the imbalanced learning problem is data sampling that was found to improve the classification efficiency. In this study, we utilize a hybrid method of data over-sampling and under-sampling to be more effective in addressing the issue of highly imbalanced auction fraud datasets. We deploy a set of well-known binary classifiers to understand how the class imbalance affects the classification results. We choose the most relevant performance metrics to deal with both imbalanced data and fraud bidding data.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Akbani, R., Kwek, S., Japkowicz, N.: Applying support vector machines to imbalanced datasets. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 39–50. Springer, Heidelberg (2004). doi:10.1007/978-3-540-30115-8_7
Brownlee, J.: 8 Tactics to Combat Imbalanced Classes in Your Machine Learning Dataset (2015). www.machinelearningmastery.com
Chang, W.-H., Chang, J.-S.: A novel two-stage phased modeling framework for early fraud detection in online auctions. Expert Syst. Appl. 38(9), 11244–11260 (2011)
Chawla, N.V., et al.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Ford, B.J., Xu, H., Valova, I.: A real-time self-adaptive classifier for identifying suspicious bidders in online auctions. Comput. J. 56, 646–663 (2012)
Ganganwar, V.: An overview of classification algorithms for imbalanced datasets. Int. J. Emerg. Technol. Adv. Eng. 2(4), 42–47 (2012)
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Köknar-Tezel, S., Latecki, L.J.: Improving SVM classification on imbalanced data sets in distance spaces. In: 9th IEEE International Conference on Data Mining (2009)
Nikitkov, A., Bay, D.: Shill bidding: empirical evidence of its effectiveness and likelihood of detection in online auction systems. Int. J. Account. Inf. Syst. 16, 42–54 (2015)
Provost, F., Fawcett, T.: Robust classification for imprecise environments. Mach. Learn. 42(3), 203–231 (2001)
Sadaoui, S., Wang, X.: A dynamic stage-based fraud monitoring framework of multiple live auctions. Appl. Intell. 46, 1–17 (2016). doi:10.1007/s10489-016-0818-7
Weiss, G.M., McCarthy, K., Zabar, B.: Cost-sensitive learning vs. sampling: which is best for handling imbalanced classes with unequal error costs? DMIN 7, 35–41 (2007)
Zhang, S., Sadaoui, S., Mouhoub, M.: An empirical analysis of imbalanced data classification. Comput. Inf. Sci. 8(1), 151–162 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Ganguly, S., Sadaoui, S. (2017). Classification of Imbalanced Auction Fraud Data. In: Mouhoub, M., Langlais, P. (eds) Advances in Artificial Intelligence. Canadian AI 2017. Lecture Notes in Computer Science(), vol 10233. Springer, Cham. https://doi.org/10.1007/978-3-319-57351-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-57351-9_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57350-2
Online ISBN: 978-3-319-57351-9
eBook Packages: Computer ScienceComputer Science (R0)