Skip to main content

Optimizing Instance Selection Strategies in Interactive Machine Learning: An Application to Fraud Detection

  • Conference paper
  • First Online:
Hybrid Intelligent Systems (HIS 2020)

Abstract

Machine Learning systems are generally thought of as fully automatic. However, in recent years, interactive systems in which Human experts actively contribute towards the learning process have shown improved performance when compared to fully automated ones. This may be so in scenarios of Big Data, scenarios in which the input is a data stream, or when there is concept drift. In this paper we present a system for supporting auditors in the task of financial fraud detection. The system is interactive in the sense that the auditors can provide feedback regarding the instances of the data they use, or even suggest new variables. This feedback is incorporated into newly trained Machine Learning models which improve over time. In this paper we show that the order by which instances are evaluated by the auditors, and their feedback incorporated, influences the evolution of the performance of the system over time. The goal of this paper is to study of different instance selection strategies for Human evaluation and feedback can improve the learning speed. This information can then be used by the system to determine, at each moment, which instances would improve the system the most, so that these can be suggested to the users for validation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lima, R.F., Pereira, A.C.: Feature selection approaches to fraud detection in e-payment systems. In: International Conference on Electronic Commerce and Web Technologies, pp. 111–126. Springer (2016)

    Google Scholar 

  2. Nanduri, J., Jia, Y., Oka, A., Beaver, J., Liu, Y.W.: Microsoft uses machine learning and optimization to reduce e-commerce fraud. INFORMS J. Appl. Anal. 50(1), 64–79 (2020)

    Article  Google Scholar 

  3. Kose, I., Gokturk, M., Kilic, K.: An interactive machine-learning-based electronic fraud and abuse detection system in healthcare insurance. Appl. Soft Comput. 36, 283–299 (2015)

    Article  Google Scholar 

  4. Still, S.: Information-theoretic approach to interactive learning. EPL (Europhys. Lett.) 85(2), 28005 (2009)

    Article  Google Scholar 

  5. Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.: Learning under concept drift: a review. IEEE Trans. Knowl. Data Eng. 31(12), 2346–2363 (2018)

    Google Scholar 

  6. Holzinger, A., Plass, M., Kickmeier-Rust, M., Holzinger, K., Crişan, G.C., Pintea, C.M., Palade, V.: Interactive machine learning: experimental evidence for the human in the algorithmic loop. Appl. Intell. 49(7), 2401–2414 (2019)

    Article  Google Scholar 

  7. Ware, M., Frank, E., Holmes, G., Hall, M., Witten, I.H.: Interactive machine learning: letting users build classifiers. Int. J. Hum. Comput. Stud. 55(3), 281–292 (2001)

    Article  Google Scholar 

  8. Holzinger, A.: Interactive machine learning for health informatics: when do we need the human-in-the-loop? Brain Inf. 3(2), 119–131 (2016)

    Article  Google Scholar 

  9. Berg, S., Kutra, D., Kroeger, T., Straehle, C.N., Kausler, B.X., Haubold, C.,Schiegg, M., Ales, J., Beier, T., Rudy, M., et al.: Ilastik: interactive machine learning for (bio) image analysis. Nat. Methods. 1–7 (2019)

    Google Scholar 

  10. Holzinger, A., Jurisica, I.: Knowledge discovery and data mining in biomedical informatics: the future is in integrative, interactive machine learning solutions. In: Interactive knowledge Discovery and Data Mining in Biomedical Informatics, pp. 1–18. Springer (2014)

    Google Scholar 

  11. Carneiro, D., Silva, F., Guimarães, M., Sousa, D., Novais, P.: Explainable intelligent environments. In: International Symposium on Ambient Intelligence, pp. 34–43. Springer (2020)

    Google Scholar 

  12. Jaber, M.Y.:Learning Curves: Theory, Models, and Applications. CRC Press (2016)

    Google Scholar 

Download references

Acknowledgments

This work was supported by the Northern Regional Operational Program, Portugal 2020 and European Union, trough European Regional Development Fund (ERDF) in the scope of project number 39900-31/SI/2017, and by FCT – Fundação para a Ciência e Tecnologia within project UIDB/04728/2020.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Davide Carneiro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Carneiro, D., Guimarães, M., Sousa, M. (2021). Optimizing Instance Selection Strategies in Interactive Machine Learning: An Application to Fraud Detection. In: Abraham, A., Hanne, T., Castillo, O., Gandhi, N., Nogueira Rios, T., Hong, TP. (eds) Hybrid Intelligent Systems. HIS 2020. Advances in Intelligent Systems and Computing, vol 1375. Springer, Cham. https://doi.org/10.1007/978-3-030-73050-5_13

Download citation

Publish with us

Policies and ethics