Design of a NLP-empowered finance fraud awareness model: the anti-fraud chatbot for fraud detection and fraud classification as an instance

Chang, Jia-Wei; Yen, Neil; Hung, Jason C.

doi:10.1007/s12652-021-03512-2

Design of a NLP-empowered finance fraud awareness model: the anti-fraud chatbot for fraud detection and fraud classification as an instance

Original Research
Published: 21 March 2022

Volume 13, pages 4663–4679, (2022)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

2189 Accesses
19 Citations
Explore all metrics

Abstract

Advanced technologies, Internet of things and fundamental information communication technology frameworks in particular, facilitate information sharing. One simple click-on end device can make every tool accessible to users; however, whether correct information is received remains to be an open question. Incorrect information that bundles the factors of fake, malicious, or fraudulent information, whether deliberately or not, may worsen misunderstandings. To avoid these cases escalating to the level of crime, a universal financial fraud-awareness model was designed in this study. The model first targets accurate fraud detection and classification using the natural language processing technique. An anti-fraud chatbot is then implemented as an instance of the model and deployed on a widely used social network service, namely LINE. This implementation aims to manage finance-fraud cases and provide anti-fraud suggestions to deal with foreseeable fraud events. Statistics of the comparison between Word2vec, ELMO, BERT, and DistilBERT on the five-strong conventional machine-learning models and the models of artificial neural networks indicate that the proposed model can achieve an accuracy of over 98% while detecting potential finance-fraud cases. In addition, the more efficient models by DistilBERT with a support vector machine or a random forest have lower resource-computation cost and faster execution time in real applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 5

Combatting Phishing Threats: An NLP-Based Programming Approach for Detection of Malicious Emails and Texts

Review the role of artificial intelligence in detecting and preventing financial fraud using natural language processing

Article 26 July 2023

Fraud detection with natural language processing

Article Open access 19 July 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Availability of data and material

Data, experiment dataset as well, is accessible per reasonable request.

Code Availability

Code will be available per reasonable request after the patent is filed.

Notes

References

Adewumi AO, Akinyelu AA (2017) A survey of machine-learning and nature-inspired based credit card fraud detection techniques. Int J Syst Assur Eng Manage 8:937–953
Article Google Scholar
Aggarwal A, Chauhan A, Kumar D, Mittal M, Verma S (2020) Classification of fake news by fine-tuning deep bidirectional transformers based language model. EAI Endorsed Trans Scalable Inf Syst 7(27):1–12
Google Scholar
Bocklisch T, Faulkner J, Pawlowski N, Nichol A (2017) Rasa: Open source language understanding and dialogue management. arXiv preprint arXiv:1712.05181.
Chen LC, Hsu CL, Lo NW, Yeh KH, Lin PH (2017) Fraud analysis and detection for real-time messaging communications on social networks. IEICE Trans Inf Syst 100:2267–2274
Article Google Scholar
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Hajek P, Henriques R (2017) Mining corporate annual reports for intelligent detection of financial statement fraud–A comparative study of machine learning methods. Knowledge-Based Syst 128:139–152
Article Google Scholar
2019 Internet Crime Report Released (2019) https://www.fbi.gov/news/stories/2019-internet-crime-report-released-021120. Accessed 28 May 2020.
Jurgovsky J, Granitzer M, Ziegler K, Calabretto S, Portier PE, He-Guelton L, Caelen O (2018) Sequence classification for credit-card fraud detection. Expert Syst Appl 100:234–245
Article Google Scholar
Lilleberg J, Zhu Y, Zhang Y (2015) Support vector machines and word2vec for text classification with semantic features. 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC), pp 136–140, 6–8 Jul 2015, Beijing, China.
Ling M, Chen Q, Sun Q, Jia Y (2020) Hybrid neural network for Sina Weibo sentiment analysis. IEEE Trans Comput Social Syst 7(4):983–990
Article Google Scholar
Martina M, Wu JR (2016) China blames Taiwan criminals for surge in telephone scams. Reuters. https://www.reuters.com/article/us-china-telecoms-fraud-idUSKCN0XJ022. Accessed 22 April 2016.
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. arXiv preprint arXiv:1802.05365.
Rexha A, Dragoni M, Kern R (2020) A Neural-based Architecture For Small Datasets Classification. In: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL '20), pp 319–327, 1–5 Aug 2020, China.
Sanh V, Debut L, Chaumond J, Wolf T (2019) DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.
Sun C, Yang Z, Luo L, Wang L, Zhang Y, Lin H, Wang J (2019) A deep learning approach with deep contextualized word representations for chemical-protein interaction extraction from biomedical literature. IEEE Access 7:151034–151046
Article Google Scholar
Wen TH, Gasic M, Mrksic N, Su PH, Vandyke D, Young S (2015) Semantically conditioned lstm-based natural language generation for spoken dialogue systems. arXiv preprint arXiv:1508.01745.
Wensen L, Zewen C, Jun W, Xiaoyi W (2016) Short text classification based on Wikipedia and Word2vec. 2016 2nd IEEE International Conference on Computer and Communications (ICCC), pp 1195–1200, 14–17 Oct 2016, Chengdu, China.

Download references

Acknowledgements

Special thanks to Mr. Hou-Hsun Wang for his assistance in the development of the programming for this study.

Funding

This work was partially supported by the Ministry of Science and Technology, Taiwan, R.O.C. [grand number MOST 108-2218-E-025-002-MY3].

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Taichung University of Science and Technology, Taichung City, Taiwan
Jia-Wei Chang & Jason C. Hung
School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, China
Neil Yen
School of Computer Science and Engineering, University of Aizu, Aizuwakamatsu, Japan
Neil Yen

Authors

Jia-Wei Chang
View author publications
You can also search for this author inPubMed Google Scholar
Neil Yen
View author publications
You can also search for this author inPubMed Google Scholar
Jason C. Hung
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Neil Yen.

Ethics declarations

Conflicts of interest

Authors of this work declare that there is no conflict of interest/competing interests.

Consent for participate

Authors are aware of everything related to this submitted work.

Consent for publication

Authors are aware of the submitted work for publication on Journal of Ambient Intelligence and Humanized Computing.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

See Table 15.

Table 15 Sample Fraud Events and Categories

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chang, JW., Yen, N. & Hung, J.C. Design of a NLP-empowered finance fraud awareness model: the anti-fraud chatbot for fraud detection and fraud classification as an instance. J Ambient Intell Human Comput 13, 4663–4679 (2022). https://doi.org/10.1007/s12652-021-03512-2

Download citation

Received: 30 December 2020
Accepted: 09 September 2021
Published: 21 March 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s12652-021-03512-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Design of a NLP-empowered finance fraud awareness model: the anti-fraud chatbot for fraud detection and fraud classification as an instance

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Combatting Phishing Threats: An NLP-Based Programming Approach for Detection of Malicious Emails and Texts

Review the role of artificial intelligence in detecting and preventing financial fraud using natural language processing

Fraud detection with natural language processing

Explore related subjects

Availability of data and material

Code Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Consent for participate

Consent for publication

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now