Abstract
Advanced technologies, Internet of things and fundamental information communication technology frameworks in particular, facilitate information sharing. One simple click-on end device can make every tool accessible to users; however, whether correct information is received remains to be an open question. Incorrect information that bundles the factors of fake, malicious, or fraudulent information, whether deliberately or not, may worsen misunderstandings. To avoid these cases escalating to the level of crime, a universal financial fraud-awareness model was designed in this study. The model first targets accurate fraud detection and classification using the natural language processing technique. An anti-fraud chatbot is then implemented as an instance of the model and deployed on a widely used social network service, namely LINE. This implementation aims to manage finance-fraud cases and provide anti-fraud suggestions to deal with foreseeable fraud events. Statistics of the comparison between Word2vec, ELMO, BERT, and DistilBERT on the five-strong conventional machine-learning models and the models of artificial neural networks indicate that the proposed model can achieve an accuracy of over 98% while detecting potential finance-fraud cases. In addition, the more efficient models by DistilBERT with a support vector machine or a random forest have lower resource-computation cost and faster execution time in real applications.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of data and material
Data, experiment dataset as well, is accessible per reasonable request.
Code Availability
Code will be available per reasonable request after the patent is filed.
References
Adewumi AO, Akinyelu AA (2017) A survey of machine-learning and nature-inspired based credit card fraud detection techniques. Int J Syst Assur Eng Manage 8:937–953
Aggarwal A, Chauhan A, Kumar D, Mittal M, Verma S (2020) Classification of fake news by fine-tuning deep bidirectional transformers based language model. EAI Endorsed Trans Scalable Inf Syst 7(27):1–12
Bocklisch T, Faulkner J, Pawlowski N, Nichol A (2017) Rasa: Open source language understanding and dialogue management. arXiv preprint arXiv:1712.05181.
Chen LC, Hsu CL, Lo NW, Yeh KH, Lin PH (2017) Fraud analysis and detection for real-time messaging communications on social networks. IEICE Trans Inf Syst 100:2267–2274
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Hajek P, Henriques R (2017) Mining corporate annual reports for intelligent detection of financial statement fraud–A comparative study of machine learning methods. Knowledge-Based Syst 128:139–152
2019 Internet Crime Report Released (2019) https://www.fbi.gov/news/stories/2019-internet-crime-report-released-021120. Accessed 28 May 2020.
Jurgovsky J, Granitzer M, Ziegler K, Calabretto S, Portier PE, He-Guelton L, Caelen O (2018) Sequence classification for credit-card fraud detection. Expert Syst Appl 100:234–245
Lilleberg J, Zhu Y, Zhang Y (2015) Support vector machines and word2vec for text classification with semantic features. 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC), pp 136–140, 6–8 Jul 2015, Beijing, China.
Ling M, Chen Q, Sun Q, Jia Y (2020) Hybrid neural network for Sina Weibo sentiment analysis. IEEE Trans Comput Social Syst 7(4):983–990
Martina M, Wu JR (2016) China blames Taiwan criminals for surge in telephone scams. Reuters. https://www.reuters.com/article/us-china-telecoms-fraud-idUSKCN0XJ022. Accessed 22 April 2016.
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. arXiv preprint arXiv:1802.05365.
Rexha A, Dragoni M, Kern R (2020) A Neural-based Architecture For Small Datasets Classification. In: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL '20), pp 319–327, 1–5 Aug 2020, China.
Sanh V, Debut L, Chaumond J, Wolf T (2019) DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.
Sun C, Yang Z, Luo L, Wang L, Zhang Y, Lin H, Wang J (2019) A deep learning approach with deep contextualized word representations for chemical-protein interaction extraction from biomedical literature. IEEE Access 7:151034–151046
Wen TH, Gasic M, Mrksic N, Su PH, Vandyke D, Young S (2015) Semantically conditioned lstm-based natural language generation for spoken dialogue systems. arXiv preprint arXiv:1508.01745.
Wensen L, Zewen C, Jun W, Xiaoyi W (2016) Short text classification based on Wikipedia and Word2vec. 2016 2nd IEEE International Conference on Computer and Communications (ICCC), pp 1195–1200, 14–17 Oct 2016, Chengdu, China.
Acknowledgements
Special thanks to Mr. Hou-Hsun Wang for his assistance in the development of the programming for this study.
Funding
This work was partially supported by the Ministry of Science and Technology, Taiwan, R.O.C. [grand number MOST 108-2218-E-025-002-MY3].
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
Authors of this work declare that there is no conflict of interest/competing interests.
Consent for participate
Authors are aware of everything related to this submitted work.
Consent for publication
Authors are aware of the submitted work for publication on Journal of Ambient Intelligence and Humanized Computing.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Rights and permissions
About this article
Cite this article
Chang, JW., Yen, N. & Hung, J.C. Design of a NLP-empowered finance fraud awareness model: the anti-fraud chatbot for fraud detection and fraud classification as an instance. J Ambient Intell Human Comput 13, 4663–4679 (2022). https://doi.org/10.1007/s12652-021-03512-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-021-03512-2