Abstract
Social media has dramatically influenced and changed the rate and the nature of crime in our society. The perpetrators cut across different age groups, social standing, and beliefs. The ability to be anonymous on social media and the lack of adequate resources to fight cybercrime are catalysts for the rise in criminal activities, especially in South Africa. We proposed a system that will analyse and detect crime in social media posts or messages. The new system can detect attacks and drug-related crime messages, hate speech, and offensive messages. Natural language processing algorithms were used for text tokenisation, stemming, and lemmatisation. Machine learning models such as support vector machines and random forest classifiers were used to classify texts. Using the support vector machine to detect crime in texts, we achieved 86% accuracy and using the random forest for crime analysis, 72% accuracy was achieved.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Boba, R.: Introductory guide to crime analysis and mapping. Community Oriented Policing Services, USA (2001)
Dlamini, S., Mbambo, C.: Understanding policing of cybe-rcrime in South Africa: the phenomena, challenges and effective responses. Cogent Soc. Sci. 5(1), 1675404 (2019)
SAS: SAS: Machine Learning: What it is and why it matters. https://www.sas.com/en_us/insights/analytics/machine-learning.html. Accessed 27 Apr 2021
Salloum, S., Gaber, T., Vadera, S., Shaalan, K.: Phishing email detection using natural language processing techniques: a literature survey. Procedia Comput. Sci. 189, 19–28 (2021)
Guo, W., et al.: Deep natural language processing for search and recommender systems. In: Conference: the 25th ACM SIGKDD International Conference (2019)
Chavare, S.R., Awati, C.J., Shirgave, S.K.: Smart recommender system using deep learning. In: 2021 6th International Conference on Inventive Computation Technologies (ICICT) (2021)
Chakraoui, M., Elkalay, A., Mouhni, N.: Recommender system for information retrieval using natural language querying interface based in bibliographic research for Naïve users. Int. J. Intell. Sci. 12(1), 9–20 (2022)
Olaide, O., Kana, A.D.: OWL formalization of cases: an improved case-based reasoning in diagnosing and treatment of breast cancer. Int. J. Inf. Secur. Priv. Digit. Forensics (IJIS) 3(2), 92–105 (2019)
Oyelade, O.N., Ezugwu, A.E.: COVID19: a natural language processing and ontology oriented temporal case-based framework for early detection and diagnosis of novel coronavirus. Preprints (2020)
Oyelade, A.O.S.J.S.A.O.N.: Patient symptoms elicitation process for breast cancer medical expert systems: a semantic web and natural language parsing approach. Future Comput. Inform. J. 3(1), 72–81 (2018)
Oyelade, O.N., Ezugwu, A.E.: A case-based reasoning framework for early detection and diagnosis of novel coronavirus. Inform. Med. Unlocked 20, 100395 (2020)
Osorio, J., Beltran, A.: Enhancing the detection of criminal organisations in mexico using ML and NLP. In: 2020 International Joint Conference on Neural Networks (IJCNN) (2020)
Meira, J., Carneiro, J., Bolón-Canedo, V., Alonso-Betanzos, A., Novais, P., Marreiros, G.: Anomaly detection on natural language processing to improve predictions on tourist preferences. Electronics 11(5), 779 (2022)
Zhang, T., Schoene, A.M., Ji, S., Ananiadou, S.: Natural language processing applied to mental illness detection: a narrative review. NPJ Digital Med. 5(46) (2022)
Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP) (2018)
Shah, N., Bhagat, N., Shah, M.: Crime forecasting: a machine learning and computer vision approach to crime prediction and prevention. Vis. Comput. Ind. Biomed. Art 4(1), 1–14 (2021)
Bolla, R.A.: Crime Pattern Detection Using Online Social Media. Missouri University of Science and Technology (2014)
Sharma, A., Jain, R.: Data pre-processing in spam detection. IJSTE Int. J. Sci. Technol. Eng. 1(11) (2015)
Shirani-Mehr, H.: SMS spam detection using machine learning approach, Stanford University (2013)
Malmasi, S., Zampieri, M.: Detecting hate speech in social media, arXiv preprint arXiv:1712.06427 (2017)
Andrews, S., Brewster, B., Day, T.: Organised crime and social media: a system for detecting, corroborating and visualising weak signals of organised crime online. Secur. Inform. 7(1), 1–21 (2018)
Ikonomakis, E., Kotsiantis, S., Tampakas, V.: Text classification using machine learning techniques. WSEAS Trans. Comput. 4(8), 966–974 (2005)
Lim, H.S.: Improving KNN based text classification with well estimated parameters. In: Pal, N.R., Kasabov, N., Mudi, R.K., Pal, S., Parui, S.K. (eds.) ICONIP 2004. LNCS, vol. 3316, pp. 516–523. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30499-9_79
Johnson, D.E., Oles, F.J., Zhang, T., Goetz, T.: A decision-tree-based symbolic rule induction system for text categorization. IBM Syst. J. 41(3), 428–437 (2002)
Kim, S.-B., Rim, H.-C., Yook, D., Lim, H.-S.: Effective methods for improving naive bayes text classifiers. In: Ishizuka, M., Sattar, A. (eds.) PRICAI 2002. LNCS (LNAI), vol. 2417, pp. 414–423. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45683-X_45
Shanahan, J.G., Roma, N.: Improving SVM text classification performance through threshold adjustment. In: Lavrač, N., Gamberger, D., Blockeel, H., Todorovski, L. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 361–372. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39857-8_33
Walczak, S.: Predicting crime and other uses of neural networks in police decision making. Front. Psychol. 12 (2021)
Palanivinayagam, A., Gopal, S.S., Bhattacharya, S., Anumbe, N., Ibeke, E., Biamba, C.: An optimised machine learning and big data approach to crime detection. Wirel. Commun. Mob. Comput. 2021 (2021)
Bharati, A., Sarvanaguru, R.A.K.: Crime prediction and analysis using machine learning. Int. Res. J. Eng. Technol. (2018)
Navalgund, U.V., Priyadharshini, K.: Crime intention detection system using deep learning. In: 2018 International Conference on Circuits and Systems in Digital Enterprise Technology (ICCSDET) (2018)
Rodrigues, A.P., Fernandes, R., Shetty, A., Lakshmanna, K., Shafi, R.M.: Real-time twitter spam detection and sentiment analysis using machine learning and deep learning techniques. Comput. Intell. Neurosci. (2022)
Yadav, N., Kudale, O., Gupta, S., Rao, A., Shitole, A.: Twitter sentiment analysis using supervised machine learning. In: Hemanth, J., Bestak, R., Chen, J.I.Z. (eds.) Intelligent Data Communication Technologies and Internet of Things. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-9509-7_51
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lombo, X., Oyelade, O.N., Ezugwu, A.E. (2022). Crime Detection and Analysis from Social Media Messages Using Machine Learning and Natural Language Processing Technique. In: Gervasi, O., Murgante, B., Misra, S., Rocha, A.M.A.C., Garau, C. (eds) Computational Science and Its Applications – ICCSA 2022 Workshops. ICCSA 2022. Lecture Notes in Computer Science, vol 13381. Springer, Cham. https://doi.org/10.1007/978-3-031-10548-7_37
Download citation
DOI: https://doi.org/10.1007/978-3-031-10548-7_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-10547-0
Online ISBN: 978-3-031-10548-7
eBook Packages: Computer ScienceComputer Science (R0)