Crime Detection and Analysis from Social Media Messages Using Machine Learning and Natural Language Processing Technique

Lombo, Xolani; Oyelade, Olaide N.; Ezugwu, Absalom E.

doi:10.1007/978-3-031-10548-7_37

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13381))

Included in the following conference series:

International Conference on Computational Science and Its Applications

1031 Accesses
2 Citations

Abstract

Social media has dramatically influenced and changed the rate and the nature of crime in our society. The perpetrators cut across different age groups, social standing, and beliefs. The ability to be anonymous on social media and the lack of adequate resources to fight cybercrime are catalysts for the rise in criminal activities, especially in South Africa. We proposed a system that will analyse and detect crime in social media posts or messages. The new system can detect attacks and drug-related crime messages, hate speech, and offensive messages. Natural language processing algorithms were used for text tokenisation, stemming, and lemmatisation. Machine learning models such as support vector machines and random forest classifiers were used to classify texts. Using the support vector machine to detect crime in texts, we achieved 86% accuracy and using the random forest for crime analysis, 72% accuracy was achieved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Boba, R.: Introductory guide to crime analysis and mapping. Community Oriented Policing Services, USA (2001)
Google Scholar
Dlamini, S., Mbambo, C.: Understanding policing of cybe-rcrime in South Africa: the phenomena, challenges and effective responses. Cogent Soc. Sci. 5(1), 1675404 (2019)
Google Scholar
SAS: SAS: Machine Learning: What it is and why it matters. https://www.sas.com/en_us/insights/analytics/machine-learning.html. Accessed 27 Apr 2021
Salloum, S., Gaber, T., Vadera, S., Shaalan, K.: Phishing email detection using natural language processing techniques: a literature survey. Procedia Comput. Sci. 189, 19–28 (2021)
Article Google Scholar
Guo, W., et al.: Deep natural language processing for search and recommender systems. In: Conference: the 25th ACM SIGKDD International Conference (2019)
Google Scholar
Chavare, S.R., Awati, C.J., Shirgave, S.K.: Smart recommender system using deep learning. In: 2021 6th International Conference on Inventive Computation Technologies (ICICT) (2021)
Google Scholar
Chakraoui, M., Elkalay, A., Mouhni, N.: Recommender system for information retrieval using natural language querying interface based in bibliographic research for Naïve users. Int. J. Intell. Sci. 12(1), 9–20 (2022)
Article Google Scholar
Olaide, O., Kana, A.D.: OWL formalization of cases: an improved case-based reasoning in diagnosing and treatment of breast cancer. Int. J. Inf. Secur. Priv. Digit. Forensics (IJIS) 3(2), 92–105 (2019)
Google Scholar
Oyelade, O.N., Ezugwu, A.E.: COVID19: a natural language processing and ontology oriented temporal case-based framework for early detection and diagnosis of novel coronavirus. Preprints (2020)
Google Scholar
Oyelade, A.O.S.J.S.A.O.N.: Patient symptoms elicitation process for breast cancer medical expert systems: a semantic web and natural language parsing approach. Future Comput. Inform. J. 3(1), 72–81 (2018)
Article Google Scholar
Oyelade, O.N., Ezugwu, A.E.: A case-based reasoning framework for early detection and diagnosis of novel coronavirus. Inform. Med. Unlocked 20, 100395 (2020)
Article Google Scholar
Osorio, J., Beltran, A.: Enhancing the detection of criminal organisations in mexico using ML and NLP. In: 2020 International Joint Conference on Neural Networks (IJCNN) (2020)
Google Scholar
Meira, J., Carneiro, J., Bolón-Canedo, V., Alonso-Betanzos, A., Novais, P., Marreiros, G.: Anomaly detection on natural language processing to improve predictions on tourist preferences. Electronics 11(5), 779 (2022)
Article Google Scholar
Zhang, T., Schoene, A.M., Ji, S., Ananiadou, S.: Natural language processing applied to mental illness detection: a narrative review. NPJ Digital Med. 5(46) (2022)
Google Scholar
Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP) (2018)
Google Scholar
Shah, N., Bhagat, N., Shah, M.: Crime forecasting: a machine learning and computer vision approach to crime prediction and prevention. Vis. Comput. Ind. Biomed. Art 4(1), 1–14 (2021)
Article Google Scholar
Bolla, R.A.: Crime Pattern Detection Using Online Social Media. Missouri University of Science and Technology (2014)
Google Scholar
Sharma, A., Jain, R.: Data pre-processing in spam detection. IJSTE Int. J. Sci. Technol. Eng. 1(11) (2015)
Google Scholar
Shirani-Mehr, H.: SMS spam detection using machine learning approach, Stanford University (2013)
Google Scholar
Malmasi, S., Zampieri, M.: Detecting hate speech in social media, arXiv preprint arXiv:1712.06427 (2017)
Andrews, S., Brewster, B., Day, T.: Organised crime and social media: a system for detecting, corroborating and visualising weak signals of organised crime online. Secur. Inform. 7(1), 1–21 (2018)
Article Google Scholar
Ikonomakis, E., Kotsiantis, S., Tampakas, V.: Text classification using machine learning techniques. WSEAS Trans. Comput. 4(8), 966–974 (2005)
Google Scholar
Lim, H.S.: Improving KNN based text classification with well estimated parameters. In: Pal, N.R., Kasabov, N., Mudi, R.K., Pal, S., Parui, S.K. (eds.) ICONIP 2004. LNCS, vol. 3316, pp. 516–523. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30499-9_79
Chapter Google Scholar
Johnson, D.E., Oles, F.J., Zhang, T., Goetz, T.: A decision-tree-based symbolic rule induction system for text categorization. IBM Syst. J. 41(3), 428–437 (2002)
Article Google Scholar
Kim, S.-B., Rim, H.-C., Yook, D., Lim, H.-S.: Effective methods for improving naive bayes text classifiers. In: Ishizuka, M., Sattar, A. (eds.) PRICAI 2002. LNCS (LNAI), vol. 2417, pp. 414–423. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45683-X_45
Chapter Google Scholar
Shanahan, J.G., Roma, N.: Improving SVM text classification performance through threshold adjustment. In: Lavrač, N., Gamberger, D., Blockeel, H., Todorovski, L. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 361–372. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39857-8_33
Chapter Google Scholar
Walczak, S.: Predicting crime and other uses of neural networks in police decision making. Front. Psychol. 12 (2021)
Google Scholar
Palanivinayagam, A., Gopal, S.S., Bhattacharya, S., Anumbe, N., Ibeke, E., Biamba, C.: An optimised machine learning and big data approach to crime detection. Wirel. Commun. Mob. Comput. 2021 (2021)
Google Scholar
Bharati, A., Sarvanaguru, R.A.K.: Crime prediction and analysis using machine learning. Int. Res. J. Eng. Technol. (2018)
Google Scholar
Navalgund, U.V., Priyadharshini, K.: Crime intention detection system using deep learning. In: 2018 International Conference on Circuits and Systems in Digital Enterprise Technology (ICCSDET) (2018)
Google Scholar
Rodrigues, A.P., Fernandes, R., Shetty, A., Lakshmanna, K., Shafi, R.M.: Real-time twitter spam detection and sentiment analysis using machine learning and deep learning techniques. Comput. Intell. Neurosci. (2022)
Google Scholar
Yadav, N., Kudale, O., Gupta, S., Rao, A., Shitole, A.: Twitter sentiment analysis using supervised machine learning. In: Hemanth, J., Bestak, R., Chen, J.I.Z. (eds.) Intelligent Data Communication Technologies and Internet of Things. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-9509-7_51
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, King Edward Road, Pietermaritzburg, 3201, KwaZulu-Natal, South Africa
Xolani Lombo, Olaide N. Oyelade & Absalom E. Ezugwu

Authors

Xolani Lombo
View author publications
You can also search for this author in PubMed Google Scholar
Olaide N. Oyelade
View author publications
You can also search for this author in PubMed Google Scholar
Absalom E. Ezugwu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Olaide N. Oyelade or Absalom E. Ezugwu .

Editor information

Editors and Affiliations

University of Perugia, Perugia, Italy
Osvaldo Gervasi
University of Basilicata, Potenza, Potenza, Italy
Beniamino Murgante
Østfold University College, Halden, Norway
Sanjay Misra
University of Minho, Braga, Portugal
Ana Maria A. C. Rocha
University of Cagliari, Cagliari, Italy
Chiara Garau

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lombo, X., Oyelade, O.N., Ezugwu, A.E. (2022). Crime Detection and Analysis from Social Media Messages Using Machine Learning and Natural Language Processing Technique. In: Gervasi, O., Murgante, B., Misra, S., Rocha, A.M.A.C., Garau, C. (eds) Computational Science and Its Applications – ICCSA 2022 Workshops. ICCSA 2022. Lecture Notes in Computer Science, vol 13381. Springer, Cham. https://doi.org/10.1007/978-3-031-10548-7_37

Download citation

DOI: https://doi.org/10.1007/978-3-031-10548-7_37
Published: 26 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-10547-0
Online ISBN: 978-3-031-10548-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics