Skip to main content

Crime Detection and Analysis from Social Media Messages Using Machine Learning and Natural Language Processing Technique

  • Conference paper
  • First Online:
Book cover Computational Science and Its Applications – ICCSA 2022 Workshops (ICCSA 2022)

Abstract

Social media has dramatically influenced and changed the rate and the nature of crime in our society. The perpetrators cut across different age groups, social standing, and beliefs. The ability to be anonymous on social media and the lack of adequate resources to fight cybercrime are catalysts for the rise in criminal activities, especially in South Africa. We proposed a system that will analyse and detect crime in social media posts or messages. The new system can detect attacks and drug-related crime messages, hate speech, and offensive messages. Natural language processing algorithms were used for text tokenisation, stemming, and lemmatisation. Machine learning models such as support vector machines and random forest classifiers were used to classify texts. Using the support vector machine to detect crime in texts, we achieved 86% accuracy and using the random forest for crime analysis, 72% accuracy was achieved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Boba, R.: Introductory guide to crime analysis and mapping. Community Oriented Policing Services, USA (2001)

    Google Scholar 

  2. Dlamini, S., Mbambo, C.: Understanding policing of cybe-rcrime in South Africa: the phenomena, challenges and effective responses. Cogent Soc. Sci. 5(1), 1675404 (2019)

    Google Scholar 

  3. SAS: SAS: Machine Learning: What it is and why it matters. https://www.sas.com/en_us/insights/analytics/machine-learning.html. Accessed 27 Apr 2021

  4. Salloum, S., Gaber, T., Vadera, S., Shaalan, K.: Phishing email detection using natural language processing techniques: a literature survey. Procedia Comput. Sci. 189, 19–28 (2021)

    Article  Google Scholar 

  5. Guo, W., et al.: Deep natural language processing for search and recommender systems. In: Conference: the 25th ACM SIGKDD International Conference (2019)

    Google Scholar 

  6. Chavare, S.R., Awati, C.J., Shirgave, S.K.: Smart recommender system using deep learning. In: 2021 6th International Conference on Inventive Computation Technologies (ICICT) (2021)

    Google Scholar 

  7. Chakraoui, M., Elkalay, A., Mouhni, N.: Recommender system for information retrieval using natural language querying interface based in bibliographic research for Naïve users. Int. J. Intell. Sci. 12(1), 9–20 (2022)

    Article  Google Scholar 

  8. Olaide, O., Kana, A.D.: OWL formalization of cases: an improved case-based reasoning in diagnosing and treatment of breast cancer. Int. J. Inf. Secur. Priv. Digit. Forensics (IJIS) 3(2), 92–105 (2019)

    Google Scholar 

  9. Oyelade, O.N., Ezugwu, A.E.: COVID19: a natural language processing and ontology oriented temporal case-based framework for early detection and diagnosis of novel coronavirus. Preprints (2020)

    Google Scholar 

  10. Oyelade, A.O.S.J.S.A.O.N.: Patient symptoms elicitation process for breast cancer medical expert systems: a semantic web and natural language parsing approach. Future Comput. Inform. J. 3(1), 72–81 (2018)

    Article  Google Scholar 

  11. Oyelade, O.N., Ezugwu, A.E.: A case-based reasoning framework for early detection and diagnosis of novel coronavirus. Inform. Med. Unlocked 20, 100395 (2020)

    Article  Google Scholar 

  12. Osorio, J., Beltran, A.: Enhancing the detection of criminal organisations in mexico using ML and NLP. In: 2020 International Joint Conference on Neural Networks (IJCNN) (2020)

    Google Scholar 

  13. Meira, J., Carneiro, J., Bolón-Canedo, V., Alonso-Betanzos, A., Novais, P., Marreiros, G.: Anomaly detection on natural language processing to improve predictions on tourist preferences. Electronics 11(5), 779 (2022)

    Article  Google Scholar 

  14. Zhang, T., Schoene, A.M., Ji, S., Ananiadou, S.: Natural language processing applied to mental illness detection: a narrative review. NPJ Digital Med. 5(46) (2022)

    Google Scholar 

  15. Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP) (2018)

    Google Scholar 

  16. Shah, N., Bhagat, N., Shah, M.: Crime forecasting: a machine learning and computer vision approach to crime prediction and prevention. Vis. Comput. Ind. Biomed. Art 4(1), 1–14 (2021)

    Article  Google Scholar 

  17. Bolla, R.A.: Crime Pattern Detection Using Online Social Media. Missouri University of Science and Technology (2014)

    Google Scholar 

  18. Sharma, A., Jain, R.: Data pre-processing in spam detection. IJSTE Int. J. Sci. Technol. Eng. 1(11) (2015)

    Google Scholar 

  19. Shirani-Mehr, H.: SMS spam detection using machine learning approach, Stanford University (2013)

    Google Scholar 

  20. Malmasi, S., Zampieri, M.: Detecting hate speech in social media, arXiv preprint arXiv:1712.06427 (2017)

  21. Andrews, S., Brewster, B., Day, T.: Organised crime and social media: a system for detecting, corroborating and visualising weak signals of organised crime online. Secur. Inform. 7(1), 1–21 (2018)

    Article  Google Scholar 

  22. Ikonomakis, E., Kotsiantis, S., Tampakas, V.: Text classification using machine learning techniques. WSEAS Trans. Comput. 4(8), 966–974 (2005)

    Google Scholar 

  23. Lim, H.S.: Improving KNN based text classification with well estimated parameters. In: Pal, N.R., Kasabov, N., Mudi, R.K., Pal, S., Parui, S.K. (eds.) ICONIP 2004. LNCS, vol. 3316, pp. 516–523. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30499-9_79

    Chapter  Google Scholar 

  24. Johnson, D.E., Oles, F.J., Zhang, T., Goetz, T.: A decision-tree-based symbolic rule induction system for text categorization. IBM Syst. J. 41(3), 428–437 (2002)

    Article  Google Scholar 

  25. Kim, S.-B., Rim, H.-C., Yook, D., Lim, H.-S.: Effective methods for improving naive bayes text classifiers. In: Ishizuka, M., Sattar, A. (eds.) PRICAI 2002. LNCS (LNAI), vol. 2417, pp. 414–423. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45683-X_45

    Chapter  Google Scholar 

  26. Shanahan, J.G., Roma, N.: Improving SVM text classification performance through threshold adjustment. In: Lavrač, N., Gamberger, D., Blockeel, H., Todorovski, L. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 361–372. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39857-8_33

    Chapter  Google Scholar 

  27. Walczak, S.: Predicting crime and other uses of neural networks in police decision making. Front. Psychol. 12 (2021)

    Google Scholar 

  28. Palanivinayagam, A., Gopal, S.S., Bhattacharya, S., Anumbe, N., Ibeke, E., Biamba, C.: An optimised machine learning and big data approach to crime detection. Wirel. Commun. Mob. Comput. 2021 (2021)

    Google Scholar 

  29. Bharati, A., Sarvanaguru, R.A.K.: Crime prediction and analysis using machine learning. Int. Res. J. Eng. Technol. (2018)

    Google Scholar 

  30. Navalgund, U.V., Priyadharshini, K.: Crime intention detection system using deep learning. In: 2018 International Conference on Circuits and Systems in Digital Enterprise Technology (ICCSDET) (2018)

    Google Scholar 

  31. Rodrigues, A.P., Fernandes, R., Shetty, A., Lakshmanna, K., Shafi, R.M.: Real-time twitter spam detection and sentiment analysis using machine learning and deep learning techniques. Comput. Intell. Neurosci. (2022)

    Google Scholar 

  32. Yadav, N., Kudale, O., Gupta, S., Rao, A., Shitole, A.: Twitter sentiment analysis using supervised machine learning. In: Hemanth, J., Bestak, R., Chen, J.I.Z. (eds.) Intelligent Data Communication Technologies and Internet of Things. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-9509-7_51

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Olaide N. Oyelade or Absalom E. Ezugwu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lombo, X., Oyelade, O.N., Ezugwu, A.E. (2022). Crime Detection and Analysis from Social Media Messages Using Machine Learning and Natural Language Processing Technique. In: Gervasi, O., Murgante, B., Misra, S., Rocha, A.M.A.C., Garau, C. (eds) Computational Science and Its Applications – ICCSA 2022 Workshops. ICCSA 2022. Lecture Notes in Computer Science, vol 13381. Springer, Cham. https://doi.org/10.1007/978-3-031-10548-7_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-10548-7_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-10547-0

  • Online ISBN: 978-3-031-10548-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics