skip to main content
10.1145/3501247.3539507acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
short-paper

Topic modeling approaches to counter online grooming

Authors Info & Claims
Published:26 June 2022Publication History

ABSTRACT

This paper describes a topic model-based approach for analyzing a large-scale dataset of chat messages exchanged between the users of LiveMe, a major social live streaming platform, in the context of broadcasts involving sexually explicit content. The analysis reveals the characteristics of predatory behavior targeting minors and its criminal dimension in an accessible and explainable manner.

References

  1. Alexander Babuta, Marion Oswald, and Christine Rinik. 2018. Machine learning algorithms and police decision-making: legal, ethical and regulatory challenges. (2018).Google ScholarGoogle Scholar
  2. Abdur Rahman MA Basher and Benjamin CM Fung. 2014. Analyzing topics and authors in chat logs for crime investigation. Knowledge and information systems 39, 2 (2014), 351–381.Google ScholarGoogle Scholar
  3. Marcio Pereira Basilio, Gabrielle Souza Brum, and Valdecy Pereira. 2020. A model of policing strategy choice: the integration of the Latent Dirichlet Allocation (LDA) method with ELECTRE I. Journal of Modelling in Management(2020).Google ScholarGoogle Scholar
  4. Daniel Birks, Alex Coleman, and David Jackson. 2020. Unsupervised identification of crime problems from police free-text data. Crime Science 9, 1 (2020), 1–19.Google ScholarGoogle ScholarCross RefCross Ref
  5. David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of machine Learning research 3, Jan (2003), 993–1022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5 (2017), 135–146.Google ScholarGoogle ScholarCross RefCross Ref
  7. Erik Cambria and Bebo White. 2014. Jumping NLP curves: A review of natural language processing research. IEEE Computational intelligence magazine 9, 2 (2014), 48–57.Google ScholarGoogle Scholar
  8. Victor Diogho Heuer de Carvalho and Ana Paula Cabral Seixas Costa. 2022. Exploring Text Mining and Analytics for Applications in Public Security: An in-depth dive into a systematic literature review. (2022). https://doi.org/10.1590/scielopreprints.3518Google ScholarGoogle ScholarCross RefCross Ref
  9. Sih-Huei Chen, Andri Santoso, Yuan-Shan Lee, and Jia-Ching Wang. 2015. Latent dirichlet allocation based blog analysis for criminal intention detection system. In 2015 International Carnahan Conference on Security Technology (ICCST). IEEE, 73–76.Google ScholarGoogle ScholarCross RefCross Ref
  10. Europol. 2021. Internet Organised Crime Threat Assessment (IOCTA 2021). European Union Agency for Law Enforcement Cooperation (Europol).Google ScholarGoogle Scholar
  11. Jiawei Han, Jian Pei, Yiwen Yin, and Runying Mao. 2004. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data mining and knowledge discovery 8, 1 (2004), 53–87.Google ScholarGoogle Scholar
  12. Liangjie Hong and Brian D Davison. 2010. Empirical study of topic modeling in twitter. In Proceedings of the first workshop on social media analytics. 80–88.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Nikolaos Lykousas and Costantions Patsakis. 2019. Large-scale analysis of grooming in modern social networks. https://doi.org/10.5281/zenodo.3560365Google ScholarGoogle ScholarCross RefCross Ref
  14. Nikolaos Lykousas and Constantinos Patsakis. 2021. Large-scale analysis of grooming in modern social networks. Expert Systems with Applications 176 (2021), 114808. https://doi.org/10.1016/j.eswa.2021.114808Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Nikolaos Lykousas, Constantinos Patsakis, and Vicenç Gómez. 2018. Adult Content in Social Live Streaming Services: Characterizing Deviant Users and Relationships. In IEEE/ACM 2018 International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018, Barcelona, Spain, August 28-31, 2018, Ulrik Brandes, Chandan Reddy, and Andrea Tagarelli (Eds.). IEEE Computer Society, 375–382. https://doi.org/10.1109/ASONAM.2018.8508246Google ScholarGoogle ScholarCross RefCross Ref
  16. Daniel A McFarland, Daniel Ramage, Jason Chuang, Jeffrey Heer, Christopher D Manning, and Daniel Jurafsky. 2013. Differentiating language usage through topic models. Poetics 41, 6 (2013), 607–625.Google ScholarGoogle ScholarCross RefCross Ref
  17. Bill Melugin. 2018. Pedophiles using app to manipulate underage girls into sexual acts, sell recordings as child porn. https://www.foxla.com/news/pedophiles-using-app-to-manipulate-underage-girls-into-sexual-acts-sell-recordings-as-child-porn. [Online; last accessed 12-March-2020].Google ScholarGoogle Scholar
  18. Daniela F Milon-Flores and Robson LF Cordeiro. 2022. How to take advantage of behavioral features for the early detection of grooming in online conversations. Knowledge-Based Systems 240 (2022), 108017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kieron O’hara. 2020. In no circumstances can or should explanations of AI outputs in sensitive contexts be wholly computable. In Workshop on Explanations for AI: Computable or Not? at 12th ACM Web Science Conference, Southampton 2020 (07/07/20 - 07/07/20). https://eprints.soton.ac.uk/442338/Google ScholarGoogle Scholar
  20. Ritika Pandey and George O Mohler. 2018. Evaluation of crime topic models: topic coherence vs spatial crime concentration. In 2018 IEEE International Conference on Intelligence and Security Informatics (ISI). IEEE, 76–78.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Venkatesh Ramanathan and Harry Wechsler. 2013. Phishing detection and impersonated entity discovery using Conditional Random Field and Latent Dirichlet Allocation. Computers & Security 34(2013), 123–139.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Radim Řehůřek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, Valletta, Malta, 45–50.Google ScholarGoogle Scholar
  23. Michael Röder, Andreas Both, and Alexander Hinneburg. 2015. Exploring the space of topic coherence measures. In Proceedings of the eighth ACM international conference on Web search and data mining. 399–408.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Carson Sievert and Kenneth Shirley. 2014. LDAvis: A method for visualizing and interpreting topics. In Proceedings of the workshop on interactive language learning, visualization, and interfaces. 63–70.Google ScholarGoogle ScholarCross RefCross Ref
  25. Seppo Virtanen. 2021. Uncovering dynamic textual topics that explain crime. Royal Society open science 8, 12 (2021), 210750.Google ScholarGoogle ScholarCross RefCross Ref
  26. Cao Xiao, Ping Zhang, W Chaovalitwongse, Jianying Hu, and Fei Wang. 2017. Adverse drug reaction prediction with symbolic latent dirichlet allocation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31.Google ScholarGoogle ScholarCross RefCross Ref
  27. Wayne Xin Zhao, Jing Jiang, Jianshu Weng, Jing He, Ee-Peng Lim, Hongfei Yan, and Xiaoming Li. 2011. Comparing twitter and traditional media using topic models. In European conference on information retrieval. Springer, 338–349.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Topic modeling approaches to counter online grooming
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            WebSci '22: Proceedings of the 14th ACM Web Science Conference 2022
            June 2022
            479 pages
            ISBN:9781450391917
            DOI:10.1145/3501247

            Copyright © 2022 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 26 June 2022

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • short-paper
            • Research
            • Refereed limited

            Acceptance Rates

            Overall Acceptance Rate218of875submissions,25%

            Upcoming Conference

            Websci '24
            16th ACM Web Science Conference
            May 21 - 24, 2024
            Stuttgart , Germany
          • Article Metrics

            • Downloads (Last 12 months)37
            • Downloads (Last 6 weeks)3

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format