ABSTRACT
This paper describes a topic model-based approach for analyzing a large-scale dataset of chat messages exchanged between the users of LiveMe, a major social live streaming platform, in the context of broadcasts involving sexually explicit content. The analysis reveals the characteristics of predatory behavior targeting minors and its criminal dimension in an accessible and explainable manner.
- Alexander Babuta, Marion Oswald, and Christine Rinik. 2018. Machine learning algorithms and police decision-making: legal, ethical and regulatory challenges. (2018).Google Scholar
- Abdur Rahman MA Basher and Benjamin CM Fung. 2014. Analyzing topics and authors in chat logs for crime investigation. Knowledge and information systems 39, 2 (2014), 351–381.Google Scholar
- Marcio Pereira Basilio, Gabrielle Souza Brum, and Valdecy Pereira. 2020. A model of policing strategy choice: the integration of the Latent Dirichlet Allocation (LDA) method with ELECTRE I. Journal of Modelling in Management(2020).Google Scholar
- Daniel Birks, Alex Coleman, and David Jackson. 2020. Unsupervised identification of crime problems from police free-text data. Crime Science 9, 1 (2020), 1–19.Google ScholarCross Ref
- David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of machine Learning research 3, Jan (2003), 993–1022.Google ScholarDigital Library
- Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5 (2017), 135–146.Google ScholarCross Ref
- Erik Cambria and Bebo White. 2014. Jumping NLP curves: A review of natural language processing research. IEEE Computational intelligence magazine 9, 2 (2014), 48–57.Google Scholar
- Victor Diogho Heuer de Carvalho and Ana Paula Cabral Seixas Costa. 2022. Exploring Text Mining and Analytics for Applications in Public Security: An in-depth dive into a systematic literature review. (2022). https://doi.org/10.1590/scielopreprints.3518Google ScholarCross Ref
- Sih-Huei Chen, Andri Santoso, Yuan-Shan Lee, and Jia-Ching Wang. 2015. Latent dirichlet allocation based blog analysis for criminal intention detection system. In 2015 International Carnahan Conference on Security Technology (ICCST). IEEE, 73–76.Google ScholarCross Ref
- Europol. 2021. Internet Organised Crime Threat Assessment (IOCTA 2021). European Union Agency for Law Enforcement Cooperation (Europol).Google Scholar
- Jiawei Han, Jian Pei, Yiwen Yin, and Runying Mao. 2004. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data mining and knowledge discovery 8, 1 (2004), 53–87.Google Scholar
- Liangjie Hong and Brian D Davison. 2010. Empirical study of topic modeling in twitter. In Proceedings of the first workshop on social media analytics. 80–88.Google ScholarDigital Library
- Nikolaos Lykousas and Costantions Patsakis. 2019. Large-scale analysis of grooming in modern social networks. https://doi.org/10.5281/zenodo.3560365Google ScholarCross Ref
- Nikolaos Lykousas and Constantinos Patsakis. 2021. Large-scale analysis of grooming in modern social networks. Expert Systems with Applications 176 (2021), 114808. https://doi.org/10.1016/j.eswa.2021.114808Google ScholarDigital Library
- Nikolaos Lykousas, Constantinos Patsakis, and Vicenç Gómez. 2018. Adult Content in Social Live Streaming Services: Characterizing Deviant Users and Relationships. In IEEE/ACM 2018 International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018, Barcelona, Spain, August 28-31, 2018, Ulrik Brandes, Chandan Reddy, and Andrea Tagarelli (Eds.). IEEE Computer Society, 375–382. https://doi.org/10.1109/ASONAM.2018.8508246Google ScholarCross Ref
- Daniel A McFarland, Daniel Ramage, Jason Chuang, Jeffrey Heer, Christopher D Manning, and Daniel Jurafsky. 2013. Differentiating language usage through topic models. Poetics 41, 6 (2013), 607–625.Google ScholarCross Ref
- Bill Melugin. 2018. Pedophiles using app to manipulate underage girls into sexual acts, sell recordings as child porn. https://www.foxla.com/news/pedophiles-using-app-to-manipulate-underage-girls-into-sexual-acts-sell-recordings-as-child-porn. [Online; last accessed 12-March-2020].Google Scholar
- Daniela F Milon-Flores and Robson LF Cordeiro. 2022. How to take advantage of behavioral features for the early detection of grooming in online conversations. Knowledge-Based Systems 240 (2022), 108017.Google ScholarDigital Library
- Kieron O’hara. 2020. In no circumstances can or should explanations of AI outputs in sensitive contexts be wholly computable. In Workshop on Explanations for AI: Computable or Not? at 12th ACM Web Science Conference, Southampton 2020 (07/07/20 - 07/07/20). https://eprints.soton.ac.uk/442338/Google Scholar
- Ritika Pandey and George O Mohler. 2018. Evaluation of crime topic models: topic coherence vs spatial crime concentration. In 2018 IEEE International Conference on Intelligence and Security Informatics (ISI). IEEE, 76–78.Google ScholarDigital Library
- Venkatesh Ramanathan and Harry Wechsler. 2013. Phishing detection and impersonated entity discovery using Conditional Random Field and Latent Dirichlet Allocation. Computers & Security 34(2013), 123–139.Google ScholarDigital Library
- Radim Řehůřek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, Valletta, Malta, 45–50.Google Scholar
- Michael Röder, Andreas Both, and Alexander Hinneburg. 2015. Exploring the space of topic coherence measures. In Proceedings of the eighth ACM international conference on Web search and data mining. 399–408.Google ScholarDigital Library
- Carson Sievert and Kenneth Shirley. 2014. LDAvis: A method for visualizing and interpreting topics. In Proceedings of the workshop on interactive language learning, visualization, and interfaces. 63–70.Google ScholarCross Ref
- Seppo Virtanen. 2021. Uncovering dynamic textual topics that explain crime. Royal Society open science 8, 12 (2021), 210750.Google ScholarCross Ref
- Cao Xiao, Ping Zhang, W Chaovalitwongse, Jianying Hu, and Fei Wang. 2017. Adverse drug reaction prediction with symbolic latent dirichlet allocation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31.Google ScholarCross Ref
- Wayne Xin Zhao, Jing Jiang, Jianshu Weng, Jing He, Ee-Peng Lim, Hongfei Yan, and Xiaoming Li. 2011. Comparing twitter and traditional media using topic models. In European conference on information retrieval. Springer, 338–349.Google ScholarDigital Library
Index Terms
- Topic modeling approaches to counter online grooming
Recommendations
User-aware topic modeling of online reviews
The online reviews are one type of social media which are opinions generated by the users to comment on some special items. Since the sentiments are dependent on topics, probabilistic topic models have been widely used for sentiment analysis. However, ...
Topic-driven reader comments summarization
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge managementReaders of a news article often read its comments contributed by other readers. By reading comments, readers obtain not only complementary information about this news article but also the opinions from other readers. However, the existing ranking ...
Topic sentiment change analysis
MLDM'11: Proceedings of the 7th international conference on Machine learning and data mining in pattern recognitionPublic opinions on a topic may change over time. Topic Sentiment change analysis is a new research problem consisting of two main components: (a) mining opinions on a certain topic, and (b) detect significant changes of sentiment of the opinions on the ...
Comments