skip to main content
10.1145/3589334.3648142acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article
Open access

Triage of Messages and Conversations in a Large-Scale Child Victimization Corpus

Published: 13 May 2024 Publication History

Abstract

Children are among the most vulnerable online populations. Reports of child sexual exploitation on social media and apps have grown annually at an alarming rate and are overwhelming investigators. Even a single case can require examining millions of messages involving hundreds of victims. Triage and prioritization based on victims' experiences is an unfortunate necessity. Using a chat dataset of more than 3 million messages between victims and perpetrators, we evaluate and contribute tools for analyzing the experiences of victims of sexual exploitation. We develop both supervised and unsupervised methods to classify messages into categories of interest to law enforcement, such as age requests, persuasion, and sexual messages. We also introduce a conversation clustering technique to illuminate differences among victims' experiences based on their chat history. Through a qualitative analysis, we demonstrate that the learned clusters are coherent and represent distinct conversation patterns. For example, we can distinguish groups of users who never comply with sexual requests, comply after a few conversations, or comply immediately after being targeted. We expect this approach and associated visualizations will aid law enforcement, industry moderators, and sociologists who need to analyze massive corpora in this domain. Finally, we validate prior models derived from conversations involving adults pretending to be minors and provide statistics that could help undercover adults more accurately portray minor victims.

Supplemental Material

MP4 File
Supplemental video

References

[1]
Maria Antoniak, David Mimno, and Karen Levy. 2019. Narrative paths and negotiation of power in birth stories. Proc. ACM on Human-Computer Interaction, Vol. 3, CSCW (2019), 1--27.
[2]
Pamela J Black, Melissa Wollis, Michael Woodworth, and Jeffrey T Hancock. 2015. A linguistic analysis of grooming strategies of online child sex offenders: Implications for our understanding of predatory sexual behavior in an increasingly computer-mediated world. Child abuse & neglect, Vol. 44 (2015), 140--149.
[3]
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of machine Learning research, Vol. 3, Jan (2003), 993--1022.
[4]
Dasha Bogdanova, Paolo Rosso, and Thamar Solorio. 2014. Exploring high-level features for detecting cyberpedophilia. Computer speech & language, Vol. 28, 1 (2014), 108--120.
[5]
Michael L. Bourke and Sarah W. Craun. 2014. Secondary Traumatic Stress Among Internet Crimes Against Children Task Force Personnel: Impact, Risk Factors, and Coping Strategies. Sexual Abuse, Vol. 26, 6 (2014), 586--609.
[6]
Peter Briggs, Walter T. Simon, and Stacy Simonsen. 2011. An Exploratory Study of Internet-Initiated Sexual Offenses and the Chat Room Sex Offender: Has the Internet Enabled a New Typology of Sex Offender? Sexual Abuse, Vol. 23, 1 (2011), 72--91. https://doi.org/10.1177/1079063210384275
[7]
Elie Bursztein, Travis Bright, Michelle DeLaune, David M. Eliff, Nick Hsu, Lindsey Olson, John Shehan, Madhukar Thakur, and Kurt Thomas. 2019. Rethinking the detection of child sexual abuse imagery on the Internet. In Proc. ACM International Conference on World Wide Web.
[8]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proc. Conf. North American Chapter of the Association for Computational Linguistics. 4171--4186. https://doi.org/10.18653/v1/N19--1423
[9]
Michelle Drouin, Ryan L Boyd, Jeffrey T Hancock, and Audrey James. 2017. Linguistic analysis of chat transcripts from child predator undercover sex stings. The Journal of Forensic Psychiatry & Psychology, Vol. 28, 4 (2017), 437--457.
[10]
Aysu Ezen-Can and Kristy Elizabeth Boyer. 2015. Understanding Student Language: An Unsupervised Dialogue Act Classification Approach. Journal of Educational Data Mining, Vol. 7, 1 (2015), 51--78.
[11]
Ateret Gewirtz-Meydan, Yael Lahav, Wendy Walsh, and David Finkelhor. 2019. Psychopathology among adult survivors of child pornography. Child Abuse & Neglect, Vol. 98 (Dec 2019), 104189.
[12]
Ateret Gewirtz-Meydan, Wendy Walsh, Janis Wolak, and David Finkelhor. 2018. The complex experience of child pornography survivors. Child Abuse Negl, Vol. 80 (Jun 2018), 238--248.
[13]
Aditi Gupta, Ponnurangam Kumaraguru, and Ashish Sureka. 2012. Characterizing pedophile conversations on the internet using online grooming. arXiv preprint arXiv:1208.4324 (2012).
[14]
Oskar Hidén and David Björelind. 2021. Clustering and Summarization of Chat Dialogues: To understand a company's customer base. Master's thesis. Linköping University, Department of Computer and Information Science, https://liu.diva-portal.org/smash/get/diva2:1576483/FULLTEXT01.pdf.
[15]
Giacomo Inches and Fabio Crestani. 2012. Overview of the International Sexual Predator Identification Competition at PAN-2012. In CLEF (Online working notes/labs/workshop), Vol. 30.
[16]
Mohit Iyyer, Anupam Guha, Snigdha Chaturvedi, Jordan Boyd-Graber, and Hal Daumé III. 2016. Feuding families and former friends: Unsupervised learning for dynamic fictional relationships. In Proc. 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1534--1544.
[17]
Katharina Kann, Kyunghyun Cho, and Samuel R. Bowman. 2019. Towards Realistic Practices In Low-Resource Natural Language Processing: The Development Set. In Proc. Conference on Empirical Methods in Natural Language Processing and Intl. Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 3342----3349.
[18]
Juliane A Kloess, Catherine E Hamilton-Giachritsis, and Anthony R Beech. 2019. Offense processes of online sexual grooming and abuse of children via internet communication platforms. Sexual Abuse, Vol. 31, 1 (2019), 73--96.
[19]
April Kontostathis, Lynne Edwards, and Amanda Leatherman. 2009. ChatCoder: Toward the Tracking and Categorization of Internet Predators. In In Proc. Text Mining Workshop, held in conjunction with the SIAM International Conference on Data Mining.
[20]
Brian N. Levine. 2022. Report to Congress: Increasing the Efficacy of Investigations of Online Child Sexual Exploitation. Technical Report NCJ Number 301590. National Institute of Justice. https://www.ojp.gov/library/publications/increasing-efficacy-investigations-online-child-sexual-exploitation-report
[21]
Nuria Lorenzo-Dus, Cristina Izura, and Roc'io Pérez-Tattam. 2016. Understanding grooming discourse in computer-mediated environments. Discourse, Context & Media, Vol. 12 (2016), 40--50.
[22]
Md Waliur Rahman Miah, John Yearwood, and Sid Kulkarni. 2011. Detection of child exploiting chats from a mixed chat dataset as a text classification task. In Proc. Australasian Language Technology Association Workshop. 157--165.
[23]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. Proc. Intl. Conf. on Learning Representations (ICLR).
[24]
Pushkar Mishra, Helen Yannakoudakis, and Ekaterina Shutova. 2019. Tackling online abuse: A survey of automated abuse detection methods. arXiv preprint arXiv:1908.06024 (2019).
[25]
Rachel O'Connell. 2003. A typology of child cybersexploitation and online grooming practices. Technical Report. Cyberspace Research Unit, University of Central Lancashire, http://image.guardian.co.uk/sys-files/Society/documents/2003/07/24/Netpaedoreport.pdf.
[26]
Perverted Justice Foundation, Inc. 2023. http://www.perverted-justice.com/.
[27]
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proc. Conf. on Empirical Methods in Natural Language Processing and the Intl. Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 3982--3992.
[28]
Tatiana R Ringenberg, Kanishka Misra, and Julia Taylor Rayz. 2019. Not so cute but fuzzy: estimating risk of sexual predation in online conversations. In Proc. IEEE International Conference on Systems, Man and Cybernetics (SMC). 2946--2951.
[29]
Kathryn C. Seigfried-Spellar. 2018. Assessing the Psychological Well-being and Coping Mechanisms of Law Enforcement Investigators vs. Digital Forensic Examiners of Child Pornography Investigations. Journal of Police and Criminal Psychology, Vol. 33, 3 (2018), 215--226.
[30]
Gregor Urbas. 2010. Protecting Children From Online Predators: The Use of Covert Investigation Techniques by Law Enforcement. Journal of Contemporary Criminal Justice, Vol. 26 (09 2010), 410--425. https://doi.org/10.1177/1043986210377103
[31]
U.S. Department of Justice. 2022. CY 2022 Report to the Committees on Appropriations National Center for Missing and Exploited Children (NCMEC) Transparency, https://www.missingkids.org/content/dam/missingkids/pdfs/OJJDP-NCMEC-Transparency_2022-Calendar-Year.pdf.
[32]
USDOJ. 2016. The National Strategy for Child Exploitation Prevention and Interdiction: A Report to Congress. https://www.justice.gov/psc/file/842411/download.
[33]
USDOJ. 2019. Windsor Man Sentenced to 18 Years in Federal Prison for Enticing Minor to Engage in Sex. U.S. Attorney's Office, District of Connecticut, https://www.justice.gov/usao-ct/pr/windsor-man-sentenced-18-years-federal-prison-enticing-minor-engage-sex.
[34]
USDOJ. 2020a. Local man sentenced for attempting to entice a minor to engage in unlawful sexual activity. U.S. Attorney's Office, Southern District of Texas, https://www.justice.gov/usao-sdtx/pr/local-man-sentenced-attempting-entice-minor-engage-unlawful-sexual-activity.
[35]
USDOJ. 2020b. Louisiana Man to Federal Prison for Sexually Enticing an Iowa Child. U.S. Attorney's Office, Northern District of Iowa, https://www.justice.gov/usao-ndia/pr/louisiana-man-federal-prison-sexually-enticing-iowa-child.
[36]
USDOJ. 2021. Child Predator and Cyberterrorist, Buster Hernandez, aka "BrianKil," is Sentenced to 75 years in Federal Prison. U.S. Attorney's Office, Southern District of Indiana, https://www.justice.gov/usao-sdtx/pr/local-man-sentenced-attempting-entice-minor-engage-unlawful-sexual-activity.
[37]
Janis Wolak and Kimberly J. Mitchell. 2009. Work Exposure to Child Pornography in ICAC Task Forces and Affiliates. Technical Report. Crimes Against Children Research Center tech report. http://www.unh.edu/ccrc/pdf/Law Enforcement Work Exposure to CP.pdf. io

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '24: Proceedings of the ACM Web Conference 2024
May 2024
4826 pages
ISBN:9798400701719
DOI:10.1145/3589334
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2024

Check for updates

Author Tags

  1. child victimization
  2. conversation clustering
  3. triage

Qualifiers

  • Research-article

Conference

WWW '24
Sponsor:
WWW '24: The ACM Web Conference 2024
May 13 - 17, 2024
Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 214
    Total Downloads
  • Downloads (Last 12 months)214
  • Downloads (Last 6 weeks)35
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media