skip to main content
10.1145/2464464.2464499acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
research-article

Detecting cyberbullying: query terms and techniques

Published: 02 May 2013 Publication History

Abstract

In this paper we describe a close analysis of the language used in cyberbullying. We take as our corpus a collection of posts from Formspring.me. Formspring.me is a social networking site where users can ask questions of other users. It appeals primarily to teens and young adults and the cyberbullying content on the site is dense; between 7% and 14% of the posts we have analyzed contain cyberbullying content.
The results presented in this article are two-fold. Our first experiments were designed to develop an understanding of both the specific words that are used by cyberbullies, and the context surrounding these words. We have identified the most commonly used cyberbullying terms, and have developed queries that can be used to detect cyberbullying content. Five of our queries achieve an average precision of 91.25% at rank 100.
In our second set of experiments we extended this work by using a supervised machine learning approach for detecting cyberbullying. The machine learning experiments identify additional terms that are consistent with cyberbullying content, and identified an additional querying technique that was able to accurately assign scores to posts from Formspring.me. The posts with the highest scores are shown to have a high density of cyberbullying content.

References

[1]
Baeza-Yates, R., and Berthier Ribeiro-Neto (2011). Modern Information Retrieval: The Concepts and Technology behind Search. New York: Addison Wesley. Print.
[2]
Baglama, J. and L. Reichel (2011). irlba. http://illposed.net/irlba.html
[3]
Berry M. W, S. T. Dumais, and G. W. O'Brien (1995). Using linear algebra for intelligent information retrieval. SIAM Review, 37(4):575--595.
[4]
Deerwester, S. C., S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman (1990). Indexing by Latent Semantic Analysis. Journal of the American Society of Information Science, 41(6):391--407.
[5]
Demmel, J. W., J. Dongarra, B. Parlett, W. Kahan, D. Bindel, Y. Hida, X. Li, O. Marques, E. J. Riedy, C. Vmel, J. Langou, P. Luszczek, J. Kurzak, A. Buttari, J. Langou, and S. Tomov (2007). Prospectus for the next LA-PACK and ScaLAPACK libraries. In Proceedings of the 8th international Conference on Applied Parallel Computing: State of the Art in Scientific Computing (Ume, Sweden). B. Kgstrm, E. Elmroth, J. Dongarra, and J. Wasniewski, Eds. Lecture Notes In Computer Science. Springer-Verlag, Berlin, Heidelberg, 11--23.
[6]
Dinakar, K; Reichart, R.; Lieberman, H. (2011). Modeling the Detection of Textual Cyberbullying. Thesis. Massachusetts Institute of Technology.
[7]
Kontostathis, A. (2007). Essential dimensions of latent semantic indexing (LSI). Proceedings of the 40th Hawaii International Conference on System Sciences. January 2007.
[8]
Kontostathis, A. and W. M. Pottenger. (2006). A framework for understanding LSI performance. Information Processing and Management. Volume 42, number 1, pages 56--73.
[9]
McGhee, I., J. Bayzick, A. Kontostathis, L. Edwards, A. McBride, and E. Jakubowski. (2011). Learning to Identify Internet Sexual Predation. International Journal on Electronic Commerce. Volume 15, Number 3. Spring 2011.
[10]
Ogilvie, P. and J. Callan (2002). Experiments Using the Lemur Toolkit, In Proceedings of the Tenth Text Retrieval Conference (TREC-10), pages 103--108
[11]
Patchin, J. and S. Hinduja. (2006). Bullies move beyond the schoolyard; a preliminary look at cyberbullying. Youth violence and juvenile justice. 4:2, 148--16
[12]
PC Magazine. (2011). Study: A Quarter of Parents Say Their Child Involved in Cyberbullying.(2011, July). PC Magazine Online. Academic OneFile. Web.
[13]
Reynolds, Kelly, April Kontostathis, and Lynne Edwards. 2011. Using Machine Learning to Detect Cyberbullying. In Proceedings of the 2011 10th International Conference on Machine Learning and Applications Workshops (ICMLA 2011). December 2011. Honolulu, HI
[14]
Salton, G. and C. Buckley (1988). Term-weighting approaches in automatic text retrieval. Information Process Management, 24(5): 513--523
[15]
van Rijsbergen, C. J. (1979). Information Retrieval (2nd ed.). Butterworth.
[16]
Willard, N. E. (2007). Cyberbullying and Cyberthreats: Responding to the Challenge of Online Social Aggression, Threats, and Distress. Champaign, IL: Research. Print.
[17]
Xu, Jun-Ming; Kwang-Sung Jun; Xiaojin Zhu; and Amy Bellmore. Learning from bullying traces in social media. In Proceedings of the 2012 Conference of North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT), Montreal, Canada, 2012, pp.656--666.
[18]
Yin, Z. Xue, L. Hong, B. D. Davison, A. Kontostathis, and L. Edwards. (2009). Detection of Harassment on Web 2.0 in CAW 2.0 '09: Proceedings of the 1st Content Analysis in Web 2.0 Workshop, Madrid, Spain.

Cited By

View all
  • (2024)Transformer learning-based neural network algorithms for identification and detection of electronic bullying in social mediaDemonstratio Mathematica10.1515/dema-2023-011857:1Online publication date: 19-Nov-2024
  • (2024)Privacy-Preserving Data Collection and Analysis for Smart CitiesHuman-Centered Services Computing for Smart Cities10.1007/978-981-97-0779-9_5(157-209)Online publication date: 5-May-2024
  • (2023)The Use of a Large Language Model for Cyberbullying DetectionAnalytics10.3390/analytics20300382:3(694-707)Online publication date: 6-Sep-2023
  • Show More Cited By

Index Terms

  1. Detecting cyberbullying: query terms and techniques

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WebSci '13: Proceedings of the 5th Annual ACM Web Science Conference
    May 2013
    481 pages
    ISBN:9781450318891
    DOI:10.1145/2464464
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 May 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. cyberbullying detection
    2. latent semantic indexing
    3. machine learning
    4. term analysis

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    WebSci '13
    Sponsor:
    WebSci '13: Web Science 2013
    May 2 - 4, 2013
    Paris, France

    Acceptance Rates

    Overall Acceptance Rate 245 of 933 submissions, 26%

    Upcoming Conference

    Websci '25
    17th ACM Web Science Conference
    May 20 - 24, 2025
    New Brunswick , NJ , USA

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)51
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 07 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Transformer learning-based neural network algorithms for identification and detection of electronic bullying in social mediaDemonstratio Mathematica10.1515/dema-2023-011857:1Online publication date: 19-Nov-2024
    • (2024)Privacy-Preserving Data Collection and Analysis for Smart CitiesHuman-Centered Services Computing for Smart Cities10.1007/978-981-97-0779-9_5(157-209)Online publication date: 5-May-2024
    • (2023)The Use of a Large Language Model for Cyberbullying DetectionAnalytics10.3390/analytics20300382:3(694-707)Online publication date: 6-Sep-2023
    • (2023)A Survey on Monitoring and Detecting Cyber Bullying Activities using Machine Learning AlgorithmsInternational Journal of Scientific Research in Science, Engineering and Technology10.32628/IJSRSET2310151(374-383)Online publication date: 10-Feb-2023
    • (2023)Past, Present, and Future of Automatic Cyberbullying Detection Research有害情報検出研究の始まり・今・未来Journal of Japan Society for Fuzzy Theory and Intelligent Informatics10.3156/jsoft.35.3_3835:3(38-47)Online publication date: 15-Aug-2023
    • (2023)Cyberbullying Conceptualization, Characterization and Detection in Social Media – A Systematic Literature ReviewInternational Journal on Perceptive and Cognitive Computing10.31436/ijpcc.v9i1.3749:1(101-121)Online publication date: 28-Jan-2023
    • (2023)Filtering objectionable information access based on click-through behaviours with deep learning methodsJournal of Information Science10.1177/01655515231160041Online publication date: 7-Mar-2023
    • (2023)A Study on Highly Accurate Swearing Detection Model Based on Multimodal DataProceedings of the 3rd International Conference on Electronic Information Technology and Smart Agriculture10.1145/3641343.3641390(266-273)Online publication date: 8-Dec-2023
    • (2023)A Multi-Stage Machine Learning and Fuzzy Approach to Cyber-Hate DetectionIEEE Access10.1109/ACCESS.2023.328283411(56046-56065)Online publication date: 2023
    • (2023)Cyberbully: Aggressive Tweets, Bully and Bully Target Profiling from Multilingual Indian TweetsPattern Recognition and Machine Intelligence10.1007/978-3-031-45170-6_66(638-645)Online publication date: 4-Dec-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media