research-article

Detecting cyberbullying: query terms and techniques

Authors:

April Kontostathis,

Kelly Reynolds,

Lynne EdwardsAuthors Info & Claims

WebSci '13: Proceedings of the 5th Annual ACM Web Science Conference

Pages 195 - 204

https://doi.org/10.1145/2464464.2464499

Published: 02 May 2013 Publication History

Abstract

In this paper we describe a close analysis of the language used in cyberbullying. We take as our corpus a collection of posts from Formspring.me. Formspring.me is a social networking site where users can ask questions of other users. It appeals primarily to teens and young adults and the cyberbullying content on the site is dense; between 7% and 14% of the posts we have analyzed contain cyberbullying content.

The results presented in this article are two-fold. Our first experiments were designed to develop an understanding of both the specific words that are used by cyberbullies, and the context surrounding these words. We have identified the most commonly used cyberbullying terms, and have developed queries that can be used to detect cyberbullying content. Five of our queries achieve an average precision of 91.25% at rank 100.

In our second set of experiments we extended this work by using a supervised machine learning approach for detecting cyberbullying. The machine learning experiments identify additional terms that are consistent with cyberbullying content, and identified an additional querying technique that was able to accurately assign scores to posts from Formspring.me. The posts with the highest scores are shown to have a high density of cyberbullying content.

References

[1]

Baeza-Yates, R., and Berthier Ribeiro-Neto (2011). Modern Information Retrieval: The Concepts and Technology behind Search. New York: Addison Wesley. Print.

Digital Library

[2]

Baglama, J. and L. Reichel (2011). irlba. http://illposed.net/irlba.html

[3]

Berry M. W, S. T. Dumais, and G. W. O'Brien (1995). Using linear algebra for intelligent information retrieval. SIAM Review, 37(4):575--595.

Digital Library

[4]

Deerwester, S. C., S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman (1990). Indexing by Latent Semantic Analysis. Journal of the American Society of Information Science, 41(6):391--407.

[5]

Demmel, J. W., J. Dongarra, B. Parlett, W. Kahan, D. Bindel, Y. Hida, X. Li, O. Marques, E. J. Riedy, C. Vmel, J. Langou, P. Luszczek, J. Kurzak, A. Buttari, J. Langou, and S. Tomov (2007). Prospectus for the next LA-PACK and ScaLAPACK libraries. In Proceedings of the 8th international Conference on Applied Parallel Computing: State of the Art in Scientific Computing (Ume, Sweden). B. Kgstrm, E. Elmroth, J. Dongarra, and J. Wasniewski, Eds. Lecture Notes In Computer Science. Springer-Verlag, Berlin, Heidelberg, 11--23.

Digital Library

[6]

Dinakar, K; Reichart, R.; Lieberman, H. (2011). Modeling the Detection of Textual Cyberbullying. Thesis. Massachusetts Institute of Technology.

[7]

Kontostathis, A. (2007). Essential dimensions of latent semantic indexing (LSI). Proceedings of the 40th Hawaii International Conference on System Sciences. January 2007.

Digital Library

[8]

Kontostathis, A. and W. M. Pottenger. (2006). A framework for understanding LSI performance. Information Processing and Management. Volume 42, number 1, pages 56--73.

Digital Library

[9]

McGhee, I., J. Bayzick, A. Kontostathis, L. Edwards, A. McBride, and E. Jakubowski. (2011). Learning to Identify Internet Sexual Predation. International Journal on Electronic Commerce. Volume 15, Number 3. Spring 2011.

Digital Library

[10]

Ogilvie, P. and J. Callan (2002). Experiments Using the Lemur Toolkit, In Proceedings of the Tenth Text Retrieval Conference (TREC-10), pages 103--108

[11]

Patchin, J. and S. Hinduja. (2006). Bullies move beyond the schoolyard; a preliminary look at cyberbullying. Youth violence and juvenile justice. 4:2, 148--16

[12]

PC Magazine. (2011). Study: A Quarter of Parents Say Their Child Involved in Cyberbullying.(2011, July). PC Magazine Online. Academic OneFile. Web.

[13]

Reynolds, Kelly, April Kontostathis, and Lynne Edwards. 2011. Using Machine Learning to Detect Cyberbullying. In Proceedings of the 2011 10th International Conference on Machine Learning and Applications Workshops (ICMLA 2011). December 2011. Honolulu, HI

Digital Library

[14]

Salton, G. and C. Buckley (1988). Term-weighting approaches in automatic text retrieval. Information Process Management, 24(5): 513--523

Digital Library

[15]

van Rijsbergen, C. J. (1979). Information Retrieval (2nd ed.). Butterworth.

Digital Library

[16]

Willard, N. E. (2007). Cyberbullying and Cyberthreats: Responding to the Challenge of Online Social Aggression, Threats, and Distress. Champaign, IL: Research. Print.

Digital Library

[17]

Xu, Jun-Ming; Kwang-Sung Jun; Xiaojin Zhu; and Amy Bellmore. Learning from bullying traces in social media. In Proceedings of the 2012 Conference of North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT), Montreal, Canada, 2012, pp.656--666.

Digital Library

[18]

Yin, Z. Xue, L. Hong, B. D. Davison, A. Kontostathis, and L. Edwards. (2009). Detection of Harassment on Web 2.0 in CAW 2.0 '09: Proceedings of the 1st Content Analysis in Web 2.0 Workshop, Madrid, Spain.

Cited By

Alsaade FAlzahrani M(2024)Transformer learning-based neural network algorithms for identification and detection of electronic bullying in social mediaDemonstratio Mathematica10.1515/dema-2023-011857:1Online publication date: 19-Nov-2024
https://doi.org/10.1515/dema-2023-0118
Sei Y(2024)Privacy-Preserving Data Collection and Analysis for Smart CitiesHuman-Centered Services Computing for Smart Cities10.1007/978-981-97-0779-9_5(157-209)Online publication date: 5-May-2024
https://doi.org/10.1007/978-981-97-0779-9_5
Ogunleye BDharmaraj B(2023)The Use of a Large Language Model for Cyberbullying DetectionAnalytics10.3390/analytics20300382:3(694-707)Online publication date: 6-Sep-2023
https://doi.org/10.3390/analytics2030038
Show More Cited By

Index Terms

Detecting cyberbullying: query terms and techniques
1. Social and professional topics
  1. Computing / technology policy

Recommendations

Parental mediation, cyberbullying, and cybertrolling

Researchers are concerned with identifying the risk and protective factors associated with adolescents' involvement in cyberharassment. One such factor is parental mediation of children's electronic technology use. Little attention has been given to how ...
Prevalence of cyberbullying and predictors of cyberbullying perpetration among Korean adolescents

This study aimed to investigate the prevalence of cyberbullying and factors in cyberbullying perpetration with a national sample of 4000 adolescents selected through multi-stage cluster sampling. The respondents were 2166 boys (54.1%) and 1834 girls (...
Intersectionality and cyberbullying

Display Omitted Our paper applies an intersectional approach to the study of cyberbullying.We explore the conditional impact of race, gender, and sexuality on victimization.We conducted an original survey of students in a Midwestern high school (N=752)...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WebSci '13: Proceedings of the 5th Annual ACM Web Science Conference

May 2013

481 pages

ISBN:9781450318891

DOI:10.1145/2464464

Conference Chairs:
Hugh Davis
University of Southampton
,
Harry Halpin
World Wide Web Consortium
,
Alex Pentland,
Program Chairs:
Mark Bernstein,
Lada Adamic,
Harith Alani,
Alexandre Monnin,
Richard Rogers

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 May 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

WebSci '13

Sponsor:

SIGWEB

WebSci '13: Web Science 2013

May 2 - 4, 2013

Paris, France

Acceptance Rates

Overall Acceptance Rate 245 of 933 submissions, 26%

Upcoming Conference

Websci '25

Sponsor:
sigweb

17th ACM Web Science Conference

May 20 - 24, 2025

New Brunswick , NJ , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

92
Total Citations
View Citations
1,126
Total Downloads

Downloads (Last 12 months)51
Downloads (Last 6 weeks)7

Reflects downloads up to 07 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Alsaade FAlzahrani M(2024)Transformer learning-based neural network algorithms for identification and detection of electronic bullying in social mediaDemonstratio Mathematica10.1515/dema-2023-011857:1Online publication date: 19-Nov-2024
https://doi.org/10.1515/dema-2023-0118
Sei Y(2024)Privacy-Preserving Data Collection and Analysis for Smart CitiesHuman-Centered Services Computing for Smart Cities10.1007/978-981-97-0779-9_5(157-209)Online publication date: 5-May-2024
https://doi.org/10.1007/978-981-97-0779-9_5
Ogunleye BDharmaraj B(2023)The Use of a Large Language Model for Cyberbullying DetectionAnalytics10.3390/analytics20300382:3(694-707)Online publication date: 6-Sep-2023
https://doi.org/10.3390/analytics2030038
Payal Budhe Mrs. Dipalee Rane (2023)A Survey on Monitoring and Detecting Cyber Bullying Activities using Machine Learning AlgorithmsInternational Journal of Scientific Research in Science, Engineering and Technology10.32628/IJSRSET2310151(374-383)Online publication date: 10-Feb-2023
https://doi.org/10.32628/IJSRSET2310151
PTASZYNSKI M(2023)Past, Present, and Future of Automatic Cyberbullying Detection Research有害情報検出研究の始まり・今・未来Journal of Japan Society for Fuzzy Theory and Intelligent Informatics10.3156/jsoft.35.3_3835:3(38-47)Online publication date: 15-Aug-2023
https://doi.org/10.3156/jsoft.35.3_38
Woo WChua HGan M(2023)Cyberbullying Conceptualization, Characterization and Detection in Social Media – A Systematic Literature ReviewInternational Journal on Perceptive and Cognitive Computing10.31436/ijpcc.v9i1.3749:1(101-121)Online publication date: 28-Jan-2023
https://doi.org/10.31436/ijpcc.v9i1.374
Lee LLi JKu STseng Y(2023)Filtering objectionable information access based on click-through behaviours with deep learning methodsJournal of Information Science10.1177/01655515231160041Online publication date: 7-Mar-2023
https://doi.org/10.1177/01655515231160041
Deng JLiu YPeng LTang PLu Y(2023)A Study on Highly Accurate Swearing Detection Model Based on Multimodal DataProceedings of the 3rd International Conference on Electronic Information Technology and Smart Agriculture10.1145/3641343.3641390(266-273)Online publication date: 8-Dec-2023
https://dl.acm.org/doi/10.1145/3641343.3641390
Ketsbaia LIssac BChen XJacob S(2023)A Multi-Stage Machine Learning and Fuzzy Approach to Cyber-Hate DetectionIEEE Access10.1109/ACCESS.2023.328283411(56046-56065)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3282834
Karan SKundu S(2023)Cyberbully: Aggressive Tweets, Bully and Bully Target Profiling from Multilingual Indian TweetsPattern Recognition and Machine Intelligence10.1007/978-3-031-45170-6_66(638-645)Online publication date: 4-Dec-2023
https://doi.org/10.1007/978-3-031-45170-6_66
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten