skip to main content
10.1145/1871437.1871732acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Elusive vandalism detection in wikipedia: a text stability-based approach

Published: 26 October 2010 Publication History

Abstract

The open collaborative nature of wikis encourages participation of all users, but at the same time exposes their content to vandalism. The current vandalism-detection techniques, while effective against relatively obvious vandalism edits, prove to be inadequate in detecting increasingly prevalent sophisticated (or elusive) vandal edits. We identify a number of vandal edits that can take hours, even days, to correct and propose a text stability-based approach for detecting them. Our approach is focused on the likelihood of a certain part of an article being modified by a regular edit. In addition to text-stability, our machine learning-based technique also takes into account edit patterns. We evaluate the performance of our approach on a corpus comprising of 15000 manually labeled edits from the Wikipedia Vandalism PAN corpus. The experimental results show that text-stability is able to improve the performance of the selected machine-learning algorithms significantly.

References

[1]
Pan 2010 lab evaluation corpus. http://www.uni-weimar.de/medien/webis/research/workshopseries/pan-10/task2-vandalism-detection.html#corpus.
[2]
Cluebot. http://en.wikipedia.org/wiki/User:ClueBot,Revision as of 20:29, 22 May 2010.
[3]
B. T. Adler and L. de Alfaro. A content-driven reputation system for the wikipedia. In WWW '07, pages 261--270, New York, NY, USA, 2007. ACM.
[4]
A. Belani. Vandalism detection in wikipedia: a bag-of-words classifier approach. CoRR, abs/1001.0700, 2010.
[5]
S.-C. Chin, W. N. Street, P. Srinivasan, and D. Eichmann. Detecting wikipedia vandalism with active learning and statistical language models. In WICOW '10, pages 3--10, New York, NY, USA, 2010. ACM.
[6]
A. Halfaker, A. Kittur, R. Kraut, and J. Riedl. A jury of your peers: quality, experience and ownership in wikipedia. In WikiSym '09, pages 1--10, New York, NY, USA, 2009. ACM.
[7]
M. Potthast, B. Stein, and R. Gerling. Automatic vandalism detection in wikipedia. In C. Macdonald, I. Ounis, V. Plachouras, I. Ruthven, and R. W. White, editors, Advances in Information Retrieval, volume 4956 of LNCS, chapter 75, pages 663--668. Springer Berlin Heidelberg, Berlin, Heidelberg, 2008.
[8]
R. Priedhorsky, J. Chen, S. T. K. Lam, K. Panciera, L. Terveen, and J. Riedl. Creating, destroying, and restoring value in wikipedia. In GROUP '07, pages 259--268, New York, NY, USA, 2007. ACM.
[9]
F. B. Viégas, M. Wattenberg, and K. Dave. Studying cooperation and conflict between authors with history flow visualizations. In CHI '04, pages 575--582, New York, NY, USA, 2004. ACM.
[10]
A. G. West, S. Kannan, and I. Lee. Detecting wikipedia vandalism via spatio-temporal analysis of revision metadata? In EUROSEC '10, pages 22--28, New York, NY, USA, 2010. ACM.
[11]
T. Wöhner and R. Peters. Assessing the quality of wikipedia articles with lifecycle based metrics. In WikiSym '09, pages 1--10, New York, NY, USA, 2009. ACM.

Cited By

View all
  • (2022)A Survey on Detecting Vandalism in Crowdsourcing Models2022 International Conference on Data Science and Intelligent Computing (ICDSIC)10.1109/ICDSIC56987.2022.10076011(25-30)Online publication date: 1-Nov-2022
  • (2022)Citation needed? Wikipedia bibliometrics during the first wave of the COVID-19 pandemicGigaScience10.1093/gigascience/giab09511Online publication date: 12-Jan-2022
  • (2020)Beyond Artificial RealityACM Transactions on Internet Technology10.1145/337421420:1(1-21)Online publication date: 2-Mar-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management
October 2010
2036 pages
ISBN:9781450300995
DOI:10.1145/1871437
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. classification
  2. vandalism detection
  3. wiki

Qualifiers

  • Poster

Conference

CIKM '10

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)A Survey on Detecting Vandalism in Crowdsourcing Models2022 International Conference on Data Science and Intelligent Computing (ICDSIC)10.1109/ICDSIC56987.2022.10076011(25-30)Online publication date: 1-Nov-2022
  • (2022)Citation needed? Wikipedia bibliometrics during the first wave of the COVID-19 pandemicGigaScience10.1093/gigascience/giab09511Online publication date: 12-Jan-2022
  • (2020)Beyond Artificial RealityACM Transactions on Internet Technology10.1145/337421420:1(1-21)Online publication date: 2-Mar-2020
  • (2018)WHADLanguage Resources and Evaluation10.1007/s10579-013-9232-547:4(1163-1190)Online publication date: 17-Dec-2018
  • (2017)Detection of Vandalism in Wikipedia using Metadata Features – Implementation in Simple English and Albanian sectionsAdvances in Science, Technology and Engineering Systems Journal10.25046/aj0204012:4(1-7)Online publication date: Mar-2017
  • (2017)Understanding and coping with extremism in an online collaborative environment: A data-driven modelingPLOS ONE10.1371/journal.pone.017356112:3(e0173561)Online publication date: 21-Mar-2017
  • (2016)Vandalism Detection in WikidataProceedings of the 25th ACM International on Conference on Information and Knowledge Management10.1145/2983323.2983740(327-336)Online publication date: 24-Oct-2016
  • (2016)Machine learning based detection of vandalism in Wikipedia across languages2016 5th Mediterranean Conference on Embedded Computing (MECO)10.1109/MECO.2016.7525689(446-451)Online publication date: Jun-2016
  • (2015)Cross-Language Learning from Bots and Users to Detect Vandalism on WikipediaIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2014.233984427:3(673-685)Online publication date: 1-Mar-2015
  • (2015)Context-Aware Detection of Sneaky Vandalism on Wikipedia Across Multiple LanguagesAdvances in Knowledge Discovery and Data Mining10.1007/978-3-319-18038-0_30(380-391)Online publication date: 17-Apr-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media