research-article

Understanding and combating link farming in the twitter social network

Authors:

Saptarshi Ghosh,

Bimal Viswanath,

Naveen Kumar Sharma,

Fabricio Benevenuto,

Krishna Phani GummadiAuthors Info & Claims

WWW '12: Proceedings of the 21st international conference on World Wide Web

Pages 61 - 70

https://doi.org/10.1145/2187836.2187846

Published: 16 April 2012 Publication History

Abstract

Recently, Twitter has emerged as a popular platform for discovering real-time information on the Web, such as news stories and people's reaction to them. Like the Web, Twitter has become a target for link farming, where users, especially spammers, try to acquire large numbers of follower links in the social network. Acquiring followers not only increases the size of a user's direct audience, but also contributes to the perceived influence of the user, which in turn impacts the ranking of the user's tweets by search engines.

In this paper, we first investigate link farming in the Twitter network and then explore mechanisms to discourage the activity. To this end, we conducted a detailed analysis of links acquired by over 40,000 spammer accounts suspended by Twitter. We find that link farming is wide spread and that a majority of spammers' links are farmed from a small fraction of Twitter users, the social capitalists, who are themselves seeking to amass social capital and links by following back anyone who follows them. Our findings shed light on the social dynamics that are at the root of the link farming problem in Twitter network and they have important implications for future designs of link spam defenses. In particular, we show that a simple user ranking scheme that penalizes users for connecting to spammers can effectively address the problem by disincentivizing users from linking with other users simply to gain influence.

References

[1]

bitly blog - Spam and Malware Protection. http://tinyurl.com/nv2oer.

[2]

Klout | The Standard for Influence. http://klout.com/home.

[3]

There Are Now 155m Tweets Posted Per Day, Triple the Number a Year Ago. http://rww.to/gv4VqA, April 2011.

[4]

Twitter help center: The Twitter rules. http://tinyurl.com/22obg56, 2011.

[5]

L. Becchetti, C. Castillo, D. Donato, R. Baeza-Yates, and S. Leonardi. Link analysis for web spam detection. ACM Transactions on the Web, 2:1--42, March 2008.

Digital Library

[6]

F. Benevenuto, G. Magno, T. Rodrigues, and V. Almeida. Detecting spammers on Twitter. In Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (CEAS), 2010.

[7]

K. Bharat and M. R. Henzinger. Improved algorithms for topic distillation in a hyperlinked environment. In ACM Int'l Conference on Research and Development in Information Retrieval (SIGIR), 1998.

Digital Library

[8]

C. Castillo, D. Donato, A. Gionis, V. Murdock, and F. Silvestri. Know your neighbors: web spam detection using the web topology. In ACM Int'l Conference on Research and Development in Information Retrieval (SIGIR), 2007.

Digital Library

[9]

M. Cha, H. Haddadi, F. Benevenuto, and K. P. Gummadi. Measuring user influence in Twitter: the million follower fallacy. In AAAI Int'l Conference on Weblogs and Social Media (ICWSM), 2010.

[10]

S. Chakrabarti. Integrating the document object model with hyperlinks for enhanced topic distillation and information extraction. In ACM Int'l Conference on World Wide Web (WWW), 2001.

Digital Library

[11]

B. D. Davison. Recognizing nepotistic links on the web. In AAAI Workshop on Artificial Intelligence for Web Search, 2000.

[12]

D. Gayo-Avello and D. J. Brenes. Overcoming Spammers in Twitter - a tale of five algorithms. In Spanish Conference on Information Retrieval (CERI), 2010.

[13]

C. Grier, K. Thomas, V. Paxson, and M. Zhang. @spam: the underground on 140 characters or less. In ACM Int'l Conference on Computer and Communications Security (CCS), 2010.

Digital Library

[14]

Z. Gyöngyi and H. Garcia-Molina. Link spam alliances. In Int'l Conference on Very Large Data Bases (VLDB), 2005.

Digital Library

[15]

Z. Gyöngyi, H. Garcia-Molina, and J. Pedersen. Combating web spam with trustrank. In Int'l Conference on Very Large Data Bases (VLDB), 2004.

Digital Library

[16]

T. H. Haveliwala. Topic-sensitive pagerank. In ACM Int'l Conference on World Wide Web (WWW), 2002.

Digital Library

[17]

US confirms it asked Twitter to stay open to help Iran protesters. http://tinyurl.com/klv36p.

[18]

H. Kwak, H. Chun, and S. Moon. Fragile online relationship: a first look at unfollow dynamics in Twitter. In Annual Conference on Human Factors in Computing Systems (CHI), 2011.

Digital Library

[19]

K. Lee, J. Caverlee, and S. Webb. Uncovering social spammers: social honeypots

[20]

machine learning. In ACM Int'l Conference on Research and Development in Information Retrieval (SIGIR), 2010.

[21]

K. Lee, B. D. Eoff, and J. Caverlee. Seven months with the devils: a long-term study of content polluters on Twitter. In AAAI Int'l Conference on Weblogs and Social Media (ICWSM), 2011.

[22]

R. Lempel and S. Moran. The stochastic approach for link-structure analysis (SALSA) and the TKC effect. Computer Networks, 33:387--401, Jun 2000.

Digital Library

[23]

A. Ramachandran and N. Feamster. Understanding the network-level behavior of spammers. SIGCOMM Computer Communication Review, 36:291--302, Aug 2006.

Digital Library

[24]

T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In ACM Int'l Conference on World Wide Web (WWW), 2010.

Digital Library

[25]

M. Sobek. Google PageRank - PR 0. http://pr.efactory.de/e-pr0.shtml.

[26]

D. Talbot. How Google Ranks Tweets. http://www.technologyreview.in/web/24353/.

[27]

J. Teevan, D. Ramage, and M. R. Morris. #TwitterSearch: a comparison of microblog search and web search. In ACM Int'l Conference on Web Search and Data Mining (WSDM), 2011.

Digital Library

[28]

K. Thomas, C. Grier, V. Paxson, and D. Song. Suspended accounts in retrospect: an analysis of Twitter spam. In ACM SIGCOMM Conference on Internet Measurement (IMC), 2011.

Digital Library

[29]

L. Rao, Twitter Seeing 90 Million Tweets Per Day, 25 Percent Contain Links, TechCrunch, 2010. http://tinyurl.com/27x5cay.

[30]

J. Weng, E.-P. Lim, J. Jiang, and Q. He. TwitterRank: finding topic-sensitive influential Twitterers. In ACM Int'l Conference on Web Search and Data Mining (WSDM), 2010.

Digital Library

[31]

B. Wu and B. D. Davison. Identifying link farm spam pages. In ACM Int'l Conference on World Wide Web (WWW), 2005.

Digital Library

[32]

B. Wu, V. Goel, and B. D. Davison. Propagating trust and distrust to demote web spam. In Workshop on Models of Trust for the Web, 2006.

[33]

S. Yardi, D. Romero, G. Schoenebeck, and D. M. Boyd. Detecting spam in a twitter network. First Monday, 15(1):1--13, Jan 2010.

[34]

C. M. Zhang and V. Paxson. Detecting and analyzing automated activity on Twitter. In Int'l Conference on Passive and Active Measurement (PAM), 2011.

Digital Library

Cited By

Sorour SAlojail AEl-Shora AAmin AAbohany A(2024)A Hybrid Deep Learning Approach for Enhanced Sentiment Classification and Consistency Analysis in Customer ReviewsMathematics10.3390/math1223385612:23(3856)Online publication date: 7-Dec-2024
https://doi.org/10.3390/math12233856
Memon ASootahar DLuhana KMeyer K(2024)A corpus-based real-time text classification and tagging approach for social dataFrontiers in Computer Science10.3389/fcomp.2024.12949856Online publication date: 13-Mar-2024
https://doi.org/10.3389/fcomp.2024.1294985
Saeed MAli SPaudel PBlackburn JStringhini G(2024)Unraveling the Web of Disinformation: Exploring the Larger Context of State-Sponsored Influence Campaigns on TwitterProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3678890.3678911(353-367)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3678890.3678911
Show More Cited By

Index Terms

Understanding and combating link farming in the twitter social network
1. Information systems
  1. World Wide Web
    1. Web applications
    2. Web services

Recommendations

What is Twitter, a social network or a news media?
WWW '10: Proceedings of the 19th international conference on World wide web

Twitter, a microblogging service less than three years old, commands more than 41 million users as of July 2009 and is growing fast. Twitter users tweet about any topic within the 140-character limit and follow others to receive their tweets. The goal ...
Finding news-topic oriented influential twitter users based on topic related hashtag community detection

Recently, more and more users would like to collect and provide information about news topics in Twitter, which is one of the most popular microblogging services. Virtual communities defined by hashtags in Twitter are created for exchanging information ...
Disinformation Warfare: Understanding State-Sponsored Trolls on Twitter and Their Influence on the Web
WWW '19: Companion Proceedings of The 2019 World Wide Web Conference

Over the past couple of years, anecdotal evidence has emerged linking coordinated campaigns by state-sponsored actors with efforts to manipulate public opinion on the Web, often around major political events, through dedicated accounts, or “trolls.” ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '12: Proceedings of the 21st international conference on World Wide Web

April 2012

1078 pages

ISBN:9781450312295

DOI:10.1145/2187836

General Chairs:
Alain Mille
Université de Lyon, France
,
Fabien Gandon
INRIA, France
,
Jacques Misselis
HP, France
,
Program Chairs:
Michael Rabinovich
Case Western Reserve University, USA
,
Steffen Staab
University of Koblenz-Landau, Germany

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Univ. de Lyon: Universite de Lyon

In-Cooperation

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 April 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW 2012

Sponsor:

Univ. de Lyon

WWW 2012: 21st World Wide Web Conference 2012

April 16 - 20, 2012

Lyon, France

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

237
Total Citations
View Citations
1,989
Total Downloads

Downloads (Last 12 months)61
Downloads (Last 6 weeks)4

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sorour SAlojail AEl-Shora AAmin AAbohany A(2024)A Hybrid Deep Learning Approach for Enhanced Sentiment Classification and Consistency Analysis in Customer ReviewsMathematics10.3390/math1223385612:23(3856)Online publication date: 7-Dec-2024
https://doi.org/10.3390/math12233856
Memon ASootahar DLuhana KMeyer K(2024)A corpus-based real-time text classification and tagging approach for social dataFrontiers in Computer Science10.3389/fcomp.2024.12949856Online publication date: 13-Mar-2024
https://doi.org/10.3389/fcomp.2024.1294985
Saeed MAli SPaudel PBlackburn JStringhini G(2024)Unraveling the Web of Disinformation: Exploring the Larger Context of State-Sponsored Influence Campaigns on TwitterProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3678890.3678911(353-367)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3678890.3678911
Huang HTian HZheng XZhang XZeng DWang F(2024)CGNN: A Compatibility-Aware Graph Neural Network for Social Media Bot DetectionIEEE Transactions on Computational Social Systems10.1109/TCSS.2024.339641311:5(6528-6543)Online publication date: Oct-2024
https://doi.org/10.1109/TCSS.2024.3396413
Feng WLiu SCheng X(2023)Hierarchical Dense Pattern Detection in TensorsACM Transactions on Knowledge Discovery from Data10.1145/357702217:6(1-29)Online publication date: 28-Feb-2023
https://dl.acm.org/doi/10.1145/3577022
V. MVijayakumar KP. SGangadharan RSuresh D(2023)Fake News Classification using Transfer Learning2023 International Conference on Artificial Intelligence and Knowledge Discovery in Concurrent Engineering (ICECONF)10.1109/ICECONF57129.2023.10083678(1-7)Online publication date: 5-Jan-2023
https://doi.org/10.1109/ICECONF57129.2023.10083678
Chawla VKapoor Y(2023)A hybrid framework for bot detection on twitter: Fusing digital DNA with BERTMultimedia Tools and Applications10.1007/s11042-023-14730-582:20(30831-30854)Online publication date: 1-Mar-2023
https://doi.org/10.1007/s11042-023-14730-5
Abdelwahab AMostafa M(2022)A Deep Neural Network Technique for Detecting Real-Time Drifted Twitter SpamApplied Sciences10.3390/app1213640712:13(6407)Online publication date: 23-Jun-2022
https://doi.org/10.3390/app12136407
Rovito LBonin LManzoni LDe Lorenzo A(2022)An Evolutionary Computation Approach for Twitter Bot DetectionApplied Sciences10.3390/app1212591512:12(5915)Online publication date: 10-Jun-2022
https://doi.org/10.3390/app12125915
Saeed MAli SBlackburn JCristofaro EZannettou SStringhini G(2022)TrollMagnifier: Detecting State-Sponsored Troll Accounts on Reddit2022 IEEE Symposium on Security and Privacy (SP)10.1109/SP46214.2022.9833706(2161-2175)Online publication date: May-2022
https://doi.org/10.1109/SP46214.2022.9833706
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten