research-article

Who is tweeting on Twitter: human, bot, or cyborg?

Authors:

Steven Gianvecchio,

Sushil JajodiaAuthors Info & Claims

ACSAC '10: Proceedings of the 26th Annual Computer Security Applications Conference

Pages 21 - 30

https://doi.org/10.1145/1920261.1920265

Published: 06 December 2010 Publication History

Abstract

Twitter is a new web application playing dual roles of online social networking and micro-blogging. Users communicate with each other by publishing text-based posts. The popularity and open structure of Twitter have attracted a large number of automated programs, known as bots, which appear to be a double-edged sword to Twitter. Legitimate bots generate a large amount of benign tweets delivering news and updating feeds, while malicious bots spread spam or malicious contents. More interestingly, in the middle between human and bot, there has emerged cyborg referred to either bot-assisted human or human-assisted bot. To assist human users in identifying who they are interacting with, this paper focuses on the classification of human, bot and cyborg accounts on Twitter. We first conduct a set of large-scale measurements with a collection of over 500,000 accounts. We observe the difference among human, bot and cyborg in terms of tweeting behavior, tweet content, and account properties. Based on the measurement results, we propose a classification system that includes the following four parts: (1) an entropy-based component, (2) a machine-learning-based component, (3) an account properties component, and (4) a decision maker. It uses the combination of features extracted from an unknown user to determine the likelihood of being a human, bot or cyborg. Our experimental evaluation demonstrates the efficacy of the proposed classification system.

References

[1]

Amazon comes to twitter. http://www.readwriteweb.com/archives/amazon_comes_to_twitter.php {Accessed: Dec. 20, 2009}.

[2]

Barack obama uses twitter in 2008 presidential campaign. http://twitter.com/BarackObama/ {Accessed: Dec. 20, 2009}.

[3]

Best buy goes all twitter crazy with @twelpforce. http://twitter.com/in_social_media/status/2756927865 {Accessed: Dec. 20, 2009}.

[4]

The crm114 discriminator. http://crm114.sourceforge.net/ {Accessed: Sept. 12, 2009}.

[5]

Alexa. The top 500 sites on the web by alexa. http://www.alexa.com/topsites {Accessed: Jan. 15, 2010}.

[6]

Meeyoung Cha, Haewoon Kwak, Pablo Rodriguez, Yong-Yeol Ahn, and Sue Moon. I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, San Diego, CA, USA, 2007.

Digital Library

[7]

Meeyoung Cha, Alan Mislove, and Krishna P. Gummadi. A measurement-driven analysis of information propagation in the flickr social network. In Proceedings of the 18th International Conference on World Wide Web, Madrid, Spain, 2009.

Digital Library

[8]

Thomas M. Cover and Joy A. Thomas. Elements of information theory. Wiley-Interscience, New York, NY, USA, 2006.

Digital Library

[9]

Marcel Dischinger, Andreas Haeberlen, Krishna P. Gummadi, and Stefan Saroiu. Characterizing residential broadband networks. In Proceedings of the 7th ACM SIGCOMM conference on Internet Measurement, San Diego, CA, USA, 2007.

Digital Library

[10]

Il-Chul Moon Dongwoo Kim, Yohan Jo and Alice Oh. Analysis of twitter lists as a potential source for discovering latent characteristics of users. In To appear on CHI 2010 Workshop on Microblogging: What and How Can We Learn From It?, 2010.

[11]

Henry J. Fowler and Will E. Leland. Local area network traffic characteristics, with implications for broadband network congestion management. IEEE Journal of Selected Areas in Communications, 9(7), 1991.

Digital Library

[12]

Steven Gianvecchio and Haining Wang. Detecting covert timing channels: An entropy-based approach. In Proceedings of the 2007 ACM Conference on Computer and Communications Security, Alexandria, VA, USA, October-November 2007.

Digital Library

[13]

Steven Gianvecchio, Zhenyu Wu, Mengjun Xie, and Haining Wang. Battle of botcraft: fighting bots in online games with human observational proofs. In Proceedings of the 16th ACM conference on Computer and Communications Security, Chicago, IL, USA, 2009.

Digital Library

[14]

Steven Gianvecchio, Mengjun Xie, Zhenyu Wu, and Haining Wang. Measurement and classification of humans and bots in internet chat. In Proceedings of the 17th USENIX Security symposium, San Jose, CA, 2008.

Digital Library

[15]

Minas Gjoka, Maciej Kurant, Carter T Butts, and Athina Markopoulou. Walking in facebook: A case study of unbiased sampling of osns. In Proceedings of the 27th IEEE International Conference on Computer Communications, San Diego, CA, USA, March 2010.

Digital Library

[16]

Google. Google safe browsing API. http://code.google.com/apis/safebrowsing/ {Accessed: Feb. 5, 2010}.

[17]

Paul Graham. A plan for spam, 2002. http://www.paulgraham.com/spam.html {Accessed: Jan. 25, 2008}.

[18]

Monika R. Henzinger, Allan Heydon, Michael Mitzenmacher, and Marc Najork. On near-uniform url sampling. In Proceedings of the 9th International World Wide Web Conference on Computer Networks, Amsterdam, The Netherlands, May 2000.

Digital Library

[19]

Christopher M. Hill and Linda C. Malone. Using simulated data in support of research on regression analysis. In WSC '04: Proceedings of the 36th conference on Winter simulation, 2004.

Digital Library

[20]

B A Huberman and T Hogg. Complexity and adaptation. Phys. D, 2(1--3), 1986.

Digital Library

[21]

A. L. Hughes and L. Palen. Twitter adoption and use in mass convergence and emergency events. In Proceedings of the 6th International ISCRAM Conference, Gothenburg, Sweden, May 2009.

[22]

H. Husna, S. Phithakkitnukoon, and R. Dantu. Traffic shaping of spam botnets. In Proceedings of the 5th IEEE Conference on Consumer Communications and Networking, Las Vegas, NV, USA, January 2008.

[23]

Bernard J. Jansen, Mimi Zhang, Kate Sobel, and Abdur Chowdury. Twitter power: Tweets as electronic word of mouth. American Society for Information Science and Technology, 60(11), 2009.

Digital Library

[24]

Akshay Java, Xiaodan Song, Tim Finin, and Belle Tseng. Why we twitter: understanding microblogging usage and communities. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis, San Jose, CA, USA, 2007.

Digital Library

[25]

Balachander Krishnamurthy, Phillipa Gill, and Martin Arlitt. A few chirps about twitter. In Proceedings of the First Workshop on Online Social Networks, Seattle, WA, USA, 2008.

Digital Library

[26]

G. J. McLachlan. Discriminant Analysis and Statistical Pattern Recognition. Wiley Interscience, 2004.

[27]

Alan Mislove, Massimiliano Marcon, Krishna P. Gummadi, Peter Druschel, and Bobby Bhattacharjee. Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, San Diego, CA, USA, 2007.

Digital Library

[28]

A Porta, G Baselli, D Liberati, N Montano, C Cogliati, T Gnecchi-Ruscone, A Malliani, and S Cerutti. Measuring regularity by means of a corrected conditional entropy in sympathetic outflow. Biological Cybernetics, Vol. 78(No. 1), January 1998.

[29]

P. Real. A generalized analysis of variance program utilizing binary logic. In ACM '59: Preprints of papers presented at the 14th national meeting of the Association for Computing Machinery, New York, NY, USA, 1959.

Digital Library

[30]

Erick Schonfeld. Costolo: Twitter now has 190 million users tweeting 65 million times a day. http://techcrunch.com/2010/06/08/twitter-190-million-users/ {Accessed: Sept. 26, 2010}.

[31]

Fabrizio Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, Vol. 34(No. 1), 2002.

Digital Library

[32]

Kate Starbird, Leysia Palen, Amanda Hughes, and Sarah Vieweg. Chatter on the red: What hazards threat reveals about the social life of microblogged information. In Proceedings of the ACM 2010 Conference on Computer Supported Cooperative Work, February 2010.

Digital Library

[33]

Statsoft. Statistica, a statistics and analytics software package developed by statsoft. http://www.statsoft.com/support/download/brochures/ {Accessed: Mar. 12, 2010}.

[34]

Brett Stone-Gross, Marco Cova, Lorenzo Cavallaro, Bob Gilbert, Martin Szydlowski, Richard Kemmerer, Christopher Kruegel, and Giovanni Vigna. Your botnet is my botnet: analysis of a botnet takeover. In Proceedings of the 16th ACM conference on Computer and Communications Security, Chicago, IL, USA, 2009.

Digital Library

[35]

J. Sutton, Leysia Palen, and Irina Shlovski. Back-channels on the front lines: Emerging use of social media in the 2007 southern california wildfires. In Proceedings of the 2008 ISCRAM Conference, Washington, DC, USA, May 2008.

[36]

Alan M. Turing. Computing machinery and intelligence. Mind, Vol. 59:433--460, 1950.

Digital Library

[37]

Tweetadder. Automatic twitter software. http://www.tweetadder.com/ {Accessed: Feb. 5, 2010}.

[38]

Twitter. How to report spam on twitter. http://help.twitter.com/entries/64986 {Accessed: May. 30, 2010}.

[39]

Twitter. Twitter api wiki. http://apiwiki.twitter.com/ {Accessed: Feb. 5, 2010}.

[40]

Mengjun Xie, Zhenyu Wu, and Haining Wang. Honeyim: Fast detection and suppression of instant messaging malware in enterprise-like networks,. In Proceedings of the 23rd Annual Computer Security Applications Conference, Miami Beach, FL, USA, 2007.

[41]

Mengjun Xie, Heng Yin, and Haining Wang. An effective defense against email spam laundering. In Proceedings of the 13th ACM conference on Computer and Communications Security, Alexandria, VA, USA, 2006.

Digital Library

[42]

Jeff Yan. Bot, cyborg and automated turing test. In Proceedings of the 14th International Workshop on Security Protocols, Cambridge, UK, March 2006.

[43]

Sarita Yardi, Daniel Romero, Grant Schoenebeck, and Danah Boyd. Detecting spam in a twitter network. First Monday, 15(1), January 2010.

[44]

Jonathan A. Zdziarski. Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification. No Starch Press, 2005.

Digital Library

[45]

Dejin Zhao and Mary Beth Rosson. How and why people twitter: the role that micro-blogging plays in informal communication at work. In Proceedings of the ACM 2009 International Conference on Supporting Group Work, Sanibel Island, FL, USA, 2009.

Digital Library

Cited By

AlKulaib LLu C(2025)HyperSMOTE-MC: Enhancing Multiclass Bot Detection on X Through Hypergraph-Based ResamplingSocial Networks Analysis and Mining10.1007/978-3-031-78554-2_11(171-186)Online publication date: 25-Jan-2025
https://doi.org/10.1007/978-3-031-78554-2_11
Valenzuela-Levi NGálvez Ramírez NNilo CPonce-Méndez JKristjanpoller WZúñiga MTorres N(2024)A Cyborg Walk for Urban Analysis? From Existing Walking Methodologies to the Integration of Machine LearningLand10.3390/land1308121113:8(1211)Online publication date: 6-Aug-2024
https://doi.org/10.3390/land13081211
Dracewicz WSepczuk M(2024)Detecting Fake Accounts on Social Media Portals—The X Portal Case StudyElectronics10.3390/electronics1313254213:13(2542)Online publication date: 28-Jun-2024
https://doi.org/10.3390/electronics13132542
Show More Cited By

Index Terms

Who is tweeting on Twitter: human, bot, or cyborg?
1. Security and privacy
  1. Network security

Recommendations

Detecting Automation of Twitter Accounts: Are You a Human, Bot, or Cyborg?

Twitter is a new web application playing dual roles of online social networking and microblogging. Users communicate with each other by publishing text-based posts. The popularity and open structure of Twitter have attracted a large number of automated ...
A sentiment analysis of audiences on twitter: who is the positive or negative audience of popular twitterers?
ICHIT'11: Proceedings of the 5th international conference on Convergence and hybrid information technology

Microblogging is a new informal communication medium of blogging that differs from a traditional blog in which content is much shorter. Microbloggers post about topics that describe their current status. Twitter is a popular microblogging service and ...
Information resonance on Twitter: watching Iran
SOMA '10: Proceedings of the First Workshop on Social Media Analytics

Twitter has undoubtedly caught the attention of both the general public, and academia as a microblogging service worthy of study and attention. Twitter has several features that sets it apart from other social media/networking sites, including its 140 ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ACSAC '10: Proceedings of the 26th Annual Computer Security Applications Conference

December 2010

419 pages

ISBN:9781450301336

DOI:10.1145/1920261

Conference Chair:
Carrie Gates
CA Labs
,
Program Chairs:
Michael Franz
University of California, Irvine
,
John McDermott
Naval Research Lab

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

ACSA: Applied Computing Security Assoc

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 December 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ACSAC '10

Sponsor:

ACSA

ACSAC '10: 2010 Annual Computer Security Applications Conference

December 6 - 10, 2010

Texas, Austin, USA

Acceptance Rates

Overall Acceptance Rate 104 of 497 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

347
Total Citations
View Citations
4,176
Total Downloads

Downloads (Last 12 months)161
Downloads (Last 6 weeks)14

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

AlKulaib LLu C(2025)HyperSMOTE-MC: Enhancing Multiclass Bot Detection on X Through Hypergraph-Based ResamplingSocial Networks Analysis and Mining10.1007/978-3-031-78554-2_11(171-186)Online publication date: 25-Jan-2025
https://doi.org/10.1007/978-3-031-78554-2_11
Valenzuela-Levi NGálvez Ramírez NNilo CPonce-Méndez JKristjanpoller WZúñiga MTorres N(2024)A Cyborg Walk for Urban Analysis? From Existing Walking Methodologies to the Integration of Machine LearningLand10.3390/land1308121113:8(1211)Online publication date: 6-Aug-2024
https://doi.org/10.3390/land13081211
Dracewicz WSepczuk M(2024)Detecting Fake Accounts on Social Media Portals—The X Portal Case StudyElectronics10.3390/electronics1313254213:13(2542)Online publication date: 28-Jun-2024
https://doi.org/10.3390/electronics13132542
Taban MGür İ(2024)Social Media as an Agent of Influence: Twitter Bots in Russia - Ukraine WarGüvenlik Stratejileri Dergisi10.17752/guvenlikstrtj.139670520:47(99-122)Online publication date: 26-Apr-2024
https://doi.org/10.17752/guvenlikstrtj.1396705
Obreja D(2024)The “Russian bots” between social and technological: Examining the ordinary folk theories of Twitter usersNew Media & Society10.1177/14614448241255692Online publication date: 27-May-2024
https://doi.org/10.1177/14614448241255692
Nagashima HTajima KChua TNgo CKumar RLauw HKa-Wei Lee R(2024)Automatic Construction of Expiration Time Expression Dataset from RetweetsCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3651471(545-548)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589335.3651471
T GG KH HRajadurai KMR A(2024)Implementing Machine Learning Approaches to Identify Fabricated Profiles2024 International Conference on Science Technology Engineering and Management (ICSTEM)10.1109/ICSTEM61137.2024.10560730(1-7)Online publication date: 26-Apr-2024
https://doi.org/10.1109/ICSTEM61137.2024.10560730
Tsvetkova MYasseri TPescetelli NWerner T(2024)A new sociology of humans and machinesNature Human Behaviour10.1038/s41562-024-02001-88:10(1864-1876)Online publication date: 22-Oct-2024
https://doi.org/10.1038/s41562-024-02001-8
Chaudhuri NGupta GBagherzadeh MDaim TYalcin H(2024)Misinformation on social platforms: A review and research agendaTechnology in Society10.1016/j.techsoc.2024.102654(102654)Online publication date: Jul-2024
https://doi.org/10.1016/j.techsoc.2024.102654
Rahman AMohammadi EAlhoori H(2024)Cutting through the noise to motivate people: A comprehensive analysis of COVID-19 social media posts de/motivating vaccinationNatural Language Processing Journal10.1016/j.nlp.2024.1000858(100085)Online publication date: Sep-2024
https://doi.org/10.1016/j.nlp.2024.100085
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten