skip to main content
10.1145/3148055.3148071acmconferencesArticle/Chapter ViewAbstractPublication PagesbdcatConference Proceedingsconference-collections
research-article

What Ignites a Reply?: Characterizing Conversations in Microblogs

Authors:
Johnny Torres
Escuela Superior Politécnica del Litoral, Guayaquil, Ecuador
,
Carmen Vaca
Escuela Superior Politécnica del Litoral, Guayaquil, Ecuador
,
Cristina L. Abad
Escuela Superior Politécnica del Litoral, Guayaquil, Ecuador
Authors Info & Claims
Published: 05 December 2017 Publication History

Abstract

Nowadays, microblog platforms provide a medium to share content and interact with other users. With the large-scale data generated on these platforms, the origin and reasons of users engagement in conversations has attracted the attention of the research community. In this paper, we analyze the factors that might spark conversations in Twitter, for the English and Spanish languages. Using a corpus of 2.7 million tweets, we reconstruct existing conversations, then extract several contextual and content features. Based on the features extracted, we train and evaluate several predictive models to identify tweets that will spark a conversation. Our findings show that conversations are more likely to be initiated by users with high activity level and popularity. For less popular users, the type of content generated is a more important factor. Experimental results shows that the best predictive model is able obtain an average score $F1=0.80$. We made available the dataset scripts and code used in this paper to the research community via Github.

References

[1]
Veronika Abramova and Jorge Bernardino. 2013. NoSQL databases: MongoDB vs cassandra. In Proceedings of the international C* conference on computer science and software engineering. ACM, 14--22.
[2]
James Allen, Nathanael Chambers, George Ferguson, Lucian Galescu, Hyuckchul Jung, Mary Swift, and William Taysom. 2007. Plow: A collaborative task learning agent. In AAAI. 1514--1519.
[3]
Tim Althoff, Kevin Clark, and Jure Leskovec. 2016. Natural Language Processing for Mental Health: Large Scale Discourse Analysis of Counseling Conversations. Transactions of the Association for Computational Linguistics (2016).
[4]
Danah Boyd, Scott Golder, and Gilad Lotan. 2010. Tweet, tweet, retweet: Conversational aspects of retweeting on twitter. In System Sciences (HICSS), 2010 43rd Hawaii International Conference on. IEEE, 1--10.
[5]
Justin Cheng, Lada Adamic, P Alex Dow, Jon Michael Kleinberg, and Jure Leskovec. 2014. Can cascades be predicted?. In Proceedings of the 23rd international conference on World wide web. ACM, 925--936.
[6]
Aaron Clauset, Cosma Rohilla Shalizi, and Mark EJ Newman. 2009. Power-law distributions in empirical data. SIAM review 51, 4 (2009), 661--703.
[7]
Micha Elsner and Eugene Charniak. 2008. You Talking to Me? A Corpus and Algorithm for Conversation Disentanglement. In ACL. 834--842.
[8]
Ji He, Mari Ostendorf, Xiaodong He, Jianshu Chen, Jianfeng Gao, Lihong Li, and Li Deng. 2016. Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads. arXiv preprint arXiv:1606.03667 (2016).
[9]
Courtenay Honey and Susan C Herring. 2009. Beyond microblogging: Conversation and collaboration via Twitter. In System Sciences, 2009. HICSS'09. 42nd Hawaii International Conference on. IEEE, 1--10.
[10]
Liangjie Hong, Ovidiu Dan, and Brian D Davison. 2011. Predicting popular messages in twitter. In Proceedings of the 20th international conference companion on World wide web. ACM, 57--58.
[11]
Yijue How and Min-Yen Kan. 2005. Optimizing predictive text entry for short message service on mobile phones. In Proceedings of HCII, Vol. 5.
[12]
Avinash Lakshman and Prashant Malik. 2010. Cassandra: a decentralized structured storage system. ACM SIGOPS Operating Systems Review 44, 2 (2010), 35--40.
[13]
Gustavo López, Luis Quesada, and Luis A Guerrero. 2017. Alexa vs. Siri vs. Cortana vs. Google Assistant: A Comparison of Speech-Based Natural User Interfaces. In International Conference on Applied Human Factors and Ergonomics. Springer, 241--250.
[14]
Shereen Oraby, Pritam Gundecha, Jalal Mahmud, Mansurul Bhuiyan, and Rama Akkiraju. 2017. How May I Help You?: Modeling Twitter Customer Service- Conversations Using Fine-Grained Dialogue Acts. In Proceedings of the 22nd International Conference on Intelligent User Interfaces. ACM, 343--355.
[15]
Alan Ritter, Colin Cherry, and Bill Dolan. 2010. Unsupervised modeling of twitter conversations. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 172--180.
[16]
Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C Courville, and Joelle Pineau. 2016. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models. In AAAI. 3776--3784.
[17]
Bongwon Suh, Lichan Hong, Peter Pirolli, and Ed H Chi. 2010. Want to be retweeted? large scale analytics on factors impacting retweet in twitter network. In Social computing (socialcom), 2010 ieee second international conference on. IEEE, 177--184.
[18]
Chenhao Tan, Lillian Lee, and Bo Pang. 2014. The effect of wording on message propagation: Topic-and author-controlled natural experiments on Twitter. arXiv preprint arXiv:1405.1438 (2014).
[19]
KatrinWeller, Axel Bruns, Jean Burgess, Merja Mahrt, and Cornelius Puschmann. 2014. Twitter and society. Vol. 89. P. Lang.
[20]
Yorick Wilks. 2006. Artificial companions as a new kind of interface to the future internet. (2006).
[21]
Tae Yano, William W Cohen, and Noah A Smith. 2009. Predicting response to political blog posts with topic models. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 477--485.
[22]
Shaozhi Ye and Shyhtsun Felix Wu. 2010. Measuring message propagation and social influence on twitter. com. SocInfo 10 (2010), 216--231.
[23]
An Gie Yong and Sean Pearce. 2013. A beginner's guide to factor analysis: Focusing on exploratory factor analysis. Tutorials in quantitative methods for psychology 9, 2 (2013), 79--94.

Cited By

View all
  • (2019)Cross-lingual Perspectives about Crisis-Related Conversations on TwitterCompanion Proceedings of The 2019 World Wide Web Conference10.1145/3308560.3316799(255-261)Online publication date: 13-May-2019
  • (2019)The Advent of Speech Based NLP QA Systems: A Refined Usability Testing ModelDesign, User Experience, and Usability. Practice and Case Studies10.1007/978-3-030-23535-2_11(152-163)Online publication date: 4-Jul-2019

Index Terms

  1. What Ignites a Reply?: Characterizing Conversations in Microblogs

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      BDCAT '17: Proceedings of the Fourth IEEE/ACM International Conference on Big Data Computing, Applications and Technologies
      December 2017
      288 pages
      ISBN:9781450355490
      DOI:10.1145/3148055
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 05 December 2017

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. big data
      2. machine learning
      3. social computing

      Qualifiers

      • Research-article

      Conference

      UCC '17
      Sponsor:

      Acceptance Rates

      BDCAT '17 Paper Acceptance Rate 27 of 93 submissions, 29%;
      Overall Acceptance Rate 27 of 93 submissions, 29%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 16 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2019)Cross-lingual Perspectives about Crisis-Related Conversations on TwitterCompanion Proceedings of The 2019 World Wide Web Conference10.1145/3308560.3316799(255-261)Online publication date: 13-May-2019
      • (2019)The Advent of Speech Based NLP QA Systems: A Refined Usability Testing ModelDesign, User Experience, and Usability. Practice and Case Studies10.1007/978-3-030-23535-2_11(152-163)Online publication date: 4-Jul-2019

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media

      Get Access

      Get Access

      Login options

      References

      References

      [1]
      Veronika Abramova and Jorge Bernardino. 2013. NoSQL databases: MongoDB vs cassandra. In Proceedings of the international C* conference on computer science and software engineering. ACM, 14--22.
      [2]
      James Allen, Nathanael Chambers, George Ferguson, Lucian Galescu, Hyuckchul Jung, Mary Swift, and William Taysom. 2007. Plow: A collaborative task learning agent. In AAAI. 1514--1519.
      [3]
      Tim Althoff, Kevin Clark, and Jure Leskovec. 2016. Natural Language Processing for Mental Health: Large Scale Discourse Analysis of Counseling Conversations. Transactions of the Association for Computational Linguistics (2016).
      [4]
      Danah Boyd, Scott Golder, and Gilad Lotan. 2010. Tweet, tweet, retweet: Conversational aspects of retweeting on twitter. In System Sciences (HICSS), 2010 43rd Hawaii International Conference on. IEEE, 1--10.
      [5]
      Justin Cheng, Lada Adamic, P Alex Dow, Jon Michael Kleinberg, and Jure Leskovec. 2014. Can cascades be predicted?. In Proceedings of the 23rd international conference on World wide web. ACM, 925--936.
      [6]
      Aaron Clauset, Cosma Rohilla Shalizi, and Mark EJ Newman. 2009. Power-law distributions in empirical data. SIAM review 51, 4 (2009), 661--703.
      [7]
      Micha Elsner and Eugene Charniak. 2008. You Talking to Me? A Corpus and Algorithm for Conversation Disentanglement. In ACL. 834--842.
      [8]
      Ji He, Mari Ostendorf, Xiaodong He, Jianshu Chen, Jianfeng Gao, Lihong Li, and Li Deng. 2016. Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads. arXiv preprint arXiv:1606.03667 (2016).
      [9]
      Courtenay Honey and Susan C Herring. 2009. Beyond microblogging: Conversation and collaboration via Twitter. In System Sciences, 2009. HICSS'09. 42nd Hawaii International Conference on. IEEE, 1--10.
      [10]
      Liangjie Hong, Ovidiu Dan, and Brian D Davison. 2011. Predicting popular messages in twitter. In Proceedings of the 20th international conference companion on World wide web. ACM, 57--58.
      [11]
      Yijue How and Min-Yen Kan. 2005. Optimizing predictive text entry for short message service on mobile phones. In Proceedings of HCII, Vol. 5.
      [12]
      Avinash Lakshman and Prashant Malik. 2010. Cassandra: a decentralized structured storage system. ACM SIGOPS Operating Systems Review 44, 2 (2010), 35--40.
      [13]
      Gustavo López, Luis Quesada, and Luis A Guerrero. 2017. Alexa vs. Siri vs. Cortana vs. Google Assistant: A Comparison of Speech-Based Natural User Interfaces. In International Conference on Applied Human Factors and Ergonomics. Springer, 241--250.
      [14]
      Shereen Oraby, Pritam Gundecha, Jalal Mahmud, Mansurul Bhuiyan, and Rama Akkiraju. 2017. How May I Help You?: Modeling Twitter Customer Service- Conversations Using Fine-Grained Dialogue Acts. In Proceedings of the 22nd International Conference on Intelligent User Interfaces. ACM, 343--355.
      [15]
      Alan Ritter, Colin Cherry, and Bill Dolan. 2010. Unsupervised modeling of twitter conversations. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 172--180.
      [16]
      Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C Courville, and Joelle Pineau. 2016. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models. In AAAI. 3776--3784.
      [17]
      Bongwon Suh, Lichan Hong, Peter Pirolli, and Ed H Chi. 2010. Want to be retweeted? large scale analytics on factors impacting retweet in twitter network. In Social computing (socialcom), 2010 ieee second international conference on. IEEE, 177--184.
      [18]
      Chenhao Tan, Lillian Lee, and Bo Pang. 2014. The effect of wording on message propagation: Topic-and author-controlled natural experiments on Twitter. arXiv preprint arXiv:1405.1438 (2014).
      [19]
      KatrinWeller, Axel Bruns, Jean Burgess, Merja Mahrt, and Cornelius Puschmann. 2014. Twitter and society. Vol. 89. P. Lang.
      [20]
      Yorick Wilks. 2006. Artificial companions as a new kind of interface to the future internet. (2006).
      [21]
      Tae Yano, William W Cohen, and Noah A Smith. 2009. Predicting response to political blog posts with topic models. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 477--485.
      [22]
      Shaozhi Ye and Shyhtsun Felix Wu. 2010. Measuring message propagation and social influence on twitter. com. SocInfo 10 (2010), 216--231.
      [23]
      An Gie Yong and Sean Pearce. 2013. A beginner's guide to factor analysis: Focusing on exploratory factor analysis. Tutorials in quantitative methods for psychology 9, 2 (2013), 79--94.