Skip to main content

Advertisement

Log in

Estimating effectiveness of twitter messages with a personalized machine learning approach

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

To improve a tweet in Twitter, we would like to estimate the effectiveness of a draft before it is sent. The total number of retweets of a tweet can be considered as a measure for the tweet’s effectiveness. To estimate the number of retweets for an author, we propose a procedure to learn a personalized model from his/her past tweets. We propose three types of new features based on the contents of the tweets: Entity, Pair, and Topic. Empirical results from seven authors indicate that the Pair and Topic features have statistically significant improvements on the correlation coefficient between the estimates and the actual numbers of retweets. We study different combinations of the three types of features, and many of the combinations significantly improve the result further.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. http://hootsuite.com.

  2. http://mobile.twitter.com/trends.

  3. We define the Domain Stop Words as the words belong to the web site instead of the article. For all pages from the same web site (domain), the words in the menu and even in the advertisement are usually the same. For this reason, we generate an independent list of stop words for each domain. A stop word in a domain is the one that appears in more than 80% of pages which we crawled. When we use a web page to extract features, we remove the words listed in the Domain Stop Words.

  4. http://mallet.cs.umass.edu/index.php.

  5. http://twitter.com/who_to_follow/interests/social-good.

  6. http://twitter4j.org/en/index.html.

  7. http://dev.twitter.com/streaming/overview.

  8. http://dev.twitter.com/rest/public.

  9. http://www.cs.waikato.ac.nz/ml/weka/.

  10. http://www.csie.ntu.edu.tw/~cjlin/libsvm/.

References

  1. Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  2. Bradley MM, Lang PJ (1999) Affective norms for English Words (ANEW): instruction manual and affective ratings. Technical Report, The Center for Research in Psychophysiology, University of Florida

  3. El-Arini K, Paquet U, Herbrich R, Van Gael J, Agüera y Arcas B (2012) Transparent user models for personalization. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, pp 678–686

  4. Feng W, Wang J (2013) Retweet or not? Personalized tweet re-ranking. In: Proceedings of the sixth ACM international conference on Web search and data mining, pp 577–586

  5. Jenders M, Kasneci G, Naumann F (2013) Analyzing and predicting viral tweets. In: Proceedings of the 22nd international conference on World Wide Web, pp 657–664

  6. Kim HR, Chan PK (2003) Learning implicit user interest hierarchy for context in personalization. In: Proceedings of the 8th international conference on intelligent user interfaces, pp 101–108

  7. Kwak H, Lee C, Park H, Moon S (2010) What is Twitter, a social network or a news media? In: Proceedings of the 19th international conference on World Wide Web, pp 591–600

  8. Lee K, Mahmud J, Chen J, Zhou M, Nichols J (2014) Who will retweet this? Automatically identifying and engaging strangers on twitter to spread information. In: Proceedings of the 19th international conference on intelligent user interfaces, pp 247–256

  9. Macskassy SA, Michelson M (2011) Why do people retweet? Anti-homophily wins the day! In: ICWSM, pp 209–216

  10. Mendes PN, Gruhl D, Drews C, Kau C, Lewis N, Nagarajan M, Alba A, Welch S (2014) Sonora: a prescriptive model for message authoring on Twitter. In: International conference on Web information systems engineering, pp 17–31

  11. Naveed N, Gottron T, Kunegis J, Alhadi AC (2011) Bad news travel fast: a content-based analysis of interestingness on twitter. In: Proceedings of the 3rd international Web science conference, p 8

  12. Pennebaker JW, Francis ME, Booth RJ (2001) Linguistic inquiry and word count: LIWC 2001. Lawrence Erlbaum Associates, Mahway, p 71 (2001)

  13. Quercia D, Ellis J, Capra L, Crowcroft J (2011) In the mood for being influential on twitter. In: 2011 IEEE third international conference on privacy, security, risk and trust (PASSAT) and 2011 IEEE third international conference on social computing (SocialCom), pp 307–314

  14. Suh B, Hong L, Pirolli P, Chi EH (2010) Want to be retweeted? Large scale analytics on factors impacting retweet in twitter network. In: 2010 IEEE second international conference on social computing (SocialCom), pp 177–184

  15. Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A (2010) Sentiment strength detection in short informal text. J Assoc Inf Sci Technol 61(12):2544–2558

    Article  Google Scholar 

  16. Uysal I, Croft WB (2011) User oriented tweet ranking: a filtering approach to microblogs. In: Proceedings of the 20th ACM international conference on information and knowledge management, pp 2261–2264

  17. Xu Z, Yang Q (2012) Analyzing user retweet behavior on twitter. In: Proceedings of the 2012 international conference on advances in social networks analysis and mining, pp 46–50

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xunhu Sun.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, X., Chan, P.K. Estimating effectiveness of twitter messages with a personalized machine learning approach. Knowl Inf Syst 56, 27–53 (2018). https://doi.org/10.1007/s10115-017-1088-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-017-1088-3

Keywords

Navigation