Skip to main content

Measuring Discourse Scale of Tweet Sequences: A Case Study of Japanese Twitter Accounts

  • Conference paper
  • First Online:
  • 1386 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10647))

Abstract

In this study, we measure the discourse scale of tweet sequences and observe their characteristics, for 80 × 3 Japanese Twitter accounts that deal with books, films, and other interests. For each account, a sequence of 3,000 tweets is regarded as the overall textual unit for which the discoursal scale is evaluated. To measure the discourse scale, we first selected 50 words that we call “discourse keywords” and observed how they occur in each of the Twitter accounts. The results showed that the discourse scale is about 15 tweets, regardless of their interests.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    In order to infer the Part-of-Speech tags of Japanese words, we adopted the Japanese morphological analyser MeCab (https://github.com/taku910/MeCab), with a dictionary enhanced for neologisms frequently appearing online [29]. The version we used was released on 24th April 2017.

  2. 2.

    In any set of content words, we removed Japanese stop-words suitable for content analysis [14], which enables us to exclude delexical words among nouns, verbs, and adjectives.

References

  1. Adams, P.H., Martell, C.H.: Topic detection and extraction in chat. In: 30th International Conference on Software Engineering, pp. 581–588 (2008)

    Google Scholar 

  2. Barzlay, R., Elhadad, M.: Using lexical chains for text summarization. In: ACL Workshop on Intelligent Scalable Text Summarisation, pp. 111–121 (1997)

    Google Scholar 

  3. de Beaugrande, W., Dressler, W.U.: Introduction to Text Linguistics. Longman, London (1981)

    Google Scholar 

  4. Benevenuto, F., Haddadi, H., Gummadi, K.: The world of connections and information flow in Twitter. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 42(4), 991–998 (2012)

    Article  Google Scholar 

  5. Bi, B., Cho, J.: Modeling a retweet network via an adaptive Bayesian approach. In: 25th International World Wide Web Conference, pp. 459–469 (2016)

    Google Scholar 

  6. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(4–5), 993–1022 (2003)

    MATH  Google Scholar 

  7. Blei, D.M., Lafferty, J.: Topic models. In: Srivastava, A., Sahami, M. (eds.) Text Mining: Classification, Clustering, and Applications, pp. 71–93. CRC, London (2009)

    Google Scholar 

  8. Bollen, J., Mao, H., Pepe, A.: Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. In: Fifth International AAAI Conference on Weblogs and Social Media, pp. 450–453 (2011)

    Google Scholar 

  9. Brown, G., Yule, G.: Discourse Analysis. Cambridge University Press, Cambridge (1983)

    Book  Google Scholar 

  10. Can, E.F., Oktay, H., Manmatha, R.: Predicting retweet count using visual cues. In: 22nd ACM International Conference on Information and Knowledge Management, pp. 1481–1484 (2013)

    Google Scholar 

  11. Dascalu, M.: Analyzing Discourse and Text Complexity for Learning and Collaborating. Springer, Heidelberg (2014)

    Book  Google Scholar 

  12. Guzman, J., Poblete, B.: On-line relevant anomaly detection in Twitter stream: an efficient bursty keyword detection model. In: 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 31–39 (2013)

    Google Scholar 

  13. Halliday, M.A.K., Hasan, R.: Language, Context and Text. Deaking University Press, Geelong (1985)

    Google Scholar 

  14. Kokubu, H., Yamazaki, H., Nosaka, M.: Japanese stopword list making for keyword extraction suitable for semantic interpretation. Trans. Japan Soc. Kansei Eng. 12, 511–518 (2013). [in Japanese]

    Article  Google Scholar 

  15. Luo, Z., Osborne, M., Tang, J., Wang, T.: Who will retweet me? Finding retweeters in Twitter. In: 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 869–872 (2013)

    Google Scholar 

  16. Mani, I., Bloedorn, E., Gates, B.: Using cohesion and coherence models for text summarisation. AAAI technical report (1998)

    Google Scholar 

  17. Marcu, D.: The theory and practice of discourse parsing and summarization. MIT Press, Cambridge, Mass (2000)

    MATH  Google Scholar 

  18. Mathioudakis, M., Koudas, N.: TwitterMonitor: trend detection over the Twitter stream. In: 2010 ACM SIGMOD International Conference on Management of Data, pp. 1155–1158 (2010)

    Google Scholar 

  19. Montemurro, M.A., Zanette, D.: Towards the quantification of the semantic information encoded in written language. Adv. Complex Syst. 13(2), 135–153 (2009)

    Article  MATH  Google Scholar 

  20. Montemurro, M.A., Zanette, D.: The statistics of meaning: Darwin Gibbon Moby Dick. Significance 6(4), 165–169 (2014)

    Article  Google Scholar 

  21. Montemurro, M.A.: Quantifying the information in the long-range order of words: semantic structures and universal linguistic constraints. Cortex 55, 5–16 (2014)

    Article  Google Scholar 

  22. Neubig, G., Duh, K.: How much is said in a Tweet? A multilingual, information-theoretic perspective. In: AAAI Spring Symposium: Analyzing Microtext, pp. 32–39 (2013)

    Google Scholar 

  23. Paris, C., Wan, S.: Listening to the community: social media monitoring tasks for improving government services. In: The ACM CHI Conference on Human Factors in Computing Systems, pp. 2095–2100 (2011)

    Google Scholar 

  24. Paris, C., Thomas, P., Wan, S.: Differences in language and style between two social media communities. In: 6th International AAAI Conference on Weblogs and Social Media (2012)

    Google Scholar 

  25. Pezzoni, F., An, J., Passarella, A., Crowcroft, J., Conti, M.: Why do I retweet it? An information propagation model for microblogs. In: 5th International Conference on Social Informatics, pp. 360–369 (2013)

    Google Scholar 

  26. Roberts, K., Roach, M.A., Johnson, J., Guthrie, J., Harabagiu, S.M.: EmpaTweet: annotating and detecting emotions on Twitter. In: 8th International Conference on Language Resources and Evaluation, pp. 3806–3813 (2012)

    Google Scholar 

  27. Sakaki, T., Toriumi, F., Matsuo, Y.: Tweet trend analysis in an emergency situation. In: ACM the Special Workshop on Internet and Disasters, no. 3 (2011)

    Google Scholar 

  28. Silber, G., McCoy, K.: Efficiently computed lexical chains as an intermediate representation for automatic text summarization. Comput. Linguist. 28(4), 487–496 (2003)

    Article  Google Scholar 

  29. Toshinori, S.: Neologism dictionary based on the language resources on the web for Mecab (2015)

    Google Scholar 

  30. Wan, S., Paris, C.: Understanding public emotional reactions on Twitter. In: 9th International AAAI Conference on Weblogs and Social Media (2015)

    Google Scholar 

  31. Yada, S.: Development of a book recommendation system to inspire “infrequent readers”. In: 16th International Conference on Asia-Pacific Digital Libraries, pp. 399–404 (2014)

    Google Scholar 

  32. Yang, Z., Guo, J., Cai, K., Tang, J., Li, J., Zhang, L., Su, Z.: Understanding retweeting behaviors in social networks. In: 19nd ACM International Conference on Information and Knowledge Management, pp. 1633–1636 (2010)

    Google Scholar 

  33. Zhao, D., Rosson, M. B.: How and why people Twitter: the role that micro-blogging plays in informal communication at work. In: ACM International Conference on Supporting Group Work, pp. 243–252 (2009)

    Google Scholar 

Download references

Acknowledgement

This work was supported by JSPS KAKENHI Grant Number JP 16K12542.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuntaro Yada .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yada, S., Kageura, K. (2017). Measuring Discourse Scale of Tweet Sequences: A Case Study of Japanese Twitter Accounts. In: Choemprayong, S., Crestani, F., Cunningham, S. (eds) Digital Libraries: Data, Information, and Knowledge for Digital Lives. ICADL 2017. Lecture Notes in Computer Science(), vol 10647. Springer, Cham. https://doi.org/10.1007/978-3-319-70232-2_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-70232-2_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-70231-5

  • Online ISBN: 978-3-319-70232-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics