Measuring Discourse Scale of Tweet Sequences: A Case Study of Japanese Twitter Accounts

Yada, Shuntaro; Kageura, Kyo

doi:10.1007/978-3-319-70232-2_13

Measuring Discourse Scale of Tweet Sequences: A Case Study of Japanese Twitter Accounts

Conference paper
First Online: 03 November 2017

1386 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10647))

Abstract

In this study, we measure the discourse scale of tweet sequences and observe their characteristics, for 80 × 3 Japanese Twitter accounts that deal with books, films, and other interests. For each account, a sequence of 3,000 tweets is regarded as the overall textual unit for which the discoursal scale is evaluated. To measure the discourse scale, we first selected 50 words that we call “discourse keywords” and observed how they occur in each of the Twitter accounts. The results showed that the discourse scale is about 15 tweets, regardless of their interests.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
In order to infer the Part-of-Speech tags of Japanese words, we adopted the Japanese morphological analyser MeCab (https://github.com/taku910/MeCab), with a dictionary enhanced for neologisms frequently appearing online [29]. The version we used was released on 24th April 2017.
2.
In any set of content words, we removed Japanese stop-words suitable for content analysis [14], which enables us to exclude delexical words among nouns, verbs, and adjectives.

References

Adams, P.H., Martell, C.H.: Topic detection and extraction in chat. In: 30th International Conference on Software Engineering, pp. 581–588 (2008)
Google Scholar
Barzlay, R., Elhadad, M.: Using lexical chains for text summarization. In: ACL Workshop on Intelligent Scalable Text Summarisation, pp. 111–121 (1997)
Google Scholar
de Beaugrande, W., Dressler, W.U.: Introduction to Text Linguistics. Longman, London (1981)
Google Scholar
Benevenuto, F., Haddadi, H., Gummadi, K.: The world of connections and information flow in Twitter. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 42(4), 991–998 (2012)
Article Google Scholar
Bi, B., Cho, J.: Modeling a retweet network via an adaptive Bayesian approach. In: 25th International World Wide Web Conference, pp. 459–469 (2016)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(4–5), 993–1022 (2003)
MATH Google Scholar
Blei, D.M., Lafferty, J.: Topic models. In: Srivastava, A., Sahami, M. (eds.) Text Mining: Classification, Clustering, and Applications, pp. 71–93. CRC, London (2009)
Google Scholar
Bollen, J., Mao, H., Pepe, A.: Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. In: Fifth International AAAI Conference on Weblogs and Social Media, pp. 450–453 (2011)
Google Scholar
Brown, G., Yule, G.: Discourse Analysis. Cambridge University Press, Cambridge (1983)
Book Google Scholar
Can, E.F., Oktay, H., Manmatha, R.: Predicting retweet count using visual cues. In: 22nd ACM International Conference on Information and Knowledge Management, pp. 1481–1484 (2013)
Google Scholar
Dascalu, M.: Analyzing Discourse and Text Complexity for Learning and Collaborating. Springer, Heidelberg (2014)
Book Google Scholar
Guzman, J., Poblete, B.: On-line relevant anomaly detection in Twitter stream: an efficient bursty keyword detection model. In: 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 31–39 (2013)
Google Scholar
Halliday, M.A.K., Hasan, R.: Language, Context and Text. Deaking University Press, Geelong (1985)
Google Scholar
Kokubu, H., Yamazaki, H., Nosaka, M.: Japanese stopword list making for keyword extraction suitable for semantic interpretation. Trans. Japan Soc. Kansei Eng. 12, 511–518 (2013). [in Japanese]
Article Google Scholar
Luo, Z., Osborne, M., Tang, J., Wang, T.: Who will retweet me? Finding retweeters in Twitter. In: 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 869–872 (2013)
Google Scholar
Mani, I., Bloedorn, E., Gates, B.: Using cohesion and coherence models for text summarisation. AAAI technical report (1998)
Google Scholar
Marcu, D.: The theory and practice of discourse parsing and summarization. MIT Press, Cambridge, Mass (2000)
MATH Google Scholar
Mathioudakis, M., Koudas, N.: TwitterMonitor: trend detection over the Twitter stream. In: 2010 ACM SIGMOD International Conference on Management of Data, pp. 1155–1158 (2010)
Google Scholar
Montemurro, M.A., Zanette, D.: Towards the quantification of the semantic information encoded in written language. Adv. Complex Syst. 13(2), 135–153 (2009)
Article MATH Google Scholar
Montemurro, M.A., Zanette, D.: The statistics of meaning: Darwin Gibbon Moby Dick. Significance 6(4), 165–169 (2014)
Article Google Scholar
Montemurro, M.A.: Quantifying the information in the long-range order of words: semantic structures and universal linguistic constraints. Cortex 55, 5–16 (2014)
Article Google Scholar
Neubig, G., Duh, K.: How much is said in a Tweet? A multilingual, information-theoretic perspective. In: AAAI Spring Symposium: Analyzing Microtext, pp. 32–39 (2013)
Google Scholar
Paris, C., Wan, S.: Listening to the community: social media monitoring tasks for improving government services. In: The ACM CHI Conference on Human Factors in Computing Systems, pp. 2095–2100 (2011)
Google Scholar
Paris, C., Thomas, P., Wan, S.: Differences in language and style between two social media communities. In: 6th International AAAI Conference on Weblogs and Social Media (2012)
Google Scholar
Pezzoni, F., An, J., Passarella, A., Crowcroft, J., Conti, M.: Why do I retweet it? An information propagation model for microblogs. In: 5th International Conference on Social Informatics, pp. 360–369 (2013)
Google Scholar
Roberts, K., Roach, M.A., Johnson, J., Guthrie, J., Harabagiu, S.M.: EmpaTweet: annotating and detecting emotions on Twitter. In: 8th International Conference on Language Resources and Evaluation, pp. 3806–3813 (2012)
Google Scholar
Sakaki, T., Toriumi, F., Matsuo, Y.: Tweet trend analysis in an emergency situation. In: ACM the Special Workshop on Internet and Disasters, no. 3 (2011)
Google Scholar
Silber, G., McCoy, K.: Efficiently computed lexical chains as an intermediate representation for automatic text summarization. Comput. Linguist. 28(4), 487–496 (2003)
Article Google Scholar
Toshinori, S.: Neologism dictionary based on the language resources on the web for Mecab (2015)
Google Scholar
Wan, S., Paris, C.: Understanding public emotional reactions on Twitter. In: 9th International AAAI Conference on Weblogs and Social Media (2015)
Google Scholar
Yada, S.: Development of a book recommendation system to inspire “infrequent readers”. In: 16th International Conference on Asia-Pacific Digital Libraries, pp. 399–404 (2014)
Google Scholar
Yang, Z., Guo, J., Cai, K., Tang, J., Li, J., Zhang, L., Su, Z.: Understanding retweeting behaviors in social networks. In: 19nd ACM International Conference on Information and Knowledge Management, pp. 1633–1636 (2010)
Google Scholar
Zhao, D., Rosson, M. B.: How and why people Twitter: the role that micro-blogging plays in informal communication at work. In: ACM International Conference on Supporting Group Work, pp. 243–252 (2009)
Google Scholar

Download references

Acknowledgement

This work was supported by JSPS KAKENHI Grant Number JP 16K12542.

Author information

Authors and Affiliations

Graduate School of Education, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan
Shuntaro Yada & Kyo Kageura

Authors

Shuntaro Yada
View author publications
You can also search for this author in PubMed Google Scholar
Kyo Kageura
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuntaro Yada .

Editor information

Editors and Affiliations

Chulalongkorn University, Bangkok, Thailand
Songphan Choemprayong
University of Lugano, Lugano, Switzerland
Fabio Crestani
Waikato University, Hamilton, New Zealand
Sally Jo Cunningham

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yada, S., Kageura, K. (2017). Measuring Discourse Scale of Tweet Sequences: A Case Study of Japanese Twitter Accounts. In: Choemprayong, S., Crestani, F., Cunningham, S. (eds) Digital Libraries: Data, Information, and Knowledge for Digital Lives. ICADL 2017. Lecture Notes in Computer Science(), vol 10647. Springer, Cham. https://doi.org/10.1007/978-3-319-70232-2_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-70232-2_13
Published: 03 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70231-5
Online ISBN: 978-3-319-70232-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics