Abstract
In this study, we measure the discourse scale of tweet sequences and observe their characteristics, for 80 × 3 Japanese Twitter accounts that deal with books, films, and other interests. For each account, a sequence of 3,000 tweets is regarded as the overall textual unit for which the discoursal scale is evaluated. To measure the discourse scale, we first selected 50 words that we call “discourse keywords” and observed how they occur in each of the Twitter accounts. The results showed that the discourse scale is about 15 tweets, regardless of their interests.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
In order to infer the Part-of-Speech tags of Japanese words, we adopted the Japanese morphological analyser MeCab (https://github.com/taku910/MeCab), with a dictionary enhanced for neologisms frequently appearing online [29]. The version we used was released on 24th April 2017.
- 2.
In any set of content words, we removed Japanese stop-words suitable for content analysis [14], which enables us to exclude delexical words among nouns, verbs, and adjectives.
References
Adams, P.H., Martell, C.H.: Topic detection and extraction in chat. In: 30th International Conference on Software Engineering, pp. 581–588 (2008)
Barzlay, R., Elhadad, M.: Using lexical chains for text summarization. In: ACL Workshop on Intelligent Scalable Text Summarisation, pp. 111–121 (1997)
de Beaugrande, W., Dressler, W.U.: Introduction to Text Linguistics. Longman, London (1981)
Benevenuto, F., Haddadi, H., Gummadi, K.: The world of connections and information flow in Twitter. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 42(4), 991–998 (2012)
Bi, B., Cho, J.: Modeling a retweet network via an adaptive Bayesian approach. In: 25th International World Wide Web Conference, pp. 459–469 (2016)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(4–5), 993–1022 (2003)
Blei, D.M., Lafferty, J.: Topic models. In: Srivastava, A., Sahami, M. (eds.) Text Mining: Classification, Clustering, and Applications, pp. 71–93. CRC, London (2009)
Bollen, J., Mao, H., Pepe, A.: Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. In: Fifth International AAAI Conference on Weblogs and Social Media, pp. 450–453 (2011)
Brown, G., Yule, G.: Discourse Analysis. Cambridge University Press, Cambridge (1983)
Can, E.F., Oktay, H., Manmatha, R.: Predicting retweet count using visual cues. In: 22nd ACM International Conference on Information and Knowledge Management, pp. 1481–1484 (2013)
Dascalu, M.: Analyzing Discourse and Text Complexity for Learning and Collaborating. Springer, Heidelberg (2014)
Guzman, J., Poblete, B.: On-line relevant anomaly detection in Twitter stream: an efficient bursty keyword detection model. In: 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 31–39 (2013)
Halliday, M.A.K., Hasan, R.: Language, Context and Text. Deaking University Press, Geelong (1985)
Kokubu, H., Yamazaki, H., Nosaka, M.: Japanese stopword list making for keyword extraction suitable for semantic interpretation. Trans. Japan Soc. Kansei Eng. 12, 511–518 (2013). [in Japanese]
Luo, Z., Osborne, M., Tang, J., Wang, T.: Who will retweet me? Finding retweeters in Twitter. In: 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 869–872 (2013)
Mani, I., Bloedorn, E., Gates, B.: Using cohesion and coherence models for text summarisation. AAAI technical report (1998)
Marcu, D.: The theory and practice of discourse parsing and summarization. MIT Press, Cambridge, Mass (2000)
Mathioudakis, M., Koudas, N.: TwitterMonitor: trend detection over the Twitter stream. In: 2010 ACM SIGMOD International Conference on Management of Data, pp. 1155–1158 (2010)
Montemurro, M.A., Zanette, D.: Towards the quantification of the semantic information encoded in written language. Adv. Complex Syst. 13(2), 135–153 (2009)
Montemurro, M.A., Zanette, D.: The statistics of meaning: Darwin Gibbon Moby Dick. Significance 6(4), 165–169 (2014)
Montemurro, M.A.: Quantifying the information in the long-range order of words: semantic structures and universal linguistic constraints. Cortex 55, 5–16 (2014)
Neubig, G., Duh, K.: How much is said in a Tweet? A multilingual, information-theoretic perspective. In: AAAI Spring Symposium: Analyzing Microtext, pp. 32–39 (2013)
Paris, C., Wan, S.: Listening to the community: social media monitoring tasks for improving government services. In: The ACM CHI Conference on Human Factors in Computing Systems, pp. 2095–2100 (2011)
Paris, C., Thomas, P., Wan, S.: Differences in language and style between two social media communities. In: 6th International AAAI Conference on Weblogs and Social Media (2012)
Pezzoni, F., An, J., Passarella, A., Crowcroft, J., Conti, M.: Why do I retweet it? An information propagation model for microblogs. In: 5th International Conference on Social Informatics, pp. 360–369 (2013)
Roberts, K., Roach, M.A., Johnson, J., Guthrie, J., Harabagiu, S.M.: EmpaTweet: annotating and detecting emotions on Twitter. In: 8th International Conference on Language Resources and Evaluation, pp. 3806–3813 (2012)
Sakaki, T., Toriumi, F., Matsuo, Y.: Tweet trend analysis in an emergency situation. In: ACM the Special Workshop on Internet and Disasters, no. 3 (2011)
Silber, G., McCoy, K.: Efficiently computed lexical chains as an intermediate representation for automatic text summarization. Comput. Linguist. 28(4), 487–496 (2003)
Toshinori, S.: Neologism dictionary based on the language resources on the web for Mecab (2015)
Wan, S., Paris, C.: Understanding public emotional reactions on Twitter. In: 9th International AAAI Conference on Weblogs and Social Media (2015)
Yada, S.: Development of a book recommendation system to inspire “infrequent readers”. In: 16th International Conference on Asia-Pacific Digital Libraries, pp. 399–404 (2014)
Yang, Z., Guo, J., Cai, K., Tang, J., Li, J., Zhang, L., Su, Z.: Understanding retweeting behaviors in social networks. In: 19nd ACM International Conference on Information and Knowledge Management, pp. 1633–1636 (2010)
Zhao, D., Rosson, M. B.: How and why people Twitter: the role that micro-blogging plays in informal communication at work. In: ACM International Conference on Supporting Group Work, pp. 243–252 (2009)
Acknowledgement
This work was supported by JSPS KAKENHI Grant Number JP 16K12542.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Yada, S., Kageura, K. (2017). Measuring Discourse Scale of Tweet Sequences: A Case Study of Japanese Twitter Accounts. In: Choemprayong, S., Crestani, F., Cunningham, S. (eds) Digital Libraries: Data, Information, and Knowledge for Digital Lives. ICADL 2017. Lecture Notes in Computer Science(), vol 10647. Springer, Cham. https://doi.org/10.1007/978-3-319-70232-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-70232-2_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70231-5
Online ISBN: 978-3-319-70232-2
eBook Packages: Computer ScienceComputer Science (R0)