Abstract
Social networks and online communities are taking a primary role in enabling communication and content sharing (e.g., posts, documents, photos, videos) among Web users. Knowledge discovery from user-generated content is becoming an increasingly appealing research context. Many different approaches have been devoted to addressing this issue.This chapter proposes the TweCoM (Tweet Context Miner) framework which entails the mining of relevant recurrences from the content and the context in which Twitter messages (i.e., tweets) are posted. The framework combines two main efforts: (i) the automatic generation of taxonomies from both post content and contextual features, and (ii) the extraction of hidden correlations by means of generalized association rule mining. Since generalized association rule mining is commonly driven by user-provided taxonomies, discovered recurrences are often unsatisfactory. To overcome this issue, two different taxonomy inference procedures have been applied, depending on the kind of information. In particular, relationships holding in context data provided by Twitter are exploited to automatically construct aggregation hierarchies over contextual features, while a hierarchical clustering algorithm is exploited to build a taxonomy over most relevant tweet content keywords. To counteract the excessive level of detail of the extracted information, conceptual aggregations (i.e., generalizations) of concepts hidden in the analyzed data are exploited in the association rule mining process. The extraction of generalized association rules allows discovering high level recurrences by evaluating the extracted taxonomies. Experiments performed on real Twitter posts show the effectiveness and the efficiency of the proposed framework in analyzing tweet content and related context as well as highlighting relevant trends in tweet propagation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Abrol, S., Khan, L.: Twinner: understanding news queries with geo-content using twitter. In: Proceedings of the 6th Workshop on Geographic Information Retrieval, pp. 1–8. ACM, New York (2008)
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: ACM SIGMOD Record, vol. 22, pp. 207–216. ACM, New York (1993)
Agarwal, D., Phillips, J., Venkatasubramanian, S.: The hunting of the bump: on maximizing statistical discrepancy. In: Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm, pp. 1137–1146. ACM, New York (2006)
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: Dbpedia: A Nucleus for a Web of Open Data. The Semantic Web, pp. 722–735. Springer, Heidelberg (2007)
Baralis, E., Cagliero, L., Cerquitelli, T., D’Elia, V., Garza, P.: Support driven opportunistic aggregation for generalized itemset extraction. In: IEEE Conference of Intelligent Systems, pp. 102–107. IEEE, Washington, DC (2010)
Basile, P., Gendarmi, D., Lanubile, F., Semeraro, G.: Recommending Smart Tags in a Social Bookmarking System, pp. 22–29. IEEE, Washington, DC (2007)
Bender, M., Crecelius, T., Kacimi, M., Michel, S., Neumann, T., Parreira, J., Schenkel, R., Weikum, G.: Exploiting social relations for query expansion and result ranking. In: IEEE 24th International Conference on Data Engineering Workshop, pp. 501–506. ACM, New York (2008)
Bogorny, V., Valiati, J., da Silva Camargo, S., Engel, P., Alvares, L.: Towards Elimination of Redundant and Well Known Patterns in Spatial Association Rule Mining, pp. 343–360. Springer, Berlin/Heidelberg (2008)
Clifton, C., Cooley, R., Rennie, J.: TopCat: data Mining for Topic Identification in a Text Corpus, pp. 949–964. IEEE, Washington, DC (2004)
Gates, S., Teiken, W., Cheng, K.: Taxonomies by the numbers: building high-Performance taxonomies. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 568–577. ACM, New York (2005)
Han, J., Fu, Y.: Mining multiple-level association rules in large databases. IEEE Trans. Knowl. Data Eng. 11(5), 798–805 (2002)
Hatzivassiloglou, V., Gravano, L., Maganti, A.: An investigation of linguistic features and clustering algorithms for topical document clustering. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 224–231. ACM, New York (2000)
Herlocker, J., Konstan, J., Terveen, L., Riedl, J.: Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. 22(1), 5–53 (2004)
Heymann, P., Ramage, D., Garcia-Molina, H.: Social tag prediction. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 531–538. ACM, New York (2008)
Hovy, E., Lin, C.: Automated text summarization in SUMMARIST. In: Advances in Automatic Text Summarization, vol. 94. MIT, Cambridge (1999)
Ienco, D., Meo, R.: Towards the Automatic Construction of Conceptual Taxonomies, pp. 327–336. Springer, London (2008)
Kasneci, G., Ramanath, M., Suchanek, F., Weikum, G.: The YAGO-NAGA approach to knowledge discovery. ACM SIGMOD Rec. 37(4), 41–47 (2009)
Kivinen, J., Mannila, H.: Approximate inference of functional dependencies from relations. Theor. Comput. Sci. 149(1), 129–149 (1995)
Lappas, T., Arai, B., Platakis, M., Kotsakos, D., Gunopulos, D.: On burstiness-aware search for document sequences. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 477–486. ACM, New York (2009)
Li, X., Guo, L., Zhao, Y.: Tag-based social interest discovery. In: Proceeding of the 17th International Conference on World Wide Web, pp. 675–684. ACM, New York (2008)
Li, Q., Wang, J., Chen, Y., Lin, Z.: User comments for news recommendation in forum-based social media. Inf. Sci. 180, 4929–4939 (2010)
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press, Berkeley (1967)
Mathioudakis, M., Koudas, N.: TwitterMonitor: trend detection over the twitter stream. In: Proceedings of the 2010 International Conference on Management of data, pp. 1155–1158. ACM, New York (2010)
Mennis, J., Liu, J.: Mining association rules in spatio-temporal data: an analysis of urban socioeconomic and land cover change. Trans. GIS 9(1), 5–17 (2005)
Neshati, M., Hassanabadi, L.: Taxonomy construction using compound similarity measure. In: Proceedings of the OTM Confederated International Conference on On the Move to Meaningful Internet Systems, pp. 915–932. Springer, Berlin/Heidelberg (2007)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Database Theory, pp. 398–416, Springer, Berlin (1999)
Phelan, O., McCarthy, K., Smyth, B.: Using twitter to recommend real-time topical news. In: Proceedings of the Third ACM Conference on Recommender Systems, pp. 385–388. ACM, New York (2009)
Porter, M.F.: An algorithm for suffix stripping. In: Readings in Information Retrieval pp. 313–316. Morgan Kaufmann, San Francisco (1997)
Pramudiono, I., Kitsuregawa, M.: Fp-tax: tree structure based generalized association rule mining. In: Proceedings of the 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, p. 63. ACM, New York (2004)
Schmitz, C., Hotho, A., Jaschke, R., Stumme, G.: Mining association rules in folksonomies. In: Data Science and Classification, pp. 261–270. Springer, Berlin (2006)
Shepitsen, A., Gemmell, J., Mobasher, B., Burke, R.: Personalized recommendation in social tagging systems using hierarchical clustering. In: Proceedings of the 2008 ACM Conference on Recommender Systems, pp. 259–266. ACM, New York (2008)
Sigurbjornsson, B., Van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th International Conference on World Wide Web, pp. 327–336. ACM, New York (2008)
Srikant, R., Agrawal, R.: Mining generalized association rules. In: International Conference on Very Large Data Bases, pp. 407–419. Morgan Kaufmann, San Fransisco (1995)
Srikant, R., Vu, Q., Agrawal, R.: Mining association rules with item constraints. In: Conference on Knowledge Discovery and Data Mining, vol. 97, pp. 67–73. AAAI, Menlo Park (1997)
Sriphaew, K., Theeramunkong, T.: A new method for finding generalized frequent itemsets in generalized association rule mining. In: Seventh International Symposium on Computers and Communications, pp. 1040–1045. IEEE, Washington, DC (2002)
Tan, P., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, p. 41. ACM, New York (2002)
Woon, W., Madnick, S.: Asymmetric information distances for automated taxonomy construction. Knowl. Inf. Syst. 21(1), 91–111 (2009)
Xue, Y., Zhang, C., Zhou, C., Lin, X., Li, Q.: An effective news recommendation in social media based on users’ preference. In: International Workshop on Education Technology and Training, vol. 1, pp. 627–631. IEEE, Washington, DC (2009)
Yin, Z., Li, R., Mei, Q., Han, J.: Exploring social tagging graph for web object classification. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 957–966. ACM, New York (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Wien
About this chapter
Cite this chapter
Cagliero, L., Fiori, A. (2013). TweCoM: Topic and Context Mining from Twitter. In: Özyer, T., Rokne, J., Wagner, G., Reuser, A. (eds) The Influence of Technology on Social Network Analysis and Mining. Lecture Notes in Social Networks, vol 6. Springer, Vienna. https://doi.org/10.1007/978-3-7091-1346-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-7091-1346-2_4
Published:
Publisher Name: Springer, Vienna
Print ISBN: 978-3-7091-1345-5
Online ISBN: 978-3-7091-1346-2
eBook Packages: Computer ScienceComputer Science (R0)