Skip to main content

TweCoM: Topic and Context Mining from Twitter

  • Chapter
  • First Online:

Part of the book series: Lecture Notes in Social Networks ((LNSN,volume 6))

Abstract

Social networks and online communities are taking a primary role in enabling communication and content sharing (e.g., posts, documents, photos, videos) among Web users. Knowledge discovery from user-generated content is becoming an increasingly appealing research context. Many different approaches have been devoted to addressing this issue.This chapter proposes the TweCoM (Tweet Context Miner) framework which entails the mining of relevant recurrences from the content and the context in which Twitter messages (i.e., tweets) are posted. The framework combines two main efforts: (i) the automatic generation of taxonomies from both post content and contextual features, and (ii) the extraction of hidden correlations by means of generalized association rule mining. Since generalized association rule mining is commonly driven by user-provided taxonomies, discovered recurrences are often unsatisfactory. To overcome this issue, two different taxonomy inference procedures have been applied, depending on the kind of information. In particular, relationships holding in context data provided by Twitter are exploited to automatically construct aggregation hierarchies over contextual features, while a hierarchical clustering algorithm is exploited to build a taxonomy over most relevant tweet content keywords. To counteract the excessive level of detail of the extracted information, conceptual aggregations (i.e., generalizations) of concepts hidden in the analyzed data are exploited in the association rule mining process. The extraction of generalized association rules allows discovering high level recurrences by evaluating the extracted taxonomies. Experiments performed on real Twitter posts show the effectiveness and the efficiency of the proposed framework in analyzing tweet content and related context as well as highlighting relevant trends in tweet propagation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Abrol, S., Khan, L.: Twinner: understanding news queries with geo-content using twitter. In: Proceedings of the 6th Workshop on Geographic Information Retrieval, pp. 1–8. ACM, New York (2008)

    Google Scholar 

  2. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: ACM SIGMOD Record, vol. 22, pp. 207–216. ACM, New York (1993)

    Google Scholar 

  3. Agarwal, D., Phillips, J., Venkatasubramanian, S.: The hunting of the bump: on maximizing statistical discrepancy. In: Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm, pp. 1137–1146. ACM, New York (2006)

    Google Scholar 

  4. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: Dbpedia: A Nucleus for a Web of Open Data. The Semantic Web, pp. 722–735. Springer, Heidelberg (2007)

    Google Scholar 

  5. Baralis, E., Cagliero, L., Cerquitelli, T., D’Elia, V., Garza, P.: Support driven opportunistic aggregation for generalized itemset extraction. In: IEEE Conference of Intelligent Systems, pp. 102–107. IEEE, Washington, DC (2010)

    Google Scholar 

  6. Basile, P., Gendarmi, D., Lanubile, F., Semeraro, G.: Recommending Smart Tags in a Social Bookmarking System, pp. 22–29. IEEE, Washington, DC (2007)

    Google Scholar 

  7. Bender, M., Crecelius, T., Kacimi, M., Michel, S., Neumann, T., Parreira, J., Schenkel, R., Weikum, G.: Exploiting social relations for query expansion and result ranking. In: IEEE 24th International Conference on Data Engineering Workshop, pp. 501–506. ACM, New York (2008)

    Google Scholar 

  8. Bogorny, V., Valiati, J., da Silva Camargo, S., Engel, P., Alvares, L.: Towards Elimination of Redundant and Well Known Patterns in Spatial Association Rule Mining, pp. 343–360. Springer, Berlin/Heidelberg (2008)

    Google Scholar 

  9. Clifton, C., Cooley, R., Rennie, J.: TopCat: data Mining for Topic Identification in a Text Corpus, pp. 949–964. IEEE, Washington, DC (2004)

    Google Scholar 

  10. Gates, S., Teiken, W., Cheng, K.: Taxonomies by the numbers: building high-Performance taxonomies. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 568–577. ACM, New York (2005)

    Google Scholar 

  11. Han, J., Fu, Y.: Mining multiple-level association rules in large databases. IEEE Trans. Knowl. Data Eng. 11(5), 798–805 (2002)

    Google Scholar 

  12. Hatzivassiloglou, V., Gravano, L., Maganti, A.: An investigation of linguistic features and clustering algorithms for topical document clustering. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 224–231. ACM, New York (2000)

    Google Scholar 

  13. Herlocker, J., Konstan, J., Terveen, L., Riedl, J.: Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. 22(1), 5–53 (2004)

    Article  Google Scholar 

  14. Heymann, P., Ramage, D., Garcia-Molina, H.: Social tag prediction. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 531–538. ACM, New York (2008)

    Google Scholar 

  15. Hovy, E., Lin, C.: Automated text summarization in SUMMARIST. In: Advances in Automatic Text Summarization, vol. 94. MIT, Cambridge (1999)

    Google Scholar 

  16. Ienco, D., Meo, R.: Towards the Automatic Construction of Conceptual Taxonomies, pp. 327–336. Springer, London (2008)

    Google Scholar 

  17. Kasneci, G., Ramanath, M., Suchanek, F., Weikum, G.: The YAGO-NAGA approach to knowledge discovery. ACM SIGMOD Rec. 37(4), 41–47 (2009)

    Article  Google Scholar 

  18. Kivinen, J., Mannila, H.: Approximate inference of functional dependencies from relations. Theor. Comput. Sci. 149(1), 129–149 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  19. Lappas, T., Arai, B., Platakis, M., Kotsakos, D., Gunopulos, D.: On burstiness-aware search for document sequences. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 477–486. ACM, New York (2009)

    Google Scholar 

  20. Li, X., Guo, L., Zhao, Y.: Tag-based social interest discovery. In: Proceeding of the 17th International Conference on World Wide Web, pp. 675–684. ACM, New York (2008)

    Google Scholar 

  21. Li, Q., Wang, J., Chen, Y., Lin, Z.: User comments for news recommendation in forum-based social media. Inf. Sci. 180, 4929–4939 (2010)

    Article  Google Scholar 

  22. MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press, Berkeley (1967)

    Google Scholar 

  23. Mathioudakis, M., Koudas, N.: TwitterMonitor: trend detection over the twitter stream. In: Proceedings of the 2010 International Conference on Management of data, pp. 1155–1158. ACM, New York (2010)

    Google Scholar 

  24. Mennis, J., Liu, J.: Mining association rules in spatio-temporal data: an analysis of urban socioeconomic and land cover change. Trans. GIS 9(1), 5–17 (2005)

    Article  Google Scholar 

  25. Neshati, M., Hassanabadi, L.: Taxonomy construction using compound similarity measure. In: Proceedings of the OTM Confederated International Conference on On the Move to Meaningful Internet Systems, pp. 915–932. Springer, Berlin/Heidelberg (2007)

    Google Scholar 

  26. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Database Theory, pp. 398–416, Springer, Berlin (1999)

    Google Scholar 

  27. Phelan, O., McCarthy, K., Smyth, B.: Using twitter to recommend real-time topical news. In: Proceedings of the Third ACM Conference on Recommender Systems, pp. 385–388. ACM, New York (2009)

    Google Scholar 

  28. Porter, M.F.: An algorithm for suffix stripping. In: Readings in Information Retrieval pp. 313–316. Morgan Kaufmann, San Francisco (1997)

    Google Scholar 

  29. Pramudiono, I., Kitsuregawa, M.: Fp-tax: tree structure based generalized association rule mining. In: Proceedings of the 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, p. 63. ACM, New York (2004)

    Google Scholar 

  30. Schmitz, C., Hotho, A., Jaschke, R., Stumme, G.: Mining association rules in folksonomies. In: Data Science and Classification, pp. 261–270. Springer, Berlin (2006)

    Google Scholar 

  31. Shepitsen, A., Gemmell, J., Mobasher, B., Burke, R.: Personalized recommendation in social tagging systems using hierarchical clustering. In: Proceedings of the 2008 ACM Conference on Recommender Systems, pp. 259–266. ACM, New York (2008)

    Google Scholar 

  32. Sigurbjornsson, B., Van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th International Conference on World Wide Web, pp. 327–336. ACM, New York (2008)

    Google Scholar 

  33. Srikant, R., Agrawal, R.: Mining generalized association rules. In: International Conference on Very Large Data Bases, pp. 407–419. Morgan Kaufmann, San Fransisco (1995)

    Google Scholar 

  34. Srikant, R., Vu, Q., Agrawal, R.: Mining association rules with item constraints. In: Conference on Knowledge Discovery and Data Mining, vol. 97, pp. 67–73. AAAI, Menlo Park (1997)

    Google Scholar 

  35. Sriphaew, K., Theeramunkong, T.: A new method for finding generalized frequent itemsets in generalized association rule mining. In: Seventh International Symposium on Computers and Communications, pp. 1040–1045. IEEE, Washington, DC (2002)

    Google Scholar 

  36. Tan, P., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, p. 41. ACM, New York (2002)

    Google Scholar 

  37. Woon, W., Madnick, S.: Asymmetric information distances for automated taxonomy construction. Knowl. Inf. Syst. 21(1), 91–111 (2009)

    Article  Google Scholar 

  38. Xue, Y., Zhang, C., Zhou, C., Lin, X., Li, Q.: An effective news recommendation in social media based on users’ preference. In: International Workshop on Education Technology and Training, vol. 1, pp. 627–631. IEEE, Washington, DC (2009)

    Google Scholar 

  39. Yin, Z., Li, R., Mei, Q., Han, J.: Exploring social tagging graph for web object classification. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 957–966. ACM, New York (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luca Cagliero .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Wien

About this chapter

Cite this chapter

Cagliero, L., Fiori, A. (2013). TweCoM: Topic and Context Mining from Twitter. In: Özyer, T., Rokne, J., Wagner, G., Reuser, A. (eds) The Influence of Technology on Social Network Analysis and Mining. Lecture Notes in Social Networks, vol 6. Springer, Vienna. https://doi.org/10.1007/978-3-7091-1346-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-7091-1346-2_4

  • Published:

  • Publisher Name: Springer, Vienna

  • Print ISBN: 978-3-7091-1345-5

  • Online ISBN: 978-3-7091-1346-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics