skip to main content
10.1145/2481492.2481497acmconferencesArticle/Chapter ViewAbstractPublication PageshtConference Proceedingsconference-collections
research-article

Harnessing linked knowledge sources for topic classification in social media

Published:01 May 2013Publication History

ABSTRACT

Topic classification (TC) of short text messages offers an effective and fast way to reveal events happening around the world ranging from those related to Disaster (e.g. Sandy hurricane) to those related to Violence (e.g. Egypt revolution). Previous approaches to TC have mostly focused on exploiting individual knowledge sources (KS) (e.g. DBpedia or Freebase) without considering the graph structures that surround concepts present in KSs when detecting the topics of Tweets. In this paper we introduce a novel approach for harnessing such graph structures from multiple linked KSs, by: (i) building a conceptual representation of the KSs, (ii) leveraging contextual information about concepts by exploiting semantic concept graphs, and (iii) providing a principled way for the combination of KSs. Experiments evaluating our TC classifier in the context of Violence detection (VD) and Emergency Responses (ER) show promising results that significantly outperform various baseline models including an approach using a single KS without linked data and an approach using only Tweets.

References

  1. F. Abel, Q. Gao, G.-J. Houben, and K. Tao. Analyzing user modeling on twitter for personalized news recommendations. In Proceedings of the 19th international conference on User modeling, adaption, and personalization, UMAP'11, pages 1--12, Berlin, Heidelberg, 2011. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. Bizer, J. Lehmann, G. Kobilarov, S. Auer, C. Becker, R. Cyganiak, and S. Hellmann. Dbpedia - a crystallization point for the web of data. J. Web Sem., 7(3):154--165, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. K. D. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD Conference, pages 1247--1250, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Dolby, A. Fokoue, A. Kalyanpur, E. Schonberg, and K. Srinivas. Extracting Enterprise Vocabularies Using Linked Open Data. In 8th International Semantic Web Conference (ISWC2009), Oct. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Garcia-Silva, O. Corcho, and J. Gracia. Associating semantics to multilingual tags in folksonomies, 2010.Google ScholarGoogle Scholar
  6. Y. Genc, Y. Sakamoto, and J. V. Nickerson. Discovering context: classifying tweets through a semantic transform based on wikipedia. In Proceedings of the 6th international conference on Foundations of augmented cognition: directing the future of adaptive systems, FAC'11, pages 484--492, Berlin, Heidelberg, 2011. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Han and T. Baldwin. Lexical normalisation of short text messages: makn sens a #twitter. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, HLT '11, pages 368--378, Stroudsburg, PA, USA, 2011. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Hoffart, F. M. Suchanek, K. Berberich, and G. Weikum. Yago2: a spatially and temporally enhanced knowledge base from wikipedia. Artificial Intelligence Journal, Special Issue on Artificial Intelligence, Wikipedia and Semi-Structured Resources, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Michelson and S. A. Macskassy. Discovering users' topics of interest on twitter: a first look. In Proceedings of the fourth workshop on Analytics for noisy unstructured text data, AND '10, New York, NY, USA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Milne and I. H. Witten., editors. Learning to link with Wikipedia. 2008.Google ScholarGoogle Scholar
  11. O. Muñoz García, A. García-Silva, O. Corcho, M. de la Higuera Hernández, and C. Navarro. Identifying Topics in Social Media Posts using DBpedia. In M. Jean-Dominique, H. Hrasnica, and F. Genoux, editors, Proceedings of the NEM Summit, pages 81--86. NEM Initiative, Eurescom? the European Institute for Research and Strategic Studies in Telecommunications? GmbH, Sept. 2011.Google ScholarGoogle Scholar
  12. N. L. M. Phan, X. H. and S. Horiguchi. Learning to classify short and sparse text and web with hidden topics from large-scale data collections. In ACM, editor, Proceeding of the 17th international conference on World Wide Web., 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Ritter, S. Clark, Mausam, and O. Etzioni. Named entity recognition in tweets: An experimental study. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 1524--1534, Edinburgh, Scotland, UK., July 2011. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. Rizzo and R. Troncy. Nerd: a framework for evaluating named entity recognition tools in the web of data. 2011.Google ScholarGoogle Scholar
  15. Y. Song, H. Wang, Z. Wang, H. Li, and W. Chen. Short text conceptualization using a probabilistic knowledgebase. In IJCAI, pages 2330--2336, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Varga, A. E. Cano, and F. Ciravegna. Exploring the similarity between social knowledge sources and twitter for cross-domain topic classification. In Proceedings of the Knowledge Extraction and Consolidation from Social Media, 11th International Semantic Web Conference (ISWC2012), 2012.Google ScholarGoogle Scholar
  17. D. Vitale, P. Ferragina, and U. Scaiella. Classification of short texts by deploying topical annotations. In ECIR, pages 376--387, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. Xu and D. W. Oard. Wikipedia-based topic clustering for microblogs. Proc. Am. Soc. Info. Sci. Tech., 48(1):1--10, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  19. Z. Zhang, A. L. Gentile, and F. Ciravegna. Harnessing different knowledge sources to measure semantic relatedness under a uniform model. In EMNLP, pages 991--1002, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Harnessing linked knowledge sources for topic classification in social media

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in
                • Published in

                  cover image ACM Conferences
                  HT '13: Proceedings of the 24th ACM Conference on Hypertext and Social Media
                  May 2013
                  275 pages
                  ISBN:9781450319676
                  DOI:10.1145/2481492

                  Copyright © 2013 ACM

                  Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 1 May 2013

                  Permissions

                  Request permissions about this article.

                  Request Permissions

                  Check for updates

                  Qualifiers

                  • research-article

                  Acceptance Rates

                  HT '13 Paper Acceptance Rate16of96submissions,17%Overall Acceptance Rate378of1,158submissions,33%

                  Upcoming Conference

                  HT '24
                  35th ACM Conference on Hypertext and Social Media
                  September 10 - 13, 2024
                  Poznan , Poland

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader