ABSTRACT
Topic classification (TC) of short text messages offers an effective and fast way to reveal events happening around the world ranging from those related to Disaster (e.g. Sandy hurricane) to those related to Violence (e.g. Egypt revolution). Previous approaches to TC have mostly focused on exploiting individual knowledge sources (KS) (e.g. DBpedia or Freebase) without considering the graph structures that surround concepts present in KSs when detecting the topics of Tweets. In this paper we introduce a novel approach for harnessing such graph structures from multiple linked KSs, by: (i) building a conceptual representation of the KSs, (ii) leveraging contextual information about concepts by exploiting semantic concept graphs, and (iii) providing a principled way for the combination of KSs. Experiments evaluating our TC classifier in the context of Violence detection (VD) and Emergency Responses (ER) show promising results that significantly outperform various baseline models including an approach using a single KS without linked data and an approach using only Tweets.
- F. Abel, Q. Gao, G.-J. Houben, and K. Tao. Analyzing user modeling on twitter for personalized news recommendations. In Proceedings of the 19th international conference on User modeling, adaption, and personalization, UMAP'11, pages 1--12, Berlin, Heidelberg, 2011. Springer-Verlag. Google ScholarDigital Library
- C. Bizer, J. Lehmann, G. Kobilarov, S. Auer, C. Becker, R. Cyganiak, and S. Hellmann. Dbpedia - a crystallization point for the web of data. J. Web Sem., 7(3):154--165, 2009. Google ScholarDigital Library
- K. D. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD Conference, pages 1247--1250, 2008. Google ScholarDigital Library
- J. Dolby, A. Fokoue, A. Kalyanpur, E. Schonberg, and K. Srinivas. Extracting Enterprise Vocabularies Using Linked Open Data. In 8th International Semantic Web Conference (ISWC2009), Oct. 2009. Google ScholarDigital Library
- A. Garcia-Silva, O. Corcho, and J. Gracia. Associating semantics to multilingual tags in folksonomies, 2010.Google Scholar
- Y. Genc, Y. Sakamoto, and J. V. Nickerson. Discovering context: classifying tweets through a semantic transform based on wikipedia. In Proceedings of the 6th international conference on Foundations of augmented cognition: directing the future of adaptive systems, FAC'11, pages 484--492, Berlin, Heidelberg, 2011. Springer-Verlag. Google ScholarDigital Library
- B. Han and T. Baldwin. Lexical normalisation of short text messages: makn sens a #twitter. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, HLT '11, pages 368--378, Stroudsburg, PA, USA, 2011. Association for Computational Linguistics. Google ScholarDigital Library
- J. Hoffart, F. M. Suchanek, K. Berberich, and G. Weikum. Yago2: a spatially and temporally enhanced knowledge base from wikipedia. Artificial Intelligence Journal, Special Issue on Artificial Intelligence, Wikipedia and Semi-Structured Resources, 2012. Google ScholarDigital Library
- M. Michelson and S. A. Macskassy. Discovering users' topics of interest on twitter: a first look. In Proceedings of the fourth workshop on Analytics for noisy unstructured text data, AND '10, New York, NY, USA, 2010. Google ScholarDigital Library
- D. Milne and I. H. Witten., editors. Learning to link with Wikipedia. 2008.Google Scholar
- O. Muñoz García, A. García-Silva, O. Corcho, M. de la Higuera Hernández, and C. Navarro. Identifying Topics in Social Media Posts using DBpedia. In M. Jean-Dominique, H. Hrasnica, and F. Genoux, editors, Proceedings of the NEM Summit, pages 81--86. NEM Initiative, Eurescom? the European Institute for Research and Strategic Studies in Telecommunications? GmbH, Sept. 2011.Google Scholar
- N. L. M. Phan, X. H. and S. Horiguchi. Learning to classify short and sparse text and web with hidden topics from large-scale data collections. In ACM, editor, Proceeding of the 17th international conference on World Wide Web., 2008. Google ScholarDigital Library
- A. Ritter, S. Clark, Mausam, and O. Etzioni. Named entity recognition in tweets: An experimental study. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 1524--1534, Edinburgh, Scotland, UK., July 2011. Association for Computational Linguistics. Google ScholarDigital Library
- G. Rizzo and R. Troncy. Nerd: a framework for evaluating named entity recognition tools in the web of data. 2011.Google Scholar
- Y. Song, H. Wang, Z. Wang, H. Li, and W. Chen. Short text conceptualization using a probabilistic knowledgebase. In IJCAI, pages 2330--2336, 2011. Google ScholarDigital Library
- A. Varga, A. E. Cano, and F. Ciravegna. Exploring the similarity between social knowledge sources and twitter for cross-domain topic classification. In Proceedings of the Knowledge Extraction and Consolidation from Social Media, 11th International Semantic Web Conference (ISWC2012), 2012.Google Scholar
- D. Vitale, P. Ferragina, and U. Scaiella. Classification of short texts by deploying topical annotations. In ECIR, pages 376--387, 2012. Google ScholarDigital Library
- T. Xu and D. W. Oard. Wikipedia-based topic clustering for microblogs. Proc. Am. Soc. Info. Sci. Tech., 48(1):1--10, 2011.Google ScholarCross Ref
- Z. Zhang, A. L. Gentile, and F. Ciravegna. Harnessing different knowledge sources to measure semantic relatedness under a uniform model. In EMNLP, pages 991--1002, 2011. Google ScholarDigital Library
Index Terms
- Harnessing linked knowledge sources for topic classification in social media
Recommendations
The use of social media within the global disaster alert and coordination system (GDACS)
WWW '12 Companion: Proceedings of the 21st International Conference on World Wide WebThe Global Disaster Alert and Coordination System (GDACS) collects near real-time hazard information to provide global multi-hazard disaster alerting for earthquakes, tsunamis, tropical cyclones, floods and volcanoes. GDACS alerts are based on ...
Statistically Modeling the Effectiveness of Disaster Information in Social Media
GHTC '11: Proceedings of the 2011 IEEE Global Humanitarian Technology ConferenceTwitter has increasingly become an important source of information during disasters. Authorities have responded by providing related information in Twitter. The same information channel can also be used to deliver disaster preparation information to ...
Collaboration with social media in emergency response: a case study in China
ICEGOV '13: Proceedings of the 7th International Conference on Theory and Practice of Electronic GovernanceWith its speed and scale in communication and interaction among users, social media has shown great potential in emergency response. In October 2011, after a collision accident occurred in Shanghai Metro, various government agencies, companies, NGOs and ...
Comments