Abstract
This paper presents an initial exploration into the feasibility of automatic topic detection in instant messaging applications. We have developed a prototype system that employs a set of preprocessing techniques, weighted scheme for word combination and suitable word classification and extraction of non-relevant words. The topic detection is based on the descriptive metadata for the selected keywords provided by an existing public knowledge base. In addition, the selected keywords have been linked with popular tweets to provide supplemental information. An exploratory user study has been conducted to gather some insights into the performance and usability metrics related to the proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bengel, J., Gauch, S., Mittur, E., Vijayaraghavan, R.: Chattrack - Chat room topic detection using classification. In: 2nd Symposium on Intelligence and Security Informatics, Tucson, Arizona (2004)
Hansen, L.K., Larsen, J., Kolenda, T.: Blind Detection of Independent Dynamic Components. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 5, pp. 3197–3200 (2001)
Ozyurt, O., Kose, K.: Chat Mining: Automatically Determination of Chat Conversations Topic in text based chat mediums. Journal on Selected Areas in Expert Systems with Applications ESWA 4843 (2010)
Seymore, K., Rosenfeld, R.: Using Story Topics for Language Model Adaptation. In: Eurospeech 1997 (1997)
Dong, H., Hui, S.C., He, Y.: Structural analysis of chat messages for topic detection. Online Information Review 30(5), 496–516 (2006)
Zhang, J., Yang, Y., Carbonell, J.: New Event Detection with Nearest Neighbor, Support Vector Machines, and Kernel Regression. CMU Technical Report CMU-CS-04-118, CMU-LTI-04-180 (2004)
Freebase, repository of structured data (accessed March 20, 2011), http://www.freebase.com
WordNet®, a lexical database for English, http://wordnet.princeton.edu. (accessed August 10, 2011)
Masaitene, D.: Analyzing Casual Conversations. Respectus Philogicus 1(16) (2001)
Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33(1), 31–88 (2001)
Apte, C., Damerau, F., Weiss, S.: Automated learning of decision rules for text categorization. ACM Transactions on Information Systems (1994)
Twitter, an online social networking and microblogging service, http://twitter.com/ (accessed August 15, 2011)
Apte, C., Damerau, F., Weiss, S.: Text mining with decision rules and decision trees. In: Conference on Automated Learning and Discovery, Workshop 6: Learning from Text and the Web (1998)
TweetSharp,NET library for micro-blogging platform, http://tweetsharp.codeplex.com (accessed March 20, 2011)
Jiang, J., Conrath, D.: Semantic similarity based on corpus statistics and lexical taxonomy. In: International Conference on Research in Computational Linguistics, pp. 19–33 (1997)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag GmbH Berlin Heidelberg
About this paper
Cite this paper
Packova, K., Gievska, S. (2012). On the Feasibility of Automatic Topic Detection in IM Chats. In: Kocarev, L. (eds) ICT Innovations 2011. ICT Innovations 2011. Advances in Intelligent and Soft Computing, vol 150. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28664-3_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-28664-3_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28663-6
Online ISBN: 978-3-642-28664-3
eBook Packages: EngineeringEngineering (R0)