skip to main content
10.1145/1871437.1871752acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Topic detection and organization of mobile text messages

Authors Info & Claims
Published:26 October 2010Publication History

ABSTRACT

How to organize and visualize big amount of text messages stored on one's mobile phone is a challenging problem, since they can hardly be organized by threads as we do for emails due to lack of necessary metadata such as "subject" and "reply-to". In this paper, we propose an innovative approach based on clustering algorithms and natural language processing methods. We first cluster the text messages into candidate conversations based on their temporal attributes, and then do further analysis using a semantic model based on Latent Dirichlet Allocation (LDA). Considering that the text messages are usually short and sparse, we trained the model using a large scale external data collected from twitter-like web sites, and applied the model to text messages. In the end, the text messages are organized as conversations based on their topics. We evaluated our approach based on 122,359 text messages collected from 50 university students during 6 months.

References

  1. Y. Yang, T. Pierce, and J. Carbonell. "A study of retrospective and on-line event detection". In Proceedings of SIGIR'98. Melbourne, Australia, 28--36, Aug, 1998 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Allan. Introduction to topic detection and tracking. In J. Allan, editor, Topic Detection and Tracking---Event -based Information Organization, 1--16. Kluwer Academic Publisher, 2002 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Matthew Cooper, Jonathan Foote, Andreas Girgensohn and Lynn Wilcox, 2005. Temporal event clustering for digital photo collections. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), 269--288, Aug, 2005 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Q, Zhao, P. Mitra, "Event Detection and Visulization for Social Text Streams", In Proceedings of ICWSM'2007, Colorado, USA, 26--28, Mar. 2007.Google ScholarGoogle Scholar
  5. Griffiths T, Steyvers M (2004). Finding scientific topics. Natl Acad Sci 101:5228--5235Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Topic detection and organization of mobile text messages

        Recommendations

        Reviews

        David Parry

        Tian et al. deal with an interesting application of clustering, using both temporal distance and semantic similarity, to identify short message service (SMS) messages (between a pair of users) related to a particular topic. The authors claim that SMS and other short messages, such as tweets and other online postings, need to be classified into particular groups-similarly to threaded discussions. However, current SMS systems do not normally support this. The paper uses a two-stage process: first cluster the messages using hierarchical temporal clustering, and then use cluster quality tools to select candidate clusters. The paper then describes the use of a modified form of latent Dirichlet allocation (LDA) to link fragments of related conversations that may have been placed in separate clusters. LDA includes both temporal measures and text analysis to identify texts-in this case, clusters of messages-that are related to each other. Using a test corpus from a Chinese-language micro-blogging Web site to train the LDA, Tian et al. show that their algorithm is superior to other methods-it has higher precision and recall. The paper demonstrates an interesting approach to a real problem that may be useful in practice. My only complaint is that it is quite short. Online Computing Reviews Service

        Access critical reviews of Computing literature here

        Become a reviewer for Computing Reviews.

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management
          October 2010
          2036 pages
          ISBN:9781450300995
          DOI:10.1145/1871437

          Copyright © 2010 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 26 October 2010

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • poster

          Acceptance Rates

          Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader