skip to main content
10.1145/1645953.1646210acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Mining and ranking streams of news stories using cross-stream sequential patterns

Published:02 November 2009Publication History

ABSTRACT

We present a new method for mining and ranking streams of news stories using cross-stream sequential patterns and content similarity. In particular, we focus on stories reporting the same event across the streams within a given time window, where an event is defined as a specific thing that happens at a specific time and place. For every discovered cluster of stories reporting the same event we create an itemset-sequence consisting of stream identifiers of the stories in the cluster, where the sequence is ordered according to the timestamps of the stories. Furthermore, we record exact timestamps and content similarities between the respective stories. Given such a collection of itemset-sequences we use it for two tasks: (I) to discover recurrent temporal publishing patterns between the news streams in terms of frequent sequential patterns and content similarity and (II) to rank the streams of news stories with respect to timeliness of reporting important events and content authority. We demonstrate the applicability of the presented method on a multi-stream of news stories was gathered from RSS feeds of major world news agencies.

References

  1. R. Agrawal and R. Srikant. Mining sequential patterns. In ICDE, pages 3--14, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Allan. Topic Detection and Tracking: Event-Based Information Organization. Kluwer Academic Publishers, Norwell, MA, USA, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Clifton, R. Cooley, and J. Rennie. Topcat: Data mining for topic identification in a text corpus. TKDE, 16(8):949--964, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. G. D. Corso, A. Gulli, and F. Romani. Ranking a stream of news. In In WWW 2005, pages 97--106. ACM Press, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Gwadera and F. Crestani. Discovering significant patterns in multi-stream sequences. In 2008 IEEE International Conference on Data Mining, pages 827--832, Pisa, Italy, December 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, and Q. Chen. Mining sequential patterns by pattern-growth: The prefixspan approach. TKDE, 16(11), November 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Mining and ranking streams of news stories using cross-stream sequential patterns

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management
      November 2009
      2162 pages
      ISBN:9781605585123
      DOI:10.1145/1645953

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 November 2009

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • poster

      Acceptance Rates

      Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader