skip to main content
10.1145/1277741.1277780acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

New event detection based on indexing-tree and named entity

Published: 23 July 2007 Publication History

Abstract

New Event Detection (NED) aims at detecting from one or multiple streams of news stories that which one is reported on a new event (i.e. not reported previously). With the overwhelming volume of news available today, there is an increasing need for a NED system which is able to detect new events more efficiently and accurately. In this paper we propose a new NED model to speed up the NED task by using news indexing-tree dynamically. Moreover, based on the observation that terms of different types have different effects for NED task, two term reweighting approaches are proposed to improve NED accuracy. In the first approach, we propose to adjust term weights dynamically based on previous story clusters and in the second approach, we propose to employ statistics on training data to learn the named entity reweighting model for each class of stories. Experimental results on two Linguistic Data Consortium (LDC) datasets TDT2 and TDT3 show that the proposed model can improve both efficiency and accuracy of NED task significantly, compared to the baseline system and other existing systems.

References

[1]
http://www.nist.gov/speech/tests/tdt/index.htm
[2]
In Topic Detection and Tracking. Event-based Information Organization. Kluwer Academic Publishers, 2002.
[3]
Y. Yang, J. Carbonell, R. Brown, T. Pierce, B. T. Archibald, and X. Liu. Learning Approaches for Detecting and Tracking News Events. In IEEE Intelligent Systems Special Issue on Applications of Intelligent Information Retrieval, volume 14 (4), 1999, 32--43.
[4]
Y. Yang, T. Pierce, and J. Carbonell. A Study on Retrospective and On-line Event Detection. In Proceedings of SIGIR-98, Melbourne, Australia, 1998, 28--36.
[5]
J. Allan, V. Lavrenko, D. Malin, and R. Swan. Detections, Bounds, and Timelines: Umass and tdt-3. In Proceedings of Topic Detection and Tracking Workshop (TDT-3), Vienna, VA, 2000, 167--174.
[6]
R. Papka and J. Allan. On-line New Event Detection Using Single Pass Clustering TITLE2:. Technical Report UM-CS-1998-021, 1998.
[7]
W. Lam, H. Meng, K. Wong, and J. Yen. Using Contextual Analysis for News Event Detection. International Journal on Intelligent Systems, 2001, 525--546.
[8]
B. Thorsten, C. Francine, and F. Ayman. A System for New Event Detection. In Proceedings of the 26th Annual International ACM SIGIR Conference, New York, NY, USA. ACM Press. 2003, 330--337.
[9]
S. Nicola and C. Joe. Combining Semantic and Syntactic Document Classifiers to Improve First Story Detection. In Proceedings of the 24th Annual International ACM SIGIR Conference, New York, NY, USA. ACM Press. 2001, 424--425.
[10]
Y. Yang, J. Zhang, J. Carbonell, and C. Jin. Topic-conditioned Novelty Detection. In Proceedings of the 8th ACM SIGKDD International Conference, ACM Press. 2002, 688--693.
[11]
M. Juha, A. M. Helena, and S. Marko. Applying Semantic Classes in Event Detection and Tracking. In Proceedings of International Conference on Natural Language Processing (ICON 2002), 2002, pages 175--183.
[12]
M. Juha, A. M. Helena, and S. Marko. Simple Semantics in Topic Detection and Tracking. Information Retrieval, 7(3--4): 2004, 347--368.
[13]
K. Giridhar and J. Allan. Text Classification and Named Entities for New Event Detection. In Proceedings of the 27th Annual International ACM SIGIR Conference, New York, NY, USA. ACM Press. 2004, 297--304.
[14]
J. P. Callan, W. B. Croft, and S. M. Harding. The INQUERY Retrieval System. In Proceedings of DEXA-92, 3rd International Conference on Database and Expert Systems Applications, 1992, 78--83.
[15]
R. Krovetz. Viewing Morphology as An Inference Process. In Proceedings of ACM SIGIR93, 1993, 61--81.
[16]
Y. Yang and J. Pedersen. A Comparative Study on Feature Selection in Text Categorization. In J. D. H. Fisher, editor, The Fourteenth International Conference on Machine Learning (ICML'97), Morgan Kaufmann, 1997, 412--420.
[17]
T. M. Cover, and J. A. Thomas. Elements of Information Theory. Wiley. 1991.
[18]
The linguistic data consortium, http://www.ldc,upenn.edu/.
[19]
The 2001 TDT task definition and evaluation plan, http://www.nist.gov/speech/tests/tdt/tdt2001/evalplan.htm.
[20]
R. E. Schapire and Y. Singer. Boostexter: A Boosting-based System for Text Categorization. In Machine Learning 39(2/3):1, Kluwer Academic Publishers, 2000, 35--168.
[21]
K. Giridhar and J. Allan. 2005. Using Names and Topics for New Event Detection. In Proceedings of Human Technology Conference and Conference on Empirical Methods in Natural Language, Vancouver, 2005, 121--128

Cited By

View all
  • (2024)Toward Cross-Lingual Social Event Detection with Hybrid Knowledge DistillationACM Transactions on Knowledge Discovery from Data10.1145/368994818:9(1-36)Online publication date: 12-Nov-2024
  • (2023)Reinforced, Incremental and Cross-Lingual Event Detection From Social MessagesIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.314499345:1(980-998)Online publication date: 1-Jan-2023
  • (2023)Iteratively Tracking Hot Topics on Public Opinion Based on Parallel IntelligenceIEEE Journal of Radio Frequency Identification10.1109/JRFID.2022.32143467(158-162)Online publication date: 2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
July 2007
946 pages
ISBN:9781595935977
DOI:10.1145/1277741
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 July 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. named entity
  2. new event detection
  3. real-time indexing
  4. topic detection and tracking information systems

Qualifiers

  • Article

Conference

SIGIR07
Sponsor:
SIGIR07: The 30th Annual International SIGIR Conference
July 23 - 27, 2007
Amsterdam, The Netherlands

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)4
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Toward Cross-Lingual Social Event Detection with Hybrid Knowledge DistillationACM Transactions on Knowledge Discovery from Data10.1145/368994818:9(1-36)Online publication date: 12-Nov-2024
  • (2023)Reinforced, Incremental and Cross-Lingual Event Detection From Social MessagesIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.314499345:1(980-998)Online publication date: 1-Jan-2023
  • (2023)Iteratively Tracking Hot Topics on Public Opinion Based on Parallel IntelligenceIEEE Journal of Radio Frequency Identification10.1109/JRFID.2022.32143467(158-162)Online publication date: 2023
  • (2023)Predicting document novelty: an unsupervised learning approachKnowledge and Information Systems10.1007/s10115-023-01989-166:3(1709-1728)Online publication date: 12-Oct-2023
  • (2022)From Text Representation to Financial Market Prediction: A Literature ReviewInformation10.3390/info1310046613:10(466)Online publication date: 29-Sep-2022
  • (2022)A Survey of Data Representation for Multi-Modality Event Detection and EvolutionApplied Sciences10.3390/app1204220412:4(2204)Online publication date: 20-Feb-2022
  • (2021)Detecting and Classifying Typhoon Information from Chinese News Based on a Neural Network ModelSustainability10.3390/su1313733213:13(7332)Online publication date: 30-Jun-2021
  • (2021)Bursty Events Detection with the Field of Mobile Customer ServiceProceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence10.1145/3507548.3507622(310-316)Online publication date: 4-Dec-2021
  • (2021)Knowledge-Preserving Incremental Social Event Detection via Heterogeneous GNNsProceedings of the Web Conference 202110.1145/3442381.3449834(3383-3395)Online publication date: 19-Apr-2021
  • (2021)A General Framework for First Story Detection Utilizing Entities and Their RelationsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.297005133:11(3482-3493)Online publication date: 1-Nov-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media