Abstract
Given two stories, Story Link Detection System identifies whether they are discussing the same event. Standard approach in link detection system is to use cosine similarity measure to find whether the two documents are linked. Many researchers applied query expansion technique successfully in link detection system, where models are built from the relevant documents retrieved from the collection using query expansion. In this approach, success depends on the quality of the information retrieval system. In the current research, we propose a new information retrieval system for query expansion that uses intra-cluster similarity of the retrieved documents in addition to the similarity with respect to the query document. Our technique enhances the quality of the retrieval system thus improving the performance of the Link Detection System. Combining this improved IR with our Cohesion Model provides excellent result in link detection. Experimental results confirm the effect of the improved retrieval system in query expansion technique.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Allan, J.: Introduction to Topic Detection and Tracking, Topic Detection and Tracking: Event-based Information Organization, pp. 1–16. Kluwer Academic Publishers, Dordrecht (2002)
Allan, J., Carbonell, J., Doddington, G., Yamron, J., Yang, Y.: Topic detection and tracking pilot study: Final report. In: Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, pp. 194–218. Morgan Kaufmann publishers, San Francisco (1998)
Topic Detection and Tracking (TDT) Project.homepage: http://www.nist.gov/speech/tests/tdt/
Lavrenko, V.: A Generative Theory of Relevance, PhD Thesis, University Of Massachusetts Amherst (September 2004)
Chen, F., Farahat, A., Brants, T.: Multiple Similarity Measures and Source-Pair Information in Story Link Detection. In: Proceedings of HLT-NAACL, pp. 313–320 (2004)
Lavrenko, V., Allan, J., DeGuzman, E., LaFlamme, D., Pollard, V., Thomas, S.: Relevance models for topic detection and tracking. In: Proceedings of Human Language Technologies Conference, HLT, pp. 104–110 (2002)
Yang, Y., Ault, T., Pierce, T., Lattimer, C.W.: Improving text categorization methods for event tracking. In: SIGIR 2000. Proceedings of the 23rd Annual international ACM SIGIR Conference on Research and Development in information Retrieval, Athens, Greece, July 24-28, 2000, pp. 65–72. ACM Press, New York (2000)
Farahat, A., Chen, F., Brants, T.: Optimizing Story Link Detection is not Equivalent to Optimizing New Event Detection. In: Dignum, F.P.M. (ed.) ACL 2003. LNCS (LNAI), vol. 2922, pp. 232–239. Springer, Heidelberg (2004)
Nallapati, R., Allan, J.: Capturing Term Dependencies using a Language Model based on Sentence Trees. In: CIKM 2002, McLean, Virginia (November 4-9, 2002)
Lakshmi, K., Mukherjee, S.: An Improved Feature Selection using Maximized Signal to Noise Ratio Technique for TC. In: ITNG 2006. Proceedings of Information Technology: New Generations, pp. 541–546 (April 2006)
Allan, J., Lavrenko, V., Frey, D., Khandelwal, V.: UMass at TDT 2000. In: Proceedings of the Topic Detection and Tracking Workshop (2000)
Figueroa, M., Lawrence Kincaid, D., Rani, M., Lewis, G. (eds.): Communication for Social Change: An Integrated Model for Measuring the Process and Its Outcomes. The Rockefeller Foundation New York (2002)
Raghavan, H., Allan, J.: Using soundex codes for indexing names in ASR documents. In: Proceedings of the HLT NAACL Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval (2004)
Lakshmi, K., Mukherjee, S.: Using Cohesion-Model for Story Link Detection System. IJCSNS International Journal of Computer Science and Network Security 7(3), 59–66 (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lakshmi, K., Mukherjee, S. (2007). Improved IR in Cohesion Model for Link Detection System. In: Perner, P. (eds) Advances in Data Mining. Theoretical Aspects and Applications. ICDM 2007. Lecture Notes in Computer Science(), vol 4597. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73435-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-73435-2_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73434-5
Online ISBN: 978-3-540-73435-2
eBook Packages: Computer ScienceComputer Science (R0)