Abstract
In a previous paper we proved that Named Entity Recognition plays an important role to improve Question Answering by both increasing the quality of the data and by reducing its quantity. Here we present a more in-depth discussion, studying several ways in which NER can be applied in order to produce a maximum data reduction. We achieve a 60% reduction without significant data loss and a 92.5% with a reasonable implication in data quality.
This research has been partially funded by the Spanish Government under project CICyT number TIC2003-07158-C04-01 and under project PROFIT number FIT-340100-2004-14 and by the Valencia Government under project number GV04B-268.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lancaster, F.W.: Information Retrieval Systems: Characteristics, Testing and Evaluation. John Wiley and Sons, New York (1979)
Kaskziel, M., Zobel, J.: Passage retrieval revisited. In: Proceedings of the 20th annual International ACM Philadelphia SIGIR., pp. 178–185 (1997)
Callan, J.: Passage-level evidence in document retrieval. In: Salton, G., Schneider, H.-J. (eds.) SIGIR 1982. LNCS, vol. 146, pp. 302–310. Springer, Heidelberg (1983)
Salton, G.: Automatic text processing: The transformation, analysis, and retrieval of information by computer (1989)
Singhal, A., Buckley, C., Mitra, M.: Pivoted document length normalization. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Developement in Information Retrieval, Experimental Studies, pp. 21–29 (1996)
Roberston, S., Walker, S., Beaulieu, M.: Okapi at trec-7. In: Seventh Text RETrieval Conference, Gaithersburg, USA. National Institute of Standard and Technology, vol. 500-242, pp. 253–264 (1998)
Amati, G., Van Rijsbergen, C.J.: Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM TOIS 20, 357–389 (2002)
TREC-10: Tenth Text REtrieval Conference. In: Tenth Text REtrieval Conference. NIST Special Publication., Gaithersburg, USA, National Institute of Standards and Technology, pp.500-250 (2002)
Braschler, M., Peters, C.: CLEF 2003 methodology and metrics. In: Peters, C., Gonzalo, J., Braschler, M., Kluck, M. (eds.) CLEF 2003. LNCS, vol. 3237, pp. 7–20. Springer, Heidelberg (2004)
Chinchor, N.: Overview of muc-7. In: Proceedings of the Seventh Message Understanding Conference, MUC-7 (1998)
Borthwick, A.: A Maximum Entropy Approach to Named Entity Recognition. PhD thesis, New York University (1999)
Toral, A., Noguera, E., Llopis, F., Muńoz, R.: Improving question answering using named entity recognition. In: Montoyo, A., Muńoz, R., Métais, E. (eds.) NLDB 2005. LNCS, vol. 3513, pp. 181–191. Springer, Heidelberg (2005)
Llopis, F.: IR-n: Un Sistema de Recuperación de Información Basado en Pasajes. PhD thesis, University of Alicante (2003)
Toral, A.: DRAMNERI: a free knowledge based tool to Named Entity Recognition. In: Proceedings of the 1st Free Software Technologies Conference (2005) (accepted)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Noguera, E., Toral, A., Llopis, F., Muńoz, R. (2005). Reducing Question Answering Input Data Using Named Entity Recognition. In: Matoušek, V., Mautner, P., Pavelka, T. (eds) Text, Speech and Dialogue. TSD 2005. Lecture Notes in Computer Science(), vol 3658. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551874_55
Download citation
DOI: https://doi.org/10.1007/11551874_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28789-6
Online ISBN: 978-3-540-31817-0
eBook Packages: Computer ScienceComputer Science (R0)