Abstract
This paper starts by describing Esfinge, a general domain Portuguese question answering system that uses the redundancy available in the Web as an important resource to find its answers. The paper also presents the strategies employed to participate in CLEF-2004 and discusses the results obtained. Three different strategies were tested: searching the answers only in the CLEF document collection, searching the answers in the Web and using the CLEF document collection to confirm these answers and finally searching the answers only in the Web. The intriguing question of why the system performed better when joining the two information sources, even though it was designed for the Web is discussed; in this connection, different language varieties and some problems of Google are mentioned. The paper concludes describing some of the work planned for the near future.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Brill, E.: Processing Natural Language without Natural Language Processing. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 360–369. Springer, Heidelberg (2003)
Aires, R., Santos, D.: Measuring the Web in Portuguese. In: Euroweb 2002 conference, Oxford, UK, December 17-18, pp. 198–199 (2002)
Banerjee, S., Pedersen, T.: The Design, Implementation, and Use of the {N}gram {S}tatistic {P}ackage. In: Proceedings of the Fourth International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, February 2003, pp. 370–381 (2003)
Christ, O., Schulze, B.M., Hofmann, A., Koenig, E.: The IMS Corpus Workbench: Corpus Query Processor (CQP): User’s Manual. University of Stuttgart, March 8 (1999) (CQP V2.2)
Santos, D., Rocha, P.: Evaluating CETEMPúblico, a free resource for Portuguese. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, July 9-11, pp. 442–449 (2001)
Simões, A.M., Almeida, J.J.: Jspell.pm - um módulo de análise morfológica para uso em Processamento de Linguagem Natural. In: Gonçalves, A., Correia, C.N. (eds.) Actas do XVII Encontro da Associação Portuguesa de Linguística (APL 2001) (Lisboa, 2-4 Outubro 2001), APL Lisboa, pp. 485–495 (2002)
Brill, E., Lin, J., Banko, M., Dumais, S., Ng, A.: Data-Intensive Question Answering. In: Voorhees, E.M., Harman, D.K. (eds.) Information Technology: The Tenth Text Retrieval Conference, TREC 2001, pp. 393–400. NIST Special Publication 500-250 (2001)
Magnini, B., et al.: Overview of the CLEF 2004 Multilingual Question answering track. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B., et al. (eds.) CLEF 2004. LNCS, vol. 3491, pp. 371–391. Springer, Heidelberg (2005)
Aires, R., Manfrin, A., Aluísio, S.M., Santos, D.: What Is My Style? Stylistic features in Portuguese web pages according to IR users’ needs. In: Lino, M.T., Xavier, M.F., Ferreira, F., Costa, R., Silva, R. (eds.) Proceedings of LREC 2004, Lisboa, Portugal, May 26-28, pp. 1943–1946 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Costa, L. (2005). First Evaluation of Esfinge – A Question Answering System for Portuguese. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds) Multilingual Information Access for Text, Speech and Images. CLEF 2004. Lecture Notes in Computer Science, vol 3491. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11519645_51
Download citation
DOI: https://doi.org/10.1007/11519645_51
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27420-9
Online ISBN: 978-3-540-32051-7
eBook Packages: Computer ScienceComputer Science (R0)