Abstract
Techniques of processing databases like free text searching, or proximity search are one of the key factors that influence efficiency of query answering. Since most users prefer querying systems in natural language, a correct answer formulation based on the electronic document content seems a real challenge. Processing queries in multilingual environment usually impedes the system responsiveness even more. This paper proposes an approach of overcoming these obstacles by implementation of syntactic information extraction. Some evaluation methodologies commonly used by TREC, NTCIR, SIGIR etc are studied in order to suggest that it is not only a system architecture itself, a translation model or the document format, but also other factors that determine the system performance. The shallow technique of the syntactic information extraction used appears to be a robust of the system described. In this light, it is possible to achieve comparable results when processing monolingual and cross-lingual collections.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Brill, E., Dumais, S., Bank, M.: An Analysis of the AskMSR Question Answering System, Microsoft Research, One Microsoft Way (2003)
FEMI – a Framework for the Evaluation of Machine Translation in ISLE, Information Science Institute, USC Viterbi School of Engineering, http://www.isi.edu/natural-anguage/mteval
Danilowicz, C., Nguyen, H.C., Nguyen, N.T.: Model of Intelligent Information Retrieval Systems Using User Profiles. In: Proceedings of BIS 2003, Colorado USA, pp. 30–36 (2003)
Lin, J., Katz, B.: Question Answering from the Web Using Knowledge Mining Techniques. In: Proceedings of the 12th International Conference of Information and Knowledge Management (2003)
Wan, X.: Using Only Cross-document Relationships for Both Generic and Topic-focused Multi-document Summarizations. Springer Science+Business Media, LLC (2007)
Si, L., Callan, J., Cetintas, S., Yuan, H.: An Effective and Efficient Results Merging Strategy for Multilingual Information Retrieval in Federated Search Environments. In: Information Retrieval. Springer, Heidelberg (2008)
McCallum, A., Freitag, D., Pereira, F.: Maximum Entropy Markov Models for Information Extraction and Segmentation. In: Proceedings of 17th International Conference on Machine Learning, pp. 591–598. Morgan Kaufmann, San Francisco (2000)
Sun, A., Naing, M., Lim, E., Lam, W.: Using Support Vector Machine for Terrorism Information Extraction. In: Proceedings of 1st NSF/NIJ Symposium on Intelligence and Security Informatics (2003)
Kushmerick, N.: Finite-state Approaches to Web Information Extraction. In: Proceedings of 3rd Summer Convention on Information Extraction, Rome (2002)
Vorhees, E.: Q&A Track Guidelines. In: Proceeding of TREC-13 2004 (2004)
Carl, M., Garnier, S., Haller, J., Altmayer, A., Miemietz, B.: Controlling Gender Equality with Shallow NLP Techniques. In: 20th International Conference on Computational Linguistics, Geneva, Switzerland (2004)
Bustos, B., Keim, D., Saupe, D., Schreck, T., Vranic, D.: An Experimental Effectiveness Comparison of Methods for 3D Similarity Search. International Journal on Digital Libraries, 6/1 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mizera-Pietraszko, J. (2009). Syntactic Extraction Approach to Processing Local Document Collections. In: Andreasen, T., Yager, R.R., Bulskov, H., Christiansen, H., Larsen, H.L. (eds) Flexible Query Answering Systems. FQAS 2009. Lecture Notes in Computer Science(), vol 5822. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04957-6_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-04957-6_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04956-9
Online ISBN: 978-3-642-04957-6
eBook Packages: Computer ScienceComputer Science (R0)