Abstract
Information retrieval (IR) with metadata tends to have high precision as long as the user expresses the information need accurately but may suffer from low recall because queries are too exact with the specification of the metadata fields. On the other hand, full-text retrieval tends to suffer more from low precision especially when queries are simple and the number of documents is large. While structured queries targeted at metadata can be quite precise and the retrieval results can be accurate, it is not easy to construct an effective structured query without understanding the characteristics of the metadata. Casual users, however, are usually interested in spending time to understand the meaning of various metadata. In this paper, we propose a hybrid IR model that searches both metadata and text fields of documents. User queries are analyzed and converted into a hybrid query automatically. Experiments show that the hybrid approach outperforms either of the cases, i.e. searching text only or metadata only.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley, New York (1999)
Callan, J.P.: Document filtering with inference networks. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland, pp. 262–269 (1996)
Calado, P., Cristo, M., Moura, E., Ziviani, B., Goncalves, M.: Combining link-based and content-based methods for web document classification. In: Proceedings of the 12th International Conference on Information and Knowledge Management, New Orleans, LA, USA, pp. 394–401 (2003)
Campos, L.M., Ferenandez-Luna, J.M., Huete, J.F.: Query Expansion in Information Retrieval Systems Using a Bayesian Network-Based Thesaurus. In: Proceedings of the 14th Annual Conference on Uncertainty in Artificial Intelligence (UAI 1998), San Francisco, CA, 53-60 (1998)
Calado, P., Silva, A.S., Vieria, R.C., Laender, A.H.F., Ribeiro-Neto, B.A.: Searching Web Databases by Structuring Keyword-based Queries. In: Proceedings of the 11th International Conference on Information and Knowledge Management, McLean, VA USA, 26-33 (2002)
Dumais, S.T., Platt, P., Hecherman, D., Sahami, M.: Inductive learning algorithms and representations for text categorization. In: Proceedings of the 7th International Conference on Information and Knowledge Management CIKM 1998, Bethesda, Maryland, USA, pp. 148–155 (1998)
Deniman, D., Sumner, T., Davis, L., Bhushan, S., Jackson, S.: Merging Metadata and Content-Based Retreival. Proceedings of Journal of Digital Information 4(3)
Goncalves, M.A., Fox, E.A., Krowne, A., Calado, P., Laender, A.H.F., Silva, A.S., Ribeiro-Neto, B.A.: The effectiveness of Automatically Structured Queries in Digital libraries. In: Proceedings of the 2004 joint ACM/IEEE conference on Digital libraries - Volume 00, Tuscon, AZ, USA (2004)
Haines, D., Croft, W.: Relevance feedback and inference networks. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Pittsburgh, PA, USA, June 2003, pp. 2–11 (1993)
Myaeng, S.H., Jang, D.H., Kim, M.S., Zhoo, J.C.: A Flexible Model for Retrieval of SGML documents. In: Proc. of the 21st ACM SIGIR International Conference on Research and Development in Information Retrieval, Melbourne, Australia (1998)
Passin, T.B.: Explorer’s Guide to the Semantic Web. Manning press (2004)
Ribeiro-Neto, B., Muntz, R.: A belief network model for IR. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland, August 1996, pp. 253–260 (1996)
Silva, I., Ribeiro-Neto, B., Calado, P., Moura, E., Ziviani, N.: Linked-based and Content-Based Evidential Information in a Belief Network Model. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Athens, Greece, pp. 96–103 (2000)
Turtle, H.R., Croft, W.B.: Inference networks for document retrieval. In: Proceedings of the 13th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Brussels, Belgium, September 1990, pp. 1–24 (1990)
Turtle, H.R., Croft, W.B.: Croft. Evaluation of an Inference network-Based Retrieval Model. ACM Transactions on Information Systems 9(3), 187–222 (1991)
Valle, R.F., Ribeiro-Neto, B.A., Lima, L.R.S., Laender, A.H.F., Junior, H.R.F.F.: Improving text retrieval in medical collections through automatic categorization. In: Proceedings of the 10th International Symposium on String Processing and Information Retrieval SPIRE 2003, Manaus, Brazil, pp. 197–210 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, S.S., Myaeng, S.H., Yoo, JM. (2005). A Hybrid Information Retrieval Model Using Metadata and Text. In: Fox, E.A., Neuhold, E.J., Premsmit, P., Wuwongse, V. (eds) Digital Libraries: Implementing Strategies and Sharing Experiences. ICADL 2005. Lecture Notes in Computer Science, vol 3815. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11599517_27
Download citation
DOI: https://doi.org/10.1007/11599517_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30850-8
Online ISBN: 978-3-540-32291-7
eBook Packages: Computer ScienceComputer Science (R0)