Abstract
In this paper, we present two new list fusion strategies for the collection fusion problem in web metasearch. The new approaches fall in the category of isolated methods since they consider as input the lists obtained by different search engines only, plus a fuzzy degree of relevance in one of the methods. In the latter case, we employ a recently proposed approach for the representation of fuzzy preferences, namely, RL-preference relations. A remarkable contribution of the approaches is that the result of the fusion is an ordered list of indistinguishable groups of documents. This kind of output represents a good compromise between understandability and accuracy of the result. We have illustrated the performance of the method by means of several experiments and a comparison with other web metasearchers.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
The numeration of the top-20 and the top-10 results is different, since Google documents are numerated from \(x_1\) up to \(x_{20}\) in the first case, while in the second case, the Google documents are numerated from \(x_1\) up to \(x_{10},\) so the rest of the documents of the other engines are numerated from \(x_{21}\) in the first case and from \(x_{11}\) in the second one.
References
Bartell BT, Cottrell GW, Belew RK (1994) Automatic combination of multiple ranked retrieval systems. In: Proceedings of the 17th annual international ACM SIGIR conference on research and development in information retrieval, pp 173–181
Craswell N, Hawking D, Thistlewaite P (1999) Merging results from isolated search engines. In: Proceedings of the tenth Australasian database conference, Auckland, New Zealand, 18–21 January, pp 189–200
Delgado M, Herrera F, Herrera-Viedma E, Martin-Bautista MJ, Martinez L, Vila MA (2002) A communication model based on the 2-tuple fuzzy linguistic representation for a distributed intelligent agent system on internet. Soft Comput 6:320–328
Ding W, Marchionini G (1996) A comparative study of Web search service performance. In: Hardin S (ed) Proceedings of the 59th annual meeting of the American Society for Information Science, pp 136–142
Fox EA, Shaw JA (1994) Combination of multiple searches. In: Harmon DK (ed) The second text retrieval conference (TREC-2). National Institute of Standards and Technology Special Publication 500–215, pp 243–252
Gauch S, Wang G, Gómez M (1996) Profusion: intelligent fusion from multiple, distributed search engines. J Univ Comput Sci 2(9):637–649
Gravano L, Chang CK, García-Molina H, Paepcke A (1997) STARTS: Stanford proposal for Internet metasearching. In: Proceedings of the 1997 ACM international conference on management of data (SIGMOD’97)
Jansen BJ, Spink A, Koshman S (2007) Web searcher interaction with de Dogpile.com metasearch engine. J Am Soc Inf Sci 58(5):744–755
Lawrence S, Giles CL (1998) Inquirus, the NECI meta search engine. In: Proceedings of the seventh international World Wide Web conference, pp 95–105
Mazur Z (1994) Models of a distributed information retrieval system based on thesauri with weights. Inf Process Manag 30(1):61–77
Sánchez D, Delgado M, Vila MA (2008a) A restriction level approach to the representation of imprecise properties. In: IPMU 2008, pp 153–159
Sánchez D, Martín-Bautista MJ, Delgado M, Vila MA (2008b) A restriction level approach to preference modelling. In: Ruan D, Montero J, Lu J, Martínez L, D’hondt P, Kerre EE (eds) Computational intelligence in decision and control. Proceedings of the 8th international FLINS conference. World Scientific, UK, pp 283–288
Selberg E, Etzioni O (1997) The MetaCrawler architecture for resource aggregation on the Web. IEEE Expert 12(1):11–14
Spink A, Jansen BJ, Blakely C, Koshman S (2006) A study of results overlap and uniqueness among major Web search engines. Inf Process Manag 42(5):1379–1391
Voorhees EM, Gupta NK, Johnson-Laird B (1994) The collection fusion problem. In: Proceedings of the third text retrieval conference (TREC-3), pp 95–104
Yager RR, Rybalov A (1998) On the fusion of documents from multiple collection information retrieval systems. J Am Soc Inf Sci 49(13):1177–1184
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Martín-Bautista, M.J., Sánchez, D., Vila, M.A. et al. A new fusion strategy for web metasearch. Soft Comput 14, 847–855 (2010). https://doi.org/10.1007/s00500-009-0467-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-009-0467-4