Abstract
Combination of multiple evidences (multiple query formulations, multiple retrieval schemes or systems) has been shown (mostly experimentally) to be effective in data fusion in information retrieval. However, the question of why and how combination should be done still remains largely unanswered. In this paper, we provide a model for simulation and a framework for analysis in the study of data fusion in the information retrieval domain. A rank/score function is defined and the concept of a Cayley graph is used in the design and analysis of our framework. The model and framework have led us to better understanding of the data fusion phenomena in information retrieval. In particular, by exploiting the graphical properties of the rank/score function, we have shown analytically and by simulation that combination using rank performs better than combination using score under certain conditions. Moreover, we demonstrated that the rank/score function might be used as a predictive variable for the effectiveness of combination of multiple evidences.
Article PDF
Similar content being viewed by others
References
Aslam JA, Pavlu V and Savell R (2003) A unified model for metasearch, pooling, and system evaluation. In: Proceedings of the Twelfth International Conference on Information and Knowledge Management. New Orleans, LA, pp. 484–491.
Belkin NJ, Cool C, Croft WB and Callan JP (1993) The effect of multiple query representations on information retrieval performance. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Pittsburgh, PA, pp. 339–346.
Belkin NJ, Kantor PB, Cool C, and Quatrain R (1994) Combining evidence for information retrieval. In: Harman D (ed.), TREC-2, in: Proceedings of the Second Text Retrieval Conference. Washington, D.C., GPO, pp. 35–44.
Belkin NJ, Kantor PB, Fox EA and Shaw JA (1995) Combining evidence of multiple query representation for information retrieval. Information Processing & Management, 31(3):431–448.
Biggs NL and White T (1979) Permutation Groups and Combinatorial Structures, Cambridge University Press, LMS Lecture Note Series 33.
Chuang H-Y, Liu H, Chen F-A, Kao C-Y and Hsu DF (2004) Combination method in microarray analysis, In: Proceedings of the 7th International Symposium on Parallel Architectures, Algorithms and Networks (I-SPAN’04). IEEE Computer Society Press, pp. 625–630.
Dwork C, Kumar R, Naor M and Sivakumar D (2001) Rank aggregation methods for the web. In: Proceeding of WWW10. Hong Kong, pp. 613–622.
Fagin R, Kumar R and Sivakumar D (2003) Comparing top k-lists. SIAM Journal on Discrete Mathematics. 17:134–160.
Fox EA and Shaw JA (1994) Combination of multiple searches. In: Proceedings of the Second Text Retrieval Conference (TREC-2), National Institute of Standards and Technology Special Publication 500-215, pp. 243–252.
Grammatikakis MD, Hsu DF and Kraetzl M (2001) Parallel System Interconnections and Communications. CRC Press.
Heydemann MC (1997) Cayley graphs and interconnection networks. In Hahn G. and Sabidussi G. (eds.), Graph Symmetry. Kluwer Academic Publishers, pp. 161–224.
Hsu DF, Lyons DM, Usandivaras C and Montero F (2003) RAF: A dynamic and efficient approach to fusion for multiple target tracking in CCTV surveillance. In: Proceedings of IEEE International Conference on Multisensor Fusion and Integration for Inteligent Systems (MFI). IEEE Computer Society Press, pp. 222–228.
Hsu DF and Palumbo A (2004) A study of data Fusion in Cayley Graphs G(S n , P n ). In: Proceedings of the 7th International Symposium on Parallel Architectures, Algorithms and Networks (I-SPAN’04). IEEE Computer Society Press, pp. 557–562.
Hsu DF, Shapiro J and Taksa I (2002) Methods of data fusion in information retrieval: Rank vs. score combination, DIMACS Technical Report 2002–58, pp. 1–47.
Ibraev U, Ng KB and Kantor PB (2001) Counter intuitive cases of data fusion in information retrieval. Rutgers University Technical Report.
Kantor PB (1998) Semantic dimension: On the effectiveness of naive data fusion methods in certain learning and detection problems. In: Fifth International Symposium on Artificial Intelligence and Mathematics. Ft. Lauderdale, FL.
Lee JH (1997) Analyses of multiple evidence combination. In: Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Philadelphia, PA, pp. 267–276.
Lyons DM, Hsu DF, Usandivaras C and Montero F (2003) Experimental results from using rank and fuse approach for multi-target tracking in CCTV surveillance. In: Proceedings of IEEE International Conference on AVSS. IEEE Computer Society Press, pp. 345–351.
Marden JI (1995) Analyzing and modeling rank data. Monographs on Statistics and Applied Probability No. 64, Chapman & Hall.
Ng KB and Kantor PB (1998) An investigation of the preconditions for effective data fusion in information retrieval: A pilot study. In: Proceedings of the 61st Annual Meeting of the American Society for Information Science, pp. 166–178.
Ng KB and Kantor PB (2000) Predicting the effectiveness of naïve data fusion on the basis of system characteristics, Journal of the American Society for Information Science, 51(13):1177–1189.
Pfeifer U, Poersch T and Fuhr N (1996) Retrieval effectiveness of proper name search methods. Information Processing and Management, 32(6):667–679.
van Rijsbergen CJ (1986) A new theoretical framework of information retrieval. In: Proceedings of the 9th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Pisa, Italy, pp. 194–200.
Saracevic T and Kantor PB (1988) A study of information seeking and retrieving. III Searchers, searches, overlap. Journal of the ASIS, 39:197–216.
Varshney PK (ed.) (1997) In: Proceedings of the IEEE. Special issue on data fusion 85(1) pp. 3–183.
Vogt CC and Cottrell GW (1999) Fusion via a linear combination of scores. Information Retrieval, 1(3):151–173.
Xu L, Krzyzak A and Suen CY (1992) Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Transactions on Systems, Man, and Cybernetics, 22(3):418—435.
Author information
Authors and Affiliations
Corresponding author
Additional information
Authors wish to dedicate this paper to the memory of our friend and colleague Professor Jacob Shapiro, who passed away September 2003.
Supported in part by the DIMACS NSF grant STC-91-19999 and by NJ Commission.
Supported in part by a grant from The City University of New York PSC-CUNY Research Award.
Rights and permissions
About this article
Cite this article
Frank Hsu, D., Taksa, I. Comparing Rank and Score Combination Methods for Data Fusion in Information Retrieval. Inf Retrieval 8, 449–480 (2005). https://doi.org/10.1007/s10791-005-6994-4
Issue Date:
DOI: https://doi.org/10.1007/s10791-005-6994-4