Abstract
A well known problem that many sources of data nowadays cope with, is the problem of duplicate data. In general, we can represent a data source as a collection of objects. Deduplication then consists of two main problems: (a) finding duplicate objects and (b) processing those duplicate objects. This paper contributes to the study of the latter problem by investigating functions that map a multiset of objects to a single object. Such functions are called merge functions.We investigate the specific case where an object itself is a multiset. An interesting application of this case is the problem of multiple document summarization. Next to the basic definition of such merge functions, we focus on an important property borrowed from the (more general) field of information fusion: the majority rule.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Elmagarmid, A., Ipeirotis, P., Verykios, V.: Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering 19(1), 1–16 (2007)
Bronselaer, A., De Tré, G.: Aspects of object merging. In: Proceedings of the NAFIPS Conference, Toronto, Canada, pp. 27–32 (2010)
Bronselaer, A., De Tré, G.: Properties of possibilistic string comparison. IEEE Transactions on Fuzzy Systems 18(2), 312–325 (2010)
Schweizer, B., Sklar, A.: Probabilistic metric spaces. Elsevier, Amsterdam (1983)
Fellegi, I., Sunter, A.: A theory for record linkage. American Statistical Association Journal 64(328), 1183–1210 (1969)
Lin, J., Mendelzon, A.: Knowledge base merging by majority. In: Dynamic Worlds: From the Frame Problem to Knowledge Management. Kluwer, Dordrecht (1994)
Ricardo, B.-Y., Berthier, R.-N.: Modern information retrieval. ACM Press, New York (1999)
Yager, R.: On the theory of bags. International Journal of General Systems 13(1), 23–27 (1986)
Konieczny, S., Pérez, R.: Merging information under constraints: a logical framework. Journal of Logic and Computation 12(1), 111–120 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bronselaer, A., De Tré, G., Van Britsom, D. (2011). Multiset Merging: The Majority Rule. In: Melo-Pinto, P., Couto, P., Serôdio, C., Fodor, J., De Baets, B. (eds) Eurofuse 2011. Advances in Intelligent and Soft Computing, vol 107. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24001-0_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-24001-0_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24000-3
Online ISBN: 978-3-642-24001-0
eBook Packages: EngineeringEngineering (R0)