Utopia in the solution of the Bucket Order Problem
Introduction
This paper falls in the field of group decision making (GDM), a problem in which several agents (individuals, experts, software agents, organizations, etc.) expose their opinions regarding a decision making problem, and it is necessary to reach a consensus among them [1]. Although a GDM problem may be solved by selecting one of the proposed alternatives, in this way not all the agent's particular preferences would be considered properly. Because of this, in many methodologies for GDM the operation of reaching a consensus is even considered as an additional phase of the GDM process.
Different approaches can be followed to deal with the GDM problem [2] and, in particular, with consensus reaching [1], many of them based on the use of fuzzy sets theory [3]. A simple taxonomy of consensus reaching approaches is provided in [1] based on two dimensions: allowing or not a feedback mechanism [4], [5] and evaluating alternatives by the distance between experts or the distance to the collective preference [6], [7].
In this paper we approach the GDM from the perspective of social choice and voting theories. In particular, our contribution is located in rank aggregation, a typical preference learning [8], [9] problem with many applications to decision making. The goal of rank aggregation is to combine a set of individual preferences or precedences, expressed by different agents in the form of rankings over (some of) the provided items or alternatives, into a consensus ranking which represents the collective opinion of the agents involved. Regarding the taxonomy in [1], this problem mainly falls in the category of consensus models without a feedback mechanism and with a consensus measure based on computing pairwise similarities.
Rank aggregation methods have traditionally been applied in marketing, advertisement research and applied psychology, and, as pointed out in [10], “more recently they have emerged as an important tool to combine information coming from different internet search engines or from different omics-scale biological studies”[11], [12], [13], [14], [15]. In the field of information and decision support systems, rank aggregation has also a broad applicability, which ranges from: selecting the right information system in the context of a business application [16]; assisting in the process of discovering the cloud service candidates that have the highest customer satisfaction [17]; estimating the effort and cost for developing an information system [18]; automating the process of data integration by matching concepts which describe the meaning of data in various data sources (database schemata, XML, DTDs, etc.) [19]; etc. Apart from its application as an end, solving the rank aggregation problem is also used as a building-block in dealing with problems that involve estimating the consensus permutation many times, e.g. optimization [20] and machine learning [21].
However, not all the previous applications solve the same rank aggregation problem, as this is a general term which embraces several problems. Thus, when all the agents give a complete and strict precedence ranking of the items, that is, a permutation, then the problem is known as the Kemeny ranking problem (KRP)[22], [23]. The term rank aggregation problem (RAP) is usually considered as a generalization of the KRP, allowing to the agents to produce (in)complete rankings with or without ties [24]. Both problems, KRP and RAP, have in common that the solution is a permutation (i.e. a complete ranking without ties) defined over all the items. KRP and RAP are NP-complete [24], [25], so heuristic greedy algorithms are usually employed to tackle them [23], [26], [27], [28], [29], [30], [31].
In this paper we focus on a more general, or flexible, problem, which allows us to obtain a ranking with ties as consensus. The use of ties in the solution arises as a more natural option when no strict preferences are individually or collectively given by the agents. For example, let us consider a set of rankings in which none of the agents individually expresses any preference between items 1 and 2, that is, they are tied in all the rankings given by the agents. So, why must this tie be broken in the consensus ranking? In other cases, the ties may arise from the collective opinion. For example, if we have the rankings1{1|2|3|4,2|1|3|4,1|2|4|3,2|1|4|3}, then, it is obvious that the four agents agree that i is better than j for i ∈{1,2} and j ∈{3,4}, but there is no consensus with respect to the preference between 1 and 2, and between 3 and 4. Hence, the most reasonable solution in this case would be 1,2|3,4.
Dealing with rank aggregation while allowing ties in the solution or consensus ranking is known as the Optimal Bucket Order Problem (OBOP)[32], [33]. In addition to the real-world applications inherited from the rank aggregation problem, as reported in [34], the OBOP has also been applied “in the context of seriation problems in scientific disciplines, such as Paleontology, Archaeology and Ecology”. In this paper we propose several substantial improvements to the greedy algorithm which currently constitutes the standard approach to solve the OBOP. Since these improvements lead to different BPA-based algorithms, we obtain decision rules to support the decision maker in the process of selecting the best method according to their preferences and/or the problem instance features.
The rest of the paper is organized as follows. In Section 2 we motivate our work by highlighting the weaknesses of the standard algorithm used to solve the OBOP, and state our research goal. Next, in Section 3 we describe the OBOP and the BPA algorithm, introducing the notation to be used throughout this work. Section 4 presents the concept of the Utopian Matrix and some other derived notions. In Section 5 we introduce the modifications proposed for the BPA in the case where only one item is used as pivot, which involves changing the way of selecting it. Section 6 is devoted to presenting an experimental study that confirms that the proposed modifications outperform the original BPA. In Section 7 we extend the previous results to the multi-pivot case. Then, in Section 8 we perform an experimental study for all the proposed algorithms. Finally, in Sections 9 we discuss our results.
Section snippets
Motivation and research goal
As might be expected, the OBOP is NP-Complete [32] and so several heuristic greedy approaches have been contemplated to tackle it. In [35] a heuristic algorithm is designed to obtain the consensus bucket order from a set of full rankings (permutations). A more general/flexible approach, which does not limit the kind of input rankings is the Bucket Pivot Algorithm (BPA)[32], [33]. This algorithm has a clear resemblance to quicksort. It starts with the random selection of an item as pivot and
The Optimal Bucket Order Problem (OBOP)
In this section we introduce the notation and formalize the OBOP. Given a set of items [[n]] = {1,...,n}, a bucket orderℬ is an ordered partition of [[n]] [32], [33], [37]. More precisely, it is a linear ordering of disjoint subsets (buckets) of [[n]], 1 ≤ k ≤ n, with . Thus, given two buckets Bi,Bj in ℬ, we will write to indicate that Bi precedes Bj according to the bucket order ℬ. Analogously, given two items u ∈ Bi,v ∈ Bj, we will write if . All the
Utopian matrix and its implications for pivot selection
In this section we introduce the utopian matrix and related concepts.
Definition 1 Given a pair order matrix C, the utopian matrixUC associated with C is the n × n matrix defined as where Then, the utopia valueuC associated with C is uC = D(UC,C).
Note that for any pair order matrix C, the maximum distance between a particular entry and the corresponding one in the associated utopian matrix UC is 0.25, and this happens only when the value of the entry is
BPA with least indecision assumption
Now, we show how the information provided by the utopian matrix can be used to select the pivot in an informed way. First we define an index to measure the goodness of selecting an item as pivot, and then we propose two different schemes to integrate its use in BPA.
Experimental study of BPALIA algorithm(s)
In this section we carry out an experimental comparison between the original BPA and the proposed BPALIA algorithms, namely LIAG and LIAL. All the experiments have been run in a personal computer with a processor Intel i7-6700, 3.40 GHz, 8 cores and 16 Gb of RAM. All the algorithms have been coded in Prolog.
As a benchmark we use 50 real-world datasets of rankings available at PrefLib[39]. In particular, we downloaded the pwg files3
Using multiple pivots
The BPA and BPALIA algorithms use a single item as pivot to decide in which list (L,S or R) the remaining items are placed (see Fig. 1). However, it seems plausible to progressively use the information provided by the items placed in the list containing the pivot (S), since this list will remain as a bucket itself in the resulting bucket order. From now on, we call this approach multi-pivot (MP).
In order to let all the items included in (S) intervene in the process of placing each new item, we
Experimental analysis
In order to explore the advantages of the multi-pivot approach we carry out a new set of experiments using the same benchmark as in Section 6. Regarding the algorithms, we consider the combination of the three BPA approaches discussed in the previous sections with the two multi-pivot strategies (MP and MP2). Consequently, we introduce six new algorithms called: BPAMP, BPAMP2, LIA, LIA, LIA and LIA. Furthermore, in our experimental study we also include the three single-pivot
Improving BPA
In Section 2 we identified the main weaknesses of BPA algorithm and outline our ideas to overcome them. Next, we summarize how our proposals have actually had success.
First, we pointed out the use of a random pivot as the most critical decision in BPA. To overcome this drawback, we proposed to select the pivot in an informed way. To do this, we introduced the theoretical concept of Utopian Matrix and showed how it may be used to evaluate the precedences matrices that are the input for the OBOP.
Acknowledgements
This work was partially financed by the Junta de Comunidades de Castilla-La Mancha, Universidad de Castilla-La Mancha and FEDER funds by means of the projects PEII-2014-049 and CCI-2014ES16RFOP010.
Juan A. Aledo received the M.S. degree in Mathematics in 1997 and the Ph.D. degree in Mathematics in 2000, both from the University of Murcia, Spain. He joined the Department of Mathematics at the University of Castilla-La Mancha (UCLM) in 1997, where he is currently a Full Professor. His main research interests include differential geometry, discrete mathematics, decision making and machine learning. In these topics Dr. Aledo has (co)authored more than sixty papers in journals, books and
References (42)
- et al.
Consensus under a fuzzy context: taxonomy, analysis framework afryca and experimental case of study
Inf. Fusion
(2014) - et al.
A consistency and consensus based decision support model for group decision making with multiplicative preference relations
Decis. Support. Syst.
(2012) - et al.
Consensus reaching in committees
Eur. J. Oper. Res.
(2007) - et al.
Group consensus algorithms based on preference relations
Inform. Sci.
(2011) - et al.
Effective rank aggregation for metasearching
J. Syst. Softw.
(2011) - et al.
Utilizing customer satisfaction in ranking prediction for personalized cloud service selection
Decis. Support. Syst.
(2017) - et al.
Experiments with Kemeny ranking: what works when?
Math. Soc. Sci.
(2012) Emphasizing the rank positions in a distance-based aggregation procedure
Decis. Support. Syst.
(2011)- et al.
Using extension sets to aggregate partial rankings in a flexible setting
Appl. Math. Comput.
(2016) - et al.
Accurate algorithms for identifying the median ranking when dealing with weak and partial rankings under the Kemeny axiomatic approach
Eur. J. Oper. Res.
(2016)
Prototypes construction from partial rankings to characterize the attractiveness of companies in Belgium
Appl. Soft Comput.
An approach to group ranking decisions in a dynamic environment
Decis. Support. Syst.
A randomized approximation algorithm for computing bucket orders
Inf. Process. Lett.
A Taxonomy for Multiple Attribute Group Decision Making Literature
On Group Decision Making, Consensus Reaching, Voting, and Voting Paradoxes under Fuzzy Preferences and a Fuzzy Majority: A Survey and a Granulation Perspective
A consensus model for multiperson decision making with different preference structures
Trans. Syst. Man Cybern. Part A
Implementing an Empirical Study of Rank Aggregation Approaches Based on Real World Instances, CoRR Abs/1402.5259
Rank Aggregation Methods
Wiley Interdiscip. Rev. Computat. Stat.
Web Metasearch: Rank Vs. Score Based Rank Aggregation Methods
Robust rank aggregation for gene list integration and meta-analysis
Bioinformatics
Cited by (24)
Multi-dimensional Bayesian network classifiers for partial label ranking
2023, International Journal of Approximate ReasoningPairwise learning for the partial label ranking problem
2023, Pattern RecognitionApproximate Condorcet Partitioning: Solving large-scale rank aggregation problems
2023, Computers and Operations ResearchComplexity reduction and approximation of multidomain systems of partially ordered data
2022, Computational Statistics and Data AnalysisCitation Excerpt :Often, the target unknown poset is supposed to have a simple shape, as in the case of so-called bucket orders (i.e., informally speaking, of rankings with ties; see Section 8), which are relevant in many fields, for example in connection with the seriation problem in paleontology (Puolamäki et al., 2006). Algorithms for the reconstruction of bucket orders (or their subclasses) are available in Fernandez et al. (2013) Feng et al. (2008), Ukkonen et al. (2009), Aledo et al. (2017), and D'Ambrosio et al. (2019). Somewhat related to this research is the problem of reconstructing preferences, from partial information, usually in the context of the Mallows models (Lu and Boutilier, 2014) and the Plackett–Luce models (Liu et al., 2019; Zhao and Xia, 2019, 2020).
A highly scalable algorithm for weak rankings aggregation
2021, Information SciencesCitation Excerpt :Finally, in Section 7 we present our concluding remarks. Throughout this paper we will use the notions of utopian matrix and utopia value introduced in [5], which we briefly review below. Finally, as pointed out in Section 1, in the recent work [6] several evolution strategies were designed to tackle the OBOP.
Multi-criteria node criticality assessment framework for critical infrastructure networks
2020, International Journal of Critical Infrastructure ProtectionCitation Excerpt :Note that this paper extends our early work in [27] to the case of different weights for the different metrics, possibly defined over different graphs having the same set of nodes (e.g., we consider different sets of edges, each conveying specific information such as structural interconnection, flow or other dependencies) and possibly defined over subsets of the nodes. It should be noted that the problem of aggregating rankings has raised some interest in previous researches: in [28] Kendall and Hausdorff distances are used to compare rankings and a median-based approach is used to identify an overall ranking; in [29] interval ordinal rankings are considered; in [30] (and references therein) the bucket order problem is considered, i.e., finding an agreement based on several ranking matrices with ordinal information; in [31] centrality measures are combined to devise a control strategy that minimizes control energy in networked dynamical systems. Notice that, in [4], the authors quantify the correlation of centrality measures with risk levels in Dependency Risk Graphs and provide an heuristic algorithm to recursively select a subset of nodes based on the centrality measure with the highest correlation.
Juan A. Aledo received the M.S. degree in Mathematics in 1997 and the Ph.D. degree in Mathematics in 2000, both from the University of Murcia, Spain. He joined the Department of Mathematics at the University of Castilla-La Mancha (UCLM) in 1997, where he is currently a Full Professor. His main research interests include differential geometry, discrete mathematics, decision making and machine learning. In these topics Dr. Aledo has (co)authored more than sixty papers in journals, books and refereed international conferences.
Jose A. Gámez received the M.S. degree in Computer Science in 1991, and the Ph.D. degree in Computer Science in 1998, both from the University of Granada, Spain. He joined the Department of Computer Science at the University of Castilla-La Mancha (UCLM) in 1991, where he is currently a Full Professor. His main research interests include probabilistic reasoning, Bayesian networks, metaheuristic algorithms, decision making, machine learning and data mining. In these topics Dr. Gamez has edited six books and six special issues of international journals. He is the (co)author of more than one hundred papers in journals, books and refereed international conferences.
Alejandro Rosete received the M.Sc. degree in applied informatics and the Ph.D. degree in Informatics from Higher Polytechnic Institute Jose Antonio Echeverría (CUJAE), La Habana, Cuba, in 1995 and 2000, respectively. He has been the Head of the Department of Artificial Intelligence and Infrastructure of Informatics Systems, CUJAE. He has published over 40 papers. He is a co-author of the book Lógica y Algoritmos (Editorial Felix Varela, Habana, 2004). His research interests include metaheuristics, agent-oriented software engineering, decision making, data mining, fuzzy systems, and knowledge extraction based on metaheuristics.