Elsevier

Decision Support Systems

Volume 97, May 2017, Pages 69-80
Decision Support Systems

Utopia in the solution of the Bucket Order Problem

https://doi.org/10.1016/j.dss.2017.03.006Get rights and content

Highlights

  • The standard greedy algorithm for OBOP, Bucket Pivot Algorithm (BPA), is significantly improved.

  • The concept of utopian matrix is introduced and used to carry out an informed selection of the pivot.

  • Several items are allowed to “give their opinion” during the ordering process, which yields a multi-pivot instead of single-pivot approach.

  • Decision rules are provided to select the right BPA-based algorithm according to the decision maker preferences.

Abstract

This paper deals with group decision making and, in particular, with rank aggregation, which is the problem of aggregating individual preferences (rankings) in order to obtain a consensus ranking. Although this consensus ranking is usually a permutation of all the ranked items, in this paper we tackle the situation in which some items can be tied, that is, the consensus shows that there is no preference among them. This problem has arisen recently and is known as the Optimal Bucket Order Problem (OBOP).

In this paper we propose two improvements to the standard greedy algorithm usually considered to approach the bucket order problem: the Bucket Pivot Algorithm (BPA). The first improvement is based on the introduction of the Utopian Matrix, a matrix associated to a pair order matrix that represents the precedences in a collection of rankings. This idealization constitutes a superoptimal solution to the OBOP, which can be used as an extreme (sometimes feasible) best value. The second improvement is based on the use of several items as pivots to generate the bucket order, in contrast to BPA that only uses a single pivot. The set of items playing the role of decision-maker is dynamically created. We analyze separately the contribution of each improvement and also their joint effect. The statistical analysis of the experiments carried out shows that the combined use of both techniques is the best choice, showing a significant improvement in accuracy (17%) with respect to the original BPA and providing an important reduction in the variance of the output. Moreover, we provide decision rules to help the decision maker to select the right algorithm according to the problem instance.

Introduction

This paper falls in the field of group decision making (GDM), a problem in which several agents (individuals, experts, software agents, organizations, etc.) expose their opinions regarding a decision making problem, and it is necessary to reach a consensus among them [1]. Although a GDM problem may be solved by selecting one of the proposed alternatives, in this way not all the agent's particular preferences would be considered properly. Because of this, in many methodologies for GDM the operation of reaching a consensus is even considered as an additional phase of the GDM process.

Different approaches can be followed to deal with the GDM problem [2] and, in particular, with consensus reaching [1], many of them based on the use of fuzzy sets theory [3]. A simple taxonomy of consensus reaching approaches is provided in [1] based on two dimensions: allowing or not a feedback mechanism [4], [5] and evaluating alternatives by the distance between experts or the distance to the collective preference [6], [7].

In this paper we approach the GDM from the perspective of social choice and voting theories. In particular, our contribution is located in rank aggregation, a typical preference learning [8], [9] problem with many applications to decision making. The goal of rank aggregation is to combine a set of individual preferences or precedences, expressed by different agents in the form of rankings over (some of) the provided items or alternatives, into a consensus ranking which represents the collective opinion of the agents involved. Regarding the taxonomy in [1], this problem mainly falls in the category of consensus models without a feedback mechanism and with a consensus measure based on computing pairwise similarities.

Rank aggregation methods have traditionally been applied in marketing, advertisement research and applied psychology, and, as pointed out in [10], “more recently they have emerged as an important tool to combine information coming from different internet search engines or from different omics-scale biological studies”[11], [12], [13], [14], [15]. In the field of information and decision support systems, rank aggregation has also a broad applicability, which ranges from: selecting the right information system in the context of a business application [16]; assisting in the process of discovering the cloud service candidates that have the highest customer satisfaction [17]; estimating the effort and cost for developing an information system [18]; automating the process of data integration by matching concepts which describe the meaning of data in various data sources (database schemata, XML, DTDs, etc.) [19]; etc. Apart from its application as an end, solving the rank aggregation problem is also used as a building-block in dealing with problems that involve estimating the consensus permutation many times, e.g. optimization [20] and machine learning [21].

However, not all the previous applications solve the same rank aggregation problem, as this is a general term which embraces several problems. Thus, when all the agents give a complete and strict precedence ranking of the items, that is, a permutation, then the problem is known as the Kemeny ranking problem (KRP)[22], [23]. The term rank aggregation problem (RAP) is usually considered as a generalization of the KRP, allowing to the agents to produce (in)complete rankings with or without ties [24]. Both problems, KRP and RAP, have in common that the solution is a permutation (i.e. a complete ranking without ties) defined over all the items. KRP and RAP are NP-complete [24], [25], so heuristic greedy algorithms are usually employed to tackle them [23], [26], [27], [28], [29], [30], [31].

In this paper we focus on a more general, or flexible, problem, which allows us to obtain a ranking with ties as consensus. The use of ties in the solution arises as a more natural option when no strict preferences are individually or collectively given by the agents. For example, let us consider a set of rankings in which none of the agents individually expresses any preference between items 1 and 2, that is, they are tied in all the rankings given by the agents. So, why must this tie be broken in the consensus ranking? In other cases, the ties may arise from the collective opinion. For example, if we have the rankings1{1|2|3|4,2|1|3|4,1|2|4|3,2|1|4|3}, then, it is obvious that the four agents agree that i is better than j for i ∈{1,2} and j ∈{3,4}, but there is no consensus with respect to the preference between 1 and 2, and between 3 and 4. Hence, the most reasonable solution in this case would be 1,2|3,4.

Dealing with rank aggregation while allowing ties in the solution or consensus ranking is known as the Optimal Bucket Order Problem (OBOP)[32], [33]. In addition to the real-world applications inherited from the rank aggregation problem, as reported in [34], the OBOP has also been applied “in the context of seriation problems in scientific disciplines, such as Paleontology, Archaeology and Ecology”. In this paper we propose several substantial improvements to the greedy algorithm which currently constitutes the standard approach to solve the OBOP. Since these improvements lead to different BPA-based algorithms, we obtain decision rules to support the decision maker in the process of selecting the best method according to their preferences and/or the problem instance features.

The rest of the paper is organized as follows. In Section 2 we motivate our work by highlighting the weaknesses of the standard algorithm used to solve the OBOP, and state our research goal. Next, in Section 3 we describe the OBOP and the BPA algorithm, introducing the notation to be used throughout this work. Section 4 presents the concept of the Utopian Matrix and some other derived notions. In Section 5 we introduce the modifications proposed for the BPA in the case where only one item is used as pivot, which involves changing the way of selecting it. Section 6 is devoted to presenting an experimental study that confirms that the proposed modifications outperform the original BPA. In Section 7 we extend the previous results to the multi-pivot case. Then, in Section 8 we perform an experimental study for all the proposed algorithms. Finally, in Sections 9 we discuss our results.

Section snippets

Motivation and research goal

As might be expected, the OBOP is NP-Complete [32] and so several heuristic greedy approaches have been contemplated to tackle it. In [35] a heuristic algorithm is designed to obtain the consensus bucket order from a set of full rankings (permutations). A more general/flexible approach, which does not limit the kind of input rankings is the Bucket Pivot Algorithm (BPA)[32], [33]. This algorithm has a clear resemblance to quicksort. It starts with the random selection of an item as pivot and

The Optimal Bucket Order Problem (OBOP)

In this section we introduce the notation and formalize the OBOP. Given a set of items [[n]] = {1,...,n}, a bucket orderℬ is an ordered partition of [[n]] [32], [33], [37]. More precisely, it is a linear ordering of disjoint subsets (buckets) B1,B2,,Bk of [[n]], 1 ≤ kn, with i=1kBi=[[n]]. Thus, given two buckets Bi,Bj in ℬ, we will write BiBBj to indicate that Bi precedes Bj according to the bucket order ℬ. Analogously, given two items uBi,vBj, we will write uBv if BiBBj. All the

Utopian matrix and its implications for pivot selection

In this section we introduce the utopian matrix and related concepts.

Definition 1

Given a pair order matrix C, the utopian matrixUC associated with C is the n × n matrix defined as UC(u,v)=ϒ(C(v,u))where ϒ(x)=1ifx>0.750.5if0.25x0.750ifx<0.25

Then, the utopia valueuC associated with C is uC = D(UC,C).

Note that for any pair order matrix C, the maximum distance between a particular entry and the corresponding one in the associated utopian matrix UC is 0.25, and this happens only when the value of the entry is

BPA with least indecision assumption

Now, we show how the information provided by the utopian matrix can be used to select the pivot in an informed way. First we define an index to measure the goodness of selecting an item as pivot, and then we propose two different schemes to integrate its use in BPA.

Experimental study of BPALIA algorithm(s)

In this section we carry out an experimental comparison between the original BPA and the proposed BPALIA algorithms, namely LIAG and LIAL. All the experiments have been run in a personal computer with a processor Intel i7-6700, 3.40  GHz, 8 cores and 16  Gb of RAM. All the algorithms have been coded in Prolog.

As a benchmark we use 50 real-world datasets of rankings available at PrefLib[39]. In particular, we downloaded the pwg files3

Using multiple pivots

The BPA and BPALIA algorithms use a single item as pivot to decide in which list (L,S or R) the remaining items are placed (see Fig. 1). However, it seems plausible to progressively use the information provided by the items placed in the list containing the pivot (S), since this list will remain as a bucket itself in the resulting bucket order. From now on, we call this approach multi-pivot (MP).

In order to let all the items included in (S) intervene in the process of placing each new item, we

Experimental analysis

In order to explore the advantages of the multi-pivot approach we carry out a new set of experiments using the same benchmark as in Section 6. Regarding the algorithms, we consider the combination of the three BPA approaches discussed in the previous sections with the two multi-pivot strategies (MP and MP2). Consequently, we introduce six new algorithms called: BPAMP, BPAMP2, LIAGMP, LIAGMP2, LIALMP and LIALMP2. Furthermore, in our experimental study we also include the three single-pivot

Improving BPA

In Section 2 we identified the main weaknesses of BPA algorithm and outline our ideas to overcome them. Next, we summarize how our proposals have actually had success.

First, we pointed out the use of a random pivot as the most critical decision in BPA. To overcome this drawback, we proposed to select the pivot in an informed way. To do this, we introduced the theoretical concept of Utopian Matrix and showed how it may be used to evaluate the precedences matrices that are the input for the OBOP.

Acknowledgements

This work was partially financed by the Junta de Comunidades de Castilla-La Mancha, Universidad de Castilla-La Mancha and FEDER funds by means of the projects PEII-2014-049 and CCI-2014ES16RFOP010.

Juan A. Aledo received the M.S. degree in Mathematics in 1997 and the Ph.D. degree in Mathematics in 2000, both from the University of Murcia, Spain. He joined the Department of Mathematics at the University of Castilla-La Mancha (UCLM) in 1997, where he is currently a Full Professor. His main research interests include differential geometry, discrete mathematics, decision making and machine learning. In these topics Dr. Aledo has (co)authored more than sixty papers in journals, books and

References (42)

  • G. Napoles et al.

    Prototypes construction from partial rankings to characterize the attractiveness of companies in Belgium

    Appl. Soft Comput.

    (2016)
  • Y.L. Chen et al.

    An approach to group ranking decisions in a dynamic environment

    Decis. Support. Syst.

    (2010)
  • A. Ukkonen et al.

    A randomized approximation algorithm for computing bucket orders

    Inf. Process. Lett.

    (2009)
  • B. Ervural et al.

    A Taxonomy for Multiple Attribute Group Decision Making Literature

  • J. Kacprzyk et al.

    On Group Decision Making, Consensus Reaching, Voting, and Voting Paradoxes under Fuzzy Preferences and a Fuzzy Majority: A Survey and a Granulation Perspective

  • E. Herrera-Viedma et al.

    A consensus model for multiperson decision making with different preference structures

    Trans. Syst. Man Cybern. Part A

    (2002)
  • J. Fürnkranz et al.
  • Y. Lu

    Implementing an Empirical Study of Rank Aggregation Approaches Based on Real World Instances, CoRR Abs/1402.5259

    (2014)
  • S. Lin

    Rank Aggregation Methods

    Wiley Interdiscip. Rev. Computat. Stat.

    (2010)
  • M.E. Renda et al.

    Web Metasearch: Rank Vs. Score Based Rank Aggregation Methods

  • R. Kolde et al.

    Robust rank aggregation for gene list integration and meta-analysis

    Bioinformatics

    (2012)
  • Cited by (24)

    • Multi-dimensional Bayesian network classifiers for partial label ranking

      2023, International Journal of Approximate Reasoning
    • Complexity reduction and approximation of multidomain systems of partially ordered data

      2022, Computational Statistics and Data Analysis
      Citation Excerpt :

      Often, the target unknown poset is supposed to have a simple shape, as in the case of so-called bucket orders (i.e., informally speaking, of rankings with ties; see Section 8), which are relevant in many fields, for example in connection with the seriation problem in paleontology (Puolamäki et al., 2006). Algorithms for the reconstruction of bucket orders (or their subclasses) are available in Fernandez et al. (2013) Feng et al. (2008), Ukkonen et al. (2009), Aledo et al. (2017), and D'Ambrosio et al. (2019). Somewhat related to this research is the problem of reconstructing preferences, from partial information, usually in the context of the Mallows models (Lu and Boutilier, 2014) and the Plackett–Luce models (Liu et al., 2019; Zhao and Xia, 2019, 2020).

    • A highly scalable algorithm for weak rankings aggregation

      2021, Information Sciences
      Citation Excerpt :

      Finally, in Section 7 we present our concluding remarks. Throughout this paper we will use the notions of utopian matrix and utopia value introduced in [5], which we briefly review below. Finally, as pointed out in Section 1, in the recent work [6] several evolution strategies were designed to tackle the OBOP.

    • Multi-criteria node criticality assessment framework for critical infrastructure networks

      2020, International Journal of Critical Infrastructure Protection
      Citation Excerpt :

      Note that this paper extends our early work in [27] to the case of different weights for the different metrics, possibly defined over different graphs having the same set of nodes (e.g., we consider different sets of edges, each conveying specific information such as structural interconnection, flow or other dependencies) and possibly defined over subsets of the nodes. It should be noted that the problem of aggregating rankings has raised some interest in previous researches: in [28] Kendall and Hausdorff distances are used to compare rankings and a median-based approach is used to identify an overall ranking; in [29] interval ordinal rankings are considered; in [30] (and references therein) the bucket order problem is considered, i.e., finding an agreement based on several ranking matrices with ordinal information; in [31] centrality measures are combined to devise a control strategy that minimizes control energy in networked dynamical systems. Notice that, in [4], the authors quantify the correlation of centrality measures with risk levels in Dependency Risk Graphs and provide an heuristic algorithm to recursively select a subset of nodes based on the centrality measure with the highest correlation.

    View all citing articles on Scopus

    Juan A. Aledo received the M.S. degree in Mathematics in 1997 and the Ph.D. degree in Mathematics in 2000, both from the University of Murcia, Spain. He joined the Department of Mathematics at the University of Castilla-La Mancha (UCLM) in 1997, where he is currently a Full Professor. His main research interests include differential geometry, discrete mathematics, decision making and machine learning. In these topics Dr. Aledo has (co)authored more than sixty papers in journals, books and refereed international conferences.

    Jose A. Gámez received the M.S. degree in Computer Science in 1991, and the Ph.D. degree in Computer Science in 1998, both from the University of Granada, Spain. He joined the Department of Computer Science at the University of Castilla-La Mancha (UCLM) in 1991, where he is currently a Full Professor. His main research interests include probabilistic reasoning, Bayesian networks, metaheuristic algorithms, decision making, machine learning and data mining. In these topics Dr. Gamez has edited six books and six special issues of international journals. He is the (co)author of more than one hundred papers in journals, books and refereed international conferences.

    Alejandro Rosete received the M.Sc. degree in applied informatics and the Ph.D. degree in Informatics from Higher Polytechnic Institute Jose Antonio Echeverría (CUJAE), La Habana, Cuba, in 1995 and 2000, respectively. He has been the Head of the Department of Artificial Intelligence and Infrastructure of Informatics Systems, CUJAE. He has published over 40 papers. He is a co-author of the book Lógica y Algoritmos (Editorial Felix Varela, Habana, 2004). His research interests include metaheuristics, agent-oriented software engineering, decision making, data mining, fuzzy systems, and knowledge extraction based on metaheuristics.

    View full text