Elsevier

Information Sciences

Volume 501, October 2019, Pages 377-387
Information Sciences

Aggregation on ordinal scales with the Sugeno integral for biomedical applications

https://doi.org/10.1016/j.ins.2019.06.023Get rights and content

Highlights

  • Sugeno integrals are well suited to aggregation of ordinal biomedical data.

  • Learning the parameters of a Sugeno integral from data is a difficult task.

  • The optimisation problem is formulated as the difference of two convex functions.

  • This allows specialised techniques to be used to minimise the objective.

  • Experimen Pts are carried out to compare different approaches.

Abstract

The Sugeno integral is a function particularly suited to the aggregation of ordinal inputs. Defined with respect to a fuzzy measure, its ability to account for complementary and redundant relationships between variables brings much potential to the field of biomedicine, where it is common for measurements and patient information to be expressed qualitatively. However, practical applications require well-developed methods for identifying the Sugeno integral’s parameters, and this task is not easily expressed using the standard optimisation approaches. Here we formulate the objective function as the difference of two convex functions, which enables the use of specialised numerical methods. Such techniques are compared with other global optimisation frameworks through a number of numerical experiments.

Introduction

Biomedicine offers significant challenges to data science and computing, from data capturing to its analysis and interpretation. With the widespread implementation of electronic medical records, as well as computerised access to laboratory test results, we have access to more data than ever, yet its usage in clinical decision support is limited [33], [38]. Signs and symptoms as observed by clinicians are usually represented on ordinal or nominal scales, which gives rise to the need for mathematical and computational models geared towards these data types. Even when the data is captured in numeric form, as it may be through laboratory tests, its apparent precision is misleading, as (a) there is natural variability in the results for a particular patient – the readings depend on the time, condition, position and prior activities; and (b) there is variability within the population, which leads to wide ranges of values considered to be “normal”.

For these reasons, clinicians operate with rather inexact (from the perspective of computational algorithms) concepts, such as “slightly enlarged”, “higher than normal”, “unremarkable”, “significantly low”, and so on. However, in combination, such inexact constructs allow clinicians to make accurate diagnoses in a broad range of situations. Granular computing technologies are well suited to provide formal modelling tools for dealing with inexact concepts. In particular, fuzzy systems provide a natural way of representing the above mentioned concepts as fuzzy sets, and then performing operations on these sets, such as aggregation, implication and logical deduction.

The concept of a linguistic variable [20], [50] further captures this idea that human reasoning can be conducted over linguistic terms along with semantic rules and modifiers. Quantifiable traits like being aged 22 can be mapped to linguistic variables like young, or very young and not old, by degrees of membership, while the linguistic values may be mapped to a range or estimate over a numeric scale.

The rules (of a decision support system, for example) can be formulated in terms of a set of linguistic variables in a way that resembles natural language, and unlike black-box methods (e.g., based on deep learning and convolutional neural networks) facilitate their understanding and eventual acceptance.

Linguistic variables are most often specified over an ordinal scale, that is, a scale on which some variables can be compared and ranked, but where the differences between sequential values are meaningless. An example of the domain of a linguistic variable is the set{“verylow”,“low”,“normal”,“high”,“veryhigh”,“extreme”}.

There is a natural order in these values but the difference between “normal” and “low” need not be the same as between “high” and “normal”. The ordinal scales can be represented by a set of integers but without the implied connotation that the distances between the labels are the same.

Aggregation of linguistic variables has been studied from different perspectives [15], [17], [29]. In this work we look at the aggregation in the framework of fuzzy measures, also called capacities [27]. Our motivation for this is that fuzzy measures explicitly model the input dependencies associated with mutual correlations, redundancies and complementarity that abound in the medical domain. For example, some clinical indicators can be unimportant on their own but crucial in the presence of other signs and symptoms, and on the contrary, certain signs or laboratory tests measure related processes (e.g., amylase and lipase tests, ESR (Erythrocyte Sedimentation Rate) and CRP (C-Reactive Protein)) and hence are partially redundant. Therefore, simple aggregation with weighted means is unsuitable here as input correlation is not accounted for.

Non-additive integrals are defined with respect to fuzzy measures (also called capacities) and aggregate the inputs by accounting for their mutual dependencies. The most representative examples of such integrals are the Choquet and Sugeno integrals which, although defined for continuous sets, will be considered here only in the discrete setting. Furthermore, we are most interested in the Sugeno integral as it is more suitable for aggregation on ordinal scales [21]. If the capacity is known, then it is a routine exercise to calculate the Sugeno integral [9]. In practice however, specifying the integral may be difficult due to the number of parameters and consistency conditions required, and it is more useful to learn the capacity from representative examples or collected datasets. This contribution is hence geared toward automatic decision model acquisition from data given on ordinal scales based on the Sugeno integral.

Previous applications of fuzzy integrals to biomedical data include [42], which used the Choquet integral to extend the method of logistic regression and performed comparative machine learning experiments on several real datasets including the breast cancer and mammographic data. Also of note is the study in [2], where the Sugeno integral was applied to estimate age-at-death of skeletal remains from a number of indicators. Other applications include landmine detection [35] and various multiple criteria decision problems [47]. We also mention a number of previous studies that have developed methods for learning fuzzy measures from empirical data by using the Choquet or Sugeno integrals [1], [8], [10], [28], [31], [32], [34], [39].

Our own recent study [24] examined learning symmetric fuzzy measures by fitting the Sugeno integral to numerical data based on an l1 fitness function. There are significant computational challenges when eliciting fuzzy measures from observational data which need to be addressed, in particular their complexity, which is exponential in the number of input parameters. Therefore, techniques based on suitable simplifications, efficient data mining and numerical optimisation are needed.

The models presented here are different in three aspects. Firstly, our data fitting objective is geared to ordinal scales and hence involves the concept of non-additive robust ordinal regression (NAROR) [39], [44]. Secondly, we do not assume the specific parametric form of the fuzzy measure (in particular Sugeno λ-measures), although we use some simplifying assumptions such as symmetry at this stage. Thirdly, we formulate the fitting problem with the help of max-min functions, which are piecewise linear and DC (difference of convex) functions, which hence provide an opportunity to use specialised DC optimisation techniques to our advantage.

The main contributions of this work are: (a) the formulation of ordinal regression models for learning Sugeno integrals from linguistic data, (b) formulation of the appropriate non-smooth and DC optimisation problems, and (c) finding and benchmarking suitable computational tools to solve the associated optimisation problems.

The paper is organised as follows. Section 2 provides some preliminary definitions and notation. Section 3 provides several formulations of the fuzzy measure learning problem in the context of the Sugeno integral. In Section 4 we discuss various simplification strategies and methods of numerical solution. Section 5 is devoted to numerical experiments and benchmarking of suitable optimisation methods. Section 6 concludes.

Section snippets

Preliminary definitions

We consider Sugeno integrals defined with respect to discrete fuzzy measures. In Table 1 we provide a list of symbols and notation for ease of reference throughout the article.

Definition 1

Let N={1,2,,n}. A discrete fuzzy measure (normalised capacity) is a set function μ:2N[0,1] which is monotonic (i.e., μ(A)μ(B) whenever AB) and satisfies μ()=0 and μ(N)=1.

We assume that the fuzzy measure values range over [0,1], however in general, the Sugeno integral can be calculated with respect to any set function

Learning problem formulations

We now assume that the data is provided in the form of linguistic variable labels, which are mapped to integers and then to the respective rationals from [0,1]. In the context of ordinal regression, fitting the values y(k) exactly is unnecessary. What is important is that the order of the observed values is matched by the prediction of the model. Thus if y(j) ≤ y(k) we require Sμ(x(j)) ≤ Sμ(x(k)). As the data may contain inaccuracies or may not correspond to the Sugeno integral with respect to

Methods of solution

As we mentioned, unlike fitting the Choquet integral to data, which involves a convex optimisation problem with the unique optimum, fitting the Sugeno integral is a non-convex multiextremal problem. As such it requires the use of global optimisation techniques. Some approximate methods based on evolutionary computing or neural networks were used in [1], [49]. A multistart local optimisation method was used in [24] for a somewhat different objective function and under the assumption of symmetry.

Computational experiments

In this section we compare the performance of various numerical optimisation methods when solving problem (8). We selected the following approaches. We chose the simplex Nelder-Mead (NM) method [40], as well as Powell’s methods newuoa and bobyqa [41]. These methods are derivative-free, which means that while they do not use derivatives of F in calculations, they do not make explicit assumptions on the non-differentiability of F. It is useful to include these numerical techniques as a basis for

Conclusion

In this paper we have approached the problem of learning fuzzy measure values in an ordinal framework toward clinical applications. To this end, we formulated a number of learning problems and showed that the resulting piecewise linear objective functions can be decomposed into the difference of two convex functions, making them suitable for DC optimisation approaches.

We then considered a range of algorithms for solving the ordinal optimisation problem posed in the paper. A specially crafted

Declaration of Interest Statement

We declare that this manuscript is original research not submitted elsewhere, and that the authors do not have conflicts of interest.

Acknowledgements

Marek Gagolewski wishes to acknowledge the support by the Czech Science Foundation through the project No.18-06915S.

References (50)

  • R. Mesiar

    Generalizations of k-order additive discrete fuzzy measures

    Fuzzy Sets Syst.

    (1999)
  • J. Murillo et al.

    K-maxitive fuzzy measures: a scalable approach to model interactions

    Fuzzy Sets Syst.

    (2017)
  • J.-Z. Wu et al.

    Nonadditivity index and capacity identification method in the context of multicriteria decision making

    Inf. Sci.

    (2018)
  • J.-Z. Wu et al.

    Nonmodularity index for capacity identifying with multiple criteria preference information

    Inf. Sci.

    (2019)
  • J.-Z. Wu et al.

    Compromise principle based methods of identifying capacities in the framework of multicriteria decision analysis

    Fuzzy Sets Syst.

    (2014)
  • L. Zadeh

    The concept of a linguistic variable and its application to approximate reasoning. part i

    Inf. Sci.

    (1975)
  • D. Anderson et al.

    Learning fuzzy-valued fuzzy measures for the fuzzy-valued sugeno fuzzy integral

    Lecture Notes Artif. Intell.

    (2010)
  • M. Anderson et al.

    Estimation of adult skeletal age-at-death using the sugeno fuzzy integral

    Am. J. Phys. Anthropol.

    (2010)
  • A. Bagirov et al.

    Sub-gradient method for non-convex non-smooth optimization

    J. Optim. Theory Appl.

    (2013)
  • A. Bagirov et al.

    Introduction to Non-smooth Optimization: Theory, Practice and Software

    (2014)
  • A. Bagirov et al.

    Codifferential method for minimizing nonsmooth DC functions

    J. Global Optim.

    (2011)
  • A. Bagirov et al.

    Nonsmooth optimization algorithm for solving cluster-wise linear regression problems

    J. Optim. Theory Appl.

    (2015)
  • G. Beliakov et al.

    A Practical Guide to Averaging Functions

    (2016)
  • G. Beliakov et al.

    Learning choquet-integral-based metrics for semisupervised clustering

    IEEE Trans. Fuzzy Syst.

    (2011)
  • G. Beliakov et al.

    Appropriate choice of aggregation operators in fuzzy decision support systems

    IEEE Trans. Fuzzy Syst.

    (2001)
  • Cited by (10)

    • Ranked hesitant fuzzy sets for multi-criteria multi-agent decisions

      2022, Expert Systems with Applications
      Citation Excerpt :

      Inspirational studies exist in related settings (Su et al., 2019; Wang et al., 2019). Lastly, it is likely that novel procedures for the aggregation of ranked hesitant fuzzy elements may be defined by resort to the Sugeno integral, precisely because this is an ordinal aggregator (Beliakov et al., 2019). Also the managerial and industrial applications of ranked hesitant fuzzy sets still remain to be explored.

    • Robust fitting for the Sugeno integral with respect to general fuzzy measures

      2020, Information Sciences
      Citation Excerpt :

      Depending on the granularity of the scale used, methods based on LS and LAD can still provide a good proxy for well performing measures. In [16,17] we have explored techniques for ordinal fitting, where the aim is to preserve the relative ranking of the outputs. In this contribution we focus on minimizing the maximum absolute error (i.e., the l∞ norm) and the median absolute error.

    • On representation of fuzzy measures for learning Choquet and Sugeno integrals

      2020, Knowledge-Based Systems
      Citation Excerpt :

      The Sugeno integral fitting problem is multiextremal, therefore global optimization approach is needed. Multistart local nonsmooth and DC optimization were successfully applied in [18,19]. The proposed change of variables has eliminated the monotonicity constraints only partially.

    View all citing articles on Scopus
    View full text