A fast heuristic attribute reduction approach to ordered decision systems

https://doi.org/10.1016/j.ejor.2017.03.029Get rights and content

Highlights

  • Time efficiency of algorithms using and non-using the core for finding a reduct is analyzed and compared.

  • The universe is reduced by removing uninformative objects.

  • The insignificant criteria are dropped to lessen the number of features.

  • The reducts obtained by both the accelerated and original versions are the same.

  • Experimental analysis shows the validity and efficiency of accelerated methods.

Abstract

Rough set theory has shown success in being a filter-based feature selection approach for analyzing information systems. One of its main aims is to search for a feature subset called a reduct, which preserves the classification ability of the original system. In this paper, we consider ordered decision systems, where the preference order, a fundamental concept in dominance-based rough set approach, plays a critical role. In recent literature, based on the greedy hill climbing method, many heuristic attribute reduction algorithms are proposed by utilizing significance measures of attributes, and they are extended to deal with ordered decision systems. Unfortunately, they are often time-consuming, especially when applied to deal with large scale data sets with high dimensions. To reduce the complexity, a novel accelerator is introduced in heuristic algorithms from the perspectives of objects and criteria. Based on the new accelerator, the number of objects and the dimension of criteria are lessened thus making the accelerated algorithms faster than their original counterparts while maintaining the same reducts. Experimental analysis shows the validity and efficiency of the proposed methods.

Section snippets

Introduction and related work

Feature (subset) selection, also known as attribute reduction by the rough set community, becomes naturally an important but difficult problem encountered in many practical areas such as machine learning, pattern recognition and data mining, especially in this era of information explosion (Blum, Langley, 1997, Gunal, Edizkan, 2008, Piramuthu, 2004). The main objective of this technique is to seek the relevant features from the original feature set without incurring much loss of information.

Dominance-based rough set approach

In this section, to make this paper self-contained, we give a brief introduction to several basic topics, such as lower and upper approximations, and the quality of classification of ordered information systems.

Two common algorithms for acquisition of a reduct of an ordered decision system

Given an information system, a reduct is a minimal subset which has the same discriminating or sorting ability as the full set of available attributes in the system. In this section, the DRSQR algorithm as well as the HARCC algorithm is provided to derive a (super-)reduct of a given ODS.

An accelerator for attribute reduction in ordered decision systems

Despite the DRSQR/HARNC algorithm and the HARCC algorithm are of polynomial time complexity with respect to |U| and |C|, one still needs a rather long time when they are applied to high dimensional data sets. To improve the performance of proposed algorithms, we attempt to find ways of reducing the number of objects and dimension of criteria.

Experiments and analysis

In this section, some real-world tasks are gathered in the empirical study whose objective is to test the feasibility and efficiency of the proposed methods.

Ordinal classification with monotonicity constraints (monotonicity classification for short) is a special class of ordinal regression problems. In this task, object x is described by a k-dimensional vector (x1,x2,,xk) and is assigned the ordered class label λ(x). The monotonicity constraints can be expressed as (Kotłowski, Dembczyński,

Conclusions

Dominance-based rough set approach takes users’ preferences into consideration for reasoning about ordinal data, which distinguishes itself from other extensions of the rough set theory. The classical reduct which preserves the quality of classification is usually compared with other kinds of reducts in aspects such as length, stability, computational time and classification accuracy. Thus, an efficient acquisition scheme for a reduct is a necessity for further study from an empirical

Acknowledgments

The authors sincerely thank the three anonymous reviewers for their constructive comments and valuable suggestions which helped improve this paper significantly. This research was supported by the National Natural Science Foundation of China (Grant nos. 11571010, 61179038) and the Fundamental Research Funds for the Central Universities (Grant no. 2015201020201).

References (65)

  • S. Greco et al.

    Rough sets methodology for sorting problems in presence of multiple attributes and criteria

    European Journal of Operational Research

    (2002)
  • S. Greco et al.

    Putting Dominance-based Rough Set Approach and robust ordinal regression together

    Decision Support Systems

    (2013)
  • S. Gunal et al.

    Subspace based feature selection for pattern recognition

    Information Sciences

    (2008)
  • HuQ. et al.

    Mixed feature selection based on granulation and approximation

    Knowledge-Based Systems

    (2008)
  • HuQ. et al.

    Information-preserving hybrid data reduction based on fuzzy-rough techniques

    Pattern Recognition Letters

    (2006)
  • KadzińskiM. et al.

    Robust Ordinal Regression for Dominance-based Rough Set Approach to multiple criteria sorting

    Information Sciences

    (2014)
  • R. Kohavi et al.

    Wrappers for feature subset selection

    Artificial Intelligence

    (1997)
  • W. Kotłowski et al.

    Stochastic dominance-based rough set model for ordinal classification

    Information Sciences

    (2008)
  • LiangJ. et al.

    An accelerator for attribute reduction based on perspective of objects and attributes

    Knowledge-Based Systems

    (2013)
  • Z. Pawlak

    Rough sets

    International Journal of Computer and Information Sciences

    (1982)
  • R. Potharst et al.

    Classification trees for problems with monotonicity constraints

    ACM SIGKDD Explorations Newsletter

    (2002)
  • QianY. et al.

    An efficient accelerator for attribute reduction from incomplete data in rough set framework

    Pattern Recognition

    (2011)
  • QianY. et al.

    Fuzzy-rough feature selection accelerator

    Fuzzy Sets and Systems

    (2015)
  • J.R. Quinlan

    Induction of decision trees

    Machine Learning

    (1986)
  • ShenQ. et al.

    A rough-fuzzy approach for generating classification rules

    Pattern Recognition

    (2002)
  • ShuW. et al.

    A fast approach to attribute reduction from perspective of attribute measures in incomplete decision systems

    Knowledge-Based Systems

    (2014)
  • ShuW. et al.

    An incremental approach to attribute reduction from dynamic incomplete decision systems in rough set theory

    Data and Knowledge Engineering

    (2015)
  • SkowronA. et al.

    The discernibility matrices and functions in information systems

  • R. Susmaga et al.

    Generation of rough sets reducts and constructs based on inter-class and intra-class information

    Fuzzy Sets and Systems

    (2015)
  • R. Susmaga et al.

    Generation of reducts and rules in multi-attribute and multi-criteria classification

    Control and Cybernetics

    (2000)
  • M. Szelag et al.

    Variable consistency dominance-based rough set approach to preference learning in multicriteria ranking

    Information Sciences

    (2014)
  • YaoY.

    The two sides of the theory of rough sets

    Knowledge-Based Systems

    (2015)
  • Cited by (36)

    • A novel incremental attribute reduction by using quantitative dominance-based neighborhood self-information

      2023, Knowledge-Based Systems
      Citation Excerpt :

      Therefore, attribute reduction methods based on DRSA and its extended models are also one of the mainstreams of current research [22–24]. Du and Hu introduced a QuickReduct method and a heuristic attribute reduction method based on the DRSA model [25]. To deal with numerical ordered data, Hu et al. explored an extended DRSA, namely the fuzzy preference based rough set model, and proposed the corresponding feature selection algorithm [26].

    • Self-adaptive weighted interaction feature selection based on robust fuzzy dominance rough sets for monotonic classification

      2022, Knowledge-Based Systems
      Citation Excerpt :

      Qian et al. introduced an attribute reduction method with rank-preservation based on the variable dominance rough set model [49]. Du et al. presented a heuristic attribute reduction method based on dominance-based rough fuzzy set model [21], and then they successively introduced QuickReduct algorithm and heuristic attribute reduction algorithm based on DRSA [50]. Based on FDRS model, Wang et al. designed an ensemble learning strategy based on the discernibility matrix for feature selection [30].

    View all citing articles on Scopus
    View full text