Rough set methods in feature selection via submodular function

Zhu, Xiao-Zhong; Zhu, William; Fan, Xin-Nan

doi:10.1007/s00500-015-2024-7

Rough set methods in feature selection via submodular function

Methodologies and Application
Published: 30 January 2016

Volume 21, pages 3699–3711, (2017)
Cite this article

Soft Computing Aims and scope Submit manuscript

Xiao-Zhong Zhu¹,
William Zhu² &
Xin-Nan Fan¹

484 Accesses
10 Citations
Explore all metrics

Abstract

Attribute reduction is an important problem in data mining and machine learning in that it can highlight favorable features and decrease the risk of over-fitting to improve the learning performance. With this regard, rough sets offer interesting opportunities for this problem. Reduct in rough sets is a subspace of attributes/features which are jointly sufficient and individually necessary to satisfy a certain criterion. Excessive attributes may reduce diversity and increase correlation among features, a lower number of attributes may also receive nearly equal to or even higher classification accuracy in some specific classifiers, which motivates us to address dimensionality reduction problems with attribute reduction from the joint viewpoint of the learning performance and the reduct size. In this paper, we propose a new attribute reduction criterion to select lowest attributes while keeping the best performance of the corresponding learning algorithms to some extent. The main contributions of this work are twofold. First, we define the concept of k-approximate-reduct, instead of the limitation to minimum reduct, which provides an important view to reveal the connection between the size of attribute reduct and the learning performance. Second, a greedy algorithm for attribute reduction problems based on mutual information is developed, and submodular functions are used to analyze its convergence. By the property of diminishing return of the submodularity, there is a solid guarantee for the reasonability of the k-approximate-reduct. It is noted that rough sets serve as an effective tool to evaluate both the marginal and joint probability distributions among attributes in mutual information. Extensive experiments in six real-world public datasets from machine learning repository demonstrate that the selected subset by mutual information reduct comes with higher accuracy with less number of attributes when developing classifiers naive Bayes and radial basis function network.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

http://archive.ics.uci.edu/ml/.

References

Bach F (2011) Learning with submodular functions: a convex optimization perspective. arXiv:1111.6453 [csLG]
Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: 16th ACM SIGKDD conference on knowledge discovery and data mining (KDD’10)
Chen S, Cowan CF, Grant PM (1991) Orthogonal least squares learning algorithm for radial basis function networks. Neural Netw IEEE Trans 2(2):302–309
Article Google Scholar
Chen D, Zhang L, Zhao S, Hu Q, Zhu P (2012) A novel algorithm for finding reducts with fuzzy rough sets. IEEE Trans Fuzzy Syst 20(2):385–389
Diao R, Shen Q (2012) Feature selection with harmony search. IEEE Trans Syst Man Cybern Part B Cybern 42(6):1509–1523
Article Google Scholar
Golub G, Reinsch C (1970) Singular value decomposition and least squares solutions. Numer Math 14(5):403–420
Article MathSciNet MATH Google Scholar
Gray RM (2011) Entropy and information theory. Springer
Gu Q, Li Z, Han J (2012) Generalized fisher score for featureselection. arXiv:1202.3725
Han D, Kim J (2015) Unsupervised simultaneous orthogonal basis clustering feature selection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5016–5023
He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. In: Advances in neural information processing systems, pp 507–514
Hu Q, Liu J, Yu D (2007) Mixed feature selection based on granulation and approximation. Knowl Based Syst 21(4):294–304
Article Google Scholar
Hu Q, Yu D, Liu J, Wu C (2008a) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594
Article MathSciNet MATH Google Scholar
Hu Q, Yu D, Xie Z (2008a) Numerical attribute reduction based on neighborhood granulation and rough approximation. J Softw (in Chinese) 19(3):640–649
Article MATH Google Scholar
Hu Q, Yu D, Pedrycz W, Chen D (2011) Kernelized fuzzy rough sets and their applications. IEEE Trans Knowl Data Eng 23(11):1649–1667
Article Google Scholar
Jolliffe I (ed) (1986) Principal component analysis. Springer
Krause A, Golovin D (2012) Submodular function maximization. Tractability Pract Approaches Hard Prob 3:1–28
Krause A, Guestrin C (2012) Near-optimal nonmyopic value of information in graphical models. arXiv:1207.1394 [csAI]
Kumar R, Moseley B, Vassilvitskii S, Vattani A (2013) Fast greedy algorithms in mapreduce and streaming. In: Proceedings of the 25th ACM symposium on parallelism in algorithms and architectures, pp 1–10
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791
Article Google Scholar
Ma R, Barzigar N, Roozgard A, Cheng S (2014) Decomposition approach for low-rank matrix completion and its applications. Signal Process IEEE Trans 62(7):1671–1683
Article MathSciNet Google Scholar
Min F, He H, Qian Y, Zhu W (2011) Test-cost-sensitive attribute reduction. Inf Sci 181:4928–4942
Article Google Scholar
Min F, Zhu W (2012) Attribute reduction of data with error ranges and test costs. Inf Sci 211:48–67
Article MathSciNet MATH Google Scholar
Mitchell TM (1997) Machine learning. WCB
Nemhauser G, Wolsey L, Fisher M (1978) An analysis of approximations for maximizing submodular set functions. Math Program 14:265–294
Article MathSciNet MATH Google Scholar
Oxley JG (1993) Matroid theory. Oxford University Press, New York
MATH Google Scholar
Parthalain NM, Shen Q, Jensen R (2010) A distance measure approach to exploring the rough set boundary region for attribute reduction. IEEE Trans Knowl Data Eng 22(3):305–317
Article Google Scholar
Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11:341–356
Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishers, Boston
Book MATH Google Scholar
Pedrycz W (1993) Fuzzy control and fuzzy systems. Research Studies Press, Taunton
MATH Google Scholar
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. Pattern Anal Mach Intell IEEE Trans 27(8):1226–1238
Qian Y, Liang J, Dang C (2010a) Incomplete multigranulation rough set. IEEE Trans Syst Man Cybern Part A Syst Hum 40(2):420–431
Article Google Scholar
Qian Y, Liang J, Pedrycz W, Dang C (2010a) Positive approximation: an accelerator for attribute reduction in rough set theory. Artif Intell 174(9–10):597–618
Article MathSciNet MATH Google Scholar
Stobbe P (2013) Convex analysis for minimizing and learning submodular set functions. PhD thesis, California Institute of Technology
Tsang ECC, Yeung DS, Wang XZ (2003) Offss: optimal fuzzy-valued feature subset selection. IEEE Trans Fuzzy Syst 11(2):202–213
Article Google Scholar
Wang H, Bell D, Murtagh F (1999) Axiomatic approach to feature subset selection based on relevance. IEEE Trans Pattern Anal Mach Intell 21(3):271–276
Article Google Scholar
Wang G, Yu H, Yang D (2002) Decision table reduction based on conditional information entropy. Chin J Comput 2(7):759–766
MathSciNet Google Scholar
Wang S, Pedrycz W, Zhu Q, Zhu W (2015) Subspace learning for unsupervised feature selection via matrix factorization. Pattern Recogn 48(1):10–19
Article Google Scholar
Wei K, Liu Y, Kirchhoff K, Bilmes J (2013) Using document summarization techniques for speech data subset selection. In: Proceedings of NAACL-HLT, pp 721–726
Wei L, Zhao X, Zhou X (2015) An enhanced entropy-k-nearest neighbor algorithm based on attribute reduction. In: Proceedings of the 4th international conference on computer engineering and networks, Springer, pp 381–388
Xu F, Miao D, Wei L (2009) Fuzzy-rough attribute reduction via mutual information with an application to cancer classification. Comput Math Appl 57(6):1010–1017
Article MATH Google Scholar
Yao Y (2004) A partition model of granular computing. Trans Rough Sets I 3100:232–253
Article MATH Google Scholar
Zhao Y, Lou F, Wong S, Yao Y (2007) A general definition of an attribute reduct. Rough Sets Knowl Technol LNAI 4481:101–108
Article Google Scholar
Zhen Z, Zeng X, Wang H, Han L (2011) A global evaluation criterion for feature selection in text categorization using kullback-leibler divergence. In: Soft computing and pattern recognition (SoCPaR), 2011 international conference of, IEEE, pp 440–445
Zhu W, Wang F (2007) On three types of covering rough sets. IEEE Trans Knowl Data Eng 19(8):1131–1144
Article Google Scholar
Ziarko W (2015) Dependency analysis and attribute reduction in the probabilistic approach to rough sets. In: Feature selection for data and pattern recognition, Springer, pp 93–111

Download references

Acknowledgments

This study was funded by the National Natural Science Foundation of China (Grant No. 61379049).

Author information

Authors and Affiliations

College of IOT Engineering, Hohai University, Changzhou, 213022, China
Xiao-Zhong Zhu & Xin-Nan Fan
Lab of Granular Computing, Minnan Normal University, Zhangzhou, 363000, China
William Zhu

Authors

Xiao-Zhong Zhu
View author publications
You can also search for this author in PubMed Google Scholar
William Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Xin-Nan Fan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to William Zhu.

Ethics declarations

Conflict of interest

All the authors declare that they have no conflict of interest.

Human and animal studies

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, XZ., Zhu, W. & Fan, XN. Rough set methods in feature selection via submodular function. Soft Comput 21, 3699–3711 (2017). https://doi.org/10.1007/s00500-015-2024-7

Download citation

Published: 30 January 2016
Issue Date: July 2017
DOI: https://doi.org/10.1007/s00500-015-2024-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Rough set methods in feature selection via submodular function

Abstract

Access this article

Similar content being viewed by others

Governance of the Redundancy in the Feature Selection Based on Rough Sets’ Reducts

A Multi-objective Attribute Reduction Method in Decision-Theoretic Rough Set Model

Similarity-based attribute reduction in rough set theory: a clustering perspective

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and animal studies

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Rough set methods in feature selection via submodular function

Abstract

Access this article

Similar content being viewed by others

Governance of the Redundancy in the Feature Selection Based on Rough Sets’ Reducts

A Multi-objective Attribute Reduction Method in Decision-Theoretic Rough Set Model

Similarity-based attribute reduction in rough set theory: a clustering perspective

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and animal studies

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation