Abstract
Feature selection is crucial for dimension reduction. Dozens of approaches employ the area under ROC curve, i.e., AUC, to evaluate features, and have shown their attractiveness in finding discriminative targets. However, feature complementarity for jointly discriminating classes is generally improperly handled by these approaches. In a recent approach to deal with such issues, feature complementarity was evaluated by computing the difference between the neighbors of each instance in different feature dimensions. This local-learning based approach introduces a distinctive way to determine how a feature is complementarily discriminative given another. Nevertheless, neighbor information is usually sensitive to noises. Furthermore, evaluating merely one-side information of nearest misses will definitely neglect the impacts of nearest hits on feature complementarity. In this paper, we propose to integrate all-side local-learning based complementarity into an AUC-based approach, dubbed ANNC, to evaluate pairwise features by scrutinizing their comprehensive misclassification information in terms of both k-nearest misses and k-nearest hits. This strategy contributes to capture complementary features that collaborate with each other to achieve remarkable recognition performance. Extensive experiments on openly available benchmarks demonstrate the effectiveness of the new approach under various metrics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Nie, F., Zhu, W., Li, X.: Unsupervised feature selection with structured graph optimization. In: Proceeding of the 30th AAAI, pp. 1302–1308 (2016)
Barbu, A., She, Y., Ding, L., et al.: Feature selection with annealing for computer vision and big data learning. IEEE Trans. Pattern Anal. Mach. Intell. 39(2), 272–286 (2017)
Wang, H., Zhang, P., Zhu, X., et al.: Incremental subgraph feature selection for graph classification. IEEE Trans. Knowl. Data Eng. 29(1), 128–142 (2017)
Li, J., Liu, H.: Challenges of feature selection for big data analytics. IEEE Intell. Syst. 32(2), 9–15 (2017)
Tang, J., Alelyani, S., Liu, H.: Feature selection for classification: a review. In: Data Classification: Algorithms and Applications. CRC Press, CA (2014)
Zhao, Z., Wang, L., Liu, H., Ye, J.: On similarity preserving feature selection. IEEE Trans. Knowl. Data Eng. 25(3), 619–632 (2013)
Jiang, L.X., Zhang, H., Cai, Z.H.: Discriminatively improving Naive Bayes by evolutionary feature selection. Rom. J. Inf. Sci. Technol. 9(3), 163–174 (2006)
Chen, X., Wasikowski, M.: FAST: a roc-based feature selection metric for small samples and imbalanced data classification problems. In: Proceeding of the 14th ACM SIGKDD International Conference on KDD, pp. 124–132 (2008)
Wang, Z., Chang, Y.C.: Marker selection via maximizing the partial area under the ROC curve of linear risk scores. Biostatistics 12(2), 369–385 (2011)
Wang, R., Tang, K.: Feature selection for MAUC-oriented classification systems. Neurocomputing 89, 39–54 (2012)
Mamitsuka, H.: Selecting features in microarray classification using ROC curves. Pattern Recogn. 39(12), 2393–2404 (2006)
Wang, R., Tang, K.: Feature selection for maximizing the area under the ROC curve. In: Proceeding of ICDMW 2009, pp. 400–405 (2009)
Sun, L., Wang, J., Wei, J.: AVC: selecting discriminative features on basis of AUC by maximizing variable complementarity. BMC Bioinf. 18(3), 73–89 (2017)
Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach. Learn. 45(2), 171–186 (2001)
Sun, Y., Todorovic, S., Goodison, S.: Local-learning-based feature selection for high-dimensional data analysis. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1610–1626 (2010)
Liu, Y., Tang, F., Zeng, Z.: Feature selection based on dependency margin. IEEE Trans. Cybern. 45(6), 1209–1221 (2015)
Wang, J., Wei, J.M., Yang, Z., Wang, S.Q.: Feature selection by maximizing independent classification information. IEEE Trans. Knowl. Data Eng. 29(4), 828–841 (2017)
Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1–2), 23–69 (2003)
Roffo, G., Melzi, S., Cristani, M.: Infinite feature selection. In: Proceedings of the IEEE ICCV 2015, pp. 4202–4210 (2015)
Bache, K., Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml
Van’t Veer, L.J., Dai, H., Van De Vijver, M.J., et al.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871), 530–536 (2002)
Liu, X.W., Wang, L., Zhang, J., Yin, J.P., Liu, H.: Global and local structure preservation for feature selection. IEEE Trans. Neural Netw. Learn. Syst. 25(6), 1083–1095 (2014)
Xu, J.L., Nie, F.P., Han, J.W.: Feature selection via scaling factor integrated multi-class support vector machines. In: Proceeding of the 26th IJCAI, pp. 1302–1308 (2017)
Jiang, L.X., Cai, Z.H., Zhang, H., Wang, D.H.: Not so greedy: randomly selected naive Bayes. Expert Syst. Appl. 39(6), 11022–11028 (2012)
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China under Grant No. 61772288, the Science Foundation of Tianjin China under Grant No. 18JCZDJC30900.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Jiang, X., Wang, J., Wei, J., Ruan, J., Yu, G. (2018). ANNC: AUC-Based Feature Selection by Maximizing Nearest Neighbor Complementarity. In: Geng, X., Kang, BH. (eds) PRICAI 2018: Trends in Artificial Intelligence. PRICAI 2018. Lecture Notes in Computer Science(), vol 11012. Springer, Cham. https://doi.org/10.1007/978-3-319-97304-3_59
Download citation
DOI: https://doi.org/10.1007/978-3-319-97304-3_59
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97303-6
Online ISBN: 978-3-319-97304-3
eBook Packages: Computer ScienceComputer Science (R0)