Abstract
Feature engineering determines the upper limit of the performance of machine learning algorithm. And feature selection is the most critical step in feature engineering. However, the dimensional disasters are caused by high-dimensional and multi-granularity feature data, which makes effective feature selection very difficult. We propose a feature selection based on the Convolutional Neural Networks and Random Forest (FSCNNRF) for this issue. The model includes two parts, Feature Selection Convolutional Neural Networks (FSCNN) and Random Forest (RF). It can select more effective feature set by using FSCNN for dimensionality reduction and RF for feature selection. Firstly, the high-dimensional and multi-granularity feature data are subjected to dimensionality reduction processing by FSCNN, so that each feature becomes a single granularity feature. Then the RF is used to select valid features. Experiments show that the model has better effect on feature selection on high-dimensional and multi-granularity dataset and improves the performance of machine learning algorithms.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
References
Zhou, Z.: Machine Learning, pp. 229–230. Tsinghua University Press, Beijing (2016)
Yu, L., et al.: Multi-response parameters optimization based on PCA and neural network. J. Syst. Simul. 176–183 (2018)
Yi, M.: Research on Infrared Feature Authentication and Counterfeiting Algorithm Based on PCA + SVM Paper Currency. University of Science and Technology Liaoling (2018)
Zhou, Z.: Machine Learning, pp. 60–63. Tsinghua University Press, Beijing (2016)
Han, Z., et al.: Gait recognition based on linear discriminant analysis and support vector machine. Pattern Recogn. Artif. Intell. 18(2) (2005)
Zhou, Z.: Machine Learning, pp. 234–237. Tsinghua University Press, Beijing (2016)
Kira, K., Renddell, L.A.: Wrappers for feature selection problem: traditional methods and a new algorithm. In: Proceedings of the 10th National Conference on Artificial Intelligence (AAAI), pp. 129–134 (1992)
Zhou, Z.: Machine Learning, pp. 248–253. Tsinghua University Press, Beijing (2016)
Liu, H., Setiono, R.: Feature selection and classification a probabilistic wrapper approach. In: Proceedings of the 9th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems (IEA/AIE), pp. 419–424 (1996)
Tibshirani, R., et al.: Sparsity and smoothness via the fused LASSO. J. R. Stat. Soc.-Ser. B 67(1), 91–108 (2005)
CSDN. https://blog.csdn.net/qq_26598445/article/details/8428790. Accessed 05 Sept 2018
Li, W.J., Wang, S., Kang, W.C.: Feature learning based deep supervised hashing with pairwise labels (2015)
Wang, H., Cai, Y., Zhang, Y., et al.: Deep learning for image retrieval: what works and what doesn’t. In: IEEE International Conference on Data Mining Workshop, pp. 1576–1583. IEEE (2015)
Huang, H.-K., Chiu, C.-F., Kuo, C.-H., Wu, Y.-C., Chu, N.N.Y, Chang, P.-C.: Mixture of deep CNN-based ensemble model for image retrieval. In: IEEE 5th Global Conference on Consumer Electronics (2016)
Li, J.Y., Li, J.H.: Fast image search with deep convolutional neural networks and efficient hashing codes. In: International Conference on Fuzzy Systems and Knowledge Discovery, pp. 1285–1290. IEEE (2015)
Liu, H., Wang, R., Shan, S., et al.: Deep supervised hashing for fast image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2064–2072. IEEE (2016)
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (2001)
Genuer, R., et al.: Variable selection using random forests. Pattern Recogn. Lett. 31(14), 2225–2236 (2010)
Deng, J., Zhang, Z., Marchi, E., et al.: Sparse autoencoder-based feature transfer learning for speech emotion recognition. In: Affective Computing & Intelligent Interaction (2013)
Liu, J., Li, C., Yang, W.: Supervised learning via unsupervised sparse autoencoder. IEEE Access PP(99), 1 (2018)
Wang, Y., et al.: Stacked sparse autoencoder with PCA and SVM for data-based line trip fault diagnosis in power systems. Neural Comput. Appl. (2018)
Acknowledgment
This work is supported by the National Natural Science Foundation of China (Grant No. 61105040, 61203284), the Beijing Natural Science Foundation (Grant No 4133085), the general program of science and technology development project of Beijing Municipal Education Commission (Grant KM201810005005), the Beijing municipal commission of education young top-notch personnel plan and the Beijing University of Technology Science Foundation (Grant No. 006000543115502).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Sun, Y., Liu, L., Chen, S., Hou, L. (2020). A High-Dimensional and Multi-granularity Feature Selection Method Based on CNN and RF. In: Liu, Y., Wang, L., Zhao, L., Yu, Z. (eds) Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery. ICNC-FSKD 2019. Advances in Intelligent Systems and Computing, vol 1074. Springer, Cham. https://doi.org/10.1007/978-3-030-32456-8_34
Download citation
DOI: https://doi.org/10.1007/978-3-030-32456-8_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32455-1
Online ISBN: 978-3-030-32456-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)