Abstract
The deep forest model, a random forest (RF) ensemble approach and an alternative to Deep Neural Network (DNN), has performance highly competitive to DNN in many classification tasks. However, deep forest model may encounter overfitting and characteristic dispersion issues as processing small-scale, class-imbalance or high-dimension data. Therefore, this paper proposes a Weighted Cascade Deep Forest framework, called WCDForest. In WCDForest, an equal multi-grained scanning module is used to scan each feature equally. Meanwhile, this framework adopts a class vector weighting module to emphasis the performance of each forest and each sliding window by weight. Furthermore, this study proposes a feature enhancement module to reduce the information loss in the first few cascade layers to improve the classification accuracy. Subsequently, systematic comparison experiments on 18 widely used public datasets demonstrate that the proposed model outperforms the state-of-the-art model. In particular, WCDForest improves the accuracy, precision, recall and F1-score by an average of 5.47%,7.04%,8.23% and 8.94%,respectively.
Similar content being viewed by others
Data availability
The UCI datasets generated during and/or analysed during the current study are available in the UCI Machine Learning repository, https://archive.ics.uci.edu/ml/index.php. Please contact the corresponding author for all other data supporting the findings of this study.
References
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Choi H, Cho K, Bengio Y (2018) Fine-grained attention mechanism for neural machine translation. Neurocomputing 284:171–176
Otter DW, Medina JR, Kalita JK (2020) A survey of the usages of deep learning for natural language processing. IEEE Trans Neural Netw Learn Syst 32(2):604–624
Chen L, Ren J, Chen P, Mao X, Zhao Q (2022) Limited text speech synthesis with electroglottograph based on Bi-LSTM and modified Tacotron-2. Appl Intell 52(13):15193–15209
Jun Yu, Zhu C, Zhang J, Huang Q, Tao D (2019) Spatial pyramid-enhanced netvlad with weighted triplet loss for place recognition. IEEE Trans Neural Netw Learn Syst 31(2):661–674
Liu M, Yang Z, Han W, Chen J, Sun W (2022) Semi-supervised multi-view binary learning for large-scale image clustering. Appl Intell 52(13):14853–14870
Ding Z, Li H, Zhou D, Liu Y, Hou R (2023) A robust infrared and visible image fusion framework via multi-receptive-field attention and color visual perception. Appl Intell 53(7):8114–8132
Latif S, Rana R, Khalifa S, Jurdak R, Qadir J, Schuller BW (2021) Survey of deep representation learning for speech emotion recognition. IEEE Trans Affect Comput. 1–1, https://doi.org/10.1109/TAFFC.2021.3114365
Li Y, Bell P, Lai C (2022) Fusing asr outputs in joint training for speech emotion recognition. In ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 7362–7366, https://doi.org/10.1109/ICASSP43922.2022.9746289
Yu Z, Lee F, Chen Q (2023) HCT-net: hybrid CNN-transformer model based on a neural architecture search network for medical image segmentation. Appl Intell 53:19990–20006
Li H, Chen D, Nailon WH, Davies ME, Laurenson DI (2022) Dual convolutional neural networks for breast mass segmentation and diagnosis in mammography. IEEE Trans Med Imaging 41(1):3–13. https://doi.org/10.1109/TMI.2021.3102622
Chatterjee S, Das A (2023) An ensemble algorithm integrating consensus-clustering with feature weighting based ranking and probabilistic fuzzy logic-multilayer perceptron classifier for diagnosis and staging of breast cancer using heterogeneous datasets. Appl Intell 53(11):13882–13923
Song H, Kim M, Park D, Shin Y, Lee J-G (2022) Learning from noisy labels with deep neural networks: A survey. IEEE Trans Neural Netw Learn Syst. 1–19, https://doi.org/10.1109/TNNLS.2022.3152527
Chen P, Deng Y, Zou Q, Lu L, Li H (2022) EAAE: A Generative Adversarial Mechanism Based Classfication Method for Small-scale Datasets. Neural Process Lett. 6: https://doi.org/10.1007/s11063-022-10921-7
Zhou Z.-H., Feng J (2017) Deep forest: towards an alternative to deep neural networks. IJCAI International Joint Conference on Artificial Intelligence 0:3553–3559
Zhou ZH, Feng J (2019) Deep forest. Natl Sci Rev, 6, https://doi.org/10.1093/nsr/nwy108
Pang M, Ting KM, Zhao P, Zhou Z-H (2022) Improving deep forest by screening. IEEE Trans Knowl Data Eng 34(9):4298–4312. https://doi.org/10.1109/TKDE.2020.3038799
Sun L, Mo Z, Yan F, Xia L, Shan F, Ding Z, Song B, Gao W, Shao W, Shi F et al (2020) Adaptive feature selection guided deep forest for covid-19 classification with chest ct. IEEE J Biomed Health Inform 24(10):2798–2805
Guo Y, Liu S, Li Z, Shang X (2018) Bcdforest: A boosting cascade deep forest model towards the classification of cancer subtypes based on gene expression data. BMC Bioinformatics. 19, https://doi.org/10.1186/s12859-018-2095-4
Dong Y, Yang W, Wang J, Zhao J, Qiang Y, Zhao Z, Kazihise NGF, Cui Y, Yang X, Liu S (2019) Mlw-gcforest: a multi-weighted gcforest model towards the staging of lung adenocarcinoma based on multi-modal genetic data. BMC Bioinformatics 20(1):1–14
Utkin LV, Ryabinin MA (2018) A siamese deep forest. Knowl-Based Syst. 139, https://doi.org/10.1016/j.knosys.2017.10.006
Wang H, Tang Y, Jia Z, Ye F (2020) Dense adaptive cascade forest: a self-adaptive deep ensemble for classification problems. Soft Comput. 24, https://doi.org/10.1007/s00500-019-04073-5
Costa VG, Pedreira CE (2023) Recent advances in decision trees: An updated survey. Artif Intell Rev 56(5):4765–4800
Sagi O, Rokach L (2018) Ensemble learning: A survey. Wiley Interdiscip Rev: Data Min Knowl Disc 8(4):e1249
Dietterich TG (2000) Ensemble methods in machine learning. International workshop on multiple classifier systems. Berlin, Heidelberg, Springer, Berlin Heidelberg, pp 1–15
Ghosh D, Cabrera J (2022) Enriched random forest for high dimensional genomic data. IEEE/ACM Trans Comput Biol Bioinf 19(5):2817–2828. https://doi.org/10.1109/TCBB.2021.3089417
Sun J, Hui Yu, Zhong G, Dong J, Zhang S, Hongchuan Yu (2022) Random shapley forests: Cooperative game-based random forests with consistency. IEEE Trans Cybernet 52(1):205–214. https://doi.org/10.1109/TCYB.2020.2972956
Jia Z, Liu Z, Gan Y, Vong C-M, Pecht M (2021) A deep forest-based fault diagnosis scheme for electronics-rich analog circuit systems. IEEE Trans Industr Electron 68(10):10087–10096. https://doi.org/10.1109/TIE.2020.3020252
Zhu G, Qiu Hu, Rong Gu, Yuan C, Huang Y (2019) Forestlayer: Efficient training of deep forests on distributed task-parallel platforms. J Parallel Distrib Comput 132:113–126
Chen Z, Wang T, Cai H, Mondal SK, Sahoo JP (2022) Blb-gcforest: A high-performance distributed deep forest with adaptive sub-forest splitting. IEEE Trans Parallel Distrib Syst 33(11):3141–3152. https://doi.org/10.1109/TPDS.2021.3133544
Lin W-P, Ge Q-C, Liong S-T, Liu J-T, Liu K-H, Qing-Qiang Wu (2023) The design of error-correcting output codes based deep forest for the micro-expression recognition. Appl Intell 53(3):3488–3504
Cheng J, Chen M, Chang Li Yu, Liu RS, Liu A, Chen X (2021) Emotion recognition from multi-channel eeg via deep forest. IEEE J Biomed Health Inform 25(2):453–464. https://doi.org/10.1109/JBHI.2020.2995767
Ma P, Wu Y, Li Y, Guo L, Jiang H, Zhu X, Wu X (2022) Hw-forest: Deep forest with hashing screening and window screening. ACM Trans Knowl Discov Data. 16(6): https://doi.org/10.1145/3532193
Ma P, Youxi Wu, Li Y, Guo L, Li Z (2022) Dbc-forest: Deep forest with binning confidence screening. Neurocomputing 475:112–122. https://doi.org/10.1016/j.neucom.2021.12.075
Uci machine learning repository. https://archive.ics.uci.edu/ml/index.php. Accessed 10 June 2022
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Biometrics 40:874
Patle A, Chouhan DS (2013) Svm kernel functions for classification. 2013 International Conference on Advances in Technology and Engineering, ICATE 2013 https://doi.org/10.1109/ICAdTE.2013.6524743
Cunningham P, Delany SJ (2021) K-nearest neighbour classifiers-a tutorial. ACM Comput Surv. 54, https://doi.org/10.1145/3459665
Ramchoun H, Amine M, Idrissi J, Ghanou Y, Ettaouil M (2016) Multilayer perceptron: Architecture optimization and training. Int J Interact Multimed Artif Intell. 4, https://doi.org/10.9781/ijimai.2016.415
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016-December, https://doi.org/10.1109/CVPR.2016.90
Arik SÖ, Pfister T (2021) Tabnet: Attentive interpretable tabular learning. Proc AAAI Conf Artif Intell 35(8):6679–6687
Kontschieder P, Fiterau M, Criminisi A, Bulò SR (2015) Deep neural decision forests. In 2015 IEEE International Conference on Computer Vision (ICCV). 1467–1475, https://doi.org/10.1109/ICCV.2015.172
Gorishniy Y, Rubachev I, Khrulkov V, Babenko A (2021) Revisiting deep learning models for tabular data. Adv Neural Inf Process Syst 34:18932–18943
Acknowledgements
This work is sponsored by the National Key R&D Program of China under Grant No.2022YFC3303200, the National Natural Science Foundation of China under Grant No.62072214, Science and Technology Planning Project of Guangzhou under Grant No.202103000036, the Guangdong Basic and Applied Basic Research Foundation under Grant No.2021B1515120048, the Industry-University-Research Collaboration Project of Zhuhai under Grant No.ZH22017001210048PWC. The corresponding author of this paper is Yuhui Deng and Lijuan Lu.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Ethics approval
Any studies with human participants or animals are not performed in this article.
Informed Consent
This manuscript does not require a statement of informed consent.
Source code
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Huang, J., Chen, P., Lu, L. et al. WCDForest: a weighted cascade deep forest model toward the classification tasks. Appl Intell 53, 29169–29182 (2023). https://doi.org/10.1007/s10489-023-04794-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04794-z