Abstract
Recently, a deep learning model, the deep forest (DF), was designed as an alternative to deep neural networks. Each cascade layer of the DF contains a set of random forests (RFs) with a large number of decision trees, some of which are of high redundancy and poor performance. To avoid the negative impacts of such decision trees, this paper proposes to optimize RFs in each cascade layer of the DF so as to realize a pruned deep forest (PDF) with higher performance and smaller ensemble size. In this paper, a new ordering-based ensemble pruning method is proposed based on feature vectorization and quantum walks. This method simultaneously considers the accuracy and the diversity of base classifiers, and it provides an integrated evaluation criterion for ordering base classifiers in the ensemble system. The effectiveness of the proposed method is verified by experiments and discussions.
Similar content being viewed by others
References
Aharonov Y, Davidovich L, Zagury N (1993) Quantum random walks. Phys Rev A 48:1687–1690. https://doi.org/10.1103/PhysRevA.48.1687
Aharonov D, Ambainis A, Kempe J (2001) Quantum walks on graphs. In: The thirty-third annual ACM symposium on theory of computing, pp 50–59. https://doi.org/10.1145/380752.380758
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140. https://doi.org/10.1007/BF00058655
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Cao X, Wen L, Ge Y, Zhao J, Jiao L (2019) Rotation-based deep forest for hyperspectral imagery classification. IEEE Geosci Remote S 16:1–5. https://doi.org/10.1109/LGRS.2019.2892117
Chen ZH, Li LP, He Z, Zhou JR, Li Y, Wong L (2019) An improved deep forest model for predicting self-interacting proteins from protein sequence using wavelet transformation. Front Genet 10:1–10. https://doi.org/10.3389/fgene.2019.00090
Dai Q, Ye R, Liu Z (2017) Considering diversity and accuracy simultaneously for ensemble pruning. Appl Soft Comput 58:75–91. https://doi.org/10.1016/j.asoc.2017.04.058
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Drucker H (1997) Improving regressors using boosting techniques. In: The fourteenth international conference on machine learning, pp 107–115
Emms D, Wilson RC, Hancock ER (2009) Graph matching using the interference of continuous-time quantum walks. Pattern Recogn 42:985–1002. https://doi.org/10.1016/j.patcog.2008.09.001
Gao W, Zhou Z-H (2013) On the doubt about margin explanation of boosting. Artif Intell 203:1–18. https://doi.org/10.1016/j.artint.2013.07.002
Guo H, Liu H, Li R, Wu C, Guo Y, Xu M (2018) Margin and diversity based ordering ensemble pruning. Neurocomputing 275:237–246. https://doi.org/10.1016/j.neucom.2017.06.052
Loyola-Gonzalez O, Medina-Perez MA, Martinez-Trinidad JF, Carrasco-Ochoa JA, Monroy R, Garcia-Borroto M (2017) PBC4cip: a new contrast pattern-based classifier for class imbalance problems. Knowl-Based Syst 115:100–109. https://doi.org/10.1016/j.knosys.2016.10.018
Nguyen HV, Bai L (2011) Cosine similarity metric learning for face verification. In: The 10th Asian conference on computer vision, pp 709–720. https://doi.org/10.1007/978-3-642-19309-5_55
Pang M, Ting K-M, Zhao P, Zhou Z-H (2018) Improving deep forest by confidence screening. In: The 18th IEEE international conference on data mining, pp 1194–1199. https://doi.org/10.1109/ICDM.2018.00158
Schapire R, Freund Y, Bartlett P, Lee W (1998) Boosting the margin: a new explanation for the effectiveness of voting methods. Ann Stat 26:1651–1686
Schouten TE, Broek EL (2014) Fast exact euclidean distance (FEED): a new class of adaptable distance transforms. IEEE Trans Pattern Anal Mach Intell 36:2159–2172. https://doi.org/10.1109/TPAMI.2014.25
Utkin L, Ryabinin MA (2018) A siamese deep forest. Knowl Based Syst 139:13–22. https://doi.org/10.1093/nsr/nwy15110.1016/j.knosys.2017.10.006
Utkin L, Meldo A, Konstantinov A (2018) Deep Forest as a framework for a new class of machine-learning models. Natl Sci Rev 6:186–187. https://doi.org/10.1093/nsr/nwy151
Utkin LV, Kovalev MS, Meldo AA (2019) A deep forest classifier with weights of class probability distribution subsets. Knowl Based Syst 173:15–27. https://doi.org/10.1016/j.knosys.2019.02.022
Venegas-Andraca SE (2012) Quantum walks: a comprehensive review. Quant Inf Process 11:1015–1106. https://doi.org/10.1007/s11128-012-0432-5
Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics 1:80–83. https://doi.org/10.2307/3001968
Wu C, Berry M, Shivakumar S, McLarty J (1995) Neural networks for full-scale protein sequence classification: sequence encoding with singular value decomposition. Mach Learn 21:177–193. https://doi.org/10.1007/BF00993384
Xu J, Wu P, Chen Y, Meng Q, Dawood H, Khan MM (2019) A novel deep flexible neural forest model for classification of cancer subtypes based on gene expression data. IEEE Access 7:22086–22095. https://doi.org/10.1109/access.2019.2898723
Yang F, Xu Q, Li B, Ji Y (2018) Ship detection from thermal remote sensing imagery through region-based deep forest. IEEE Geosci Remote S 15:449–453. https://doi.org/10.1109/lgrs.2018.2793960
Zhang Ya-Lin, Zhou Jun, Feng Ji, Zhou Z-H (2019a) Distributed deep forest and its application to automatic detection of cash-out fraud. ACM Trans Intell Syst Technol. https://doi.org/10.1145/3342241
Zhang Z, Chen D, Wang J, Bai L (2019b) Quantum-based subgraph convolutional neural networks. Pattern Recogn 88:38–49. https://doi.org/10.1016/j.patcog.2018.11.002
Zhou Z-H, Feng J (2017) Deep forest: towards an alternative to deep neural networks. In: The 26th international joint conference on artificial intelligence, pp 3553–3559. https://doi.org/10.24963/ijcai.2017/497
Zhou Z-H, Feng J (2019) Deep forest. Natl Sci Rev 6:74–86. https://doi.org/10.1093/nsr/nwy108
Acknowledgements
This work was supported by the National Natural Science Foundation of China (No. 61772023) and the Natural Science Foundation of Fujian Province (No. 2016J01320).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
Human and animal rights
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Gao, J., Liu, K., Wang, B. et al. Improving deep forest by ensemble pruning based on feature vectorization and quantum walks. Soft Comput 25, 2057–2068 (2021). https://doi.org/10.1007/s00500-020-05274-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-020-05274-z