Abstract
Advantages of machine learning algorithms are compared in terms of “generalization,” which means how accurate they perform upon unseen data. There are many applications in which test data are available (obviously without target). A central question with regard to “generalization” is as follows: Would it be possible to deploy test feature vectors in the training phase? For test data, nonetheless, there is no target—this is the paradox. Transductive learning is a form of learning that uses test data in addition to train data in the training phase. Using decision tree as the base model reveals many errors occur on those data, the feature values of which, are very close to splitting values of decision tree’s nodes. This observation is the essence of answering the aforementioned paradox in this study. This paper proposes a combination of decision tree and a novel Fisher discriminant analysis, in which the proposed modified FDA has an embedded mechanism using test data as a shortcut to generalization. This combination further arranged in a boosting system with weights on both train and test data. Experiments are evaluated on 21 datasets from various fields. The results decisively demonstrate the promising performance of the proposed method.
Similar content being viewed by others
Notes
www.kaggle.com: a well-known platform for data science competitions.
References
Amasyali MF, Ersoy O (2007) Cline: a new decision-tree family. IEEE Trans Neural Netw 19(2):356–363
Aravan A, Anzar S (2017) Robust partial fingerprint recognition using wavelet SIFT descriptors. Pattern Anal Appl 20(14):963–979
Blum A, Chawla S (2001) Learning from labeled and unlabeled data using graph mincuts. In: ICML’01 proceedings of the eighteenth international conference on machine learning, pp 19–26
Bresson X, Tai XC, Chan TF, Szlam A (2014) Multi-class transductive learning based on l 1 relaxations of Cheeger Cut and Mumford–Shah–Potts model. J Math Imaging Vis 49:191–201
Breiman L (1984) Classification and regression trees. Wadsworth Int. Group, Belmont
Cai D, He X, Han J (2007) Semi-supervised discriminant analysis. In: ICCV, IEEE conference on computer vision
Cha SH, Tappert CC (2009) A genetic algorithm for constructing compact binary decision trees. J Pattern Recognit Res 14(1):15–22
Chen J, Patel VM, Lie L, Kellokumpu V, Zhao G, Pietikainen M, Chellappa R (2017) Robust local features for remote face recognition. Image Vis Comput 64:34–46
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In Proceedings of the 22Nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 785–794
Codella NCF, Nguyen Q, Pankanti S, Gutman DA, Helba B, Halpern AC, Smoth JR (2017) Deep learning ensembles for melanoma recognition in dermoscopy images. IBM J Res Dev 61(4):5:1–5:15
Davie AM, Stothers AJ (2013) Improved bound for complexity of matrix multiplication. Proc R Soc Edinb 143A:351–370
Freund Y, Schapire RE (1999) A short introduction to boosting. J Jpn Soc Artif Intell 14(5):771–780
Gama J, Brazdil P (1999) Linear tree. Intell Data Anal 3(1):1–22
Gammerman A, Vapnik V, Vowk V (1998) Learning by transduction. In: Proceedings of uncertainty artificial intelligence, Madison, WI, July 1998, pp 148–156
Georghiades AS, Belhumeur PN, Kriegman D (2001) From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Trans Pattern Anal Mach Intell 23(6):643–660
Girosi F, Jones M, Poggio T (2013) Regularization theory and neural networks architectures. Neural Comput 7(2):219–269
Gu B, Sheng VS (2017) A robust regularization path algorithm for ν-support vector classification. IEEE Trans Neural Netw Learn Syst 28(5):1241–1248
Heath D, Kasif S, Salzberg S (1993) Induction of oblique decision trees. In: Proceedings of the 13th international joint conference on artificial intelligence, pp 1002–1007
Joachims T (2006) Transductive support vector machines. In: Scholkopf B, Zien A (eds) Semi-supervised learning by editors Olivier Chapelle. MIT Press, Cambridge, pp 105–117
Ke G, Meng Q, Fimley T, Wang T, Chen W, Ma W, Ye Q, Liu TY (2017) LightGBM: a highly efficient gradient boosting decision tree. In: 31st conference on neural information processing systems (NIPS), Long Beach, CA, USA
Kolakowska A, Malina W (2005) Fisher sequential classifiers. IEEE Trans Syst Man Cybern 35(5):988–998
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Lee KC, Ho J, Kriegman D (2005) Acquiring linear subspaces for face recognition under variable lighting. IEEE Trans Pattern Anal Mach Intell 27(5):684–698
Loh WY, Shih YS (1997) Split selection methods for classification trees. Stat Sin 7(4):815–840
Lopes GS, Silva D, Rodrigues A, Filho P (2016) “ Recognition of handwritten digits using the signature features and Optimum-Path Forest Classifier. IEEE Latin Am Trans 14(5):2455–2460
Lu M, Huo J, Philip Chen CL, Wang X (2009) Multi-stage decision tree based on inter-class and inner-class margin of SVM. In: IEEE international conference on systems, man and cybernetics
Ma L, Li M, Goa Y, Ma X, Qu L (2017) A novel wrapper approach for feature selection in object-based image classification using polygon-based cross-validation. IEEE Geosci Remote Sens Lett 14(3):409–413
Mangasarian OL, Wild EW (2006) Multisurface proximal support vector machine classification via generalized eigenvalues. IEEE Trans Pattern Anal Mach Intell 28(1):69–74
Manwani N, Sastry PS (2012) Geometric decision tree. IEEE Trans Syst Man Cybern B 42:181–192
Mehta M, Agrawal R, Rissanen J (1996) SLIQ: a fast scalable classifier for data mining. In: International conference on extending database technology. Springer, Berlin, pp 18–32
Murthy SK, Kasif S, Salzberg S (1993) The OC1 decision tree software system
Pangilinan JM, Janssens GK (2011) Pareto-optimality of oblique decision trees from evolutionary algorithms. J Global Optim 51(2):301–311
Phillips P, Moon H, Rizvi SA, Rauss PJ (2000) The FERET evaluation methodology for face-recognition algorithms. IEEE Trans Pattern Anal Mach Intell 22(10):10901104. https://doi.org/10.1109/34.879790
Rani A, Foresti GL, Micheloni C (2015) A neural tree for classification using convex objective function. Pattern Recogn Lett 68(1):41–47
Setiono R, Liu H (1999) A connectionist approach to generating oblique decision trees. IEEE Trans Syst Man Cybern 29(3):440–444
Shafer J, Agrawal R, Mehta M (1996) Sprint: a scalable parallel classifier for data mining. In: Proceedings of the 1996 International conference on very large data bases, pp 544–555
Sharma P, Abrol V, Sao AK (2017) Deep-sparse-representation-based features for speech recognition. IEEE/ACM Trans Audio Speech Lang Process 25(11):2162–2175
Sokolic J, Giryes R, Sapiro G, Rodrigues M (2017) Robust large margin deep neural networks. IEEE Trans Signal Process 65(16):4265–4280
Tibshirani R, Hastie T (2007) “Margin trees for high-dimensional classification. J Mach Learn Res 8:637–652
Utgoff PE, Brodley CE (1991) Linear machine decision trees. (Technical Report 10). Department of Computer Science, University of Massachusetts, Amherst, MA
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vision 57(2):137–154
Wang H, Lu X, Hu Z, Zheng W (2014) Fisher discriminant analysis with L1-norm. IEEE Trans Cybern 44(6):828–842
Wei JM, Wang MY, You JP, Wang SQ, Liu DY (2007) VPRSM based decision tree classifier. Comput Inform 26(6):663–677
Wickramarachchi DC, Robertson BL, Reale M, Price CJ, Brown J (2016) HHCART: an oblique decision tree. Comput Stat Data Anal 96:12–23
Li XB, Sweigart JR, Teng JT, Donohue JM, Thombs LA, Wang SM (2003) Multivariate decision trees using linear discriminants and tabu search. IEEE Trans Syst Man Cybern A 33(2):194–205
Xu C, Tao D, Li Y, Xu C (2013) Large-margin multi-view gaussian process for image classification. In: ICIMCS’13 Proceedings of the fifth international conference on internet multimedia computing and service, pp 7–12
Xu C, Tao D, Xu C, Rui Y (2014) Large-margin weakly supervised dimensionality reduction. In: ICML’14 proceedings of the 31st international conference on international conference on machine learning, vol 32, pp 865–873
Zhang L, Suganthan PN (2015) Oblique decision tree ensemble via multisurface proximal support vector machine. IEEE Trans Cybern 45(10):2165–2176
Zhang S, Lei Y, Zhang C, Hu Y (2016) Semi-supervised orthogonal discriminant projection for plant leaf classification. Pattern Anal Appl 19(4):953–961
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sheikholharam Mashhadi, P. Boosted Test-FDA: a transductive boosting method. Pattern Anal Applic 22, 115–131 (2019). https://doi.org/10.1007/s10044-018-0710-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-018-0710-7