Abstract
Existing classification algorithms focus on vectorial data given in Euclidean space or representations by means of positive semi-definite kernel matrices. Many real world data, like biological sequences are not vectorial, often non-euclidean and given only in the form of (dis-)similarities between examples, requesting for efficient and interpretable models. Vectorial embeddings or transformations to get a valid kernel are limited and current dissimilarity classifiers often lead to dense complex models which are hard to interpret by domain experts. They also fail to provide additional information about the confidence of the classification. In this paper we propose a prototype-based conformal classifier for dissimilarity data. It is based on a prototype dissimilarity learner and extended by the conformal prediction methodology. It (i) can deal with dissimilarity data characterized by an arbitrary symmetric dissimilarity matrix, (ii) offers intuitive classification in terms of sparse prototypical class representatives, (iii) leads to state-of-the-art classification results supported by a confidence measure and (iv) the model complexity is automatically adjusted. In experiments on dissimilarity data we investigate the effectiveness with respect to accuracy and model complexity in comparison to different state of the art classifiers.
Similar content being viewed by others
References
Balasubramanian, V., Chakraborty, S., Panchanathan, S., Ye, J.: Kernel learning for efficiency maximization in the conformal predictions framework, pp. 235–242 (2010)
Bhattacharyya, S.: Confidence in predictions from random tree ensembles. Knowl. Info. Syst. 35(2), 391–410 (2013)
Biehl, M., Ghosh, A., Hammer, B.: Dynamics and generalization ability of lvq algorithms. J. Mach. Learn. Res. 8, 323–360 (2007)
Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M.C., Estreicher, A., Gasteiger, E., Martin, M., Michoud, K., O’Donovan, C., Phan, I., Pilbout, S., Schneider, M.: The swiss-prot protein knowledgebase and its supplement trembl in 2003. Nucleic Acids Res. 31, 365–370 (2003)
Chen, H., Tino, P., Yao, X.: Probabilistic classification vector machines. IEEE Trans. Neural Netw. 20(6), 901–914 (2009)
Chen, Y., Garcia, E.K., Gupta, M.R., Rahimi, A., Cazzanti, L.: Similarity-based classification: Concepts and algorithms. J. Mach. Learn. Res. 10, 747–776 (2009)
Cordella, L.P., Foggia, P., Sansone, C., Tortorella, F., Vento, M.: Reliability parameters to improve combination strategies in multi-expert systems. Pattern Anal. Appl. 2(3), 205–214 (1999)
Duin, R.P.: PRTools (2012). http://www.prtools.org
Duin, R.P.W., Loog, M., Pekalska, E., Tax, D.M.J.: Feature-based dissimilarity space classification. In: Ünay, D., Çataltepe, Z., Aksoy, S. (eds.) ICPR Contests, Lecture Notes in Computer Science, vol. 6388, pp. 46–55. Springer (2010)
Elomaa, T., Mannila, H., Toivonen, H. (eds.): Machine Learning: ECML 2002, 13th European Conference on Machine Learning, Helsinki, Finland, August 19-23, 2002. Lecture Notes in Computer Science, vol. 2430. Springer (2002)
Gasteiger, E., Gattiker, A., Hoogland, C., Ivanyi, I., Appel, R., Bairoch, A.: Expasy: The proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res. 31(3784–3788) (2003)
Grbovic, M., Vucetic, S.: Learning vector quantization with adaptive prototype addition and removal. In: Neural Networks, 2009. IJCNN 2009. International Joint Conference on, pp. 994–1001 (2009) doi:10.1109/IJCNN.2009.5178710
Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press (1997)
Haasdonk, B., Bahlmann, C.: Learning with distance substitution kernels. Pattern Recognition – Proceedings of the 26th DAGM Symposium (2004)
Hammer, B., Hasenfuss, A.: Topographic mapping of large dissimilarity data sets. Neural Comput. 22(9), 2229–2284 (2010)
Hammer, B., Mokbel, B., Schleif, F.M., Zhu, X.: Prototype-based classification of dissimilarity data. In: Gama, J., Bradley, E., Hollmén, J. (eds.) IDA, Lecture Notes in Computer Science, vol. 7014, pp. 185–197. Springer (2011)
Hammer, B., Schleif, F.M., Zhu, X.: Relational extensions of learning vector quantization. In: Lu, B.L., Zhang, L., Kwok, J.T. (eds.) ICONIP (2), Lecture Notes in Computer Science, vol. 7063, pp. 481–489. Springer (2011)
Hammer, B., Strickert, M., Villmann, T.: On the generalization ability of grlvq networks. Neural Process. Lett. 21(2), 109–120 (2005)
Hebiri, M.: Sparse conformal predictors. Stat. Comput. 20(2), 253–266 (2010)
Kohonen, T., Kangas, J., Laaksonen, J., Torkkola, K.: Lvq pak: A program package for the correct application of learning vector quantization algorithms, pp. 725–730. IEEE (1992)
Goldfarb,L.: A unified approach to pattern recognition. Pattern Recogn. 17(5), 575–582 (1984)
Laub, J., Roth, V., Buhmann, J.M., Müller, K.R.: On the information and representation of non-euclidean pairwise data. Pattern Recogn. 39(10), 1815–1826 (2006)
Lozano, M., Sotoca, J.M., Sánchez, J.S., Pla, F., Pekalska, E., Duin, R.P.W.: Experimental study on prototype optimisation algorithms for prototype-based classification in vector spaces. Pattern Recogn. 39(10), 1827–1838 (2006)
Maier, T., Klebel, S., Renner, U., Kostrzewa, M.: Fast and reliable MALDI-TOF MS-based microorganism identification. Nature Methods 3(4), i–ii (2006). http://www.scopus.com/inward/record.url?eid=2-s2.0-33645324459&partnerID=40&md5=1b664ba2ddedff421e6bd84e7cad525e
Manolova, A., Guérin-Dugué, A.: Classification of dissimilarity data with a new flexible mahalanobis-like metric. Pattern Anal. Appl. 11(3–4), 337–351 (2008)
Papadopoulos, H.: Inductive conformal prediction: Theory and application to neural networks. Tools in Artificial Intelligence, chap. 18, pp. 315–330. I-Tech (2008)
Papadopoulos, H., Proedrou, K., Vovk, V., Gammerman, A.: Inductive confidence machines for regression. In: Elomaa et al. (eds.) Machine Learning: ECML 2002, 13th European Conference on Machine Learning, Helsinki, Finland, August 19-23, 2002. Lecture Notes in Computer Science, vol. 2430, pp. 345–356. Springer (2002)
Papadopoulos, H., Vovk, V., Gammerman, A.: Regression conformal prediction with nearest neighbours. J. Artif. Intell. Res. 40, 815–840 (2011)
Pekalska, E., Duin, R.: The dissimilarity representation for pattern recognition. World Scientific (2005)
Pekalska, E., Duin, R.P.W.: Dissimilarity representations allow for building good classifiers. Pattern Recogn. Lett. 23(8), 943–956 (2002)
Pekalska, E., Duin, R.P.W.: Beyond traditional kernels: Classification in two dissimilarity-based representation spaces. IEEE Trans. Syst. Man Cybern. Part C 38(6), 729–744 (2008)
Pekalska, E., Duin, R.P.W., Günter, S., Bunke, H.: On not making dissimilarities euclidean. In: Fred, A.L.N., Caelli, T., Duin, R.P.W., Campilho, A.C., de Ridder, D. (eds.) SSPR/SPR, Lecture Notes in Computer Science, vol. 3138, pp. 1145–1154. Springer (2004)
Pekalska, E., Duin, R.P.W.: Paclík, P.: Prototype selection for dissimilarity-based classifiers. Pattern Recogn. 39(2), 189–208 (2006)
Pekalska, E., Haasdonk, B.: Kernel discriminant analysis for positive definite and indefinite kernels. IEEE Trans. Pattern Anal. Mach. Intell. 31(6), 1017–1032 (2009)
Platt, J.C.: Fast training of support vector machines using sequential minimal optimization, pp. 185–208. MIT Press, Cambridge, MA (1999)
Proedrou, K., Nouretdinov, I., Vovk, V., Gammerman, A.: Transductive confidence machines for pattern recognition. In: Elomaa et al. (eds.) Machine Learning: ECML 2002, 13th European Conference on Machine Learning, Helsinki, Finland, August 19-23, 2002. Lecture Notes in Computer Science, vol. 2430, pp. 381–390. Springer (2002)
Roth, V., Laub, J., Buhmann, J.M., Müller, K.R.: Going metric: Denoising pairwise data. In: Becker, S., Thrun, S., Obermayer, K. (eds.) NIPS, pp. 817–824. MIT Press (2002)
Sato, A., Yamada, K.: Generalized learning vector quantization. In: Touretzky, D.S., Mozer, M., Hasselmo, M.E. (eds.) NIPS, pp. 423–429. MIT Press (1995)
Schleif, F.M., Villmann, T., Hammer, B., Schneider, P.: Efficient kernelized prototype based classification. Int. J. Neural Syst. 21(6), 443–457 (2011)
Schleif, F.-M., Villmann, T., Kostrzewa, M., Hammer, B., Gammerman, A.: Cancer informatics by prototype networks in mass spectrometry. Artif. Intell. Med. 45(2–3), 215-228 (2009). http://www.scopus.com/inward/record.url?eid=2-s2.0-61449263037&partnerID=40&md5=4ca2b1b309134e18f2ed579f9dc4e11e
Schneider, P., Geweniger, T., Schleif, F.M., Biehl, M., Villmann, T.: Multivariate class labeling in robust soft lvq. In: Proceedings of ESANN 2011, pp. 17–22 (2011)
Seo, S., Obermayer, K.: Soft learning vector quantization. Neural Comput. 15(7), 1589–1604 (2003)
Shafer, G., Vovk, V.: A tutorial on conformal prediction. J. Mach. Learn. Res. 9, 371–421 (2008)
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis and Discovery. Cambridge University Press (2004)
de Stefano, C., Sansone, C., Vento, M.: To reject or not to reject: that is the question: an answer in case of neural classifiers. IEEE Trans. Syst. Man Cybern. Part C 30(1), 84–93 (2000)
Tipping, M.E.: The relevance vector machine. In: Solla, S.A., Leen, T.K., Müller, K.R. (eds.) NIPS, pp. 652–658. MIT Press (1999)
Tsang, I.W., Kocsor, A., Kwok, J.T.: Simpler core vector machines with enclosing balls. In: Ghahramani, Z. (ed.) ICML, ACM International Conference Proceeding Series, vol. 227, pp. 911–918. ACM (2007)
Vapnik, V.: The nature of statistical learning theory. Statistics for Engineering and Information Science. Springer (2000)
Vovk, V.: Conditional validity of inductive conformal predictors. J. Mach. Learn. Res. - Proc. Track 25, 475–490 (2012)
Vovk, V., Gammerman, A., Shafer, G.: Algorithmic Learning in a Random World. Springer, New York (2005)
Williams, C., Seeger, M.: Using the nyström method to speed up kernel machines. In: Advances in Neural Information Processing Systems, vol. 13, pp. 682-688. MIT Press (2001)
Yang, M., Nouretdinov, I., Luo, Z., Gammerman, A.: Feature selection by conformal predictor. IFIP Adv. Inf. Commun. Technol. 364 AICT(PART 2) 439–448 (2011)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Schleif, FM., Zhu, X. & Hammer, B. Sparse conformal prediction for dissimilarity data. Ann Math Artif Intell 74, 95–116 (2015). https://doi.org/10.1007/s10472-014-9402-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-014-9402-1