Abstract
Nonparametric learning methods such as Parzen Windows have been applied to a variety of density estimation and classification problems. In this chapter we derive a “simplest” regularization algorithm and establish its close relationship with Parzen Windows. We derive the finite sample error bound for the “simplest” regularization algorithm. Because of the close relationship between the “simplest” algorithm and Parzen Windows, this analysis provides interesting insight to Parzen Windows from the view point of learning theory. Our work is a realization of the design principle of dynamic data driven applications system (DDDAS) introduced in Chapter 1. Finally, we provide empirical results on the performance of the “simplest” regularization algorithm (Parzen Windows) and other methods such as nearest neighbor classifiers, and the regularization algorithm on a number of real data sets. These results corroborate well our theoretical analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
A.J. Aved, E.P. Blasch, Multi-int query language for DDDAS designs. Proc. Comput. Sci. 51, 2518–2532 (2015). International Conference On Computational Science, ICCS 2015
G.A. Babich, O.I. Camps, Weighted parzen windows for pattern classification. IEEE Trans. Pattern Anal. Mach. Intell. 18(5), 567–570 (1996)
E.P. Blasch, A.J. Aved, Dynamic data-driven application system (DDDAS) for video surveillance user support. Proc. Comput. Sci. 51, 2503–2517 (2015). International Conference On Computational Science, ICCS 2015
E. Blasch, G. Seetharaman, F. Darema, Dynamic data driven applications systems (DDDAS) modeling for automatic target recognition. Autom. Target Recognit. XXIII 8744, 87440J (2013)
E. Blasch, G. Seetharaman, K. Reinhardt, Dynamic data driven applications system concept for information fusion. Proc. Comput. Sci. 18, 1999–2007 (2013). 2013 International Conference on Computational Science
Y. Chen, E. Garcia, M. Gupta, A. Rahimi, L. Cazzanti, Similarity-based classification: concepts and algorithms. J. Mach. Learn. Res. 10, 747–776 (2009)
T.M. Cover, P.E. Hart, Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27 (1967)
N. Cristianini, J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods (Cambridge University Press, Cambridge, 2000)
F. Cucker, S. Smale, On the mathematical foundations of learning. Bull. Am. Math. Soc. 39(1), 1–49 (2001)
F. Cucker, S. Smale, Best choices for regularization parameters in learning theory: on the bias-variance problem. Found. Comput. Math. 2(4), 413–428 (2002)
C. Domeniconi, J. Peng, D. Gunopulos, Locally adaptive metric nearest neighbor classification. IEEE Trans. Pattern Anal. Mach. Intell. 24(9), 1281–1285 (2002)
C. Domeniconi, D. Gunopulos, J. Peng, Large margin nearest neighbor classifiers. IEEE Trans. Neural Netw. 16(4), 899–909 (2005)
R. Duda, P. Hart, D. Stork, Patten Classification, 2nd edn. (John-Wiley & Son, New York, 2000)
T. Evgeniou, M. Pontil, T. Poggio, Regularization networks and support vector machines. Adv. Comput. Math. 13(1), 1–50 (2000)
K. Fukunaga, Introduction to Statistical Pattern Recognition (Academic, Boston, 1990)
T. Hastie, R. Tibshirani, Discriminant adaptive nearest neighbor classification. IEEE Trans. Pattern Anal. Mach. Intell. 18(6), 607–615 (1996)
J. Hertz, A. Krough, R. Palmer, Introduction to the Theory of Neural Computation (Addison Wesley, Redwood City, 1991)
A. Hoerl, R. Kennard, Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(3), 55–67 (1970)
M. Kim, Large margin cost-sensitive learning of conditional random fields. Pattern Recogn. 43(10), 3683–3692 (2010)
N. Kwak, C. Choi, Input feature selection by mutual information based on parzen window. IEEE Trans. Pattern Anal. Mach. Intell. 24(12), 1667–1671 (2004)
L. Lan, H. Shi, Z. Wang, S. Vucetic, An active learning algorithm based on parzen window classification. J. Mach. Learn. Res. Work. Conf. Proc. 10, 1–14 (2010)
J. Langford, J. Shawe-Taylor, Pac-Bayes and margins, in Advances in Neural Information Processing Systems, vol. 15 (MIT Press, Cambridge, 2002), pp. 439–446
W. Li, K. Lee, K. Leung, Generalized regularized least-squares learning with predefined features in a Hilbert space, in Advances in Neural Information Processing Systems, ed. by B. Schlkopf, J. Platt, T. Hoffman (MIT Press, Cambridge, 2007)
S. Lin, X. Guo, D. Zhou, Distributed learning with regularized least squares. J. Mach. Learn. Res. 18, 1–31 (2017)
A. Maurer, Learning similarity with operator-valued large-margin classifiers. J. Mach. Learn. Res. 9, 1049–1082 (2008)
C. McDiarmid, Concentration, in Probabilistic Methods for Algorithmic Discrete Mathematics (Springer, Berlin/Heidelberg, 1998), pp. 195–248
S. Mosci, L. Rosasco, A. Verri, Dimensionality reduction and generalization, in Proceedlings of International Conference on Machine Learning, 2007, pp. 657–664
E. Parzen, On the estimation of a probability density function and the mode. Ann. Math. Stats. 33, 1049–1051 (1962)
J. Peng, D. Heisterkamp, H. Dai, Adaptive quasi-conformal Kernel nearest neighbor classification. IEEE Trans. Pattern Anal. Mach. Intell. 26(5), 565–661 (2005)
T. Poggio, S. Smale, The mathematics of learning: dealing with data. Not. AMS 50(5), 537–544 (2003)
G. Ratsch, M. Warmuth, Efficient margin maximizing with boosting. J. Mach. Learn. Res. 6, 2131–2152 (2005)
S. Rosset, J. Zhu, T. Hastie, Boosting as a regularized path to a maximum margin classifier. J. Mach. Learn. Res. 5, 941–973 (2004)
B. Scholkopf, A. Smola, Learning with Kernels (MIT Press, Cambridge, 2002)
S. Smale, D.X. Zhou, Shannon samping II: connection to learning theory. Appl. Comput. Harmon. Anal. 19(3), 285–302 (2005)
A.J. Smola, B. Schölkopf, K.R. Müller, The connection between regularization operators and support vector kernels. Neural Netw. 11(4), 637–649 (1998). http//www.citeseer.nj.nec.com/smola98connection.html
V. Strassen, Gaussian elimination is not optimal. Numer. Math. 13, 354–356 (1969)
F. Teng, Y. Chen, X. Dang, Multiclass classification with potential function rules: margin distribution and generalization. Pattern Recogn. 45(1), 540–551 (2012)
R. Tibshirani, T. Hastie, Margin trees for high-dimensional classification. J. Mach. Learn. Res. 8, 637–652 (2007)
A.N. Tikhonov, V.Y. Arsenin, Solutions of Ill-Posed Problems (Wiley, Washington, DC, 1977)
V. Vapnik, Statistical Learning Theory (Wiley, New York, 1998)
V.N. Vapnik, Statistical learning theory, in Adaptive and Learning Systems for Signal Processing, Communications, and Control (Wiley, New York, 1998)
E.D. Vito, A. Caponnetto, L. Rosasco, Model selection for regularized least-squares algorithm in learning theory. Found. Comput. Math. 5, 59–85 (2005)
J. Wang, X. Shen, Large margin semi-supervised learning. J. Mach. Learn. Res. 8, 1867–1891 (2007)
Y. Guermeur, Vc theory of large margin multi-category classifiers. J. Mach. Learn. Res. 8, 2551–2594 (2007)
Y. Zhang, J. Schneider, Projection penalties: dimension reduction without loss, in Proceedings 27th International Conference on Machine Learning (Morgan Kaufmann, San Francisco, 2010)
Y. Zhao, J. Fan, L. Shix, Learning rates for regularized least squares ranking algorithm. Anal. Appl. 15(6), 815–836 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Peng, J., Zhang, P. (2018). Parzen Windows: Simplest Regularization Algorithm. In: Blasch, E., Ravela, S., Aved, A. (eds) Handbook of Dynamic Data Driven Applications Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-95504-9_29
Download citation
DOI: https://doi.org/10.1007/978-3-319-95504-9_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95503-2
Online ISBN: 978-3-319-95504-9
eBook Packages: Computer ScienceComputer Science (R0)