A Non-parametric Fisher Kernel

Figuera, Pau; Bringas, Pablo García

doi:10.1007/978-3-030-86271-8_38

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12886))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

Abstract

In this manuscript, we derive a non-parametric version of the Fisher kernel. We obtain this original result from the Non-negative Matrix Factorization with the Kullback-Leibler divergence. By imposing suitable normalization conditions on the obtained factorization, it can be assimilated to a mixture of densities, with no assumptions on the distribution of the parameters. The equivalence between the Kullback-Leibler divergence and the log-likelihood leads to kernelization by simply taking partial derivatives. The estimates provided by this kernel, retain the consistency of the Fisher kernel.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In the NMF framework it is more convenient to write the matrix $\mathbf {X}=(x_{ij})$ (with the subscripts i and j defined in 1) as
$$\begin{aligned}{}[\mathbf {X}]_{ij}=\mathbf {X} \end{aligned}$$
This notation is suitable when some index can vary (normally k in 4). If the matrix operations do not depend on the index that varies, eliding the subscripts simplifies the notation (as happens for diagonal matrices). For the rest of the objects, we refer to them as usual, being the matrices indicated as a bold capital letter, and vectors as bold lowercase.
2.
Several authors refers to the KL divergence as
$$\begin{aligned} D_{KL}(\mathbf {Y}\,\Vert \,\mathbf {W\,H}) = \sum _{ij}\Big (\,[\mathbf {Y}]_{ij}\,\odot \log \frac{[\mathbf {Y}]_{ij}}{[\mathbf {W\,H}]_{ij}}-[\mathbf {Y}]_{ij}+[\mathbf {WH}]_{ij}\Big ) \nonumber \end{aligned}$$
which we prefer to call it as I-divergence or generalized KL-divergence, according to [6, p. 105], and reserving the name of KL divergence for the mean information (given by the formula 6), following the original denomination of S. Kullback and R.A. Leibler [20].

References

Amari, S.: Information Geometry and its Applications. Springer, Tokyo (2016). https://doi.org/10.1007/978-4-431-55978-8
Book MATH Google Scholar
Aronszajn, N.: Theory of reproducing kernels. Trans. Am. Math. Soc. 68(3), 337–404 (1950)
Article MathSciNet Google Scholar
Bredensteiner, E.J., Bennett, K.P.: Multicategory classification by support vector machines. In: Pang, J.S. (ed.) Computational Optimization, pp. 53–79. Springer, Boston (1999). https://doi.org/10.1007/978-1-4615-5197-3_5
Chapter Google Scholar
Chappelier, J.-C., Eckard, E.: PLSI: the true fisher kernel and beyond. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009. LNCS (LNAI), vol. 5781, pp. 195–210. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04180-8_30
Chapter Google Scholar
Chen, J.C.: The nonnegative rank factorizations of nonnegative matrices. Linear Algebra Appl. 62, 207–217 (1984)
Article MathSciNet Google Scholar
Cichocki, A., Zdunek, R., Phan, A.H., Amary, S.I.: Nonnegative Matrix and Tensor Factorizations. Wiley, Hoboken (2009)
Book Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)
MATH Google Scholar
Ding, C., Li, T., Peng, W.: On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. Comput. Statist. Data Anal. 52(8), 3913–3927 (2008)
Article MathSciNet Google Scholar
Ding, C., Xiaofeng, H., Horst D.S.: On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering, pp. 606–610 (2005). https://doi.org/10.1137/1.9781611972757.70
Dua, D., Graff, C.: UCI machine learning repository (2017). School of Information and Computer Sciences, University of California, Irvine. http://archive.ics.uci.edu/ml
Durgesh, K.S., Lekha, B.: Data classification using support vector machine. J. Theor. Appl. Inf. Technol. 12(1), 1–7 (2010)
Google Scholar
Elkan, C.: Deriving TF-IDF as a fisher kernel. In: Consens, M., Navarro, G. (eds.) SPIRE 2005. LNCS, vol. 3772, pp. 295–300. Springer, Heidelberg (2005). https://doi.org/10.1007/11575832_33
Chapter Google Scholar
Figuera, P., García Bringas, P.: On the probabilistic latent semantic analysis generalization as the singular value decomposition probabilistic image. J. Stat. Theory Appl. 19, 286–296 (2020). https://doi.org/10.2991/jsta.d.200605.001
Article Google Scholar
Franke, B., et al.: Statistical inference, learning and models in big data. Int. Stat. Rev. 84(3), 371–389 (2016)
Article MathSciNet Google Scholar
Hofmann, T.: Learning the similarity of documents: an information-geometric approach to document retrieval and categorization. In: Advances in Neural Information Processing Systems, pp. 914–920 (2000)
Google Scholar
Hofmann, T., Schölkopf, B., Smola, A.J.: Kernel methods in machine learning. Ann. Stat. 36, 1171–1220 (2008)
MathSciNet MATH Google Scholar
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. J. Mach. Learn. Res. 42(1–2), 177–196 (2000)
MATH Google Scholar
Hsu, C.W., Lin, C.J.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13(2), 415–425 (2002)
Article Google Scholar
Jaakkola, T.S., Haussler, D., et al.: Exploiting generative models in discriminative classifiers. In: Advances in Neural Information Processing Systems, pp. 487–493 (1999)
Google Scholar
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)
Article MathSciNet Google Scholar
Latecki, L.J., Sobel, M., Lakaemper, R.: New EM derived from Kullback-Leibler divergence. In: KDD06 Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Minings, pp. 267–276 (2006)
Google Scholar
Lee, H., Cichocki, A., Choi, S.: Kernel nonnegative matrix factorization for spectral EEG feature extraction. Neurocomputing 72(13–15), 3182–3190 (2009)
Article Google Scholar
Martens, J.: New insights and perspectives on the natural gradient method. J. Mach. Learn. Res. 21, 1–76 (2020)
MathSciNet MATH Google Scholar
Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., Leisch, F.: e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien (2021). R package version 1.7-6. https://CRAN.R-project.org/package=e1071
Naik, G.R.: Non-negative Matrix Factorization Techniques. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-48331-2
Book MATH Google Scholar
Salcedo-Sanz, S., Rojo-Álvarez, J.L., Martínez-Ramón, M., Camps-Valls, G.: Support vector machines in engineering: an overview. Wiley Interdiscip. Rev. Data Mining Knowl. Discov. 4(3), 234–267 (2014)
Article Google Scholar
Tsuda, K., Akaho, S., Kawanabe, M., Müller, K.R.: Asymptotic properties of the fisher kernel. Neural Comput. 16(1), 115–137 (2004)
Article Google Scholar
Zhang, D., Zhou, Z.-H., Chen, S.: Non-negative matrix factorization on kernels. In: Yang, Q., Webb, G. (eds.) PRICAI 2006. LNCS (LNAI), vol. 4099, pp. 404–412. Springer, Heidelberg (2006). https://doi.org/10.1007/978-3-540-36668-3_44
Chapter Google Scholar
Zhang, X.D.: Matrix Analysis and Applications. Cambridge University Press, Cambridge (2017)
Book Google Scholar

Download references

Author information

Authors and Affiliations

University of Deusto - Deustuko Unibertsitatea, D4K Group, University of Deusto, Unibertsitate Etorb. 24, 48007, Bilbao, Spain
Pau Figuera & Pablo García Bringas

Authors

Pau Figuera
View author publications
You can also search for this author in PubMed Google Scholar
Pablo García Bringas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pau Figuera .

Editor information

Editors and Affiliations

University of Deusto, Bilbao, Spain
Hugo Sanjurjo González
University of Deusto, Bilbao, Spain
Iker Pastor López
University of Deusto, Bilbao, Spain
Pablo García Bringas
University of A Coruña, A Coruña, Spain
Héctor Quintián
University of Salamanca, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Figuera, P., Bringas, P.G. (2021). A Non-parametric Fisher Kernel. In: Sanjurjo González, H., Pastor López, I., García Bringas, P., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2021. Lecture Notes in Computer Science(), vol 12886. Springer, Cham. https://doi.org/10.1007/978-3-030-86271-8_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-86271-8_38
Published: 15 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86270-1
Online ISBN: 978-3-030-86271-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics