Skip to main content
Log in

A logistic non-negative matrix factorization approach to binary data sets

  • Published:
Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Abstract

An analysis of binary data sets employing Bernoulli statistics and a partially non-negative factorization of the related matrix of log-odds is presented. The model places several constraints onto the factorization process rendering the estimated basis system strictly non-negative or even binary. Thereby the proposed model places itself in between a logistic PCA and a binary NMF approach. We show with proper toy data sets that different variants of the proposed model yield reasonable results and indeed are able to estimate with good precision the underlying basis system which forms a new and often more compact representation of the observations. An application of the method to the USPS data set reveals the performance of the various variants of the model and shows good reconstruction quality even with a low rank binary basis set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Almeida, A. M. (2013). On an optimization model for approximate nonnegative matrix factorization. In A. Madureira, C. Reis, & V. Marque (Eds.), Computational intelligence and decision making, intelligent systems, control and automation: science and engineering, vol. 61 (pp. 249–257). Netherlands: Springer.

    Chapter  Google Scholar 

  • Arngren, M., Schmidt, M. N., & Larsen, J. (2010). Unmixing of hyperspectral images using bayesian nonnegative matrix factorization with volume prior. Journal of Signal Processing Systems, 65(3), 479–496.

    Article  Google Scholar 

  • Banerjee, A., Merugu, S., Dhillon, I. S., & Ghosh, J. (2005). Clustering with bregman divergences. Journal of Machine Learning Research, 6, 1705–1749.

    MATH  MathSciNet  Google Scholar 

  • Berry, M. W., Browne, M., Langville, A. N., Pauca, V. P., & Plemmons, R. J. (2007). Algorithms and applications for approximate nonnegative matrix factorization. Computational Statistics and Data Analysis, 52(1), 155–173.

    Google Scholar 

  • Bingham, E., Kaban, A., & Fortelius, M. (2009). The aspect bernoulli model: Multiple causes of presences and absences. Pattern Analysis & Applications, 12(1), 55–78.

    Article  MathSciNet  Google Scholar 

  • Blei, D., Ng, A., & Jordan, M. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    MATH  Google Scholar 

  • Buntine, W., & Jakuli, A. (2006). Discrete component analysis. In C. Saunders, M. Grobelnik, S. Gunn, & J. Shawe-Taylor (Eds.), Subspace, latent structure and feature selection (pp. 1–33). Berlin: Springer.

  • Cichocki, A., Amari, S., Zdunek, R., & Phan, A. H. (2009). Non-negative matrix and tensor factorizations: Applications to exploratory multi-way data analysis and blind source separation. New York: Wiley.

    Book  Google Scholar 

  • Cichocki, A., & Amari, S. I. (2002). Adaptive blind signal and image processing: Learning algorithms and applications. New York: Wiley.

    Book  Google Scholar 

  • Collins, M., Dasgupta, S., & Schapire, R. E. (2001). A generalization of principal component analysis to the exponential family. In NIPS 11 (pp. 592–598).

  • Dayan, P., & Zemel, R. S. (1995). Competition and multiple cause models. Neural Computation, 7(3), 565–579.

    Article  Google Scholar 

  • Diamantaras, K. I., & Kung, S. Y. (1996). Principal component neural networks: Theory and applications. New York: John Wiley.

    MATH  Google Scholar 

  • Donoho, D., & Stodden, V. (2003). When does non-negative matrix factorization give correct decomposition into parts? In NIPS 2003 (pp. 1–8). MIT Press, Cambridge.

  • Donoho, D. L. (2004). For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution. Communications on Pure and Applied Mathematics, 59, 797–829.

    Article  MathSciNet  Google Scholar 

  • Guan, N., Tao, D., Luo, Z., & Shawe-Taylor, J. (2012a). MahNMF: Manhattan non-negative matrix factorization. Journal Machine Learning Research. http://arxiv.org/abs/1207.3438.

  • Guan, N., Tao, D., Luo, Z., & Yuan, B. (2012b). NeNMF: An optimal gradient method for nonnegative matrix factorization. IEEE Transactions on Signal Processing, 60(6), 2882–2898.

    Google Scholar 

  • Hofmann, T. (1999). Probabilistic latent semantic indexin. In Proceedings of 22nd annual international ACM SIGIR conference on research and development in information retrieva (pp. 50–57). Berkeley, California, US.

  • Hofmann, T. (2001). Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42(1–2), 177–196.

    Article  MATH  Google Scholar 

  • Hoyer, P. O. (2004). Non-negative matrix factorization with sparseness constraints. The Journal of Machine Learning Research, 5, 1457–1469.

    MATH  MathSciNet  Google Scholar 

  • Hyvärinen, A., Karhunen, J., & Oja, E. (2001). Independent component analysis. New York: Wiley.

    Book  Google Scholar 

  • Jiang, L., & Yin, H. (2012). Bregman iteration algorithm for sparse nonnegative matrix factorizations via alternating l1-norm minimization. Multidimensional Systems and Signal Process, 23(3), 315–328. doi:10.1007/s11045-011-0147-2.

    Article  MathSciNet  Google Scholar 

  • Jolliffe, I. T. (2002). Principal component analysis, Springer Series in Statistics, vol. 45. Berlin: Springer. doi:10.1002/aic.690450213.

    Google Scholar 

  • Kabán, A., Bingham, E., & Hirsimäki, T. (2004). Learning to read between the lines: The aspect bernoulli mode. In: Proceedings of 4th SIAM international conference data mining (pp. 462–466).

  • Laurberg, H., Christensen, M. G., Plumbley, M. D., Hansen, L. K., & Jensen, S. H. (2008). Theorems on Positive Data: On the Uniqueness of NMF. Computational Intelligence and Neuroscience. doi:10.1155/2008/764206.

  • Meeds, E., Ghahramani, Z., Neal, R. M., & Roweis, S. T. (2007). Modeling dyadic data with binary latent factors. Bernoulli, 19(8), 977–984.

    Google Scholar 

  • Potluru, V. K., Plis, S. M., Roux, J. L., Pearlmutter, B. A., Calhoun, V. D., & Hayes, T. P. (2013). Block coordinate descent for sparse NMF. http://arxiv.org/abs/1301.3527.

  • Saund, E. (1995). A multiple cause mixture model for unsupervised learning. Neural Computation, 7(1), 51–71.

    Article  Google Scholar 

  • Schachtner, R., Pöppel, G., & Lang, E.W. (2010a). Bayesian extensions to non-negative matrix factorization. In CIP2010: Proceedings of the 2nd international workshop on cognitive information processing on Elba Island (pp. 57–62).

  • Schachtner, R., Pöppel, G., & Lang, E. W. (2010b). A Nonnegative blind source separation model for binary test data. IEEE Transactions on Circuits and Systems I, 57(7), 1439–1448.

    Google Scholar 

  • Schein, A. I., Saul, L. K., & Ungar, L. H. (2003). A generalized linear model for principal component analysis of binary data. In In Proceedings of the 9th international workshop on artificial intelligence and statistics.

  • Singliar, T. (2006). Noisy-OR component analysis and its application to link analysis. Journal of Machine Learning Research, 7, 2189–2213.

    MATH  MathSciNet  Google Scholar 

  • Theis, F. J., Stadlthanner, K., & Tanak, T. (2005). First results on uniqueness of sparse non-negative matrix factorization. In: 13. European signal processing conference, EUSIPCO. http://www.eurasip.org/Proceedings/Eusipco/Eusipco2005/defevent/abstract/a1658.pdf.

  • Tipping, M. E. (1999). Probabilistic visualisation of high-dimensional binary data. In Proceedings of advances in neural information processing systems II (NIPS 1998) (pp. 592–598).

  • Tipping, M. E., & Bishop, C. M. (1999). Probabilistic principal component analysis. Journal of the Royal Statistical Society—Series B: Statistical Methodology, 61(3), 611–622. doi:10.1111/1467-9868.00196.

    Article  MATH  MathSciNet  Google Scholar 

  • Tsaig, Y. (2007). Sparse solution of under determined linear systems: algorithms and applications. Thesis http://www.stanford.edu/dept/ICME/docs/thesis/Tsaig-2007.pdf.

  • Zhang, Z., Ding, C., Li, T., & Zhang, X. (2007). Binary matrix factorization with applications. In Proceedings of seventh IEEE international conference on in data mining (pp. 391–400).

  • Zhou, D., Gao, H. Y., & Zhang, Y. J. (2013). A decorrelation-based nonnegative matrix factorization algorithm for face recognition. Advanced Materials Research, 651, 858–863.

    Article  Google Scholar 

Download references

Acknowledgments

Support by the DAAD-FCT, the BFHZ-CCUFB and the GENIL-SPR project at CITIC-UGR is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. M. Tomé.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tomé, A.M., Schachtner, R., Vigneron, V. et al. A logistic non-negative matrix factorization approach to binary data sets. Multidim Syst Sign Process 26, 125–143 (2015). https://doi.org/10.1007/s11045-013-0240-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11045-013-0240-9

Keywords

Navigation