Skip to main content

Fuzzy DA Clustering-Based Improvement of Probabilistic Latent Semantic Analysis

  • Conference paper
  • First Online:
Integrated Uncertainty in Knowledge Modelling and Decision Making (IUKM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9978))

  • 1604 Accesses

Abstract

Probabilistic latent semantic analysis (pLSA) can be interpreted as a soft co-clustering model with an intrinsic fuzzification penalty and the partition quality was shown to be improved by tuning the degree of intrinsic partition fuzziness while the model is not supported by probabilistic constraints. In this paper, the mechanism of intrinsic fuzziness tuning is utilized for improving the partition quality of pLSA under the strict probabilistic constraints. The proposed deterministic annealing approach first initializes a co-cluster partition with a slightly fuzzier penalty weight and then gradually reduces the intrinsic fuzziness until it reaches the strict probabilistic constraints. Supported by the robust feature of fuzzier models against random initialization, the derived pLSA partition is demonstrated to be more stable in several numerical experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings of 15th Conference on Uncertainty in Artificial Intelligence, pp. 289–296 (1999)

    Google Scholar 

  2. Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1–2), 177–196 (2001)

    Article  MATH  Google Scholar 

  3. Rigouste, L., Cappé, O., Yvon, F.: Inference and evaluation of the multinomial mixture model for text clustering. Inf. Process. Manage. 43(5), 1260–1280 (2007)

    Article  Google Scholar 

  4. Sjölander, K., Karplus, K., Brown, M., Hughey, R., Krogh, A., Saira Mian, I., Haussler, D.: Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology. Comput. Appl. Biosci. 12(4), 327–345 (1996)

    Google Scholar 

  5. Ye, X., Yu, Y.-K., Altschul, S.F.: Compositional adjustment of Dirichlet mixture priors. J. Comput. Biol. 17(12), 1607–1620 (2010)

    Article  MathSciNet  Google Scholar 

  6. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. B 39, 1–38 (1997)

    MathSciNet  MATH  Google Scholar 

  7. Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)

    Book  MATH  Google Scholar 

  8. Miyamoto, S., Ichihashi, H., Honda, K.: Algorithms for Fuzzy Clustering. Springer, Heidelberg (2008)

    MATH  Google Scholar 

  9. Hathaway, R.J.: Another interpretation of the EM algorithm for mixture distributions. Stat. Probab. Lett. 4, 53–56 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  10. Ichihashi, H., Miyagishi, K., Honda, K.: Fuzzy c-means clustering with regularization by K-L information. In: Proceedings of 10th IEEE International Conference on Fuzzy Systems, vol. 2, pp. 924–927 (2001)

    Google Scholar 

  11. Honda, K., Ichihashi, H.: Regularized linear fuzzy clustering and probabilistic PCA mixture models. IEEE Trans. Fuzzy Syst. 13(4), 508–516 (2005)

    Article  Google Scholar 

  12. Honda, K., Goshima, T., Ubukata, S., Notsu, A.: A fuzzy co-clustering interpretation of probabilistic latent semantic analysis. In: Proceedings of the 2016 IEEE International Conference on Fuzzy Systems, pp. 718–723 (2016)

    Google Scholar 

  13. Rose, K., Gurewitz, E., Fox, G.: A deterministic annealing approach to clustering. Pattern Recogn. Lett. 11, 589–594 (1990)

    Article  MATH  Google Scholar 

  14. Oshio, S., Honda, K., Ubukata, S., Notsu, A.: A deterministic clustering framework in MMMs-induced fuzzy co-clustering. In: Integrated Uncertainty in Knowledge Modelling and Decision Making 2015. Lecture Notes in Artificial Intelligence, vol. 9376, pp. 204–213 (2015)

    Google Scholar 

  15. Honda, K., Oshio, S., Notsu, A.: Fuzzy co-clustering induced by multinomial mixture models. J. Adv. Comput. Intell. Intell. Inf. 19, 717–726 (2015)

    Google Scholar 

  16. MacQueen, J.B.: Some methods of classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)

    Google Scholar 

  17. Miyamoto, S., Mukaidono, M.: Fuzzy \(c\)-means as a regularization and maximum entropy approach. In: Proceedings of the 7th International Fuzzy Systems Association World Congress, vol. 2, pp. 86–92 (1997)

    Google Scholar 

  18. Needham, T.: A visual explanation of Jensen’s inequality. Am. Math. Mon. 100(8), 768–771 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  19. Oh, C.-H., Honda, K., Ichihashi, H.: Fuzzy clustering for categorical multivariate data. In: Proceedings of Joint 9th IFSA World Congress and 20th NAFIPS International Conference, pp. 2154–2159 (2001)

    Google Scholar 

  20. Kummamuru, K., Dhawale, A., Krishnapuram, R.: Fuzzy co-clustering of documents and keywords. In: Proceedings of 2003 IEEE International Conference on Fuzzy Systems, vol. 2, pp. 772–777 (2003)

    Google Scholar 

Download references

Acknowledgment

This work was supported in part by JSPS KAKENHI Grant Number JP26330281.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Katsuhiro Honda .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Goshima, T., Honda, K., Ubukata, S., Notsu, A. (2016). Fuzzy DA Clustering-Based Improvement of Probabilistic Latent Semantic Analysis. In: Huynh, VN., Inuiguchi, M., Le, B., Le, B., Denoeux, T. (eds) Integrated Uncertainty in Knowledge Modelling and Decision Making. IUKM 2016. Lecture Notes in Computer Science(), vol 9978. Springer, Cham. https://doi.org/10.1007/978-3-319-49046-5_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49046-5_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49045-8

  • Online ISBN: 978-3-319-49046-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics