Abstract
Probabilistic latent semantic analysis (pLSA) can be interpreted as a soft co-clustering model with an intrinsic fuzzification penalty and the partition quality was shown to be improved by tuning the degree of intrinsic partition fuzziness while the model is not supported by probabilistic constraints. In this paper, the mechanism of intrinsic fuzziness tuning is utilized for improving the partition quality of pLSA under the strict probabilistic constraints. The proposed deterministic annealing approach first initializes a co-cluster partition with a slightly fuzzier penalty weight and then gradually reduces the intrinsic fuzziness until it reaches the strict probabilistic constraints. Supported by the robust feature of fuzzier models against random initialization, the derived pLSA partition is demonstrated to be more stable in several numerical experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings of 15th Conference on Uncertainty in Artificial Intelligence, pp. 289–296 (1999)
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1–2), 177–196 (2001)
Rigouste, L., Cappé, O., Yvon, F.: Inference and evaluation of the multinomial mixture model for text clustering. Inf. Process. Manage. 43(5), 1260–1280 (2007)
Sjölander, K., Karplus, K., Brown, M., Hughey, R., Krogh, A., Saira Mian, I., Haussler, D.: Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology. Comput. Appl. Biosci. 12(4), 327–345 (1996)
Ye, X., Yu, Y.-K., Altschul, S.F.: Compositional adjustment of Dirichlet mixture priors. J. Comput. Biol. 17(12), 1607–1620 (2010)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. B 39, 1–38 (1997)
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)
Miyamoto, S., Ichihashi, H., Honda, K.: Algorithms for Fuzzy Clustering. Springer, Heidelberg (2008)
Hathaway, R.J.: Another interpretation of the EM algorithm for mixture distributions. Stat. Probab. Lett. 4, 53–56 (1986)
Ichihashi, H., Miyagishi, K., Honda, K.: Fuzzy c-means clustering with regularization by K-L information. In: Proceedings of 10th IEEE International Conference on Fuzzy Systems, vol. 2, pp. 924–927 (2001)
Honda, K., Ichihashi, H.: Regularized linear fuzzy clustering and probabilistic PCA mixture models. IEEE Trans. Fuzzy Syst. 13(4), 508–516 (2005)
Honda, K., Goshima, T., Ubukata, S., Notsu, A.: A fuzzy co-clustering interpretation of probabilistic latent semantic analysis. In: Proceedings of the 2016 IEEE International Conference on Fuzzy Systems, pp. 718–723 (2016)
Rose, K., Gurewitz, E., Fox, G.: A deterministic annealing approach to clustering. Pattern Recogn. Lett. 11, 589–594 (1990)
Oshio, S., Honda, K., Ubukata, S., Notsu, A.: A deterministic clustering framework in MMMs-induced fuzzy co-clustering. In: Integrated Uncertainty in Knowledge Modelling and Decision Making 2015. Lecture Notes in Artificial Intelligence, vol. 9376, pp. 204–213 (2015)
Honda, K., Oshio, S., Notsu, A.: Fuzzy co-clustering induced by multinomial mixture models. J. Adv. Comput. Intell. Intell. Inf. 19, 717–726 (2015)
MacQueen, J.B.: Some methods of classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)
Miyamoto, S., Mukaidono, M.: Fuzzy \(c\)-means as a regularization and maximum entropy approach. In: Proceedings of the 7th International Fuzzy Systems Association World Congress, vol. 2, pp. 86–92 (1997)
Needham, T.: A visual explanation of Jensen’s inequality. Am. Math. Mon. 100(8), 768–771 (1993)
Oh, C.-H., Honda, K., Ichihashi, H.: Fuzzy clustering for categorical multivariate data. In: Proceedings of Joint 9th IFSA World Congress and 20th NAFIPS International Conference, pp. 2154–2159 (2001)
Kummamuru, K., Dhawale, A., Krishnapuram, R.: Fuzzy co-clustering of documents and keywords. In: Proceedings of 2003 IEEE International Conference on Fuzzy Systems, vol. 2, pp. 772–777 (2003)
Acknowledgment
This work was supported in part by JSPS KAKENHI Grant Number JP26330281.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Goshima, T., Honda, K., Ubukata, S., Notsu, A. (2016). Fuzzy DA Clustering-Based Improvement of Probabilistic Latent Semantic Analysis. In: Huynh, VN., Inuiguchi, M., Le, B., Le, B., Denoeux, T. (eds) Integrated Uncertainty in Knowledge Modelling and Decision Making. IUKM 2016. Lecture Notes in Computer Science(), vol 9978. Springer, Cham. https://doi.org/10.1007/978-3-319-49046-5_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-49046-5_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49045-8
Online ISBN: 978-3-319-49046-5
eBook Packages: Computer ScienceComputer Science (R0)