Abstract
In this paper we investigate the problem of supervised latent modeling for extracting topic hierarchies from data. The supervised part is given in the form of expert information over document-topic correspondence. To exploit the expert information we use a regularization term that penalizes the difference between a predicted and an expert-given model. We hence add the regularization term to the log-likelihood function and use a stochastic EM based algorithm for parameter estimation. The proposed method is used to construct a topic hierarchy over the proceedings of the European Conference on Operational Research and helps to automatize the abstract submission system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
EURO conference abstracts and data. http://sourceforge.net/p/mlalgorithms/code/ HEAD/tree/EURO_data/. Accessed 14 May 2015
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Gaussier, É., Goutte, C., Popat, K., Chen, F.: A hierarchical model for clustering and categorising documents. In: Crestani, F., Girolami, M., van Rijsbergen, C.J. (eds.) ECIR 2002. LNCS, vol. 2291, p. 229. Springer, Heidelberg (2002)
Good, I.J., Gaskins, R.A.: Nonparametric roughness penalties for probability densities. Biometrika 58(2), 255–277 (1971)
Griffiths, T.L., Jordan, M.I., Tenenbaum, J.B., Blei, D.M.: Hierarchical topic models and the nested Chinese restaurant process. In: Thrun, S., Saul, L.K., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems 16, pp. 17–24 (2004)
Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57. ACM (1999)
Kuzmin, A.A., Strijov, V.V.: Validation of the thematic models for document collections. Inf. Technol. 4, 16–20 (2013)
Ramage, D., Hall, D., Nallapati, R., Manning, C.D.: Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1, pp. 248–256. Association for Computational Linguistics (2009)
Vorontsov, K.V., Potapenko, A.A.: Additive regularization of topic models. Mach. Learn. J., Special Issue “Data Analysis and Intelligent Optimization” 101, 303–323 (2015)
Wallach, H.M.: Topic modeling: beyond bag-of-words. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 977–984. ACM (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kuznetsov, M., Clausel, M., Amini, MR., Gaussier, E., Strijov, V. (2015). Supervised Topic Classification for Modeling a Hierarchical Conference Structure. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9489. Springer, Cham. https://doi.org/10.1007/978-3-319-26532-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-26532-2_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26531-5
Online ISBN: 978-3-319-26532-2
eBook Packages: Computer ScienceComputer Science (R0)