skip to main content
10.1145/3583780.3614809acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Class-Specific Word Sense Aware Topic Modeling via Soft Orthogonalized Topics

Published: 21 October 2023 Publication History

Abstract

We propose a word sense aware topic model for document classification based on soft orthogonalized topics. An essential problem for this task is to capture word senses related to classes, i.e., class-specific word senses. Traditional models mainly introduce semantic information of knowledge libraries for word sense discovery. However, this information may not align with the classification targets, because these targets are often subjective and task-related. We aim to model the class-specific word senses in topic space. The challenge is to optimize the class separability of the senses, i.e., obtaining sense vectors with (a) high intra-class and (b) low inter-class similarities. Most existing models predefine specific topics for each class to specify the class-specific sense vectors. We call them hard orthogonalization based methods. These methods can hardly achieve both (a) and (b) since they assume the conditional independence of topics to classes and inevitably lose topic information. To this problem, we propose soft orthogonalization for topics. Specifically, we reserve all the topics and introduce a group of class-specific weights for each word to handle the importance of topic dimensions to class separability. Besides, we detect and use highly class-specific words in each document to guide sense estimation. Our experiments on two standard datasets show that our proposal outperforms other state-of-the-art models in terms of accuracy of sense estimation, document classification, and topic modeling. In addition, our joint learning experiments with the pre-trained language model BERT showcased the best complementarity of our model in most cases compared to other topic models.

Supplementary Material

MP4 File (full0124-video.mp4)
In our research, we introduce a word sense aware topic model tailored for document classification, emphasizing class-specific word senses. Traditional methods often rely on general knowledge libraries for word sense discovery, which may not always align with specific classification goals. Our solution? Soft orthogonalization for topics. Unlike existing models that predefine topics for each class, our approach retains all topics and assigns class-specific weights to each word, optimizing for both intra-class and inter-class similarities. We also leverage highly class-specific words in documents for better sense estimation. Our tests on two benchmark datasets reveal superior performance in sense estimation, document classification, and topic modeling. When combined with the renowned BERT model, our method consistently showcases enhanced complementarity. Watch this video to learn more about our proposed approach.

References

[1]
Munir Ahmad, Shabib Aftab, Syed Shah Muhammad, and Sarfraz Ahmad. 2017. Machine Learning Techniques for Sentiment Analysis: A Review. Int. J. Multidiscip. Sci. Eng 8, 3 (2017), 27.
[2]
Ricardo A Baeza-Yates. 1992. Text-Retrieval: Theory and Practice. In IFIP Congress (1), Vol. 12. Citeseer, 465--476.
[3]
Michael W Berry and Malu Castellanos. 2004. Survey of Text Mining. Computing Reviews 45, 9 (2004), 548.
[4]
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research 3, Jan (2003), 993--1022.
[5]
Jordan Boyd-Graber, David Blei, and Xiaojin Zhu. 2007. A Topic Model for Word Sense Disambiguation. In Proc. EMNLP-CoNLL. 1024--1033.
[6]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language Models Are Few-Shot Learners. Advances in neural information processing systems 33 (2020), 1877--1901.
[7]
Chain Monte Carlo. 2004. Markov Chain Monte Carlo and Gibbs Sampling. Lecture Notes for EEB 581 (2004), 540.
[8]
Devendra Singh Chaplot and Ruslan Salakhutdinov. 2018. Knowledge-Based Word Sense Disambiguation Using Topic Models. In Proc. AAAI.
[9]
Corinna Cortes and Vladimir Vapnik. 1995. Support-Vector Networks. Machine Learning 20, 3 (1995), 273--297.
[10]
A Philip Dawid. 1979. Conditional Independence in Statistical Theory. Journal of the Royal Statistical Society: Series B (Methodological) 41, 1 (1979), 1--15.
[11]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805 (2018).
[12]
Ronen Feldman. 2013. Techniques and Applications for Sentiment Analysis. Commun. ACM 56, 4 (2013), 82--89.
[13]
Peter Gärdenfors. 2005. Unreliable Probabilities, Risk Taking, and Decision Making. The Dynamics of Thought (2005), 11--29.
[14]
Stuart Geman and Donald Geman. 2009. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images. IEEE Trans. PAMI 6, 6 (2009), 721--741.
[15]
Roger A Horn. 1990. The Hadamard Product. In Proc. Symp. Appl. Math, Vol. 40. 87--169.
[16]
Harold Jeffreys. 1973. Scientific Inference. Cambridge University Press.
[17]
Aurangzeb Khan, Baharum Baharudin, Lam Hong Lee, and Khairullah Khan. 2010. A Review of Machine Learning Algorithms for Text-Documents Classification. Journal of Advances in Information Technology 1, 1 (2010), 4--20.
[18]
Sicong Kuang and Brian D Davison. 2020. Learning Class-specific Word Embeddings. The Journal of Supercomputing 76, 10 (2020), 8265--8292.
[19]
Simon Lacoste-Julien, Fei Sha, and Michael Jordan. 2008. DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification. Advances in Neural Information Processing Systems 21 (2008).
[20]
Wenbo Li and Einoshin Suzuki. 2020. Hybrid Context-Aware Word Sense Disambiguation in Topic Modeling Based Document Representation. In Proc. ICDM. IEEE, 332--341.
[21]
Ying Liu. 2009. On Document Representation and Term Weights in Text Classification. In Handbook of Research on Text and Web Mining Technologies. IGI Global, 1--22.
[22]
Yang Liu, Zhiyuan Liu, Tat-Seng Chua, and Maosong Sun. 2015. Topical Word Embeddings. In Proc. AAAI.
[23]
Youwei Lu, Shogo Okada, and Katsumi Nitta. 2013. Semi-Supervised Latent Dirichlet Allocation for Multi-Label Text Classification. In International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Springer, 351--360.
[24]
Larry M Manevitz and Malik Yousef. 2001. One-Class SVMs for Document Classification. Journal of Machine Learning Research 2, Dec (2001), 139--154.
[25]
Jon Mcauliffe and David Blei. 2007. Supervised Topic Models. Advances in neural information processing systems 20 (2007).
[26]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. arXiv preprint arXiv:1301.3781 (2013).
[27]
George A Miller. 1995. WordNet: A Lexical Database for English. Commun. ACM 38, 11 (1995), 39--41.
[28]
David Mimno, Hanna Wallach, Edmund Talley, Miriam Leenders, and Andrew McCallum. 2011. Optimizing Semantic Coherence in Topic Models. In Proc. EMNLP. 262--272.
[29]
Roberto Navigli. 2009. Word Sense Disambiguation: A Survey. Comput. Surveys 41, 2 (2009), 1--69.
[30]
Dat Quoc Nguyen, Richard Billingsley, Lan Du, and Mark Johnson. 2015. Improving Topic Models with Latent Feature Word Representations. Transactions of the Association for Computational Linguistics 3 (2015), 299--313.
[31]
Nicole Peinelt, Dong Nguyen, and Maria Liakata. 2020. tBERT: Topic Models and BERT Joining Forces for Semantic Similarity Detection. In Proceedings of the 58th annual meeting of the association for computational linguistics. 7047--7055.
[32]
Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. 2018. Improving Language Understanding by Generative Pre-training. (2018).
[33]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. Language Models Are Unsupervised Multitask Learners. OpenAI blog 1, 8 (2019), 9.
[34]
Daniel Ramage, David Hall, Ramesh Nallapati, and Christopher D Manning. 2009. Labeled LDA: A Supervised Topic Model for Credit Attribution in Multi-Labeled Corpora. In Proc. EMNLP. 248--256.
[35]
Philip Resnik, William Armstrong, Leonardo Claudino, Thang Nguyen, Viet-An Nguyen, and Jordan Boyd-Graber. 2015. Beyond LDA: Exploring Supervised Topic Modeling for Depression-Related Language in Twitter. In Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: from Linguistic Signal to Clinical Reality. 99--107.
[36]
Michael Röder, Andreas Both, and Alexander Hinneburg. 2015. Exploring the Space of Topic Coherence Measures. In Proc. WSDM. ACM, 399--408.
[37]
Filipe Rodrigues, Mariana Lourenco, Bernardete Ribeiro, and Francisco C Pereira. 2017. Learning Supervised Topic Models for Classification and Regression from Crowds. IEEE Trans. PAMI 39, 12 (2017), 2409--2422.
[38]
Claude Elwood Shannon. 1948. A Mathematical Theory of Communication. The Bell System Technical Journal 27, 3 (1948), 379--423.
[39]
Bei Shi, Wai Lam, Shoaib Jameel, Steven Schockaert, and Kwun Ping Lai. 2017. Jointly Learning Word Embeddings and Latent Topics. In Proc. SIGIR. 375--384.
[40]
Duyu Tang, Furu Wei, Bing Qin, Nan Yang, Ting Liu, and Ming Zhou. 2015. Sentiment Embeddings with Applications to Sentiment Analysis. IEEE Transactions on Knowledge and Data Engineering 28, 2 (2015), 496--509.
[41]
Gokhan Tur and Renato De Mori. 2011. Spoken Language Understanding: Systems For Extracting Semantic Information from Speech. John Wiley & Sons.
[42]
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing Data Using t-SNE. Journal of machine learning research 9, 11 (2008).
[43]
Dongsheng Wang, Yishi Xu, Miaoge Li, Zhibin Duan, Chaojie Wang, Bo Chen, and Mingyuan Zhou. 2022. Knowledge-Aware Bayesian Deep Topic Model. arXiv preprint arXiv:2209.14228 (2022).
[44]
Feifei Wang, Junni L Zhang, Yichao Li, Ke Deng, and Jun S Liu. 2021. Bayesian Text Classification and Summarization via A Class-Specified Topic Model. Journal of Machine Learning Research 22 (2021), 89--1.
[45]
Wei Wang, Bing Guo, Yan Shen, Han Yang, Yaosen Chen, and Xinhua Suo. 2021. Robust Supervised Topic Models Under Label Noise. Machine Learning 110, 5 (2021), 907--931.

Cited By

View all
  • (2025)Reducing Data Volume in News Topic Classification: Deep Learning Framework and DatasetIEEE Open Journal of the Computer Society10.1109/OJCS.2024.35197476(153-164)Online publication date: 2025

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
October 2023
5508 pages
ISBN:9798400701245
DOI:10.1145/3583780
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. document classification
  2. document representation
  3. graphical model
  4. topic model
  5. word sense disambiguation

Qualifiers

  • Research-article

Funding Sources

  • Ministry of Science and Technology of the People's Republic of China
  • ZhejiangZhejiang laboratory
  • Ministry of Science and Technology of the People's Republic of China

Conference

CIKM '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)57
  • Downloads (Last 6 weeks)1
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Reducing Data Volume in News Topic Classification: Deep Learning Framework and DatasetIEEE Open Journal of the Computer Society10.1109/OJCS.2024.35197476(153-164)Online publication date: 2025

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media