Skip to main content
Log in

LJST: A Semi-supervised Joint Sentiment-Topic Model for Short Texts

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Several methods on simultaneous detection of sentiment and topics have been proposed to obtain subjective information such as opinion, attitude and feelings expressed in texts. Most of the techniques fail to produce desired results for short texts. In this paper, we propose LJST, a labeled joint sentiment-topic model particularly for short texts. It uses a probabilistic framework based on latent Dirichlet allocation. LJST is semi-supervised—it predicts the sentiment values for unlabeled texts in presence of a partially labeled texts with sentiment values. To address the sparsity problem in short text, we modify LJST and introduce Bi-LJST, which uses bi-terms (all possible pairs of words in a document) in place of unigrams for learning the topics by directly generating word co-occurrence patterns in each text and expressing the topics in terms of these patterns. Specifically, we have proposed a semi-supervised approach of extracting joint sentiment-topic model for short texts by incorporating bi-terms. Extensive experiments on three real-world datasets show that our methods perform consistently better than three other baselines in terms of document-level and topic-level sentiment prediction, and topic discovery—LJST using bi-term models outperforms the best baseline by producing 12% lower RMSE for document-level sentiment prediction and 6% higher F1 score for topic-sentiment prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. The parameter \(\epsilon \) is used to forcibly assign non-zero values to labels.

  2. Note that we do not deal with neutral sentiment in this work, mainly because non-subjective feedback is seldom of any importance in case of mining of opinion.

  3. Similar results are obtained for other datasets which we omit due to lack of space.

References

  1. Abirami A, Gayathri V. A survey on sentiment analysis methods and approach. In: Eighth international conference on advanced computing (ICoAC), IEEE; 2017. p. 72–76.

  2. Ali F, Kwak D, Khan P, El-Sappagh SHA, Ali A, Ullah S, Kim K, Kwak KS. Transportation sentiment analysis using word embedding and ontology-based topic modeling. Knowl Based Syst. 2019;174:27–42.

    Article  Google Scholar 

  3. Andreevskaia A, Bergler S. When specialists and generalists work together: overcoming domain dependence in sentiment tagging. In: ACL, Proceedings of the 46th annual meeting of the association for computational linguistics, Columbus, OH, USA; 2008. p. 290–298.

  4. Blei DM. Probabilistic topic models. Commun ACM. 2012;55(4):77–84.

    Article  Google Scholar 

  5. Blei DM, McAuliffe JD. Supervised topic models. In: Advances in neural information processing systems 20, Proceedings of the twenty-first annual conference on neural information processing systems, Vancouver, BC, Canada; 2007. p. 121–128.

  6. Blei DM, Ng AY, Jordan MI. Latent Dirichlet allocation. J Mach Learn Res. 2003;3:993–1022.

    MATH  Google Scholar 

  7. Chen Y, Zhang H, Liu R, Ye Z, Lin J. Experimental explorations on short text topic mining between LDA and NMF based schemes. Knowl Based Syst. 2019;163:1–13.

    Article  Google Scholar 

  8. Cheng X, Guo J, Liu S, Wang Y, Yan X. Learning topics in short texts by non-negative matrix factorization on term correlation matrix. In: Proceedings of the 13th SIAM international conference on data mining, Austin, TX, USA; 2013. p. 749–757.

  9. Choo J, Lee C, Reddy C, Park H. UTOPIAN: user-driven topic modeling based on interactive nonnegative matrix factorization. IEEE Trans Vis Comput Graph. 2013;19(12):1992–2001.

    Article  Google Scholar 

  10. Daud A, Li J, Zhou L, Muhammad F. Knowledge discovery through directed probabilistic topic models: a survey. Front Comput Sci China. 2010;4(2):280–301.

    Article  Google Scholar 

  11. Dermouche M, Khouas L, Velcin J, Loudcher S. A joint model for topic-sentiment modeling from text. In: Proceedings of the 30th annual ACM symposium on applied computing (SAC), Salamanca, Spain, ACM; 2015. p. 819–824.

  12. Fu X, Yang K, Huang JZ, Cui L. Dynamic non-parametric joint sentiment topic mixture model. Knowl Based Syst. 2015;82(C):102–14.

    Article  Google Scholar 

  13. Fu X, Sun X, Wu H, Cui L, Huang JZ. Weakly supervised topic sentiment joint model with word embeddings. Knowl Based Syst. 2018;147:43–54.

    Article  Google Scholar 

  14. Gupta D, Singh K, Chakrabarti S, Chakraborty T. Multi-task learning for target-dependent sentiment classification. In: Advances in knowledge discovery and data mining - 23rd Pacific-Asia conference, PAKDD, Macau, China, Proceedings, Part I, Springer, Lecture Notes in Computer Science; 2019. Vol. 11439, p. 185–197.

  15. Hofmann T. Probabilistic latent semantic indexing. SIGIR Forum. 2017;51(2):211–8.

    Article  Google Scholar 

  16. Hu Y, John A, Wang F, Kambhampati S. ET-LDA: joint topic modeling for aligning events and their twitter feedback. In: Proceedings of the twenty-sixth AAAI conference on artificial intelligence, Toronto, ON, Canada, AAAI Press; 2012.

  17. Jelodar H, Wang Y, Yuan C, Feng X, Jiang X, Li Y, Zhao L. Latent Dirichlet allocation (LDA) and Topic modeling: models, applications, a survey. Multimed Tools Appl. 2019;78:15169–15211. https://doi.org/10.1007/s11042-018-6894-4.

    Article  Google Scholar 

  18. Kim SM, Hovy E. Determining the sentiment of opinions. In: Proceedings of the 20th international conference on computational linguistics (COLING), Association for Computational Linguistics, USA; 2004. p. 1367–es.

  19. Lin C, He Y, Everson R, Rüger SM. Weakly supervised joint sentiment-topic detection from text. IEEE Trans Knowl Data Eng. 2012;24(6):1134–45.

    Article  Google Scholar 

  20. Liu B, Zhang L. A survey of opinion mining and sentiment analysis. In: Aggarwal C, Zhai C, editors. Mining text data. Boston: Springer; 2012. p. 415–63.

    Chapter  Google Scholar 

  21. Mei Q, Ling X, Wondra M, Su H, Zhai C. Topic sentiment mixture: modeling facets and opinions in weblogs. In: Proceedings of the 16th international conference on world wide web, WWW, Banff, AB, Canada; 2007. p. 171–180.

  22. Mei Q, Shen X, Zhai C. Automatic labeling of multinomial topic models. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining (KDD), San Jose, CA, USA, ACM; 2007. p. 490–499.

  23. Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. In: 1st International conference on learning representations, ICLR, Scottsdale, AZ, USA, Workshop track proceedings 2013.

  24. Mimno DM, Wallach HM, Talley EM, Leenders M, McCallum A. Optimizing semantic coherence in topic models. In: Proceedings of the conference on empirical methods in natural language processing, EMNLP, Edinburgh, UK, A meeting of SIGDAT, a Special Interest Group of the ACL, ACL; 2011. p. 262–272.

  25. Minka TP. Estimating a Dirichlet distribution. Tech. Rep. 1, No. 3, Microsoft Research 2000.

  26. Nguyen TH, Shirai K. Topic modeling based sentiment analysis on social media for stock market prediction. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing of the Asian Federation of Natural Language Processing, ACL, Beijing, China, vol. 1, Long Papers; 2015. p. 1354–1364.

  27. Nugroho R, Yang J, Zhong Y, Paris C, Nepal S. Deriving topics in twitter by exploiting tweet interactions. In: Carminati B, Khan L, editors. IEEE international congress on big data, New York City, NY, IEEE Computer Society; 2015. p. 87–94.

  28. Pang B, Lee L. A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting of the association for computational linguistics, Barcelona, Spain; 2004. p. 271–278.

  29. Pang B, Lee L. Opinion mining and sentiment analysis. Found Trends Inf Retr. 2008;2(1–2):1–135.

    Article  Google Scholar 

  30. Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the conference on empirical methods in natural language processing. Philadelphia, PA, USA: EMNLP; 2002.

  31. Rahman MM, Wang H. Hidden topic sentiment model. In: Proceedings of the 25th international conference on world wide web, WWW; 2016. p. 155–165.

  32. Ramage D, Hall D, Nallapati R, Manning CD. Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the conference on empirical methods in natural language processing, EMNLP, Singapore, A meeting of SIGDAT, a Special Interest Group of the ACL, ACL; 2009. p. 248–256.

  33. Roy S, Malladi VV, Gangwar A, Dharmaraj R. A NMF-based learning of topics and clusters for IT maintenance tickets aided by heuristic. In: Information systems in the big data era - CAiSE Forum, Tallinn, Estonia, Proceedings; 2018. p. 209–217.

  34. Sabnis O. Yelp reviews dataset. https://www.kaggle.com/omkarsabnis/yelp-reviews-dataset (2018). Accessed 1 Mar 2020.

  35. Shi T, Kang K, Choo J, Reddy CK. Short-text topic modeling via non-negative matrix factorization enriched with local word-context correlations. In: Proceedings of the world wide web conference on world wide web, WWW, Lyon, France, ACM; 2018. p. 1105–1114.

  36. Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C. Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing, EMNLP, Seattle, WA, USA, A meeting of SIGDAT, a Special Interest Group of the ACL; 2013. p. 1631–1642.

  37. Supplementary. An anonymized link for code, dataset and appendix. https://github.com/DSRnD/LJST (2019). Accessed 28 Mar 2020.

  38. Titov I, McDonald RT. A joint model of text and aspect ratings for sentiment summarization. In: ACL, Proceedings of the 46th annual meeting of the association for computational linguistics, Columbus, OH, USA; 2008. p. 308–316.

  39. Turney PD. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting of the association for computational linguistics, Philadelphia, PA, USA; 2002. p. 417–424.

  40. Whitelaw C, Garg N, Argamon S. Using appraisal groups for sentiment analysis. In: Herzog O, Schek H, Fuhr N, Chowdhury A, Teiken W (eds) Proceedings of the ACM CIKM international conference on information and knowledge management, Bremen, Germany, ACM; 2005. p. 625–631.

  41. Xu W, Liu X, Gong Y. Document clustering based on non-negative matrix factorization. In: SIGIR: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, Toronto, Canada, ACM; 2003. p. 67–273.

  42. Yan X, Guo J, Lan Y, Cheng X. A biterm topic model for short texts. In: 22nd International world wide web conference, WWW. Rio de Janeiro, Brazil; 2013. p. 1445–1456.

  43. Zhang L, Wang S, Liu B. Deep learning for sentiment analysis: a survey. Wiley Interdiscip Rev Data Min Knowl Discov. 2018;8(4):e1253.

    Article  Google Scholar 

  44. Zhao J, Liu K, Wang G. Adding redundant features for CRFs-based sentence sentiment classification. In: Proceedings of the conference on empirical methods in natural language processing, Association for Computational Linguistics, USA, EMNLP; 2008. p 117–126.

Download references

Funding

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ayan Sengupta.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Detailed Formulation of Model Inference for LJST

The total probability of the model based words, topics and sentiment labels can be decomposed as follows:

$$\begin{aligned} p\left( \mathbf {w}, \mathbf {z}, \mathbf {l}, \varphi , \pi , \theta , \pmb {\alpha }, \beta , \pmb {\gamma }\right) =&\prod \limits _{j=1}^{T} \prod \limits _{k=1}^{S} p\left( \varphi _{j,k};\beta \right) \times \prod \limits _{d=1}^{D} p\left( \pi _{d,j};\pmb {\gamma }\right) p\left( \theta _{d};\pmb {\alpha }\right) \times \nonumber \\&\prod \limits _{i=1}^{N_{d}} p\left( z_{d,i}|\theta _{d}\right) p\left( l_{d,j,i}|\pi _{d,j}\right) p\left( w_{d,j,k,i}|\varphi _{z_{d,i},l_{d,j,i}}\right) . \end{aligned}$$
(14)

To use collapsed Gibbs sampling, we integrate out \(\varphi \), \(\pi \) and \(\theta \) in Eq. (14):

$$\begin{aligned} p\left( \mathbf {w}, \mathbf {z}, \mathbf {l}, \pmb {\alpha }, \beta , \pmb {\gamma }\right) =&\int _{\varphi }\int _{\pi }\int _{\theta } p\left( \mathbf {w}, \mathbf {z}, \mathbf {l}, \varphi , \pi , \theta , \pmb {\alpha }, \beta , \pmb {\gamma }\right) \,d\varphi \,d\pi \,d\theta \nonumber \\ =&\int _{\varphi } \prod \limits _{j} \prod \limits _{k} p\left( \varphi _{j,k};\beta \right) \prod \limits _{d} \prod \limits _{i} p\left( w_{d,j,k,i}|\varphi _{z_{d,i},l_{d,j,i}}\right) \,d\varphi \ \times \nonumber \\&\int _{\pi } \prod \limits _{d} \prod \limits _{j} p\left( \pi _{d,j};\pmb {\gamma }\right) \prod \limits _{i} p\left( l_{d,j,i}|\pi _{d,j}\right) \,d\pi \ \times \nonumber \\&\int _{\theta } \prod \limits _{d} \prod \limits _{i} p\left( z_{d,i}|\theta _{d}\right) p\left( \theta _{d};\pmb {\alpha }\right) \,d\theta . \end{aligned}$$
(15)

As, \(\varphi \), \(\pi \) and \(\theta \) are independent variables, we can split the three terms of the RHS in Eq. (15) and calculate them separately. Further, each individual terms \(\varphi _{j,k}\), \(\pi _{d,j}\) and \(\theta _{d}\) are independent. Thus, we can interchange the product and integration in each of the three terms.

For each document d, \(\theta _{d}\), we replace the term \(p(\theta _{d};\pmb {\alpha })\) with corresponding Dirichlet distribution and \(p(z_{d,i}|\theta _{d})\) with multinomial distribution to get:

$$\begin{aligned} \int _{\theta _{d}} p\left( \theta _{d};\pmb {\alpha }\right) \prod \limits _{i} p\left( z_{d,i}|\theta _{d}\right) \,d\theta _{d} = \int _{\theta _{d}} {\left( \frac{\varGamma \left( \sum \nolimits _{j=1}^{T} \alpha _j\right) }{\prod \nolimits _{j=1}^{T}\varGamma \left( \alpha _j\right) } \right) } \theta _{d,j}^{N_{d,j}} \prod \limits _{j} \theta _{d,j}^{\alpha _{j}-1} \,d\theta _{d}. \end{aligned}$$
(16)

Using the property of Dirichlet distribution we get:

$$\begin{aligned} \int _{\theta _{d}} {\left( \frac{\varGamma \left( \sum \nolimits _{j=1}^{T} \alpha _j + \sum \nolimits _{j=1}^{T} N_{d,j}\right) }{\prod \nolimits _{j=1}^{T}\varGamma \left( \alpha _j+N_{d,j}\right) } \right) } \prod \limits _{j} \theta _{d,j}^{N_{d,j}+\alpha _{j}-1} \,d\theta _{d}&= 1. \end{aligned}$$

This leads to :

$$\begin{aligned} p(z) =&\prod \limits _{d} p\left( z_{d}\right) \nonumber \\ =&\prod \limits _{d} \int _{\theta _{d}} p\left( \theta _{d};\pmb {\alpha }\right) \prod \limits _{i} p\left( z_{d,i}|\theta _{d}\right) \,d\theta _{d} \nonumber \\ =&{\left( \frac{\varGamma \left( \sum \nolimits _{j=1}^{T} \alpha _j\right) }{\prod \nolimits _{j=1}^{T}\varGamma \left( \alpha _j\right) } \right) }^D \cdot \prod \limits _{d} \frac{\prod \nolimits _{j}\varGamma (N_{d,j} + \alpha _{j})}{\varGamma \left( N_{d} + \sum \nolimits _{j} \alpha _{j}\right) }. \end{aligned}$$
(17)

Similarly, we calculate \(p(l) = \prod \nolimits _{d} \prod \nolimits _{j} p(l_{d,j})\) using the formula below:

$$\begin{aligned}&\prod \limits _{d} \prod \limits _{j} \int _{\pi _{d,j}} p\left( \pi _{d,j};\pmb {\gamma _{d}}\right) \prod \limits _{i} p\left( l_{d,j,i}|\pi _{d,j}\right) \,d\pi _{d,j} \nonumber \\&\quad = \prod \limits _{d} \prod \limits _{j} \int _{\pi _{d,j}} \prod \limits _{k} \pi _{d,j,k}^{\gamma _{d,k}-1} {\left( \frac{\varGamma \left( \sum \nolimits _{k=1}^{S} \gamma _{d,k}\right) }{\prod \nolimits _{k=1}^{S}\varGamma (\gamma _{d,k})} \right) } \pi _{d,j,k}^{N_{d,j,k}} \,d\pi _{d,j} \nonumber \\&\quad = {\left( \frac{\varGamma \left( \sum \nolimits _{k=1}^{S}\gamma _{d,k}\right) }{\prod \nolimits _{k=1}^{S}\varGamma \left( \gamma _{d,k}\right) }\right) }^{D \times T} \cdot \prod \limits _{d} \prod \limits _{j} \frac{\prod \limits _{k} \varGamma \left( N_{d,j,k}+\gamma _{d,k}\right) }{\varGamma \left( N_{d,j} + \sum \nolimits _{k}\gamma _{d,k}\right) }. \end{aligned}$$
(18)

Finally, we obtain \(p(w) = \prod \nolimits _{j} \prod \nolimits _{k} p(w_{j,k})\) by replacing \(\prod \nolimits _{d=1}^{D} \prod \nolimits _{i=1}^{N_{d}}\) with \(\prod \nolimits _{i=1}^{V}\) in the formula

$$\begin{aligned}&\prod \limits _{j} \prod \limits _{k} \int _{\varphi _{j,k}} p\left( \varphi _{j,k};\beta \right) \prod \limits _{d} \prod \limits _{i} p\left( w_{d,j,k,i}|\varphi _{j,k}\right) \,d\varphi _{j,k} \nonumber \\&\quad = \prod \limits _{j} \prod \limits _{k} \int _{\varphi _{j,k}} p\left( \varphi _{j,k};\beta \right) \prod \limits _{i} p\left( w_{j,k,i}|\varphi _{j,k}\right) \,d\varphi _{j,k} \nonumber \\&\quad = \prod \limits _{j} \prod \limits _{k} \int _{\varphi _{j,k}} \prod \limits _{i} \varphi _{j,k,i}^{\beta -1} {\left( \frac{\varGamma (\sum \nolimits _{i=1}^{V} \beta )}{\prod \nolimits _{i=1}^{V}\varGamma (\beta )} \right) } \varphi _{j,k,i}^{N_{j,k,i}} \,d\varphi _{j,k} \nonumber \\&\quad = {\left( \frac{\varGamma (V\beta )}{{\varGamma (\beta )}^{V}} \right) }^{T \times S} \cdot \prod \limits _{j} \prod \limits _{k} \frac{\prod \nolimits _{i} \varGamma \left( N_{j,k,i}+\beta \right) }{\varGamma \left( N_{j,k}+V \beta \right) }. \end{aligned}$$
(19)

The goal of Gibbs sampling is to calculate \(p(\mathbf {z}, \mathbf {l} \,|\, w_t, \pmb {\alpha }, \beta , \pmb {\gamma })\) instead of Eq. (15), by taking approximation \(p(z_t, l_t \,|\, w_t,\mathbf {z}^{-\mathbf{t}}, \mathbf {l}^{-\mathbf{t}}, \pmb {\alpha }, \beta , \pmb {\gamma })\), for each word \(w_t\), in document d. Here t is the index of the word in document d.

$$\begin{aligned}&p\left( z_{t}, l_{t} \,|\, w_t,\mathbf {z}^{-\mathbf{t}}, \mathbf {l}^{-\mathbf{t}}, \pmb {\alpha }, \beta , \pmb {\gamma }\right) \nonumber \\&\quad \propto \ p\left( z_{t}, l_{t} , \mathbf {z}^{-\mathbf{t}}, \mathbf {l}^{-\mathbf{t}} \,|\, w_t, \pmb {\alpha }, \beta , \pmb {\gamma }\right) \nonumber \\&\quad = p\left( z_{t}, l_{t}\,|\, w_t, \pmb {\alpha }, \beta , \pmb {\gamma }\right) \cdot p\left( , \mathbf {z}^{-\mathbf{t}}, \mathbf {l}^{-\mathbf{t}} \,|\, w_t, \pmb {\alpha }, \beta , \pmb {\gamma }\right) . \end{aligned}$$
(20)

If we put the values from Eqs. (17), (18) and  (19) in each of the terms in Eq. (20), we get

$$\begin{aligned}&p\left( z_{t}, l_{t} \,|\, w_t,\mathbf {z}^{-\mathbf{t}}, \mathbf {l}^{-\mathbf{t}}, \pmb {\alpha }, \beta , \pmb {\gamma }\right) \propto \nonumber \\&\quad \prod \limits _{j} \prod \limits _{k} \frac{\varGamma \left( N_{j,k,w_t}+\beta \right) }{\varGamma \left( N_{j,k}+V \beta \right) } \cdot \prod \limits _{j} \frac{\prod \nolimits _{k} \varGamma \left( N_{d,j,k}+\gamma _{d,k}\right) }{\varGamma \left( N_{d,j} + \sum \limits _{k}\gamma _{d,k}\right) } \cdot \frac{\prod \nolimits _{j}\varGamma \left( N_{d,j} + \alpha _{j}\right) }{\varGamma \left( N_{d} + \sum \nolimits _{j} \alpha _{j}\right) }, \end{aligned}$$
(21)

and

$$\begin{aligned}&p\left( z_{t} = j, l_{t} = k \,|\, w_t,\mathbf {z}^{-\mathbf{t}}, \mathbf {l}^{-\mathbf{t}}, \pmb {\alpha }, \beta , \pmb {\gamma }\right) \propto \nonumber \\&\quad \frac{\varGamma \left( N_{j,k,w_t}+\beta \right) }{\varGamma \left( N_{j,k}+V \beta \right) } \cdot \frac{\varGamma \left( N_{d,j,k}+\gamma _{d,k}\right) }{\varGamma \left( N_{d,j} + \sum \nolimits _{k}\gamma _{d,k}\right) } \cdot \frac{\varGamma \left( N_{d,j} + \alpha _{j}\right) }{\varGamma \left( N_{d} + \sum \nolimits _{j} \alpha _{j}\right) }. \end{aligned}$$
(22)

For further simplification, we use formula \(\varGamma (x+1) = x\cdot \varGamma (x)\) to get

$$\begin{aligned} \varGamma \left( N_{j,k,w_t}+\beta \right) = \varGamma \left( N_{j,k,w_t}^{-t}+\beta +1\right) \propto N_{j,k,w_t}^{-t}+\beta . \end{aligned}$$

Following this on other terms like \(N_{j,k}, N_{d,j,k}, N_{d,j}\) and \(N_{d}\) we get

$$\begin{aligned}&p\left( z_t = j, l_t = k \,|\, \mathbf {w},\mathbf {z}^{-\mathbf{t}}, \mathbf {l}^{-\mathbf{t}}, \pmb {\alpha }, \beta , \gamma \right) \propto \nonumber \\&\quad \frac{N_{j,k,w_t}^{-t} + \beta }{N_{j,k}^{-t} + V \beta } \cdot \frac{N_{d,j,k}^{-t} + \gamma _{d,k}}{N_{d,j}^{-t} + \sum \nolimits _{k}\gamma _{d,k}} \cdot \frac{N_{d,j}^{-t} + \alpha _j}{N_{d}^{-t} + \sum \nolimits _{j} \alpha _j}. \end{aligned}$$
(23)

Detailed Formulation of Model Inference for Bi-LJST

There are some changes in the probability distribution for Bi-LJST. Let \(\mathcal{B}\) snd \(B_d\) denote the vocabulary of bi-terms and the number of bi-terms in the document d. Further assume that a bi-term is represented as \(b_t = (w_p,w_q)\). The joint probability of bi-terms, topics and sentiment labels can be captured by,

$$\begin{aligned} p\left( b_t, \mathbf {z}, \mathbf {l}\right)&= p\left( b_t \,|\, \mathbf {l}, \mathbf {z}\right) p(\mathbf {l}| \mathbf {z}) p(\mathbf {z}). \end{aligned}$$
(24)

As the terms, \(p(\mathbf {l}| \mathbf {z})\) and \(p(\mathbf {z})\) do not depend on bi-term \(b_{t}\), we can directly make use of Eqs. (17) and (18). Further, in the generative process of Bi-LJST, word pairs within a bi-term are conditionally independent i.e. \(p(b_t \,|\, \mathbf {l}, \mathbf {z}) = p(w_p \,|\, \mathbf {l}, \mathbf {z}) \cdot p(w_q \,|\, \mathbf {l}, \mathbf {z})\). This leads us to topic sentiment assignment corresponding to bi-term \(b_t\) as follows:

$$\begin{aligned}&p\left( z_t = j, l_t = k \,|\, b_t,\mathbf {z}^{-\mathbf{t}}, \mathbf {l}^{-\mathbf{t}}, \pmb {\alpha }, \beta , \pmb {\gamma }\right) \propto \nonumber \\&\quad \frac{\left( N_{j,k,w_p}^{-t} + \beta \right) \cdot \left( N_{j,k,w_q}^{-t} + \beta \right) }{\left( N_{j,k}^{-t} + V \beta \right) ^2} \cdot \frac{N_{d,j,k}^{-t} + \gamma _{d,k}}{N_{d,j}^{-t} + \sum \nolimits _{k}\gamma _{d,k}} \cdot \frac{N_{d,j}^{-t} + \alpha _j}{N_{d}^{-t} + \sum \nolimits _{j} \alpha _j}, \end{aligned}$$
(25)

where, \(p_1\) and \(p_2\) are the index of \(w_p\) and \(w_q\) respectively in vocabulary.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sengupta, A., Roy, S. & Ranjan, G. LJST: A Semi-supervised Joint Sentiment-Topic Model for Short Texts. SN COMPUT. SCI. 2, 256 (2021). https://doi.org/10.1007/s42979-021-00649-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-021-00649-x

Keywords

Navigation