Skip to main content

Advertisement

Log in

Unsupervised model for aspect categorization and implicit aspect extraction

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

People’s ability to quickly convey their thoughts, or opinions, on various services or items has improved as Web 2.0 has evolved. This is to look at the public perceptions expressed in the reviews. Aspect-based sentiment analysis (ABSA) deemed to receive a set of texts (e.g., product reviews or online reviews) and identify the opinion-target (aspect) within each review. Contemporary aspect-based sentiment analysis systems, like the aspect categorization, rely predominantly on lexicon-based, or manually labelled seeds that is being incorporated into the topic models. And using either handcrafted rules or pre-labelled clues for performing implicit aspect detection. These constraints are restricted to a particular domain or language which is domain-dependent. In this work, we first propose a novel unsupervised probabilistic model Topic-seeds Latent Dirichlet Allocation (TSLDA) that leverages semantic regularities for the articulation of explicit aspect-categories. Then, based on the articulated categories, a distributed vector is used for the identification of implicit aspects. The experimental results show that our approach outperforms baseline methods for different domain-data with minimal configurations. Specifically, utilizing the RI measure, our proposed TSLDA outperformed multiple clustering and topic models by an average of 0.83% in diverse domain-data, and roughly 0.89% using the Precision metric for implicit aspect detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. CBoWs is a trainable word embedding model chosen to maintain the semantic of the words in LDA and to retrieve the topic seeds based on the semantic similarity measure

  2. https://www.cs.cmu.edu/~jiweil/html/hotel-review.html.

  3. https://alt.qcri.org/semeval2014/task4/.

  4. https://alt.qcri.org/semeval2015/task12/.

  5. https://alt.qcri.org/semeval2016/task5/.

  6. Note: Topic-seeds: is a set of terms/seeds for each category (e.g. ‘price’, ‘money’, and ‘pay’ are the topic-seeds for the aspect category ‘price’).

  7. Note: S-TS: is a short form for Single Topic-seed: meaning single seed or term for each category.

  8. Note: avg% is the sum of the averaged percentage values of the utilised evaluation techniques (e.g., 0.83 in the last row in Table 8 is the performance of TSLDA’s RI score on all four data sets).

  9. https://sdgs.un.org/goals.

  10. https://sdgs.un.org/goals/goal1.

  11. https://sdgs.un.org/goals/goal8.

  12. https://sdgs.un.org/goals/goal11.

  13. https://www.nytimes.com/2021/10/16/world/middleeast/iraq-sadr-election.html.

References

  1. Najmi E, Hashmi K, Malik Z et al (2015) CAPRA: a comprehensive approach to product ranking using customer reviews. Computing 97:843–867. https://doi.org/10.1007/s00607-015-0439-8

    Article  MathSciNet  Google Scholar 

  2. Liu W, Jing W, Li Y (2020) Incorporating feature representation into BiLSTM for deceptive review detection. Computing 102:701–715. https://doi.org/10.1007/s00607-019-00763-y

  3. Liu B (2017) Many facets of sentiment analysis. In: A practical guide to sentiment analysis. Socio-affective computing. Springer, Cham, pp 11–39. https://doi.org/10.1007/978-3-319-55394-8_2

  4. Poria S, Chaturvedi I, Cambria E, Bisio F (2016b) Sentic LDA: improving on LDA with semantic similarity for aspect-based sentiment analysis. In: International joint conference on neural networks. IEEE, pp 4465–4473. https://doi.org/10.1109/IJCNN.2016.7727784

  5. Xu X, Cheng X, Tan S et al (2013) Aspect-level opinion mining of online customer reviews. China Commun 10:25–41. https://doi.org/10.1109/CC.2013.6488828

    Article  Google Scholar 

  6. Chen Z, Mukherjee A, Liu B, et al (2013) Exploiting domain knowledge in aspect extraction. In: Empirical methods in natural language processing, pp 1655–1667. https://www.aclweb.org/anthology/D13-1172.pdf

  7. Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  8. Bollegala D, Matsuo Y, Ishizuka M (2011) A web search engine-based approach to measure semantic similarity between words. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2010.172

    Article  Google Scholar 

  9. Jiang Z, Gao S, Chen L (2020) Study on text representation method based on deep learning and topic information. Computing 102:623–642. https://doi.org/10.1007/s00607-019-00755-y

    Article  MathSciNet  MATH  Google Scholar 

  10. Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Conf Artif Intell 101:5228–5235. https://doi.org/10.1073/pnas.0307752101

    Article  Google Scholar 

  11. Mukherjee A, Liu B (2012) Aspect extraction through semi-supervised modeling. In: Proceedings of 50th annual meeting of the association for computational linguistics, pp 339–348. https://www.aclweb.org/anthology/P12-1036

  12. Andrzejewski D, Zhu X, Craven M (2009) Incorporating domain knowledge into topic modeling via Dirichlet Forest priors. In: Proceedings of 26th Annual International Conference on Machine Learning, pp 1–8. https://doi.org/10.1145/1553374.1553378

  13. Hoffman MD, Blei DM, Bach F (2010) Online learning for latent Dirichlet allocation. Adv Neural Inf Process Syst 23:856–864

    Google Scholar 

  14. Das R, Zaheer M, Dyer C (2015) Gaussian LDA for topic models with word embeddings. In: ACL-IJCNLP 2015—53rd annual meeting association computing linguistics. 7th International joint conference on natural language processing. Asian federation on natural language processing proceedings conference. https://doi.org/10.3115/v1/p15-1077

  15. Hai Z, Cong G, Chang K et al (2017) Analyzing sentiments in one go: a supervised joint topic modeling approach. IEEE Trans Knowl Data Eng 29:1172–1185. https://doi.org/10.1109/TKDE.2017.2669027

    Article  Google Scholar 

  16. Lin C, He Y (2009) Joint sentiment/topic model for sentiment analysis. In: Proceeding 18th ACM conference on information and knowledge management—CIKM ’09, ACM, p 375. https://doi.org/10.1145/1645953.1646003

  17. Ozyurt B, Akcayol MA (2021) A new topic modeling based approach for aspect extraction in aspect based sentiment analysis: SS-LDA. Expert Syst Appl 168:114231. https://doi.org/10.1016/j.eswa.2020.114231

    Article  Google Scholar 

  18. Santosh DT, Vardhan BV, Ramesh D (2016) Extracting product features from reviews using feature ontology tree applied on LDA topic clusters. In: Proceedings of 6th international conference on advanced computing conference, pp 163–168. https://doi.org/10.1109/IACC.2016.39

  19. Ali F, Kwak D, Khan P et al (2019) Transportation sentiment analysis using word embedding and ontology-based topic modeling. Knowl Based Syst 174:27–42. https://doi.org/10.1016/j.knosys.2019.02.033

    Article  Google Scholar 

  20. Park S-M, Lee SJ, On B-W (2020) Topic word embedding-based methods for automatically extracting main aspects from product reviews. Appl Sci 10:3831. https://doi.org/10.3390/app10113831

    Article  Google Scholar 

  21. García-Pablos A, Cuadros M, Rigau G (2018) W2VLDA: almost unsupervised system for aspect based sentiment analysis. Expert Syst Appl 91:127–137. https://doi.org/10.1016/j.eswa.2017.08.049

    Article  Google Scholar 

  22. Nimala K, Magesh S, Thamizh Arasan R (2018) Hash tag based topic modelling techniques for twitter by tweet aggregation strategy. J Adv Res Dyn Control Syst 10:571–578

    Google Scholar 

  23. Xiong S, Wang K, Ji D, Wang B (2018) A short text sentiment-topic model for product reviews. Neurocomputing 297:94–102. https://doi.org/10.1016/j.neucom.2018.02.034

    Article  Google Scholar 

  24. Tang F, Fu L, Yao B, Xu W (2019) Aspect based fine-grained sentiment analysis for online reviews. Inf Sci 488:190–204. https://doi.org/10.1016/J.INS.2019.02.064

    Article  Google Scholar 

  25. Tubishat M, Idris N, Abushariah MAM (2018) Implicit aspect extraction in sentiment analysis: review, taxonomy, opportunities, and open challenges. Inf Process Manag. https://doi.org/10.1016/j.ipm.2018.03.008

    Article  Google Scholar 

  26. Gobi N, Rathinavelu A (2019) Analyzing cloud based reviews for product ranking using feature based clustering algorithm. Cluster Comput. https://doi.org/10.1007/s10586-018-1996-3

    Article  Google Scholar 

  27. Miranda C, Buelvas E (2019) AspectSA: unsupervised system for aspect based sentiment analysis in Spanish. In: Prospectiva. https://doi.org/10.15665/rp.v17i1.1961

  28. Rana TA, Cheah YN (2020) Multi-level knowledge-based approach for implicit aspect identification. Appl Intell 50:4616–4630. https://doi.org/10.1007/s10489-020-01817-x

    Article  Google Scholar 

  29. Feng J, Cai S, Ma X (2019) Enhanced sentiment labeling and implicit aspect identification by integration of deep convolution neural network and sequential algorithm. Clust Comput 22:5839–5857. https://doi.org/10.1007/s10586-017-1626-5

    Article  Google Scholar 

  30. Afzaal M, Usman M, Fong ACM, Fong S (2019) Multiaspect-based opinion classification model for tourist reviews. Expert Syst 36:e12371. https://doi.org/10.1111/exsy.12371

    Article  Google Scholar 

  31. Mowlaei ME, Saniee Abadeh M, Keshavarz H (2020) Aspect-based sentiment analysis using adaptive aspect-based lexicons. Expert Syst Appl 148:113234. https://doi.org/10.1016/j.eswa.2020.113234

    Article  Google Scholar 

  32. Xu Q, Zhu L, Dai T et al (2020) Non-negative matrix factorization for implicit aspect identification. J Ambient Intell Humaniz Comput 11:2683–2699. https://doi.org/10.1007/s12652-019-01328-9

    Article  Google Scholar 

  33. Wang S, Zhou W, Jiang C (2020) A survey of word embeddings based on deep learning. Computing 102:717–740. https://doi.org/10.1007/s00607-019-00768-7

    Article  MathSciNet  MATH  Google Scholar 

  34. Almansor EH, Hussain FK, Hussain OK (2021) Supervised ensemble sentiment-based framework to measure chatbot quality of services. Computing 103:491–507. https://doi.org/10.1007/s00607-020-00863-0

    Article  Google Scholar 

  35. Demeester T, Rocktäschel T, Riedel S (2016) Lifted rule injection for relation embeddings. In: Proceedings of 2016 conference on empirical methods natural language processing, pp 1389–1399

  36. Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850. https://doi.org/10.1080/01621459.1971.10482356

    Article  Google Scholar 

  37. Shannon C (1948) A mathematical theory of communication | Nokia Bell Labs Journals & Magazine | IEEE Xplore. Bell Syst Tech J. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x

    Article  Google Scholar 

  38. Strehl A, Ghosh J (2003) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617

    MathSciNet  MATH  Google Scholar 

  39. Poria S, Cambria E, Gelbukh A (2016) Aspect extraction for opinion mining with a deep convolutional neural network. Knowl Based Syst 108:42–49. https://doi.org/10.1016/j.knosys.2016.06.009

    Article  Google Scholar 

  40. Arthur D, Vassilvitskii S (2007) K-means++: the advantages of careful seeding. In: Proceedings of 18th annual ACM-SIAM symposium discrete algorithms

  41. Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-019-01328-9

    Article  MATH  Google Scholar 

  42. Dumais ST (2004) Latent semantic analysis. Annu Rev Inf Sci Technol. https://doi.org/10.1002/aris.1440380105

    Article  Google Scholar 

  43. Su Q, Xu X, Guo H, et al (2008) Hidden sentiment association in Chinese web opinion mining. In: Proceedings of 17th international conference on world wide web, pp 959–968. https://doi.org/10.1145/1367497.1367627

  44. Chen L, Martineau J, Cheng D, Sheth A (2016) Clustering for simultaneous extraction of aspects and features from reviews. In: 2016 Conference of the North American chapter association computing linguistics human language technologies. NAACL HLT 2016—Proceedings conference, pp 789–799. https://doi.org/10.18653/v1/N16-1093

  45. Hai Z, Chang K, Kim JJ (2011) Implicit feature identification via co-occurrence association rule mining. In: Computational linguistics and intelligent text processing. CICLing Lecture notes in computer science, pp 393–404. https://doi.org/10.1007/978-3-642-19400-9_31

  46. Xiong S, Ji D (2016) Exploiting flexible-constrained K-means clustering with word embedding for aspect-phrase grouping. Inf Sci (NY) 367–368:689–699. https://doi.org/10.1016/j.ins.2016.07.002

    Article  Google Scholar 

  47. Xiong S, Cheng M, Batra V et al (2020) Aspect terms grouping via fusing concepts and context information. Inf Fusion 64:12–19. https://doi.org/10.1016/j.inffus.2020.06.007

    Article  Google Scholar 

  48. Zhai Z, Liu B, Xu H, Jia P (2011) Clustering product features for opinion mining. In: Proceedings of 4th ACM international conference on web search data mining WSDM, vol 2011, pp 347–354. https://doi.org/10.1145/1935826.1935884

  49. McAuliffe JD, Blei DM (2005) Supervised topic models. In: Advanced neural information processing systems, pp 121–128. arXiv:1003.0783

  50. Jo Y, Oh A (2011) Aspect and sentiment unification model for online review analysis. In: Proceedings of 4th ACM international conference on web search data mining. ACM Press, pp 815–824. https://doi.org/10.1145/1935826.1935932

  51. Wang H, Lu Y, Zhai C (2010) Latent aspect rating analysis on review text data: A rating regression approach. In: Proceedings of 16th ACM SIGKDD international conference on knowledge discovery and data mining—KDD ’09. https://doi.org/10.1145/1835804.1835903

  52. Chen Z, Mukherjee A, Liu B, et al (2013) Exploiting domain knowledge in aspect extraction. In: Empirical methods on natural language processing, pp 1655–1667. https://www.aclweb.org/anthology/D13-1172

  53. Poria S, Cambria E, Ku L-W, et al (2015) A rule-based approach to aspect extraction from product reviews. In: Second work on natural language processing for social media, pp 28–37. https://www.aclweb.org/anthology/W14-5905.pdf

  54. Schouten K, Frasincar F (2014) Implicit feature detection for sentiment analysis. In: Proceedings of 23rd international conference on world wide web, pp 367–368. https://doi.org/10.1145/2567948.2577378

  55. Zeng L, Li F (2013) A classification-based approach for implicit feature identification. In: Chinese computational linguistics natural language processing based natural annotations of big data, pp 190–202. https://doi.org/10.1007/978-3-642-41491-6_18

  56. Sun L, Li S, Li J, Lv J (2014) A novel context-based implicit feature extracting method. In: DSAA 2014—proceedings of 2014 IEEE international conference on data science and advanced analytics, pp 420–424. https://doi.org/10.1109/DSAA.2014.7058106

  57. Yan Z, Xing M, Zhang D, Ma B (2015) EXPRS: an extended pagerank method for product feature extraction from online consumer reviews. Inf Manag 52:850–858. https://doi.org/10.1016/j.im.2015.02.002

    Article  Google Scholar 

  58. Toh Z, Wang W (2014) DLIREC: aspect term extraction and term polarity classification system. In: Proceedings of the 11th international workshop on semantic evaluation (SemEval 2014), pp 235–240

  59. Del-pino MA, Watine P (2020) The importance of poverty in sustainability policies: an approach to understanding online opinion. In: Actas del III Congreso Internacional de Ingeniería de Sistemas La, pp 183–194

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nurul Hashimah Ahamed Hassain Malim.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

AL-Janabi, O.M., Ahamed Hassain Malim, N.H. & Cheah, YN. Unsupervised model for aspect categorization and implicit aspect extraction. Knowl Inf Syst 64, 1625–1651 (2022). https://doi.org/10.1007/s10115-022-01678-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-022-01678-5

Keywords