Abstract
People’s ability to quickly convey their thoughts, or opinions, on various services or items has improved as Web 2.0 has evolved. This is to look at the public perceptions expressed in the reviews. Aspect-based sentiment analysis (ABSA) deemed to receive a set of texts (e.g., product reviews or online reviews) and identify the opinion-target (aspect) within each review. Contemporary aspect-based sentiment analysis systems, like the aspect categorization, rely predominantly on lexicon-based, or manually labelled seeds that is being incorporated into the topic models. And using either handcrafted rules or pre-labelled clues for performing implicit aspect detection. These constraints are restricted to a particular domain or language which is domain-dependent. In this work, we first propose a novel unsupervised probabilistic model Topic-seeds Latent Dirichlet Allocation (TSLDA) that leverages semantic regularities for the articulation of explicit aspect-categories. Then, based on the articulated categories, a distributed vector is used for the identification of implicit aspects. The experimental results show that our approach outperforms baseline methods for different domain-data with minimal configurations. Specifically, utilizing the RI measure, our proposed TSLDA outperformed multiple clustering and topic models by an average of 0.83% in diverse domain-data, and roughly 0.89% using the Precision metric for implicit aspect detection.







Similar content being viewed by others
Notes
CBoWs is a trainable word embedding model chosen to maintain the semantic of the words in LDA and to retrieve the topic seeds based on the semantic similarity measure
Note: Topic-seeds: is a set of terms/seeds for each category (e.g. ‘price’, ‘money’, and ‘pay’ are the topic-seeds for the aspect category ‘price’).
Note: S-TS: is a short form for Single Topic-seed: meaning single seed or term for each category.
Note: avg% is the sum of the averaged percentage values of the utilised evaluation techniques (e.g., 0.83 in the last row in Table 8 is the performance of TSLDA’s RI score on all four data sets).
References
Najmi E, Hashmi K, Malik Z et al (2015) CAPRA: a comprehensive approach to product ranking using customer reviews. Computing 97:843–867. https://doi.org/10.1007/s00607-015-0439-8
Liu W, Jing W, Li Y (2020) Incorporating feature representation into BiLSTM for deceptive review detection. Computing 102:701–715. https://doi.org/10.1007/s00607-019-00763-y
Liu B (2017) Many facets of sentiment analysis. In: A practical guide to sentiment analysis. Socio-affective computing. Springer, Cham, pp 11–39. https://doi.org/10.1007/978-3-319-55394-8_2
Poria S, Chaturvedi I, Cambria E, Bisio F (2016b) Sentic LDA: improving on LDA with semantic similarity for aspect-based sentiment analysis. In: International joint conference on neural networks. IEEE, pp 4465–4473. https://doi.org/10.1109/IJCNN.2016.7727784
Xu X, Cheng X, Tan S et al (2013) Aspect-level opinion mining of online customer reviews. China Commun 10:25–41. https://doi.org/10.1109/CC.2013.6488828
Chen Z, Mukherjee A, Liu B, et al (2013) Exploiting domain knowledge in aspect extraction. In: Empirical methods in natural language processing, pp 1655–1667. https://www.aclweb.org/anthology/D13-1172.pdf
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
Bollegala D, Matsuo Y, Ishizuka M (2011) A web search engine-based approach to measure semantic similarity between words. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2010.172
Jiang Z, Gao S, Chen L (2020) Study on text representation method based on deep learning and topic information. Computing 102:623–642. https://doi.org/10.1007/s00607-019-00755-y
Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Conf Artif Intell 101:5228–5235. https://doi.org/10.1073/pnas.0307752101
Mukherjee A, Liu B (2012) Aspect extraction through semi-supervised modeling. In: Proceedings of 50th annual meeting of the association for computational linguistics, pp 339–348. https://www.aclweb.org/anthology/P12-1036
Andrzejewski D, Zhu X, Craven M (2009) Incorporating domain knowledge into topic modeling via Dirichlet Forest priors. In: Proceedings of 26th Annual International Conference on Machine Learning, pp 1–8. https://doi.org/10.1145/1553374.1553378
Hoffman MD, Blei DM, Bach F (2010) Online learning for latent Dirichlet allocation. Adv Neural Inf Process Syst 23:856–864
Das R, Zaheer M, Dyer C (2015) Gaussian LDA for topic models with word embeddings. In: ACL-IJCNLP 2015—53rd annual meeting association computing linguistics. 7th International joint conference on natural language processing. Asian federation on natural language processing proceedings conference. https://doi.org/10.3115/v1/p15-1077
Hai Z, Cong G, Chang K et al (2017) Analyzing sentiments in one go: a supervised joint topic modeling approach. IEEE Trans Knowl Data Eng 29:1172–1185. https://doi.org/10.1109/TKDE.2017.2669027
Lin C, He Y (2009) Joint sentiment/topic model for sentiment analysis. In: Proceeding 18th ACM conference on information and knowledge management—CIKM ’09, ACM, p 375. https://doi.org/10.1145/1645953.1646003
Ozyurt B, Akcayol MA (2021) A new topic modeling based approach for aspect extraction in aspect based sentiment analysis: SS-LDA. Expert Syst Appl 168:114231. https://doi.org/10.1016/j.eswa.2020.114231
Santosh DT, Vardhan BV, Ramesh D (2016) Extracting product features from reviews using feature ontology tree applied on LDA topic clusters. In: Proceedings of 6th international conference on advanced computing conference, pp 163–168. https://doi.org/10.1109/IACC.2016.39
Ali F, Kwak D, Khan P et al (2019) Transportation sentiment analysis using word embedding and ontology-based topic modeling. Knowl Based Syst 174:27–42. https://doi.org/10.1016/j.knosys.2019.02.033
Park S-M, Lee SJ, On B-W (2020) Topic word embedding-based methods for automatically extracting main aspects from product reviews. Appl Sci 10:3831. https://doi.org/10.3390/app10113831
García-Pablos A, Cuadros M, Rigau G (2018) W2VLDA: almost unsupervised system for aspect based sentiment analysis. Expert Syst Appl 91:127–137. https://doi.org/10.1016/j.eswa.2017.08.049
Nimala K, Magesh S, Thamizh Arasan R (2018) Hash tag based topic modelling techniques for twitter by tweet aggregation strategy. J Adv Res Dyn Control Syst 10:571–578
Xiong S, Wang K, Ji D, Wang B (2018) A short text sentiment-topic model for product reviews. Neurocomputing 297:94–102. https://doi.org/10.1016/j.neucom.2018.02.034
Tang F, Fu L, Yao B, Xu W (2019) Aspect based fine-grained sentiment analysis for online reviews. Inf Sci 488:190–204. https://doi.org/10.1016/J.INS.2019.02.064
Tubishat M, Idris N, Abushariah MAM (2018) Implicit aspect extraction in sentiment analysis: review, taxonomy, opportunities, and open challenges. Inf Process Manag. https://doi.org/10.1016/j.ipm.2018.03.008
Gobi N, Rathinavelu A (2019) Analyzing cloud based reviews for product ranking using feature based clustering algorithm. Cluster Comput. https://doi.org/10.1007/s10586-018-1996-3
Miranda C, Buelvas E (2019) AspectSA: unsupervised system for aspect based sentiment analysis in Spanish. In: Prospectiva. https://doi.org/10.15665/rp.v17i1.1961
Rana TA, Cheah YN (2020) Multi-level knowledge-based approach for implicit aspect identification. Appl Intell 50:4616–4630. https://doi.org/10.1007/s10489-020-01817-x
Feng J, Cai S, Ma X (2019) Enhanced sentiment labeling and implicit aspect identification by integration of deep convolution neural network and sequential algorithm. Clust Comput 22:5839–5857. https://doi.org/10.1007/s10586-017-1626-5
Afzaal M, Usman M, Fong ACM, Fong S (2019) Multiaspect-based opinion classification model for tourist reviews. Expert Syst 36:e12371. https://doi.org/10.1111/exsy.12371
Mowlaei ME, Saniee Abadeh M, Keshavarz H (2020) Aspect-based sentiment analysis using adaptive aspect-based lexicons. Expert Syst Appl 148:113234. https://doi.org/10.1016/j.eswa.2020.113234
Xu Q, Zhu L, Dai T et al (2020) Non-negative matrix factorization for implicit aspect identification. J Ambient Intell Humaniz Comput 11:2683–2699. https://doi.org/10.1007/s12652-019-01328-9
Wang S, Zhou W, Jiang C (2020) A survey of word embeddings based on deep learning. Computing 102:717–740. https://doi.org/10.1007/s00607-019-00768-7
Almansor EH, Hussain FK, Hussain OK (2021) Supervised ensemble sentiment-based framework to measure chatbot quality of services. Computing 103:491–507. https://doi.org/10.1007/s00607-020-00863-0
Demeester T, Rocktäschel T, Riedel S (2016) Lifted rule injection for relation embeddings. In: Proceedings of 2016 conference on empirical methods natural language processing, pp 1389–1399
Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850. https://doi.org/10.1080/01621459.1971.10482356
Shannon C (1948) A mathematical theory of communication | Nokia Bell Labs Journals & Magazine | IEEE Xplore. Bell Syst Tech J. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
Strehl A, Ghosh J (2003) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
Poria S, Cambria E, Gelbukh A (2016) Aspect extraction for opinion mining with a deep convolutional neural network. Knowl Based Syst 108:42–49. https://doi.org/10.1016/j.knosys.2016.06.009
Arthur D, Vassilvitskii S (2007) K-means++: the advantages of careful seeding. In: Proceedings of 18th annual ACM-SIAM symposium discrete algorithms
Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-019-01328-9
Dumais ST (2004) Latent semantic analysis. Annu Rev Inf Sci Technol. https://doi.org/10.1002/aris.1440380105
Su Q, Xu X, Guo H, et al (2008) Hidden sentiment association in Chinese web opinion mining. In: Proceedings of 17th international conference on world wide web, pp 959–968. https://doi.org/10.1145/1367497.1367627
Chen L, Martineau J, Cheng D, Sheth A (2016) Clustering for simultaneous extraction of aspects and features from reviews. In: 2016 Conference of the North American chapter association computing linguistics human language technologies. NAACL HLT 2016—Proceedings conference, pp 789–799. https://doi.org/10.18653/v1/N16-1093
Hai Z, Chang K, Kim JJ (2011) Implicit feature identification via co-occurrence association rule mining. In: Computational linguistics and intelligent text processing. CICLing Lecture notes in computer science, pp 393–404. https://doi.org/10.1007/978-3-642-19400-9_31
Xiong S, Ji D (2016) Exploiting flexible-constrained K-means clustering with word embedding for aspect-phrase grouping. Inf Sci (NY) 367–368:689–699. https://doi.org/10.1016/j.ins.2016.07.002
Xiong S, Cheng M, Batra V et al (2020) Aspect terms grouping via fusing concepts and context information. Inf Fusion 64:12–19. https://doi.org/10.1016/j.inffus.2020.06.007
Zhai Z, Liu B, Xu H, Jia P (2011) Clustering product features for opinion mining. In: Proceedings of 4th ACM international conference on web search data mining WSDM, vol 2011, pp 347–354. https://doi.org/10.1145/1935826.1935884
McAuliffe JD, Blei DM (2005) Supervised topic models. In: Advanced neural information processing systems, pp 121–128. arXiv:1003.0783
Jo Y, Oh A (2011) Aspect and sentiment unification model for online review analysis. In: Proceedings of 4th ACM international conference on web search data mining. ACM Press, pp 815–824. https://doi.org/10.1145/1935826.1935932
Wang H, Lu Y, Zhai C (2010) Latent aspect rating analysis on review text data: A rating regression approach. In: Proceedings of 16th ACM SIGKDD international conference on knowledge discovery and data mining—KDD ’09. https://doi.org/10.1145/1835804.1835903
Chen Z, Mukherjee A, Liu B, et al (2013) Exploiting domain knowledge in aspect extraction. In: Empirical methods on natural language processing, pp 1655–1667. https://www.aclweb.org/anthology/D13-1172
Poria S, Cambria E, Ku L-W, et al (2015) A rule-based approach to aspect extraction from product reviews. In: Second work on natural language processing for social media, pp 28–37. https://www.aclweb.org/anthology/W14-5905.pdf
Schouten K, Frasincar F (2014) Implicit feature detection for sentiment analysis. In: Proceedings of 23rd international conference on world wide web, pp 367–368. https://doi.org/10.1145/2567948.2577378
Zeng L, Li F (2013) A classification-based approach for implicit feature identification. In: Chinese computational linguistics natural language processing based natural annotations of big data, pp 190–202. https://doi.org/10.1007/978-3-642-41491-6_18
Sun L, Li S, Li J, Lv J (2014) A novel context-based implicit feature extracting method. In: DSAA 2014—proceedings of 2014 IEEE international conference on data science and advanced analytics, pp 420–424. https://doi.org/10.1109/DSAA.2014.7058106
Yan Z, Xing M, Zhang D, Ma B (2015) EXPRS: an extended pagerank method for product feature extraction from online consumer reviews. Inf Manag 52:850–858. https://doi.org/10.1016/j.im.2015.02.002
Toh Z, Wang W (2014) DLIREC: aspect term extraction and term polarity classification system. In: Proceedings of the 11th international workshop on semantic evaluation (SemEval 2014), pp 235–240
Del-pino MA, Watine P (2020) The importance of poverty in sustainability policies: an approach to understanding online opinion. In: Actas del III Congreso Internacional de Ingeniería de Sistemas La, pp 183–194
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
AL-Janabi, O.M., Ahamed Hassain Malim, N.H. & Cheah, YN. Unsupervised model for aspect categorization and implicit aspect extraction. Knowl Inf Syst 64, 1625–1651 (2022). https://doi.org/10.1007/s10115-022-01678-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-022-01678-5