Unsupervised model for aspect categorization and implicit aspect extraction

AL-Janabi, Omar Mustafa; Ahamed Hassain Malim, Nurul Hashimah; Cheah, Yu-N

doi:10.1007/s10115-022-01678-5

Unsupervised model for aspect categorization and implicit aspect extraction

Regular Paper
Published: 25 April 2022

Volume 64, pages 1625–1651, (2022)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

596 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

People’s ability to quickly convey their thoughts, or opinions, on various services or items has improved as Web 2.0 has evolved. This is to look at the public perceptions expressed in the reviews. Aspect-based sentiment analysis (ABSA) deemed to receive a set of texts (e.g., product reviews or online reviews) and identify the opinion-target (aspect) within each review. Contemporary aspect-based sentiment analysis systems, like the aspect categorization, rely predominantly on lexicon-based, or manually labelled seeds that is being incorporated into the topic models. And using either handcrafted rules or pre-labelled clues for performing implicit aspect detection. These constraints are restricted to a particular domain or language which is domain-dependent. In this work, we first propose a novel unsupervised probabilistic model Topic-seeds Latent Dirichlet Allocation (TSLDA) that leverages semantic regularities for the articulation of explicit aspect-categories. Then, based on the articulated categories, a distributed vector is used for the identification of implicit aspects. The experimental results show that our approach outperforms baseline methods for different domain-data with minimal configurations. Specifically, utilizing the RI measure, our proposed TSLDA outperformed multiple clustering and topic models by an average of 0.83% in diverse domain-data, and roughly 0.89% using the Precision metric for implicit aspect detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSAM: toward Supervised Sentiment and Aspect Modeling on different levels of labeling

Article Open access 04 August 2017

Non-negative matrix factorization for implicit aspect identification

Article 29 May 2019

Aspect Clustering Methods for Sentiment Analysis

Notes

CBoWs is a trainable word embedding model chosen to maintain the semantic of the words in LDA and to retrieve the topic seeds based on the semantic similarity measure
https://www.cs.cmu.edu/~jiweil/html/hotel-review.html.
https://alt.qcri.org/semeval2014/task4/.
https://alt.qcri.org/semeval2015/task12/.
https://alt.qcri.org/semeval2016/task5/.
Note: Topic-seeds: is a set of terms/seeds for each category (e.g. ‘price’, ‘money’, and ‘pay’ are the topic-seeds for the aspect category ‘price’).
Note: S-TS: is a short form for Single Topic-seed: meaning single seed or term for each category.
Note: avg% is the sum of the averaged percentage values of the utilised evaluation techniques (e.g., 0.83 in the last row in Table 8 is the performance of TSLDA’s RI score on all four data sets).
https://sdgs.un.org/goals.
https://sdgs.un.org/goals/goal1.
https://sdgs.un.org/goals/goal8.
https://sdgs.un.org/goals/goal11.
https://www.nytimes.com/2021/10/16/world/middleeast/iraq-sadr-election.html.

References

Najmi E, Hashmi K, Malik Z et al (2015) CAPRA: a comprehensive approach to product ranking using customer reviews. Computing 97:843–867. https://doi.org/10.1007/s00607-015-0439-8
Article MathSciNet Google Scholar
Liu W, Jing W, Li Y (2020) Incorporating feature representation into BiLSTM for deceptive review detection. Computing 102:701–715. https://doi.org/10.1007/s00607-019-00763-y
Liu B (2017) Many facets of sentiment analysis. In: A practical guide to sentiment analysis. Socio-affective computing. Springer, Cham, pp 11–39. https://doi.org/10.1007/978-3-319-55394-8_2
Poria S, Chaturvedi I, Cambria E, Bisio F (2016b) Sentic LDA: improving on LDA with semantic similarity for aspect-based sentiment analysis. In: International joint conference on neural networks. IEEE, pp 4465–4473. https://doi.org/10.1109/IJCNN.2016.7727784
Xu X, Cheng X, Tan S et al (2013) Aspect-level opinion mining of online customer reviews. China Commun 10:25–41. https://doi.org/10.1109/CC.2013.6488828
Article Google Scholar
Chen Z, Mukherjee A, Liu B, et al (2013) Exploiting domain knowledge in aspect extraction. In: Empirical methods in natural language processing, pp 1655–1667. https://www.aclweb.org/anthology/D13-1172.pdf
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
MATH Google Scholar
Bollegala D, Matsuo Y, Ishizuka M (2011) A web search engine-based approach to measure semantic similarity between words. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2010.172
Article Google Scholar
Jiang Z, Gao S, Chen L (2020) Study on text representation method based on deep learning and topic information. Computing 102:623–642. https://doi.org/10.1007/s00607-019-00755-y
Article MathSciNet MATH Google Scholar
Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Conf Artif Intell 101:5228–5235. https://doi.org/10.1073/pnas.0307752101
Article Google Scholar
Mukherjee A, Liu B (2012) Aspect extraction through semi-supervised modeling. In: Proceedings of 50th annual meeting of the association for computational linguistics, pp 339–348. https://www.aclweb.org/anthology/P12-1036
Andrzejewski D, Zhu X, Craven M (2009) Incorporating domain knowledge into topic modeling via Dirichlet Forest priors. In: Proceedings of 26th Annual International Conference on Machine Learning, pp 1–8. https://doi.org/10.1145/1553374.1553378
Hoffman MD, Blei DM, Bach F (2010) Online learning for latent Dirichlet allocation. Adv Neural Inf Process Syst 23:856–864
Google Scholar
Das R, Zaheer M, Dyer C (2015) Gaussian LDA for topic models with word embeddings. In: ACL-IJCNLP 2015—53rd annual meeting association computing linguistics. 7th International joint conference on natural language processing. Asian federation on natural language processing proceedings conference. https://doi.org/10.3115/v1/p15-1077
Hai Z, Cong G, Chang K et al (2017) Analyzing sentiments in one go: a supervised joint topic modeling approach. IEEE Trans Knowl Data Eng 29:1172–1185. https://doi.org/10.1109/TKDE.2017.2669027
Article Google Scholar
Lin C, He Y (2009) Joint sentiment/topic model for sentiment analysis. In: Proceeding 18th ACM conference on information and knowledge management—CIKM ’09, ACM, p 375. https://doi.org/10.1145/1645953.1646003
Ozyurt B, Akcayol MA (2021) A new topic modeling based approach for aspect extraction in aspect based sentiment analysis: SS-LDA. Expert Syst Appl 168:114231. https://doi.org/10.1016/j.eswa.2020.114231
Article Google Scholar
Santosh DT, Vardhan BV, Ramesh D (2016) Extracting product features from reviews using feature ontology tree applied on LDA topic clusters. In: Proceedings of 6th international conference on advanced computing conference, pp 163–168. https://doi.org/10.1109/IACC.2016.39
Ali F, Kwak D, Khan P et al (2019) Transportation sentiment analysis using word embedding and ontology-based topic modeling. Knowl Based Syst 174:27–42. https://doi.org/10.1016/j.knosys.2019.02.033
Article Google Scholar
Park S-M, Lee SJ, On B-W (2020) Topic word embedding-based methods for automatically extracting main aspects from product reviews. Appl Sci 10:3831. https://doi.org/10.3390/app10113831
Article Google Scholar
García-Pablos A, Cuadros M, Rigau G (2018) W2VLDA: almost unsupervised system for aspect based sentiment analysis. Expert Syst Appl 91:127–137. https://doi.org/10.1016/j.eswa.2017.08.049
Article Google Scholar
Nimala K, Magesh S, Thamizh Arasan R (2018) Hash tag based topic modelling techniques for twitter by tweet aggregation strategy. J Adv Res Dyn Control Syst 10:571–578
Google Scholar
Xiong S, Wang K, Ji D, Wang B (2018) A short text sentiment-topic model for product reviews. Neurocomputing 297:94–102. https://doi.org/10.1016/j.neucom.2018.02.034
Article Google Scholar
Tang F, Fu L, Yao B, Xu W (2019) Aspect based fine-grained sentiment analysis for online reviews. Inf Sci 488:190–204. https://doi.org/10.1016/J.INS.2019.02.064
Article Google Scholar
Tubishat M, Idris N, Abushariah MAM (2018) Implicit aspect extraction in sentiment analysis: review, taxonomy, opportunities, and open challenges. Inf Process Manag. https://doi.org/10.1016/j.ipm.2018.03.008
Article Google Scholar
Gobi N, Rathinavelu A (2019) Analyzing cloud based reviews for product ranking using feature based clustering algorithm. Cluster Comput. https://doi.org/10.1007/s10586-018-1996-3
Article Google Scholar
Miranda C, Buelvas E (2019) AspectSA: unsupervised system for aspect based sentiment analysis in Spanish. In: Prospectiva. https://doi.org/10.15665/rp.v17i1.1961
Rana TA, Cheah YN (2020) Multi-level knowledge-based approach for implicit aspect identification. Appl Intell 50:4616–4630. https://doi.org/10.1007/s10489-020-01817-x
Article Google Scholar
Feng J, Cai S, Ma X (2019) Enhanced sentiment labeling and implicit aspect identification by integration of deep convolution neural network and sequential algorithm. Clust Comput 22:5839–5857. https://doi.org/10.1007/s10586-017-1626-5
Article Google Scholar
Afzaal M, Usman M, Fong ACM, Fong S (2019) Multiaspect-based opinion classification model for tourist reviews. Expert Syst 36:e12371. https://doi.org/10.1111/exsy.12371
Article Google Scholar
Mowlaei ME, Saniee Abadeh M, Keshavarz H (2020) Aspect-based sentiment analysis using adaptive aspect-based lexicons. Expert Syst Appl 148:113234. https://doi.org/10.1016/j.eswa.2020.113234
Article Google Scholar
Xu Q, Zhu L, Dai T et al (2020) Non-negative matrix factorization for implicit aspect identification. J Ambient Intell Humaniz Comput 11:2683–2699. https://doi.org/10.1007/s12652-019-01328-9
Article Google Scholar
Wang S, Zhou W, Jiang C (2020) A survey of word embeddings based on deep learning. Computing 102:717–740. https://doi.org/10.1007/s00607-019-00768-7
Article MathSciNet MATH Google Scholar
Almansor EH, Hussain FK, Hussain OK (2021) Supervised ensemble sentiment-based framework to measure chatbot quality of services. Computing 103:491–507. https://doi.org/10.1007/s00607-020-00863-0
Article Google Scholar
Demeester T, Rocktäschel T, Riedel S (2016) Lifted rule injection for relation embeddings. In: Proceedings of 2016 conference on empirical methods natural language processing, pp 1389–1399
Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850. https://doi.org/10.1080/01621459.1971.10482356
Article Google Scholar
Shannon C (1948) A mathematical theory of communication | Nokia Bell Labs Journals & Magazine | IEEE Xplore. Bell Syst Tech J. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
Article Google Scholar
Strehl A, Ghosh J (2003) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
MathSciNet MATH Google Scholar
Poria S, Cambria E, Gelbukh A (2016) Aspect extraction for opinion mining with a deep convolutional neural network. Knowl Based Syst 108:42–49. https://doi.org/10.1016/j.knosys.2016.06.009
Article Google Scholar
Arthur D, Vassilvitskii S (2007) K-means++: the advantages of careful seeding. In: Proceedings of 18th annual ACM-SIAM symposium discrete algorithms
Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-019-01328-9
Article MATH Google Scholar
Dumais ST (2004) Latent semantic analysis. Annu Rev Inf Sci Technol. https://doi.org/10.1002/aris.1440380105
Article Google Scholar
Su Q, Xu X, Guo H, et al (2008) Hidden sentiment association in Chinese web opinion mining. In: Proceedings of 17th international conference on world wide web, pp 959–968. https://doi.org/10.1145/1367497.1367627
Chen L, Martineau J, Cheng D, Sheth A (2016) Clustering for simultaneous extraction of aspects and features from reviews. In: 2016 Conference of the North American chapter association computing linguistics human language technologies. NAACL HLT 2016—Proceedings conference, pp 789–799. https://doi.org/10.18653/v1/N16-1093
Hai Z, Chang K, Kim JJ (2011) Implicit feature identification via co-occurrence association rule mining. In: Computational linguistics and intelligent text processing. CICLing Lecture notes in computer science, pp 393–404. https://doi.org/10.1007/978-3-642-19400-9_31
Xiong S, Ji D (2016) Exploiting flexible-constrained K-means clustering with word embedding for aspect-phrase grouping. Inf Sci (NY) 367–368:689–699. https://doi.org/10.1016/j.ins.2016.07.002
Article Google Scholar
Xiong S, Cheng M, Batra V et al (2020) Aspect terms grouping via fusing concepts and context information. Inf Fusion 64:12–19. https://doi.org/10.1016/j.inffus.2020.06.007
Article Google Scholar
Zhai Z, Liu B, Xu H, Jia P (2011) Clustering product features for opinion mining. In: Proceedings of 4th ACM international conference on web search data mining WSDM, vol 2011, pp 347–354. https://doi.org/10.1145/1935826.1935884
McAuliffe JD, Blei DM (2005) Supervised topic models. In: Advanced neural information processing systems, pp 121–128. arXiv:1003.0783
Jo Y, Oh A (2011) Aspect and sentiment unification model for online review analysis. In: Proceedings of 4th ACM international conference on web search data mining. ACM Press, pp 815–824. https://doi.org/10.1145/1935826.1935932
Wang H, Lu Y, Zhai C (2010) Latent aspect rating analysis on review text data: A rating regression approach. In: Proceedings of 16th ACM SIGKDD international conference on knowledge discovery and data mining—KDD ’09. https://doi.org/10.1145/1835804.1835903
Chen Z, Mukherjee A, Liu B, et al (2013) Exploiting domain knowledge in aspect extraction. In: Empirical methods on natural language processing, pp 1655–1667. https://www.aclweb.org/anthology/D13-1172
Poria S, Cambria E, Ku L-W, et al (2015) A rule-based approach to aspect extraction from product reviews. In: Second work on natural language processing for social media, pp 28–37. https://www.aclweb.org/anthology/W14-5905.pdf
Schouten K, Frasincar F (2014) Implicit feature detection for sentiment analysis. In: Proceedings of 23rd international conference on world wide web, pp 367–368. https://doi.org/10.1145/2567948.2577378
Zeng L, Li F (2013) A classification-based approach for implicit feature identification. In: Chinese computational linguistics natural language processing based natural annotations of big data, pp 190–202. https://doi.org/10.1007/978-3-642-41491-6_18
Sun L, Li S, Li J, Lv J (2014) A novel context-based implicit feature extracting method. In: DSAA 2014—proceedings of 2014 IEEE international conference on data science and advanced analytics, pp 420–424. https://doi.org/10.1109/DSAA.2014.7058106
Yan Z, Xing M, Zhang D, Ma B (2015) EXPRS: an extended pagerank method for product feature extraction from online consumer reviews. Inf Manag 52:850–858. https://doi.org/10.1016/j.im.2015.02.002
Article Google Scholar
Toh Z, Wang W (2014) DLIREC: aspect term extraction and term polarity classification system. In: Proceedings of the 11th international workshop on semantic evaluation (SemEval 2014), pp 235–240
Del-pino MA, Watine P (2020) The importance of poverty in sustainability policies: an approach to understanding online opinion. In: Actas del III Congreso Internacional de Ingeniería de Sistemas La, pp 183–194

Download references

Author information

Authors and Affiliations

School of Computer Sciences, Universiti Sains, 11800, George Town, Pulau Pinang, Malaysia
Omar Mustafa AL-Janabi, Nurul Hashimah Ahamed Hassain Malim & Yu-N Cheah

Authors

Omar Mustafa AL-Janabi
View author publications
You can also search for this author inPubMed Google Scholar
Nurul Hashimah Ahamed Hassain Malim
View author publications
You can also search for this author inPubMed Google Scholar
Yu-N Cheah
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Nurul Hashimah Ahamed Hassain Malim.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

AL-Janabi, O.M., Ahamed Hassain Malim, N.H. & Cheah, YN. Unsupervised model for aspect categorization and implicit aspect extraction. Knowl Inf Syst 64, 1625–1651 (2022). https://doi.org/10.1007/s10115-022-01678-5

Download citation

Received: 26 May 2021
Revised: 23 February 2022
Accepted: 26 February 2022
Published: 25 April 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s10115-022-01678-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised model for aspect categorization and implicit aspect extraction

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SSAM: toward Supervised Sentiment and Aspect Modeling on different levels of labeling

Non-negative matrix factorization for implicit aspect identification

Aspect Clustering Methods for Sentiment Analysis

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now