ABSTRACT
Ontologies and knowledge models have gained more recognition because of their extensive use in recommender systems. The lack of automatic approaches in ontology engineering, however, becomes a challenge to fulfill increasing needs for such knowledge models in the field of tourism. In this study, a system for building tourism knowledge models from online reviews is proposed. The main contribution of the study is the application of topic modeling to build a knowledge model that, in turn, allows for an automated labeling process to train classifiers. Given a collection of unlabeled tourism online reviews, Latent Dirichlet Allocation (LDA) is applied to automatically label each document. Each topic discovered by LDA is labeled with one specific category, representing its semantic meaning based on an existing general ontology as a reference. These automatically labeled documents are used for classification, and the result is compared with manual annotation. Experiments on Indonesian tourism datasets showed that the automatic labeling approach using LDA provides for a precision score of 70%. In classification tasks, this approach can achieve comparable or even better classification performance than the manual labeling. The results obtained suggest that the developed system is capable of building a tourism knowledge model and providing acceptable-quality training data for the development of tourism recommender systems.
- K. Anithakumari, G. Sudhasadasivam, T. Aruna, and S. Christie Sajitha. 2013. Dynamic ontology construction for e-trading. In Advances in Intelligent Systems and Computing, 439–449. https://doi.org/10.1007/978-3-642-31600-5_43Google Scholar
- Muhammad Nabeel Asim, Muhammad Wasim, Muhammad Usman Ghani Khan, Waqar Mahmood, and Hafiza Mahnoor Abbasi. 2018. A survey of ontology learning techniques and applications. Database 2018. https://doi.org/10.1093/database/bay101Google Scholar
- David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of Machine Learning Research 3: 993–1022Google ScholarDigital Library
- Gerard Deepak and Dheera Kasaraneni. 2019. Ontocommerce: An ontology focused semantic framework for personalised product recommendation for user targeted e-commerce. International Journal of Computer Aided Engineering and Technology 11, 4–5: 449–466. https://doi.org/10.1504/IJCAET.2019.100445Google ScholarCross Ref
- Yue Guo, Stuart J. Barnes, and Qiong Jia. 2017. Mining meaning from online ratings and reviews: Tourist satisfaction analysis using latent dirichlet allocation. Tourism Management 59: 467–483. https://doi.org/10.1016/j.tourman.2016.09.009Google ScholarCross Ref
- Valentinus Roby Hananto, Uwe Serdült, and Victor V Kryssanov. 2020. Discovering tourism topics from social media: A case study of Japan. In Proceedings of the 5th International Workshop on Innovations in Information and Communication Science and Technology, 83–89. https://doi.org/https://doi.org/10.5167/uzh-188604Google Scholar
- Hengyi Hu, Adam Elkus, and Larry Kerschberg. 2016. A Personal Health Recommender System incorporating personal health records, modular ontologies, and crowd-sourced data. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016, 1027–1033. https://doi.org/10.1109/ASONAM.2016.7752367Google ScholarCross Ref
- Aurangzeb Khan, Baharum Baharudin, Lam Hong Lee, and Khairullah Khan. 2010. A review of machine learning algorithms for text-documents classification. Journal of Advances in Information Technology 1, 1: 4–20. https://doi.org/10.4304/jait.1.1.4-20Google Scholar
- Agnieszka Konys. 2018. An ontology-based knowledge modelling for a sustainability assessment domain. Sustainability 10, 2: 300. https://doi.org/10.3390/su10020300Google ScholarCross Ref
- Guson Prasamuarso Kuntarto, Irwan Prasetya Gunawan, Fahmi L. Moechtar, Yudhiansyah Ahmadin, and Berkah I. Santoso. 2017. Dwipa ontology III: Implementation of ontology method enrichment on tourism domain. International Journal on Smart Sensing and Intelligent Systems 10, 4: 903–919. https://doi.org/10.21307/ijssis-2018-024Google ScholarCross Ref
- Jingjing Li, Lizhi Xu, Ling Tang, Shouyang Wang, and Ling Li. 2018. Big data in tourism research: A literature review. Tourism Management 68: 301–323. https://doi.org/10.1016/j.tourman.2018.03.009Google ScholarCross Ref
- Weilin Lu and Svetlana Stepchenkova. 2015. User-generated content as a research mode in tourism and hospitality applications: Topics, methods, and software. Journal of Hospitality Marketing & Management 24, 2: 119–154. https://doi.org/10.1080/19368623.2014.907758Google ScholarCross Ref
- Mary L. McHugh. 2012. Interrater reliability: The kappa statistic. Biochemia Medica 22, 3: 276–282. https://doi.org/10.11613/bm.2012.031Google ScholarCross Ref
- Stuart E. Middleton, Nigel R. Shadbolt, and David C. De Roure. 2004. Ontological user profiling in recommender systems. ACM Transactions on Information Systems 22, 1: 54–88. https://doi.org/10.1145/963770.963773Google ScholarDigital Library
- Natalya F. Noy and Deborah L. McGuiness. 2001. Ontology development 101: A guide to creating your first ontology. Retrieved August 13, 2020 from http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.79.8252Google Scholar
- Monika Rani, Amit Kumar Dhar, and O.P. Vyas. 2017. Semi-automatic terminology ontology learning based on topic modeling. Engineering Applications of Artificial Intelligence 63: 108–125. https://doi.org/10.1016/j.engappai.2017.05.006Google ScholarCross Ref
- Muzafar Rasool Bhat, Majid A Kundroo, Tanveer A Tarray, Basant Agarwal, Majid A Kundroo Tanveer A Tarray, Kashmir India Basant Agarwal, M R Bhat, M A Kundroo, T A Tarray, and B Agarwal. 2020. Deep LDA: A new way to topic model © Deep LDA: A new way to topic model. 41, 3: 823–834. https://doi.org/10.1080/02522667.2019.1616911Google Scholar
- Michael Röder, Andreas Both, and Alexander Hinneburg. 2015. Exploring the space of topic coherence measures. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining - WSDM ’15, 399–408. https://doi.org/10.1145/2684822.2685324Google ScholarDigital Library
- Ahmad Rosyiq, Aina Rahmah Hayah, Achmad Nizar Hidayanto, Meisuchi Naisuty, Agus Suhanto, and Nur Fitriah Avuning Budi. 2019. Information extraction from Twitter using DBpedia ontology: Indonesia tourism places. In 2019 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS), 91–96. https://doi.org/10.1109/ICIMCIS48181.2019.8985194Google ScholarCross Ref
- David Sánchez, Montserrat Batet, Sergio Martínez, and Josep Domingo-Ferrer. 2015. Semantic variance: An intuitive measure for ontology accuracy evaluation. Engineering Applications of Artificial Intelligence 39: 89–99. https://doi.org/10.1016/j.engappai.2014.11.012Google ScholarCross Ref
- Savills. Six trends in tourism: Tourism trends that are disrupting real estate, and the opportunities they present. Retrieved July 16, 2020 from https://www.savills.com/impacts/social-change/six-trends-in-tourism.htmlGoogle Scholar
- Andrea Tagarelli and George Karypis. 2013. A segment-based approach to clustering multi-topic documents. Knowledge and Information Systems 34, 3: 563–595. https://doi.org/10.1007/s10115-012-0556-zGoogle ScholarDigital Library
- Wei Wang, Payam Mamaani Barnaghi, and Andrzej Bargiela. 2010. Probabilistic topic models for learning te1rminological ontologies. IEEE Transactions on Knowledge and Data Engineering 22, 7: 1028–1040. https://doi.org/10.1109/TKDE.2009.122Google ScholarDigital Library
Recommendations
Multi-aspect sentiment analysis for Chinese online social reviews based on topic modeling and HowNet lexicon
User-generated reviews on the Web reflect users' sentiment about products, services and social events. Existing researches mostly focus on the sentiment classification of the product and service reviews in document level. Reviews of social events such ...
Twitter Opinion Topic Model: Extracting Product Opinions from Tweets by Leveraging Hashtags and Sentiment Lexicon
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge ManagementAspect-based opinion mining is widely applied to review data to aggregate or summarize opinions of a product, and the current state-of-the-art is achieved with Latent Dirichlet Allocation (LDA)-based model. Although social media data like tweets are ...
User-aware topic modeling of online reviews
The online reviews are one type of social media which are opinions generated by the users to comment on some special items. Since the sentiments are dependent on topics, probabilistic topic models have been widely used for sentiment analysis. However, ...
Comments