Customer preference identification from hotel online reviews: A neural network based fine-grained sentiment analysis
Introduction
In past decades, hotel industry has witnessed substantial growth all over the world due to ever-increasing international and domestic tourism and business activities. A recent report shows that the market size of the global hotel and resort industry has increased year by year during 2011–2019, and reached at 1.47 trillion U.S. dollars in 2019 (Lock, 2021). Coupled with the rapid development and wide applications of information technologies, more and more giant online travel agencies (i.e., OTAs) have been emerged and boomed in recent years, e.g., Airbnb, Booking.com, TripAdvisor, Ctrip.com and Qunar.com. Customers are increasingly booking hotel rooms via these OTAs. In particular, about 1,550,000 stays are booked via Booking.com every day (Anjana, 2021). Nonetheless, customers inevitably face some uncertainty about hotel room attributes such as facilities and service quality, because of physical separation between customers and listed products or services (e.g., hotel rooms) at the time of ordering online (Zhang et al., 2021a). In this case, customers may first read online reviews on OTA platforms to make decision. A recent survey shows that more than two-thirds of travelers would usually read online reviews before booking a hotel room, and 93% of those say online reviews influence their booking decisions (Nilashi et al., 2021b). The underlying intuition is that online reviews are posted by those customers who have experienced the products or services, and thus the information conveyed in the reviews is more convincing to potential customers than hotels’ advertisements (Gavilan et al., 2018). Online reviews thereby have been regarded as an important mechanism to mitigate the uncertainty in online transactions (Kwark et al., 2014), which will have a significant impact on a company's profitability by affecting customers’ behaviors. However, in the face of the “information overload” of online reviews, it is difficult to manually extract valuable customer information comprehensively and quickly from massive reviews. Therefore, it is crucial to develop a methodology to automatically identify customer preferences from hotels’ online reviews.
In recent years, customer preference mining from online reviews has gained increasing concern. However, current research has mainly focused on extracting the sentiment orientation of each online review at the overall level, which cannot distinguish customer preferences for specific hotel attributes, e.g., price, sanitation and facilities. As observed in online reviews, a user-generated comment may generally contain several opinions, even diametrically opposite sentiment orientations, on different hotel attributes (Feldman, 2013). For example, the comment “The service quality of the hotel is super great, and the price is relatively low, whereas some facilities need to be replaced” indicates a positive sentiment towards the service quality and the price, but a negative one towards the facilities. This indicates that the customer is generally satisfied with the hotel, but a little dissatisfied with the hotel's facilities. In this case, to help improve the hotel’s performance, it is necessary to identify customers’ particular preference for each attribute from online reviews via the fine-grained sentiment analysis rather than the overall sentiment analysis. As such, we focus on fine-grained sentiment analysis of online reviews in this work.
Fine-grained sentiment analysis, namely aspect-based sentiment analysis, generally involves multiple fundamental tasks such as sentiment element extraction (e.g., aspect term extraction, opinion term extraction), aspect-opinion pair (i.e., AOP) identification and sentiment orientation analysis (Çalı and Balaman, 2019, Mao et al., 2021). The first task is generally treated as a sequence labeling task that assigns a label to each token (i.e., English word or Chinese character) in a review. Considerable efforts have been exerted to improve sentiment element extraction (Ji et al., 2020, Alshammari and Alanazi, 2021). In this study, to obtain the exact sentiments of customers towards each attribute of a hotel, we use the integrated method of Bi-LSTM (i.e., bi-directional long short-term memory) model and the CRF (i.e., conditional random fields) model to identify evaluated aspects, sentiment words, and affective modifiers of sentiment words.
As the second task, the goal of AOP identification is to recognize the sentiment words corresponding to the evaluated aspects. Researchers have never stopped exploring how to enhance the performance of AOP identification, but the outcome is still not very satisfactory. Related approaches can be generally grouped into two categories, i.e., knowledge-based methods (Qiu et al., 2011, Cambria et al., 2014, Chang et al., 2021) and learning-based methods. The knowledge-based methods usually employ a lot of linguistic knowledge when building rules for discovering the relationships of sentiment words and evaluated aspects in a review, but the formed rules are generally specified for a particular product or service, which are not compatible for others (Wang et al., 2018). In learning-based methods, the task of identifying AOPs is usually treated as a binary classification task (Mao et al., 2021), that is, the task is accomplished by judging whether any possible combination of evaluated aspect and sentiment word in the review has a matching relationship. The learning-based methods mainly include statistical machine learning methods and deep learning methods. Under statistical machine learning methods, the key task is to extract and filter features, and the performance highly depends on the meticulousness of the manually selected feature set. In the extant studies, a number of structured features, e.g., the position feature, the part-of-speech sequence feature, have been demonstrated to be suitable for classification or aspect-opinion pair recognition (Su et al., 2011; Chen and Manning, 2014). Deep learning methods show a distinct advantage in automatically obtaining word vectors (i.e., the digital text content representation) from textual information and applying them to well perform classification or prediction (Kim, 2014, Liu et al., 2020). Unfortunately, few efforts have been launched to introduce structured features into deep learning models along with unstructured textual content representation, especially regarding AOP identification. In this study, we propose an improved convolutional neural network model that can well fuse unstructured text content representation and structured features (i.e., dependency parse feature, part-of-speech sequence feature, sentiment element types and position feature). Moreover, we combine these structured features with the textual content representation to construct a new feature (i.e., the comprehensive feature), which is also fed into the convolutional neural network (i.e., CNN) model, for identifying AOPs.
Regarding the third task, sentiment orientation analysis generally aims to distinguish customers' positive or negative sentiment towards a specific aspect following AOP identification (Zhang et al., 2021a). However, our innovative work makes both tasks of sentiment element extraction and AOP identification perform well, so we further refine the aspect-level sentiment analysis framework, i.e., perform sentiment value calculation after sentiment element extraction and AOP identification, to achieve accurate identification of customer sentiment intensity value for each evaluated aspect. Specifically, we first use the proximity principle to match sentiment words and affective modifiers to form opinion phrases, and then construct three types of dictionaries (i.e. sentiment dictionary, degree adverb dictionary and negative adverb dictionary) to calculate sentiment values of opinion phrases corresponding to each evaluated aspect.
Furthermore, to identify customers’ specific preferences for different hotel attributes, we combine the aspect term clustering algorithm, that is, applying the K-means algorithm to cluster the evaluated aspects based on the word vectors obtained by word2vec (i.e., a widely used word embedding tool) to recognize the attribute categories that customers particularly concern. Finally, we apply our proposed aspect-level sentiment analysis framework to empirically examine customers’ preferences of Ji hotel in China, with hotels’ online reviews crawled from Ctrip.com.
Our empirical analysis leads to the following important findings. First, our proposed method that comprehensively considers the structured features and unstructured features can indeed improve the performance of the AOP identification. Second, we find that hotel customers mainly focus on 11 hotel attributes (i.e., service quality, room type, surrounding, price, transportation, facilities, catering, sanitation, parking, environment and epidemic prevention), and customers show different preferences for different hotel attributes. Notably, we find that service quality is the most critical factor for hotels. Third, customers’ preferences are closely related to their travel purposes. It is shown that customers vary their concerns about each attribute category of the hotel across their types and exhibit different positive and negative sentiment polarities for each attribute category of the hotel. Finally, we have further found that both the geographical distribution of the Ji hotels and the COVID-19 pandemic have slight influences on customers’ preferences.
The rest of this paper is organized as follows. Section 2 reviews the related literature. Section 3 presents the developed methodology. Section 4 applies the proposed methodology to identify customer preferences for the Ji Hotel from online reviews. Some interesting and important findings and insights are obtained from the empirical analysis in Section 4. Section 5 concludes this paper.
Section snippets
Literature review
Our work is closely related to customers preference analysis of hotels’ online reviews and the sentiment analysis of online reviews. We in this section review the most relevant studies.
Methodology
In this section, we develop an improved fine-grained sentiment analysis methodology, combined with aspect term clustering algorithm, to identify customers’ preferences for hotels’ attributes (e.g., service quality, prices and facilities) from online reviews. The framework of the proposed approach is depicted in Fig. 1.
As shown in Fig. 1, our approach includes four sequential steps: sentiment element extraction (i.e., aspect term extraction and opinion term extraction), aspect-opinion pair
Experiment results and customer preferences analysis
In this section, we first illustrate the proposed approach by using an experimental study, and then use it to extract customer preferences of Ji hotel in China. In particular, we design two groups of experiments to examine the performance of sentiment element extraction and aspect-opinion pair identification. Dataset description, experimental results and analysis, and customer preference analysis are presented below.
Conclusions
Online reviews, as a kind of user-generated information, contain a wealth of useful and valuable feedbacks from the customers who have used the products or experienced the service. In recent years, customer preference mining from online reviews, which can help improve a company’s performance, has attracted increasing attention in the literature. In this work, we develop a fine-grained sentiment analysis methodology consisting of three successive steps, i.e., sentiment element extraction, AOP
CRediT authorship contribution statement
Yiwen Bian: Conceptualization, Validation, Visualization, Supervision, Methodology. Rongsheng Ye: Methodology, Investigation, Data curation, Writing – original draft. Jing Zhang: Methodology. Xin Yan: Methodology, Data curation.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This research was partly supported by programs granted by the National Natural Science Foundation of China (NSFC) (Nos. 72031004 and 71901053). The authors would like to thank the editor and the anonymous reviewers for their helpful comments and suggestions on earlier versions of the manuscript.
References (73)
- et al.
The impact of using different annotation schemes on named entity recognition
Egyptian Informatics Journal
(2021) - et al.
Exploring asymmetric effects of attribute performance on customer satisfaction in the hotel industry
Tourism Management
(2020) - et al.
Improved decisions for marketing, supply and purchasing: Mining big data through an integration of sentiment analysis and intuitionistic fuzzy multi criteria assessment
Computers & Industrial Engineering
(2019) - et al.
Learning bilingual sentiment lexicon for online reviews
Electronic Commerce Research and Applications
(2021) - et al.
Using deep learning and visual analytics to explore hotel reviews and responses
Tourism Management
(2020) Drivers of helpfulness of online hotel reviews: A sentiment and emotion mining approach
International Journal of Hospitality Management
(2020)- et al.
A novel approach to evaluating the business potential of intellectual properties: A machine learning-based predictive analysis of patent lifetime
Computers & Industrial Engineering
(2020) - et al.
Cross-country analysis of perception and emphasis of hotel attributes
Tourism Management
(2019) - et al.
The influence of online ratings and reviews on hotel booking consideration
Tourism Management
(2018) - et al.
Relationship between customer sentiment and online customer ratings for hotels-an empirical analysis
Tourism Management
(2017)
Power entity recognition based on bidirectional long short-term memory and conditional random fields
Global Energy Interconnection
Assessing the helpfulness of online hotel reviews: A classification-based approach
Telematics and Informatics
Combining attention-based bidirectional gated recurrent neural network and two-dimensional convolutional neural network for document-level sentiment classification
Neurocomputing
Travelers decision making through preferences learning: A case on Malaysian spa hotels in TripAdvisor
Computers & Industrial Engineering
Understanding the dynamics of the quality of airline service attributes: Satisfiers and dissatisfiers
Tourism Management
The effects of traveling for business on customer satisfaction withhotel services
Tourism Management
A segmentation of online reviews by language groups: How English and non-English speakers rate hotels differently
International Journal of Hospitality Management
Online consumer review and group-buying participation: The mediating effects of consumer beliefs
Telematics and Informatics
Does hotel customer satisfaction change during the COVID-19? A perspective from online reviews
Journal of Hospitality and Tourism Management
Improving text summarization of online hotel reviews with review helpfulness and sentiment
Tourism Management
An optimal support vector machine based classification model for sentimental analysis of online product reviews
Future Generation Computer Systems
Using a stacked residual LSTM model for sentiment intensity prediction
Neurocomputing
Research on the role of influencing factors on hotel customer satisfaction based on BP neural network and text mining
Information
What can big data and text analytics tell us about hotel guest experience and satisfaction?
International Journal of Hospitality Management
Utilizing the platform economy effect through EWOM: Does the platform matter
International Journal of Production Economics
The antecedents of customer satisfaction and dissatisfaction toward various types of hotels: A text mining approach
International Journal of Hospitality Management
Deriving customer preferences for hotels based on aspect-level sentiment analysis of online reviews
Electronic Commerce Research and Applications
Customer preferences extraction for air purifiers based on fine-grained sentiment analysis of online reviews
Knowledge-Based Systems
Weakness finder: Find product weakness from Chinese reviews by using aspects-based sentiment analysis
Expert Systems with Applications
Hotel attributes and consumer satisfaction: A cross-country and cross-hotel study
Journal of Travel & Tourism Marketing
Toward a better understanding of backpackers’ motivations
Review of Applied Management Studies
Exploring backpackers’ perceptions of the hostel service quality
International Journal of Contemporary Hospitality Management
Health and well-being factors associated with international business travel
Journal of Travel Medicine
SenticNet 3: A common and common-sense knowledge base for cognition-driven sentiment analysis
The 28th AAAI Conference on Artificial Intelligence. AAAI
A fast and accurate dependency parser using neural networks
The 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). ACL
Cited by (22)
Quantifying risk of service failure in customer complaints: A textual analysis-based approach
2024, Advanced Engineering InformaticsNavigating the new normal: Redefining N95 respirator design with an integrated text mining and quality function deployment-based optimization model
2024, Computers and Industrial EngineeringA novel self-supervised contrastive learning based sentence-level attribute induction method for online satisfaction evaluation
2024, Computers and Industrial EngineeringPersonalized tourism product design focused on tourist expectations and online reviews: An integrated MCDM method
2024, Computers and Industrial EngineeringOptimal channel strategy for an e-seller: Whether and when to introduce live streaming?
2024, Electronic Commerce Research and ApplicationsLive streaming selling strategies of online retailers with spillover effects
2024, Electronic Commerce Research and Applications