Opinion mining based on fuzzy domain ontology and Support Vector Machine: A proposal to automate online review classification
Introduction
The use of social network websites is rapidly increasing; therefore, the existing internet became a mammoth market for people to publish their views on various products, for example, movies, books, and hotels. A new users see the reviews of others and respond them regarding the same product. However, the continuous increase of reviews can create confusion for the users to rightly decide to watch the movie or to buy a book or to stay in hotel. In many cases, a user’s give their opinions about the product in terms of features like “Chapter 2 is very detailed and interesting” or “The actors are incredible in this movie” or “The room has a lot of facility but a bit dusty” or opinion reviews are hidden in blogs and forums, which is difficult for user to extract meaningful information from it. Opinion mining is the process of extracting useful information from user’s reviews using natural language processing methods or text analysis methods [1].
Presently, the researchers use Naïve Byes, Maximum Entropy, and SVM techniques to classify the social network reviews [20], [22], [23], [47]. Most of these classification approaches are unable to automatically identify feature reviews, and filter out irrelevant reviews (noises). These systems can compute the polarity in the form of two terms (positive and negative) and consider the noun, verb, adjective and adverb as opinion words, which decrease the precision rate. Further, it is important to decide whether a review conveys strong positive, positive, neutral, negative, and strong negative polarity. The information extraction systems are mostly based on crisp ontology. The crisp ontology addresses only crisp data and is unable to retrieve desirable results from the hazy source of social network data. In the existing social network architecture, various systems share information and the raw fact archive is rapidly increasing. The available crisp ontology-based systems are inadequate and extract information to a limited extent from the social network.
To solve these problems, this paper proposes a system called opinion mining based on Fuzzy Domain Ontology (FDO) and Support Vector Machine (SVM) to automate online review classification. The overall process contains three phases, and each phase contains different parts as follows:
- •
Raw review sentences database.
- •
Morphological and semantic analysis.
- •
Tokenization process and word-tagging process.
- •
Support Vector Machine.
- •
Feature extraction.
- •
Fuzzy domain ontology.
- •
Feature review classification.
- •
Identify feature polarity and Hotel polarity result.
In the first phase, the system downloads and tokenizes the hotel reviews. The second phase contains feature extraction, SVM and FDO. After a collection of review’s retrieval, the hotel features along with their reviews are extracted. The SVM is employed to identify feature reviews and those sentences which have opinion words as adjective and adverb with the aim of filtering out irrelevant reviews (noises) and sentences which express the opinion words as a noun and verb. The feature reviews and sentences with opinion words as adjective and adverb are denoted as positive and other reviews and sentences are designated as negative. Also, the SVM finds the boundary separating the feature reviews into a positive class on one side of the hyperplane and a negative class on the other side. And, the FDO is then used to extract the feature along with opinion words from positive reviews (filtered reviews) and find the polarity term of individual feature and hotel. The primary objective of the proposed system is to demonstrate that the integration of FDO with SVM provides better results compared with crisp opinion mining ontology and a simple SVM technique. The rest of the paper is organized as follows. Section 2 illustrates related work. The FDO is explained in Section 3. Section 4 briefly illustrates the overall scenario and internal process of the proposed system, and the experiments and their results are presented in Section 5.
Section snippets
Related work
Sentiment classification and opinion extraction from online reviews have received considerable attention in natural language processing and information engineering research. The aim of sentiment classification and opinion extraction is to obtain the reviewers feelings from positive and negative comments. The increase in online reviews made sentiment classification more challenging. Presently, researchers use different techniques to classify the reviews. Most of these techniques are unable to
Development of fuzzy domain ontology
In this portion, we will first define the concepts and terminologies of ontology before moving towards FDO. Ontology is a shared conceptualization of a specific domain in machine-readable and in human-understandable format [5], [6]. There are four main components of ontology; concepts, axioms, instances, and relationships. In the proposed ontology, the main focus is the domain concepts, properties, values, and their relationships. The classes of ontology are in the form of a hierarchical
Opinion mining based on FDO and SVM
In this section, the architecture and internal process of the proposed system is explained. The proposed system architecture is based on FDO and SVM, which is shown in Fig. 2. The system architecture is divided into three phases for simplicity. Each phase consists of different parts as follows.
- •
Phase 1: Raw reviews sentences database, Morphological and semantic analysis, Tokenization process, and word-tagging process.
- •
Phase 2: Feature extraction, SVM, and FDO.
- •
Phase 3: Feature reviews
Experiments and results
To evaluate the effectiveness of the proposed system, a fuzzy web-crawler in java is developed to retrieve reviews from hotel distributors (hotel.com, tripadvisor.com, and booking.com) and store them in a database for further processing. The system downloaded 5639 reviews sentences. The average length of the reviews is 80 words. The total length of the opinion words is 7631, and the average number of opinion words for each feature is 635. The stored reviews contained irrelevant reviews and
Conclusion
This paper proposes a fuzzy domain ontology and Support Vector Machine based opinion mining system to automate an online review classification. Since a number of realistic issues, for example, morphological analysis, tokenization, word tagging, feature extraction, irrelevant review’s filtering using SVM, the declaration of feature polarity value in ontology, and polarity computation using FDO, are effectively considered, the proposed system enhances the performance of opinion mining. Even, the
Acknowledgments
This work was supported by a Korean National Research Foundation (NRF) Grant funded by the Korean Government (No. 2014R1A1A2053339).
References (48)
- et al.
Type-2 fuzzy ontology-based semantic knowledge for collision avoidance of autonomous underwater vehicles
Inf. Sci.
(2015) - et al.
Fuzzy ontology representation using OWL 2
Approx. Reason.
(2011) - et al.
Integration of a secure type-2 fuzzy ontology with a multi-agent platform: a proposal to automate the personalized flight ticket booking domain
Inf. Sci.
(2012) Visualization and dynamic evaluation model of corporate financial structure with self-organizing map and support vector regression
Appl. Soft. Comput.
(2012)- et al.
Social analytics: learning fuzzy product ontologies for aspect-oriented sentiment analysis
Decis. Support Syst.
(2014) - et al.
An ontology-based Web mining method for unemployment rate prediction
Decis. Support Syst.
(2014) Fuzzy sets
Inf. Control
(1965)- et al.
Type-2 fuzzy ontology-based opinion mining and information extraction: a proposal to automate the hotel reservation system
Appl. Intell.
(2015) - et al.
DeLorean: a reasoner for fuzzy OWL 1.1
International Workshop on Uncertainty Reasoning for the Semantic Web
(2008) - et al.
Exploiting the heavyweight ontology with multi-agent system using vocal command system: a case study on e-mall
IJACT
(2011)
A research on an intelligent multipurpose fuzzy semantic enhanced 3D virtual reality simulator for complex maritime missions
App. Intell.
Hontology: a multilingual ontology for the accommodation sector in the tourism industry
International Conference on Knowledge Engineering and Ontology Development
A hybrid model for business failure prediction-utilization of particle swarm optimization and support vector machines
Neural Netw. World
LIBSVM: a library for Support Vector Machines
ACM Trans. Intell. Syst. Technol.
A general architecture for text engineering (GATE)
Comput. Hum.
Semi supervised learning based opinion summarization and classification for online product reviews
Appl. Comput. Intell. Soft. Comput.
The utility of linguistic rules in opinion mining
International Conference on Research and Development in Information Retrieval
Text message categorization of collaborative learning skills in online discussion using Support Vector Machine
International Conference on Computer Control Information and Its Applications
Mining customer opinions from free text
International Symposium on Intelligent Data Analysis
Data Mining: Concepts and Techniques
Utilizing Support Vector Machines in mining online customer reviews
ICCTA
FEROM: feature extraction and refinement for opinion mining
ETRI
Automatic construction of polarity-tagged corpus from HTML documents
International Conference on Computational Linguistics Morriston
Cited by (109)
Emerging trends in online reviews research in hospitality and tourism: A scientometric update (2000−2020)
2023, Tourism Management PerspectivesElements of information ecosystems stimulating the online consumer behavior: A mediating role of cognitive and affective trust
2023, Telematics and InformaticsA damping grey multivariable model and its application in online public opinion prediction
2023, Engineering Applications of Artificial IntelligenceOnline public opinion prediction based on a novel seasonal grey decomposition and ensemble model
2022, Expert Systems with ApplicationsEvaluating medical travelers’ satisfaction through online review analysis
2021, Journal of Hospitality and Tourism Management