Elsevier

Applied Soft Computing

Volume 47, October 2016, Pages 235-250
Applied Soft Computing

Opinion mining based on fuzzy domain ontology and Support Vector Machine: A proposal to automate online review classification

https://doi.org/10.1016/j.asoc.2016.06.003Get rights and content

Highlights

  • The available classical ontology-based systems are inadequate and limit the information extraction from the internet.

  • An ontology with fuzzy logic is effective technology for precise information extraction from blurred data environment.

  • We proposed fuzzy domain ontology with SVM to extract feature’s opinion from reviews and to compute polarity.

  • The result of opinion mining by using SVM with FDO for online large data set is better than SVM-based existing systems.

  • The proposed system thoroughly explains the feature extraction and polarity computation.

Abstract

With the explosion of Social media, Opinion mining has been used rapidly in recent years. However, a few studies focused on the precision rate of feature review’s and opinion word’s extraction. These studies do not come with any optimum mechanism of supplying required precision rate for effective opinion mining. Most of these studies are based on Naïve Bayes, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and classical ontology. These systems are still imperfect for classifying the feature reviews into more degrees of polarity terms (strong negative, negative, neutral, positive and strong positive). Further, the existing classical ontology-based systems cannot extract blurred information from reviews; thus, it provides poor results. In this regard, this paper proposes a robust classification technique for feature review’s identification and semantic knowledge for opinion mining based on SVM and Fuzzy Domain Ontology (FDO). The proposed system retrieves a collection of reviews about hotel and hotel features. The SVM identifies hotel feature reviews and filter out irrelevant reviews (noises) and the FDO is then used to compute the polarity term of each feature. The amalgamation of FDO and SVM significantly increases the precision rate of review’s and opinion word’s extraction and accuracy of opinion mining. The FDO and intelligent prototype are developed using Protégé OWL-2 (Ontology Web Language) tool and JAVA, respectively. The experimental result shows considerable performance improvement in feature review’s classification and opinion mining.

Introduction

The use of social network websites is rapidly increasing; therefore, the existing internet became a mammoth market for people to publish their views on various products, for example, movies, books, and hotels. A new users see the reviews of others and respond them regarding the same product. However, the continuous increase of reviews can create confusion for the users to rightly decide to watch the movie or to buy a book or to stay in hotel. In many cases, a user’s give their opinions about the product in terms of features like “Chapter 2 is very detailed and interesting” or “The actors are incredible in this movie” or “The room has a lot of facility but a bit dusty” or opinion reviews are hidden in blogs and forums, which is difficult for user to extract meaningful information from it. Opinion mining is the process of extracting useful information from user’s reviews using natural language processing methods or text analysis methods [1].

Presently, the researchers use Naïve Byes, Maximum Entropy, and SVM techniques to classify the social network reviews [20], [22], [23], [47]. Most of these classification approaches are unable to automatically identify feature reviews, and filter out irrelevant reviews (noises). These systems can compute the polarity in the form of two terms (positive and negative) and consider the noun, verb, adjective and adverb as opinion words, which decrease the precision rate. Further, it is important to decide whether a review conveys strong positive, positive, neutral, negative, and strong negative polarity. The information extraction systems are mostly based on crisp ontology. The crisp ontology addresses only crisp data and is unable to retrieve desirable results from the hazy source of social network data. In the existing social network architecture, various systems share information and the raw fact archive is rapidly increasing. The available crisp ontology-based systems are inadequate and extract information to a limited extent from the social network.

To solve these problems, this paper proposes a system called opinion mining based on Fuzzy Domain Ontology (FDO) and Support Vector Machine (SVM) to automate online review classification. The overall process contains three phases, and each phase contains different parts as follows:

  • Raw review sentences database.

  • Morphological and semantic analysis.

  • Tokenization process and word-tagging process.

  • Support Vector Machine.

  • Feature extraction.

  • Fuzzy domain ontology.

  • Feature review classification.

  • Identify feature polarity and Hotel polarity result.

In the first phase, the system downloads and tokenizes the hotel reviews. The second phase contains feature extraction, SVM and FDO. After a collection of review’s retrieval, the hotel features along with their reviews are extracted. The SVM is employed to identify feature reviews and those sentences which have opinion words as adjective and adverb with the aim of filtering out irrelevant reviews (noises) and sentences which express the opinion words as a noun and verb. The feature reviews and sentences with opinion words as adjective and adverb are denoted as positive and other reviews and sentences are designated as negative. Also, the SVM finds the boundary separating the feature reviews into a positive class on one side of the hyperplane and a negative class on the other side. And, the FDO is then used to extract the feature along with opinion words from positive reviews (filtered reviews) and find the polarity term of individual feature and hotel. The primary objective of the proposed system is to demonstrate that the integration of FDO with SVM provides better results compared with crisp opinion mining ontology and a simple SVM technique. The rest of the paper is organized as follows. Section 2 illustrates related work. The FDO is explained in Section 3. Section 4 briefly illustrates the overall scenario and internal process of the proposed system, and the experiments and their results are presented in Section 5.

Section snippets

Related work

Sentiment classification and opinion extraction from online reviews have received considerable attention in natural language processing and information engineering research. The aim of sentiment classification and opinion extraction is to obtain the reviewers feelings from positive and negative comments. The increase in online reviews made sentiment classification more challenging. Presently, researchers use different techniques to classify the reviews. Most of these techniques are unable to

Development of fuzzy domain ontology

In this portion, we will first define the concepts and terminologies of ontology before moving towards FDO. Ontology is a shared conceptualization of a specific domain in machine-readable and in human-understandable format [5], [6]. There are four main components of ontology; concepts, axioms, instances, and relationships. In the proposed ontology, the main focus is the domain concepts, properties, values, and their relationships. The classes of ontology are in the form of a hierarchical

Opinion mining based on FDO and SVM

In this section, the architecture and internal process of the proposed system is explained. The proposed system architecture is based on FDO and SVM, which is shown in Fig. 2. The system architecture is divided into three phases for simplicity. Each phase consists of different parts as follows.

  • Phase 1: Raw reviews sentences database, Morphological and semantic analysis, Tokenization process, and word-tagging process.

  • Phase 2: Feature extraction, SVM, and FDO.

  • Phase 3: Feature reviews

Experiments and results

To evaluate the effectiveness of the proposed system, a fuzzy web-crawler in java is developed to retrieve reviews from hotel distributors (hotel.com, tripadvisor.com, and booking.com) and store them in a database for further processing. The system downloaded 5639 reviews sentences. The average length of the reviews is 80 words. The total length of the opinion words is 7631, and the average number of opinion words for each feature is 635. The stored reviews contained irrelevant reviews and

Conclusion

This paper proposes a fuzzy domain ontology and Support Vector Machine based opinion mining system to automate an online review classification. Since a number of realistic issues, for example, morphological analysis, tokenization, word tagging, feature extraction, irrelevant review’s filtering using SVM, the declaration of feature polarity value in ontology, and polarity computation using FDO, are effectively considered, the proposed system enhances the performance of opinion mining. Even, the

Acknowledgments

This work was supported by a Korean National Research Foundation (NRF) Grant funded by the Korean Government (No. 2014R1A1A2053339).

References (48)

  • A.C. Bukhari et al.

    A research on an intelligent multipurpose fuzzy semantic enhanced 3D virtual reality simulator for complex maritime missions

    App. Intell.

    (2013)
  • M.S. Chaves et al.

    Hontology: a multilingual ontology for the accommodation sector in the tourism industry

    International Conference on Knowledge Engineering and Ontology Development

    (2012)
  • M.Y. Chen

    A hybrid model for business failure prediction-utilization of particle swarm optimization and support vector machines

    Neural Netw. World

    (2011)
  • C.C. Chung et al.

    LIBSVM: a library for Support Vector Machines

    ACM Trans. Intell. Syst. Technol.

    (2011)
  • H. Cunningham et al.

    A general architecture for text engineering (GATE)

    Comput. Hum.

    (2002)
  • M.K. Dalal et al.

    Semi supervised learning based opinion summarization and classification for online product reviews

    Appl. Comput. Intell. Soft. Comput.

    (2013)
  • X. Ding et al.

    The utility of linguistic rules in opinion mining

    International Conference on Research and Development in Information Retrieval

    (2007)
  • Erlin et al.

    Text message categorization of collaborative learning skills in online discussion using Support Vector Machine

    International Conference on Computer Control Information and Its Applications

    (2013)
  • M. Gamon et al.

    Mining customer opinions from free text

    International Symposium on Intelligent Data Analysis

    (2005)
  • J. Han et al.

    Data Mining: Concepts and Techniques

    (2006)
  • T. Hassan et al.

    Utilizing Support Vector Machines in mining online customer reviews

    ICCTA

    (2012)
  • B. Hohrmann, P.L. Hegaret, T. Pixley, Document Object Model (DOM) Level 3 Events Specification, W3C,...
  • H. Jeong et al.

    FEROM: feature extraction and refinement for opinion mining

    ETRI

    (2011)
  • N. Kaji et al.

    Automatic construction of polarity-tagged corpus from HTML documents

    International Conference on Computational Linguistics Morriston

    (2006)
  • Cited by (109)

    • Evaluating medical travelers’ satisfaction through online review analysis

      2021, Journal of Hospitality and Tourism Management
    View all citing articles on Scopus
    View full text