A spatiotemporal multi-feature extraction framework for opinion mining
Introduction
The rapid development of communication technology and the Internet technology has promoted the popularization of various network platforms. More and more Internet users have become information publishers and are willing to share their opinions on commodities, movies and current affairs. Through the statistics and analyses of various speeches on the Internet, the emotional colour or emotional tendency of users can be perceived, which is very helpful for the public opinion gathering and further processing. For example, it can help consumers understand feedback about certain products so as to make more appropriate shopping decisions, and provides stronger support for the government to make decisions by analysing public opinion. Fig. 1 shows the typical processing framework of opinion mining.
The typical methods used in opinion mining can be divided into three categories: emotion dictionary method, machine learning method and deep learning method. The emotion dictionary method is mainly to match the input text with the constructed dictionary, so as to judge the emotional tendency of the text [1], [2], [3], [4]. However, the emotional dictionary is constructed manually, so it has the limitation that it cannot contain all possible explicit and implicit emotional expressions. In addition, with the development of history and society, the emergence of new words will lead to poor timeliness of emotion dictionary updating.
Machine learning methods process text and extracting text features to get the emotional tendency. Naive bayes (NB) [5], decision tree (DT) [6], k-nearest neighbour (KNN) [7], support vector machine (SVM) [8] and logistic regression (LR) [9] are the commonly used machine learning algorithms. Sharma et al. used boosting technology to integrate the performance of classifiers on the basis of SVM, and obtained better classification results than single SVM [10]. In addition, Mohd Nafis and Awang proposed an enhanced hybrid feature selection technique via the hybridization of term frequency-inverse document frequency (TF-IDF) and support vector machine-recursive feature elimination (SVM-RFE) to improve emotion classification based on machine learning methods [11]. Tripathy et al. used NB, maximum entropy, random gradient descent, and SVM to process text classification according to the number of words [12]. The implementation of the above machine learning algorithms requires complex and time-consuming feature engineering, which is not suitable for the background of data explosion.
Due to the merit of avoiding complex feature engineering, deep learning method has been increasingly used in many fields. For example, convolutional neural network (CNN) and recurrent neural network (RNN) have been broadly applied in the field of natural language processing (NLP). By extracting n-gram features of different scales from the data, Kim and Zhang applied CNN to text classification field [13], [14]. There are also researchers who tried to build deep models on the basis of CNN, Kalchbrenner et al. proposed a CNN algorithm with different depths [15], [16]. RNN is a neural network with memory functions, and it has attracted the attention of many researchers due to its advantages in processing time series to solve classification and prediction problems, and has also achieved good results. Zheng et al. proposed a hybrid bidirectional recurrent CNN (RCNN) attention-based model which combines the bidirectional long short-term memory (LSTM) and the CNN with the attention mechanism and word2vec to achieve the fine-grained text classification task [17]. LSTM and gated recurrent unit (GRU) are two main variants of RNN, which are also utilized and improved for text processing [18], [19], [20], [21], [22], [23], [24]. A novel deep neural network model, attention-based bidirectional GRU (Bi-GRU)-CNN network was proposed by Lin et al., which can not only extract the features of Chinese questions effectively, but also learn the context information of words to solve the problem that the text-CNN model can lose position feature [25]. Similar to the deep improvement of the CNN algorithm, in order to solve the semantic bias problem of RNN, the bidirectional RNN structure was proposed and applied to contextual semantic learning tasks [26], [27]. Some researchers combined the models of CNN and RNN in order to get better performance [19], [28]. The recurrent convolutional RCNN algorithm uses the association between words for text classification based on the bidirectional recurrent structure [29]. Moreover, a word embedding method based on real-time sentiment (WRS) proposed by Rasool et al. combines different lexical resources and adopts word2vec method to obtain the feature vectors [30].
However, for the deep learning methods applied in opinion mining filed, there is still room for improvement in classification accuracy due to the following shortcomings: (1) Text data has obvious spatiotemporal dependence, while traditional algorithms are obviously not sufficient in extracting spatiotemporal features. (2) The existing methods do not consider the natural language characteristics of the text and the relationship between the levels of various language elements when processing text.
In view of the issues mentioned above, the main contributions of this paper are as follows:
- 1.
A new spatiotemporal framework based on simple recurrent unit (SRU), multi-head attention mechanism, and dilated convolution module is proposed, which considers multiple features both from temporal and spatial dimensions.
- 2.
Opinion mining can be processed from four levels based on characteristics of natural language. From the perspective of time, extracting features from word level and grammar level enriches the feature information. By applying multi-head attention mechanism and using dilated convolution in spatial perspective, the proposed framework can extract richer text features on semantic and opinion levels.
The rest of the paper is arranged as follows: Section 2 details the opinion mining on related work; Section 3 elaborates the framework proposed; Section 4 shows the experimental environment and result analyses; the conclusion is given in Section 5.
Section snippets
Related work
The unprecedented development in the field of deep learning has provided more available ways for opinion mining which has also entered a new stage. Some related work on opinion mining based on deep learning is as follows:
RNN is a suitable neural network for processing time series data whose unique structure with memory function has made it be widely used in the fields such as NLP and activity recognition with good performance. As a variant of RNN, LSTM combines structures of input gate, forget
Proposed framework
In this part, a spatiotemporal framework based on SRU, multi-head attention mechanism, and dilated convolution module is introduced. This framework can extract multi-dimensional features of input data and effectively improve the classification accuracy. Fig. 4 gives a visualization of the proposed framework, it can be seen that the proposed framework is divided into five layers: embedding layer, temporal feature extraction layer, semantic feature extraction layer, spatial feature extraction
Experiments and analyses
To evaluate the performance of the proposed framework, the experiments are made on public dataset. This section describes the experiments and analyses for the proposed framework.
Conclusion
In this paper, we propose a new spatiotemporal framework to solve the problem of insufficient feature extraction in opinion mining. In this framework, SRU, multi-head attention mechanism, and dilated convolution are combined and applied to extract features as much as possible from the temporal and spatial dimensions. At the same time, attention is paid to the text at the four levels of natural language features, i.e., word, grammar, semantics, and opinions, which can extract richer features
CRediT authorship contribution statement
Tiankuo Li: Writing-original draft, Data curation, Validation, Visualization. Hongji Xu: Project administration, Funding acquisition, Supervision, Writing-review&editing. Zhi Liu: Conceptualization, Investigation. Zheng Dong: Investigation, Methodology. Qiang Liu: Writing-review&editing. Juan Li: Writing-review&editing. Shidi Fan: Writing-review&editing. Xiaojie Sun: Writing-review&editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was financially supported by the Natural Science Foundation of Shandong Province of China (ZR2020MF139) and the National Key Research and Development Program of China (2020YFC0833200).
Tiankuo Li received the B.Eng. degree in light chemical engineering from Northeast Forestry University, Harbin, China, in 2015. From April 2014 to March 2015, he was a Visiting Undergrade student with the Kitami Institute of Technology, Hokkaido, Japan. He is currently a M.S. student in School of Information Science and Engineering, Shandong University, Qingdao, China. His research interests include smart home system, deep learning and natural language processing.
References (44)
- et al.
Supervised sentiment analysis in multilingual environments
Inf. Process. Manage.
(2017) - et al.
Classification of sentiment reviews using n-gram machine learning approach
Expert Syst. Appl.
(2016) - et al.
A neural network based approach for sentiment classification in the blogosphere
J. Informetr.
(2011) - C. Whissell, Objective analysis of text: II. Using an emotional compass to describe the emotional tone of situation...
Using the revised dictionary of affect in language to quantify the emotional undertones of samples of natural language
Psychol. Rep.
(2009)- et al.
Building emotional dictionary for sentiment analysis of online news
World Wide Web.
(2014) - et al.
Toward optimal feature selection in naive Bayes for text categorization
IEEE Trans. Knowl. Data Eng.
(2016) - et al.
Semi-supervised self-training for decision tree classifiers
Int. J. Mach. Learn. Cybern.
(2017) - et al.
Efficient KNN classification with different numbers of nearest neighbours
IEEE Trans. Neural Netw. Learn. Syst.
(2018) - et al.
Aspect term extraction for sentiment analysis in large movie reviews using Gini index feature selection method and SVM classifier
World Wide Web.
(2017)
Discrimination of mine seismic events and blasts using the fisher classifier, naive Bayesian classifier and logistic regression
Rock Mech. Rock Eng.
A boosted SVM based ensemble classifier for sentiment analysis of online reviews
ACM SIGAPP Appl. Comput. Rev.
An enhanced hybrid feature selection technique using term frequency-inverse document frequency and support vector machine-recursive feature elimination for sentiment classification
IEEE Access
A hybrid bidirectional recurrent convolutional neural network attention-based model for text classification
IEEE Access
Cited by (3)
Product feature sentiment analysis based on GRU-CAP considering Chinese sarcasm recognition
2024, Expert Systems with ApplicationsMicroseism Detection Method in Coal Mine Based on Spatiotemporal Characteristics and Support Vector Regression Algorithm
2023, Applied Sciences (Switzerland)
Tiankuo Li received the B.Eng. degree in light chemical engineering from Northeast Forestry University, Harbin, China, in 2015. From April 2014 to March 2015, he was a Visiting Undergrade student with the Kitami Institute of Technology, Hokkaido, Japan. He is currently a M.S. student in School of Information Science and Engineering, Shandong University, Qingdao, China. His research interests include smart home system, deep learning and natural language processing.
Hongji Xu received the B.Eng. degree in electronic engineering from Shandong University of Technology, Jinan, China, in 1999, and received the M.S. degree in signal and information processing and the Ph.D. degree in communication and information system from Shandong University, Jinan, China, in 2001 and 2005, respectively. From 2010 to 2015, he was a Postdoctoral Research Fellow in Tsinghua University-Inspur Group Postdoctoral Scientific Research Station, China. From Dec. 2014 to Dec. 2015 and from Jan. 2018 to Apr. 2018, he was a Visiting Scholar in the University of California San Diego (UCSD), USA and the Virginia Polytechnic Institute and State University (Virginia Tech), USA, respectively. He is currently an Associate Professor in the School of Information Science and Engineering, Shandong University. His research interests include wireless communications, ubiquitous computing, blind signal processing, human-computer interaction, human activity recognition, and artificial intelligence.
Zhi Liu received the Ph.D. degree from the Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, in 2008. His current position is as Full Professor with the School of Information Science and Engineering of Shandong University. He is the head of the Intelligent Information Processing Group. His current research interests are in applications of computational intelligence to linked multicomponent Big Data systems, medical images in the neurosciences, multimodal human computer interaction, remote sensing image processing, content based image retrieval, semantic modeling, data processing, lassification, and data mining.
Zheng Dong received the B.S. and M.Eng. degrees from the School of Information Science and Engineering, Shandong University, Jinan, China, in 2009 and 2012, respectively, and the Ph.D. degree from the Department of Electrical and Computer Engineering, McMaster University, Canada, in 2016. He was a Postdoctoral Research Fellow in the School of Electrical and Information Engineering, University of Sydney, Australia. He is currently a Research Professor in the School of Information Science and Engineering, Shandong University, China. His research interests include the industrial Internet of Things and ultra-reliable low-latency communications.
Qiang Liu received the B.Eng. degree in electronic information engineering from Harbin University of Science and Technology, Harbin, China, in 2019. He is currently a M.S. student in the School of Information Science and Engineering, Shandong University, Qingdao, China. His research interests include ubiquitous computing, data fusion and human activity recognition.
Juan Li received the B.Eng. degree in Electronic Information Engineering from Shandong University, Jinan, China, in 2018. She is currently a M.S. student in the School of Information Science and Engineering, Shandong University, Qingdao, China. Her research interests include human activity recognition, data fusion and ubiquitous computing.
Shidi Fan received the B.Eng. degree in electronic information engineering from Shandong University, Jinan, China, in 2018. She is currently a M.S. student in the School of Information Science and Engineering, Shandong University, Qingdao, China. Her research interests include ubiquitous computing, context awareness, inconsistency elimination, data fusion and quality of context.
Xiaojie Sun received the B.Eng. degree in communication engineering from Shandong Normal University, Jinan, China, in 2019. She is currently a M.S. student in the School of Information Science and Engineering, Shandong University, Qingdao, China. Her research interests include machine learning, artificial intelligence, activity recognition and context-aware computing.