Extraction of emotions from multilingual text using intelligent text processing and computational linguistics
Introduction
Emotion expression plays a vital role in various part of everyday communication. In past, various measures have been used to evaluate it, through a combination of indications such as facial expressions, gestures, and actions etc. Emotions extraction using facial, gestures and action are the part of digital image processing and computer vision [1]. Emotions extraction is more difficult from texts especially from multi-languages texts, like in posts on social media and customers’ reviews. This type of data has presence of ambiguity and complexity of words in terms of meaning make them more difficult. Factors such as users writing style, politeness, irony, variability in language is one of the important problems in extraction of emotions [2]. A wide variety of state-of-art work has been carried out in the domain of opinions mining and sentiment analysis but limited research are focused on detection/extraction of emotions in multi-language text.
In English vocabulary, some words express emotion explicitly, whereas other words can be used to get across emotion implicitly depending on the context [3]. Emotion detection in the text has recently attracted the scientific community to explore meaningful inferences hidden in the data and help in decision-making [4]. Many authors classify emotions in multiple classes for a better understanding, like Strapparava and Valitutti have classified emotional words into two classes’direct affective words’ and’indirect affective-words [2]. Emotion research is important for building affective interfaces. These affective interfaces provide better user experience in following areas such as Human–Computer Interaction (HCI), Text-to-Speech (TTS) synthesis systems and Computer-Mediated Communication (CMC) [5]. Computational techniques related to emotion extraction present in social media have paying attention on basis of multiple emotion modalities [6]. However, only limited work has been done in developing automatic emotion recognition system [4], [6].
The multilanguage text contains emotional words of different languages and extraction of these emotional words definitely improve emotion identification ratio [7]. In most of the available literature, theses words are treated as stop words in social media data [7]. This paper presented an advanced framework for automatic detection of emotions in Multilanguage text data. The emotion models used for development of proposed framework deals with linguistics and psychology. Proposed framework uses Machine Learning techniques for learning and validation and effective pre-processing Natural Language Processing (NLP) techniques for better extraction of emotions existing in the text.
This paper uses the concept of emotion model given by Ekman [8] as a basis with multiple feature sets to deal with multilingual data. The text under study comprises data collected from Twitter in three different domains such as Political election, Healthcare, and Sports. The first task is to collect real-time data consisting of relevant keywords. Through this paper, a novel technique based on RSS (Rich Site Summary) feeds to collect keyword which has been used for real-time data collection of events, has been introduced. Tweets containing images and emoticons are not considered under the scope of proposed approach. The effective pre-processing technique has been used to filter out irrelevant words and preserving words representing emotion of other languages. The classification of the dataset has been performed using popular machine learning techniques.This work represents the first systematic evaluation of emotion detection in real-time multilingual data in multiple domains. Another key contribution of the presented work is the practical application of emotion models in comparison of corpus-driven approach which assigns affective orientation or scores to words and word frequencies.
The rest of the paper has been organized as follows. State-of-art methods have been presented in Section 2. Proposed data collection methodology has been presented in Section 3. The problem formulation, existing methods, and proposed framework of emotion extraction system have been presented in Section 4. Experimental setup and outcomes with discussions have been presented in Sections 5 and 6. In Section 7 advantages of proposed approach over state of art, methods have been identified. Finally, precise conclusions and scope of future work are mentioned in Section 7.
Section snippets
Related work
Nowadays, a lot of research articles have been published for analyzing sentiments in social media data in multiple domains. This literature review section discussed emotion extraction methods and sentiment classification methods related to different domains like election prediction, healthcare, and sports analytics.
Proposed data collection methodology
In this section, an intelligent technique for data collection has been presented. The important variable for data collection from social media data are keywords, which helps in identification of relevant tweets. Most research for keyword selection is based on popular terms corresponding to the event [46], [49], [52], [57], [58], [62]. Methodology for data collection is different from other author's techniques; here only those keywords which are trending and dynamic are considered.
The process of
Proposed methodology
In this section, proposed Emotion extraction framework, emotion models with annotation of general terms and feature groups used in the framework has been presented.
Experiments
In this section, performance analysis of the proposed system for emotion extraction with corpus-based features has been evaluated on collected datasets. Firstly, Corpus-based feature analysis present in the datasets has been exploited. Secondly, experimental analysis of proposed emotion extraction framework using multiple datasets has been evaluated.
Results
In this section, performance analysis of the proposed emotion extraction system has been evaluated on collected datasets. The important meaningful inference drawn from datasets has been presented. Different test data sets are used for predicting results on the basis of events.
In the case of election outcome prediction, two test cases based on party name and candidate name has been formed. In the first case, emotion extraction model has been applied to derive the emotion towards CM candidate.
Advantages of proposed work
The proposed models have been used in multiple data-driven applications which focused on the hidden information contained in the text. An application such as topic-based text categorization, summarization, question answering systems, and information retrieval systems can be improved using proposed method.
Emotion research is widely used in developing affective interfaces which provide appropriate emotional responses and facilitate online communication through animated affective agents [91], [92]
Conclusion and scope of future work
Public emotions present in Social media data offers unique challenges and opportunities for in decision-making in different domains. The major contribution of this research is to present that it is feasible to apply intelligent computational techniques for identification and classification of various types of emotions in texts. An effective technique for data collection and extraction of emotions in social media data has been presented through this paper. Important meaningful inferences are
Vinay Kumar Jain received his Bachelor's Degree in 2009 from Rajiv Gandhi Proudyogiki Vishwavidyala, Bhopal, India and received his Master's Degree from Jaypee University of Engineering and Technology, India in 2012. Now, he is pursuing his Ph.D. degree from Jaypee University of Engineering and Technology, Guna, M.P., India.
References (92)
- et al.
Twitter mood predicts the stock market
J. Comput. Sci.
(2011) - et al.
An effective approach to track levels of Influenza-A (H1N1) pandemic in India using Twitter
Procedia Comput. Sci.
(2015) - et al.
Towards building a social emotion detection system for online news
Future Gener. Comput. Syst.
(2014) - et al.
Fusion of sparse representation and dictionary matching for identifications of humans in uncontrolled environment
J. Comput. Biol. Med.
(2016) - et al.
Using text mining and sentiment analysis for online forums hotspot detection and forecast
Decis. Support Syst.
(2010) Applying the integrative model of behavioral prediction and attitude functions in the context of social media use while viewing mediated sports
Comput. Human Behav.
(2013)- et al.
World Cup 2014 in the Twitter World: a big data analysis of sentiments in U.S. sports fans’ tweets
Comput. Human Behav.
(2015) - et al.
The psychological foundations of the affective lexicon
J. Pers. Soc. Psychol.
(1987) - et al.
WordNet-Affect: an affective extension of WordNet
- et al.
Identifying expressions of emotion in text
Making Computers laugh: investigations in automatic humor recognition
Hierarchical versus flat classification of emotions in text
An argument for basic emotions
Cogn. Emot.
Affect, Imagery, Consciousness. The Positive Affects
Human Emotions
Emotion: A Psychoevolutionary Synthesis
The Cognitive Structure of Emotions
The Number of Rasa
The Measurement of Meaning
Linguistics and poetics
Towards a consensual structure of mood
Psychol. Bull.
The language of emotions: an analysis of a semantic field
Cogn. Emot.
WordNet: An Electronic Lexical Database
Affective Norms for English Words (ANEW): Stimuli, Instruction Manual and Affective Ratings. Technical Report C-1, Gainesville, FL
Words with attitude
A model of textual affect sensing using real-world knowledge
the Language of Evaluation: Appraisal in English
Emotions from text: machine learning for text-based emotion prediction
Experiments with mood classification in blog posts
The affective weight of lexicon
A corpus-based approach to finding happiness
Exploitation in affect detection in open-ended improvisational text
Using emoticons to reduce dependency in machine learning techniques for sentiment classification
Analysis of affect expressed through the evolving language of online communication
UPAR7: a knowledge-based system for headline sentiment tagging
A high-order hidden Markov model for emotion detection from textual data
Knowledge Management and Acquisition for Intelligent Systems
Emotion extraction from real time chat messenger
Emotion recognition from text based on automatically generated rules
Comparative Geospatial Analysis of Twitter Sentiment Data During the 2008 and 2012 U.S. Presidential Elections. Master Thesis
Using Roget's Thesaurus for fine-grained emotion recognition
We feel: taking the emotional pulse of the world
Opinion mining and sentiment analysis
Found. Trends Inf. Retrieval
Recognizing Emotions in Text, Master Thesis
Towards early discovery of salient health threats: a social media emotion classification technique
Influence factor based opinion mining of Twitter data using supervised learning
Cited by (119)
A sentiment analysis method for COVID-19 network comments integrated with semantic concept
2024, Engineering Applications of Artificial IntelligenceUsing data mining techniques deep analysis and theoretical investigation of COVID-19 pandemic
2023, Measurement: SensorsDeep learning-based social media mining for user experience analysis: A case study of smart home products
2023, Technology in SocietyTextual emotion detection in health: Advances and applications
2023, Journal of Biomedical InformaticsCitation Excerpt :Lexicons were utilized to extend emotion datasets with emotion synonyms [34], word clusters [86], and psychiatric labels [47]. Lexicon features were also used to train machine learning models [6,55,85,92], which form another category of text-based emotion classification. These methods are capable of learning new tasks without being specifically programmed for the new task, by dividing the entire dataset into two parts: (i) the training dataset for training model parameters and hyper-parameters, and (ii) the testing dataset to understand how effective the model will be on new unseen data or tasks [15].
Sector-level sentiment analysis with deep learning
2022, Knowledge-Based Systems
Vinay Kumar Jain received his Bachelor's Degree in 2009 from Rajiv Gandhi Proudyogiki Vishwavidyala, Bhopal, India and received his Master's Degree from Jaypee University of Engineering and Technology, India in 2012. Now, he is pursuing his Ph.D. degree from Jaypee University of Engineering and Technology, Guna, M.P., India.
Shishir Kumar in working as Professor the Department of Computer Science and Engineering at Jaypee University of Engineering and Technology, Guna, M.P., India. He has earned Ph.D. in Computer Science in 2005. He has 14 years of teaching and research experience.
Steven Fernandes is member of Core Research Group, Karnataka Government Research Centre of Sahyadri College of Engineering and Management, Mangalore, Karnataka. He has received Young Scientist Award by Vision Group on Science and Technology, Government of Karnataka, India in the year 2014 and also received grant from The Institution of Engineers (India), Kolkata, India. He completed his B.E. (Electronics and Communication Engineering) with Distinction from Visvesvaraya Technological University, Belagavi, Karnataka and M.Tech. (Microelectronics) with Distinction from Manipal University, Manipal, Karnataka. His Ph.D. work “Match Composite Sketch with Drone Images” has received patent notification (Patent Application Number: 2983/CHE/2015) from Government of India, Controller General of Patents, Designs & Trade Marks. He has 5 years of industry experience working at STMicroelectronics Pvt. Ltd. and Perform Group Pvt. Ltd. He has published several papers in peer-reviewed International Journals having Thomson Reuters Web of Science Impact Factor and IEEE, Springer, Elsevier International Conferences. He is also serving has reviewer and guest editor for several Science Citation Indexed and Scopus Indexed International Journals.