Authors:
Sanidhya Vijayvargiya
1
;
Lov Kumar
2
;
Lalita Murthy
1
;
Sanjay Misra
3
;
Aneesh Krishna
4
and
Srinivas Padmanabhuni
5
Affiliations:
1
BITS-Pilani Hyderabad, India
;
2
NIT kurukshetra, India
;
3
Østfold University College, Halden, Norway
;
4
Curtin University, Australia
;
5
Testaing.Com, India
Keyword(s):
Sentiment Analysis, Deep Learning, Data Imbalance Methods, Feature Selection, Classification Techniques, Word Embedding.
Abstract:
Sentiment analysis for software engineering(SA4SE) is a research domain with huge potential, with applications ranging from monitoring the emotional state of developers throughout a project to deciphering user feedback. There exist two main approaches to sentiment analysis for this purpose: a lexicon-based approach and a machine learning-based approach. Extensive research has been conducted on the former; hence this work explores the efficacy of the ML-based approach through an LSTM model for classifying the sentiment of the text. Three different data sets, StackOverflow, JIRA, and AppReviews, have been used to ensure consistent performance across multiple applications of sentiment analysis. This work aims to analyze how LSTM models perform sentiment prediction across various kinds of textual content produced in the software engineering industry to improve the predictive ability of the existing state-of-the-art models.