Abstract
Social media data is unstructured data where these big data are exponentially increasing day to day in many different disciplines. Analysis and understanding the semantics of these data are a big challenge due to its variety and huge volume. To address this gap, unstructured Arabic texts have been studied in this work owing to their abundant appearance in social media Web sites. This work addresses the difficulty of handling unstructured social media texts, particularly when the data at hand is very limited. This intelligent data augmentation technique that handles the problem of less availability of data are used. This article has proposed a novel architecture for hand Arabic words classification and understands based on convolutional neural networks (CNNs) and recurrent neural networks. Moreover, the CNN technique is the most powerful for the analysis of Arabic tweets and social network analysis. The main technique used in this work is character-level CNN and a recurrent neural network stacked on top of one another as the classification architecture. These two techniques give 95% accuracy in the Arabic texts dataset.
- [n.d]. Twitter sentiment analysis. Retrieved June 22, 2019 from https://github.com/topics/twitter-sentiment-analysis.Google Scholar
- Abu Bakr Soliman, Kareem Eissa, and Samhaa R. El-Beltagy. 2017. Aravec: A set of arabic word embedding models for use in arabic nlp. Proc. Comput. Sci. 117 (2017), 256--265.Google ScholarCross Ref
- Anwar Alnawas and Nursal Arici. 2019. Sentiment analysis of Iraqi Arabic Dialect on Facebook based on distributed representations of documents. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 18, 3, Article 20 (January 2019), 17 pages. DOI:https://doi.org/10.1145/3278605Google ScholarDigital Library
- T. Andrija, J. Predrag, and K. Vlado. 2006. n-Gram-based classification and unsupervised hierarchical clustering of genome sequences. Computer Methods and Programs in Biomedicine 81 (2006), 37--153. DOI:http://dx.doi.org/10.1016/j.cmpb.2005.11.007Google Scholar
- Albert Bifet and Eibe Frank. 2010. Sentiment knowledge discovery in twitter streaming data. In Proceedings of the 13th International Conference on Discovery Science (DS'10). Springer-Verlag, Berlin, Heidelberg, 1--15.Google ScholarDigital Library
- Changhua Yang, Kevin Hsin-Yih Lin, and Hsin-Hsi Chen. 2007. Emotion classification using Web blog corpora. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (WI'07). IEEE Computer Society, USA, 275--278. DOI:https://doi.org/10.1109/WI.2007.50Google ScholarDigital Library
- F. Chollet. 2015. Keras. https://github.com/fchollet/keras.Google Scholar
- Dmitry Davidov, Oren Tsur, and Ari Rappoport. 2010. Enhanced sentiment learning using Twitter hashtags and smileys. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters (COLING'10). Association for Computational Linguistics, USA, 241--249.Google ScholarDigital Library
- Changliang Li, Bo Xu, Gaowei Wu, Saike He, Guanhua Tian, and Hongwei Hao. 2014. Recursive deep learning for sentiment analysis over social data. In Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Vol. 02 (WI-IAT'14). IEEE Computer Society, USA, 180--185. DOI:https://doi.org/10.1109/WI-IAT.2014.96Google ScholarDigital Library
- Junyoung Chung, Caglar Gulcehre, Kyunghyun Cho, and Yoshua Bengio. 2015. Gated feedback recurrent neural networks. In Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37 (ICML'15). JMLR.org, 2067--2075.Google Scholar
- Yoon Kim. 2014. Convolutional neural networks for sentence classification. Eprint Arxiv, 1746--1751.Google Scholar
- Kun-Lin Liu, Wu-Jun Li, and Minyi Guo. 2012. Emoticon smoothed language models for twitter sentiment analysis. In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI'12). AAAI Press, 1678--1684.Google Scholar
- Loai Alnemer, Bayan Alammouri, Jamal Alsakran, and Omar El Ariss. 2019. Enhanced classification of sentiment analysis of Arabic reviews. EIDWT 2019, LNDECT 29 (2019), 210--220.Google Scholar
- K. P. Murphy. 2012. Machine Learning: A Probabilistic Perspective. The MIT Press.Google ScholarDigital Library
- Naaima Boudad, Rdouan Faizi, Rachid Oulad Haj Thami, and Raddouane Chiheb. 2017. Sentiment analysis in arabic: A review of the literature. Ain Shams Engineering Journal 9, 4 (2017), 2479--2490.Google ScholarCross Ref
- Alexander Pak and Patrick Paroubek. 2010. Twitter as a Corpus for Sentiment Analysis and Opinion Mining. In Proceedings of the Language Resources and Evaluation Confrence (LREC'10). 1320--1326.Google Scholar
- J. R. Quinlan. 1986. Induction of decision trees. Machine Learning 1 (1986), 81--106. DOI:https://doi.org/10.1023/A:1022643204877Google ScholarCross Ref
- J. Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.Google ScholarDigital Library
- E. Refaee and V. Rieser. 2014. An Arabic Twitter corpus for subjectivity and sentiment analysis. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC'14). European Language Resources Association, 2268--2273.Google Scholar
- Giovanni Seni and John Elder. 2010. Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions. Morgan and Claypool Publishers.Google Scholar
- S. A. Rupal Bhargava and Y. Sharma. 2018. Neural network based architecture for sentiment analysis in Indian languages. Journal of Intelligent Systems 7 (2018), 313--318.Google Scholar
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (November 1997), 1735--1780.Google ScholarDigital Library
- Shubhi Mittal, Ashna Goel, and Rachna Jain. 2016. Sentiment analysis of ecommerce and social networking sites. In 3rd International Conference on Computing for Sustainable Global Development (INDIACom'16). 2300--2305.Google Scholar
- G. Sidorov. 2013. Syntactic dependency-based n-grams in rule based automatic English as second language grammar correction. International Journal of Computational Linguistics and Applications 4, 2 (2013), 169--188.Google Scholar
- Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. 2005. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT'05). Association for Computational Linguistics, USA, 347--354. DOI:https://doi.org/10.3115/1220575.1220619Google ScholarDigital Library
- Rinalds Vīksna and Gints Jēkabsons. 2018. Sentiment analysis in Latvian and Russian: A survey. Applied Computer Systems 23, 1 (2018), 45--51.Google ScholarCross Ref
- Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text classification. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Vol. 1 (NIPS'15). MIT Press, Cambridge, MA, USA, 649--657.Google ScholarDigital Library
- Zhun Zhong, Liang Zheng, Zhedong Zheng, Shaozi Li, and Yi Yang. 2019. CamStyle: A novel data augmentation method for person re-identification. IEEE Transactions on Image Processing 28, 3 (2019), 1176--1190.Google ScholarDigital Library
- Mourad Gridach and Hatem Haddad. 2017. Arabic named entity recognition: A bidirectional GRU-CRF approach. In Proceedings of the International Conference on Computational Linguistics and Intelligent Text Processing. 264--275.Google Scholar
- Majdi Beseiso. 2019. Word and character information aware neural model for emotional analysis. Recent Patents on Computer Science 12, 2, 142.Google ScholarCross Ref
Index Terms
- Subword Attentive Model for Arabic Sentiment Analysis: A Deep Learning Approach
Recommendations
CALText: Contextual Attention Localization for Offline Handwritten Text
AbstractRecognition of Arabic-like scripts such as Persian and Urdu is more challenging than Latin-based scripts. This is due to the presence of a two-dimensional structure, context-dependent character shapes, spaces and overlaps, and placement of ...
Sentiment Analysis of Network Comments Based on GCNN
CSAI '18: Proceedings of the 2018 2nd International Conference on Computer Science and Artificial IntelligenceRecently, satisfactory performance has been achieved on sentiment analysis tasks by using recurrent neural network (RNN). As a derived model of RNN, gated recurrent units (GRU) model has a great advantage in dealing with long text sequence problems, ...
Character convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks
Highlights- Examine an LSTM neural tagging model for Named Entity Recognition (NER).
- ...
AbstractNamed Entity Recognition (NER) is a significant information extraction task since it is an important component of many natural language processing applications, such as Information Retrieval, Question Answering and Speech Recognition. ...
Comments