Abstract
Detecting depression or personality traits, tutoring and student behaviour systems, or identifying cases of cyber-bulling are a few of the wide range of the applications, in which the automatic detection of emotion is crucial. This task can contribute to the benefit of business, society, politics or education. The main objective of our research is focused on the improvement of the supervised emotion detection systems developed so far, through the definition and implementation of a technique to annotate large scale English emotional corpora automatically and with high standards of reliability. Our proposal is based on a bootstrapping process made up two main steps: the creation of the seed using NRC Emotion Lexicon and its extension employing the distributional semantic similarity through words embeddings. The results obtained are promising and allow us to confirm the soundness of the bootstrapping technique combined with the word embedding to label emotional corpora automatically.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alm, C.O., Roth, D., Sproat, R.: Emotions from text: machine learning for text-based emotion prediction. In: Proceedings of the Conference on HLT-EMNLP, pp. 579–586 (2005)
Aman, S., Szpakowicz, S.: Identifying expressions of emotion in text. In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS, vol. 4629, pp. 196–205. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74628-7_27
Aubur, D., Armantrout, R., Crystal, D., Dirda, M.: Oxford American Writer’s Thesaurus. Oxford University Press, Oxford (2004)
Boldrini, E., Martínez-Barco, P.: EMOTIBLOG: a model to learn subjetive information detection in the new textual genres of the Web 2.0-multilingual and multi-genre approach. Ph.D. thesis (2012)
Cherry, C., Mohammad, S.M., De Bruijn, B.: Binary classifiers and latent sequence models for emotion detection in suicide notes. Biomed. Inf. Insights 5(Suppl 1), 147–154 (2012)
Choudhury, M.D., Gamon, M., Counts, S.: Happy, nervous or surprised? Classification of human affective states in social media. In: Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (2012)
Chowdhury, S., Chowdhury, W.: Performing sentiment analysis in bangla microblog posts. In: International Conference on Informatics, Electronics & Vision (ICIEV). IEEE (2014)
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measure. 20(1), 37 (1960)
Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 100–110 (1999)
Dadvar, M., Trieschnigg, D., Ordelman, R., de Jong, F.: Improving cyberbullying detection with user context. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 693–696. Springer, Heidelberg (2013). doi:10.1007/978-3-642-36973-5_62
Ekman, P.: An argument for basic emotions. Cognit. Emotion 6, 169–200 (1992)
Gliozzo, A., Strapparava, C.: Semantic Domains in Computational Linguistics. Springer, Heidelberg (2009). doi:10.1007/978-3-540-68158-8
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)
Lee, S., Lee, G.G.: A bootstrapping approach for geographic named entity annotation. In: Myaeng, S.H., Zhou, M., Wong, K.-F., Zhang, H.-J. (eds.) AIRS 2004. LNCS, vol. 3411, pp. 178–189. Springer, Heidelberg (2005). doi:10.1007/978-3-540-31871-2_16
Liew, J.S.Y., Turtle, H.R., Liddy, E.D.: EmoTweet-28: a fine-grained emotion corpus for sentiment analysis. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (2016)
Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The stanford CoreNLP natural language processing toolkit. In: Association for Computational Linguistics (ACL) System Demonstrations, pp. 55–60 (2014)
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Mohammad, S.: #Emotional tweets. In: Proceedings of the First Joint Conference on Lexical and Computational Semantics (2012)
Mohammad, S.M.: Sentiment analysis: detecting valence, emotions, and other affectual states from text. In: Emotion Measurement (2015)
Mohammad, S.M., Turney, P.D.: Crowdsourcing a word-emotion association lexicon. Comput. Lang. 29(3), 436–465 (2013)
Montero, C.S., Suhonen, J.: Emotion analysis meets learning analytics: online learner profiling beyond numerical data. In: Proceedings of the 14th Koli Calling International Conference on Computing Education Research, pp. 165–169 (2014)
Neviarouskaya, A., Prendinger, H., Ishizuka, M.: Compositionality principle in recognition of fine-grained emotions from text. In: Proceedings of the Third International ICWSM Conference, pp. 278–281 (2009)
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP) (2014)
Platt, J.: Using analytic QP and sparseness to speed training of support vector machines. In: Proceedings of Advances in Neural Information Processing Systems, pp. 557–563 (1999)
Plutchik, R.: A general psycho evolutionary theory of emotion. In: Theories of Emotion, pp. 3–33 (1980)
Schwartz, H.A., Eichstaedt, J.C., Kern, M.L., Dziurzynski, L., Lucas, R.E., Agrawal, M., Park, G.J., Lakshmikanth, S.K., Jha, S., Seligman, M.E.P., Ungar, L.: Characterizing geographic variation in well-being using tweets. In: Proceedings of the International AAAI Conference on Weblogs and Social Media (2013)
Strapparava, C., Mihalcea, R.: Semeval-2007 task 14: affective text. In: Proceedings of the 4th International Workshop on Semantic Evaluations, pp. 70–74 (2007)
Wang, W., Chen, L., Thirunarayan, K., Sheth, A.P.: Harnessing twitter “big data” for automatic emotion identification. In: International Confernece on Social Computing (SocialCom) (2012)
Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics (ACL 1995), pp. 189–196. Association for Computational Linguistics, Stroudsburg, PA, USA (1995)
Acknowledgment
This research has been supported by the FPI grant (BES-2013-065950) and the research stay grant (EEBB-I-15-10108) from the Spanish Ministry of Science and Innovation. It has also funded by the Spanish Government (DIGITY ref. TIN2015-65136-C02-2-R) and the Valencian Government (grant no. PROMETEOII/2014/001).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Canales, L., Strapparava, C., Boldrini, E., Matínez-Barco, P. (2017). Bootstrapping Technique + Embeddings = Emotional Corpus Annotated Automatically. In: Quesada, J., Martín Mateos , FJ., López Soto, T. (eds) Future and Emerging Trends in Language Technology. Machine Learning and Big Data. FETLT 2016. Lecture Notes in Computer Science(), vol 10341. Springer, Cham. https://doi.org/10.1007/978-3-319-69365-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-69365-1_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69364-4
Online ISBN: 978-3-319-69365-1
eBook Packages: Computer ScienceComputer Science (R0)