Skip to main content

Bootstrapping Technique + Embeddings = Emotional Corpus Annotated Automatically

  • Conference paper
  • First Online:
Future and Emerging Trends in Language Technology. Machine Learning and Big Data (FETLT 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10341))

  • 1383 Accesses

Abstract

Detecting depression or personality traits, tutoring and student behaviour systems, or identifying cases of cyber-bulling are a few of the wide range of the applications, in which the automatic detection of emotion is crucial. This task can contribute to the benefit of business, society, politics or education. The main objective of our research is focused on the improvement of the supervised emotion detection systems developed so far, through the definition and implementation of a technique to annotate large scale English emotional corpora automatically and with high standards of reliability. Our proposal is based on a bootstrapping process made up two main steps: the creation of the seed using NRC Emotion Lexicon and its extension employing the distributional semantic similarity through words embeddings. The results obtained are promising and allow us to confirm the soundness of the bootstrapping technique combined with the word embedding to label emotional corpora automatically.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://wordspace.collocations.de/doku.php/course:acl2010:start.

  2. 2.

    http://www.natcorp.ox.ac.uk/.

References

  1. Alm, C.O., Roth, D., Sproat, R.: Emotions from text: machine learning for text-based emotion prediction. In: Proceedings of the Conference on HLT-EMNLP, pp. 579–586 (2005)

    Google Scholar 

  2. Aman, S., Szpakowicz, S.: Identifying expressions of emotion in text. In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS, vol. 4629, pp. 196–205. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74628-7_27

    Chapter  Google Scholar 

  3. Aubur, D., Armantrout, R., Crystal, D., Dirda, M.: Oxford American Writer’s Thesaurus. Oxford University Press, Oxford (2004)

    Google Scholar 

  4. Boldrini, E., Martínez-Barco, P.: EMOTIBLOG: a model to learn subjetive information detection in the new textual genres of the Web 2.0-multilingual and multi-genre approach. Ph.D. thesis (2012)

    Google Scholar 

  5. Cherry, C., Mohammad, S.M., De Bruijn, B.: Binary classifiers and latent sequence models for emotion detection in suicide notes. Biomed. Inf. Insights 5(Suppl 1), 147–154 (2012)

    Article  Google Scholar 

  6. Choudhury, M.D., Gamon, M., Counts, S.: Happy, nervous or surprised? Classification of human affective states in social media. In: Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (2012)

    Google Scholar 

  7. Chowdhury, S., Chowdhury, W.: Performing sentiment analysis in bangla microblog posts. In: International Conference on Informatics, Electronics & Vision (ICIEV). IEEE (2014)

    Google Scholar 

  8. Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measure. 20(1), 37 (1960)

    Article  Google Scholar 

  9. Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 100–110 (1999)

    Google Scholar 

  10. Dadvar, M., Trieschnigg, D., Ordelman, R., de Jong, F.: Improving cyberbullying detection with user context. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 693–696. Springer, Heidelberg (2013). doi:10.1007/978-3-642-36973-5_62

    Chapter  Google Scholar 

  11. Ekman, P.: An argument for basic emotions. Cognit. Emotion 6, 169–200 (1992)

    Article  Google Scholar 

  12. Gliozzo, A., Strapparava, C.: Semantic Domains in Computational Linguistics. Springer, Heidelberg (2009). doi:10.1007/978-3-540-68158-8

    Book  MATH  Google Scholar 

  13. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)

    Article  Google Scholar 

  14. Lee, S., Lee, G.G.: A bootstrapping approach for geographic named entity annotation. In: Myaeng, S.H., Zhou, M., Wong, K.-F., Zhang, H.-J. (eds.) AIRS 2004. LNCS, vol. 3411, pp. 178–189. Springer, Heidelberg (2005). doi:10.1007/978-3-540-31871-2_16

    Chapter  Google Scholar 

  15. Liew, J.S.Y., Turtle, H.R., Liddy, E.D.: EmoTweet-28: a fine-grained emotion corpus for sentiment analysis. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (2016)

    Google Scholar 

  16. Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The stanford CoreNLP natural language processing toolkit. In: Association for Computational Linguistics (ACL) System Demonstrations, pp. 55–60 (2014)

    Google Scholar 

  17. Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  18. Mohammad, S.: #Emotional tweets. In: Proceedings of the First Joint Conference on Lexical and Computational Semantics (2012)

    Google Scholar 

  19. Mohammad, S.M.: Sentiment analysis: detecting valence, emotions, and other affectual states from text. In: Emotion Measurement (2015)

    Google Scholar 

  20. Mohammad, S.M., Turney, P.D.: Crowdsourcing a word-emotion association lexicon. Comput. Lang. 29(3), 436–465 (2013)

    MathSciNet  Google Scholar 

  21. Montero, C.S., Suhonen, J.: Emotion analysis meets learning analytics: online learner profiling beyond numerical data. In: Proceedings of the 14th Koli Calling International Conference on Computing Education Research, pp. 165–169 (2014)

    Google Scholar 

  22. Neviarouskaya, A., Prendinger, H., Ishizuka, M.: Compositionality principle in recognition of fine-grained emotions from text. In: Proceedings of the Third International ICWSM Conference, pp. 278–281 (2009)

    Google Scholar 

  23. Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP) (2014)

    Google Scholar 

  24. Platt, J.: Using analytic QP and sparseness to speed training of support vector machines. In: Proceedings of Advances in Neural Information Processing Systems, pp. 557–563 (1999)

    Google Scholar 

  25. Plutchik, R.: A general psycho evolutionary theory of emotion. In: Theories of Emotion, pp. 3–33 (1980)

    Google Scholar 

  26. Schwartz, H.A., Eichstaedt, J.C., Kern, M.L., Dziurzynski, L., Lucas, R.E., Agrawal, M., Park, G.J., Lakshmikanth, S.K., Jha, S., Seligman, M.E.P., Ungar, L.: Characterizing geographic variation in well-being using tweets. In: Proceedings of the International AAAI Conference on Weblogs and Social Media (2013)

    Google Scholar 

  27. Strapparava, C., Mihalcea, R.: Semeval-2007 task 14: affective text. In: Proceedings of the 4th International Workshop on Semantic Evaluations, pp. 70–74 (2007)

    Google Scholar 

  28. Wang, W., Chen, L., Thirunarayan, K., Sheth, A.P.: Harnessing twitter “big data” for automatic emotion identification. In: International Confernece on Social Computing (SocialCom) (2012)

    Google Scholar 

  29. Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics (ACL 1995), pp. 189–196. Association for Computational Linguistics, Stroudsburg, PA, USA (1995)

    Google Scholar 

Download references

Acknowledgment

This research has been supported by the FPI grant (BES-2013-065950) and the research stay grant (EEBB-I-15-10108) from the Spanish Ministry of Science and Innovation. It has also funded by the Spanish Government (DIGITY ref. TIN2015-65136-C02-2-R) and the Valencian Government (grant no. PROMETEOII/2014/001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lea Canales .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Canales, L., Strapparava, C., Boldrini, E., Matínez-Barco, P. (2017). Bootstrapping Technique + Embeddings = Emotional Corpus Annotated Automatically. In: Quesada, J., Martín Mateos , FJ., López Soto, T. (eds) Future and Emerging Trends in Language Technology. Machine Learning and Big Data. FETLT 2016. Lecture Notes in Computer Science(), vol 10341. Springer, Cham. https://doi.org/10.1007/978-3-319-69365-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-69365-1_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-69364-4

  • Online ISBN: 978-3-319-69365-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics