Concatenating or Averaging? Hybrid Sentences Representations for Sentiment Analysis

Orsenigo, Carlotta; Vercellis, Carlo; Volpetti, Claudia

doi:10.1007/978-3-030-03493-1_59

Concatenating or Averaging? Hybrid Sentences Representations for Sentiment Analysis

Conference paper
First Online: 09 November 2018

2321 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11314))

Abstract

Performances in sentiment analysis - the crucial task of automatically classifying the huge amount of users’ opinions generated online - heavily rely on the representation used to transform words or sentences into numbers. In the field of machine learning for sentiment analysis the most common embedding is the bag of words (BOW) model, which works well in practice but which is essentially a lexical conversion. Another well-known method is the Word2vec approach which, instead, attempts to capture the meaning of the terms. Given the complementarity of the information encoded in the two models, the knowledge offered by Word2vec can be helpful to enrich the information comprised in the BOW scheme. Based on this assumption we designed and tested four hybrid sentence representations which combine the two former approaches. Experiments performed on publicly available datasets confirm the effectiveness of the hybrid embeddings which led to a stable increase in the performances across different sentiment analysis domains.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
MATH Google Scholar
Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. J. Comput. Sci. 2(1), 1–8 (2011). https://doi.org/10.1016/J.JOCS.2010.12.007
Article Google Scholar
Collobert, R., Weston, J.: A unified architecture for natural language processing. In: ICML 2008, pp. 160–167. ACM Press (2008). https://doi.org/10.1145/1390156.1390177
Enríquez, F., Troyano, J.A., López-Solaz, T.: An approach to the use of word embeddings in an opinion classification task. Expert Syst. Appl. 66, 1–6 (2016). https://doi.org/10.1016/j.eswa.2016.09.005
Article Google Scholar
Jansen, B.J., Zhang, M., Sobel, K., Chowdury, A.: Twitter power: tweets as electronic word of mouth. J. Am. Soc. Inf. Sci. Technol. 60(11), 2169–2188 (2009). https://doi.org/10.1002/asi.21149
Article Google Scholar
Lilleberg, J., Zhu, Y., Zhang, Y.: Support vector machines and Word2vec for text classification with semantic features. In: 2015 ICCI*CC, pp. 136–140. IEEE, July 2015. https://doi.org/10.1109/ICCI-CC.2015.7259377
Liu, B.: Sentiment Analysis. Cambridge University Press, Cambridge (2015). https://doi.org/10.1017/CBO9781139084789
Book Google Scholar
Manning, C.D., Raghavan, P., Schutze, H.: Scoring, term weighting, and the vector space model. In: Introduction to Information Retrieval, pp. 100–123. Cambridge University Press (2008). https://doi.org/10.1017/cbo9780511809071.007
Mäntylä, M.V., Graziotin, D., Kuutila, M.: The evolution of sentiment analysis—A review of research topics, venues, and top cited papers. Comput. Sci. Rev. 27, 16–32 (2018). https://doi.org/10.1016/J.COSREV.2017.10.002
Article Google Scholar
McAuley, J., Pandey, R., Leskovec, J.: Inferring networks of substitutable and complementary products. In: ACM SIGKDD 2015, pp. 785–794. ACM Press, New York (2015). https://doi.org/10.1145/2783258.2783381
McAuley, J., Targett, C., Shi, Q., van den Hengel, A.: Image-based recommendations on styles and substitutes. In: SIGIR 2015, pp. 43–52. ACM Press, New York (2015). https://doi.org/10.1145/2766462.2767755
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013). http://arxiv.org/abs/1301.3781
Mikolov, T., Yih, W.t., Zweig, G.: Linguistic regularities in continuous space word representations. In: NAACL HLT 2013, pp. 746–751 (2013). http://www.aclweb.org/anthology/N13-1090
Pang, B., Lee, L.: Opinion Mining and Sentiment Analysis, vol. 2. Now Publishers, Inc., Delft (2008). https://doi.org/10.1561/1500000011
Book Google Scholar
Piryani, R., Madhavi, D., Singh, V.: Analytical mapping of opinion mining and sentiment analysis research during 2000–2015. Inf. Process. Manag. 53(1), 122–150 (2017). https://doi.org/10.1016/J.IPM.2016.07.001
Article Google Scholar
Socher, R., et al.: Recursive deep models for semantic compositionality over a sentiment treebank. In: EMNLP 2013, pp. 1631–1642. ACL (2013)
Google Scholar
Tumasjan, A., Sprenger, T.O., Sandner, P.G., Welpe, I.M.: Election forecasts with Twitter. Soc. Sci. Comput. Rev. 29(4), 402–418 (2010). https://doi.org/10.1177/0894439310386557
Article Google Scholar
Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques. Elsevier Inc., Amsterdam (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Management, Economics and Industrial Engineering, Politecnico di Milano, Via Lambruschini 4b, 20156, Milan, Italy
Carlotta Orsenigo, Carlo Vercellis & Claudia Volpetti

Authors

Carlotta Orsenigo
View author publications
You can also search for this author in PubMed Google Scholar
Carlo Vercellis
View author publications
You can also search for this author in PubMed Google Scholar
Claudia Volpetti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Claudia Volpetti .

Editor information

Editors and Affiliations

University of Manchester, Manchester, UK
Hujun Yin
Autonomous University of Madrid, Madrid, Spain
David Camacho
Campus of Gualtar, University of Minho, Braga, Portugal
Paulo Novais
University of Seville, Seville, Spain
Antonio J. Tallón-Ballesteros

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Orsenigo, C., Vercellis, C., Volpetti, C. (2018). Concatenating or Averaging? Hybrid Sentences Representations for Sentiment Analysis. In: Yin, H., Camacho, D., Novais, P., Tallón-Ballesteros, A. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2018. IDEAL 2018. Lecture Notes in Computer Science(), vol 11314. Springer, Cham. https://doi.org/10.1007/978-3-030-03493-1_59

Download citation

DOI: https://doi.org/10.1007/978-3-030-03493-1_59
Published: 09 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03492-4
Online ISBN: 978-3-030-03493-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics