Generating Word Embeddings from an Extreme Learning Machine for Sentiment Analysis and Sequence Labeling Tasks

Lauren, Paula; Qu, Guangzhi; Yang, Jucheng; Watta, Paul; Huang, Guang-Bin; Lendasse, Amaury

doi:10.1007/s12559-018-9548-y

Generating Word Embeddings from an Extreme Learning Machine for Sentiment Analysis and Sequence Labeling Tasks

Published: 02 March 2018

Volume 10, pages 625–638, (2018)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Paula Lauren ORCID: orcid.org/0000-0003-3970-2280¹,
Guangzhi Qu¹,
Jucheng Yang²,
Paul Watta³,
Guang-Bin Huang⁴ &
…
Amaury Lendasse⁵

1356 Accesses
41 Citations
1 Altmetric
Explore all metrics

Abstract

Word Embeddings are low-dimensional distributed representations that encompass a set of language modeling and feature learning techniques from Natural Language Processing (NLP). Words or phrases from the vocabulary are mapped to vectors of real numbers in a low-dimensional space. In previous work, we proposed using an Extreme Learning Machine (ELM) for generating word embeddings. In this research, we apply the ELM-based Word Embeddings to the NLP task of Text Categorization, specifically Sentiment Analysis and Sequence Labeling. The ELM-based Word Embeddings utilizes a count-based approach similar to the Global Vectors (GloVe) model, where the word-context matrix is computed then matrix factorization is applied. A comparative study is done with Word2Vec and GloVe, which are the two popular state-of-the-art models. The results show that ELM-based Word Embeddings slightly outperforms the aforementioned two methods in the Sentiment Analysis and Sequence Labeling tasks.In addition, only one hyperparameter is needed using ELM whereas several are utilized for the other methods. ELM-based Word Embeddings are comparable to the state-of-the-art methods: Word2Vec and GloVe models. In addition, the count-based ELM model have word similarities to both the count-based GloVe and the predict-based Word2Vec models, with subtle differences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on sentiment analysis methods, applications, and challenges

Article 07 February 2022

Natural language processing: state of the art, current trends and challenges

Article 14 July 2022

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Article 09 April 2024

Notes

References

Manning CD, Schütze H. Foundations of statistical natural language processing. Cambridge: MIT press; 1999.
Google Scholar
Duda RO, Hart PE, Stork DG. Pattern classification. New York: Wiley; 2012.
Google Scholar
Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. ICLR Workshop. 2013.
Pennington J, Socher R, Manning CD. Glove: global vectors for word representation. EMNLP; 2014.
Goth G. Deep or shallow, nlp is breaking out. Commun ACM 2016;59(3):13–16.
Article Google Scholar
Maas AL, Ng AY. A probabilistic model for semantic word vectors. NIPS 2010 workshop on deep learning and unsupervised feature learning; 2010. p. 1–8.
Zou WY, Socher R, Cer DM, Manning CD. Bilingual word embeddings for phrase-based machine translation. EMNLP; 2013. p. 1393–1398.
Le QV, Mikolov T. Distributed representations of sentences and documents. Proceedings of ICML; 2014.
Huang G, Huang G-B, Song S, You K. Trends in extreme learning machines: a review. Neural Netw 2015;61:32–48.
Article PubMed Google Scholar
Dai AM, Olah C, Le QV. Document embedding with paragraph vectors, arXiv:1507.07998, 2015. 2015.
Landauer TK, Foltz PW, Laham D. An introduction to latent semantic analysis. Discourse Processes 1998; 25:259–284.
Article Google Scholar
Lauren P, Qu G, Zhang F, Lendasse A. Clinical narrative classification using discriminant word embeddings with elm. Int’l joint conference on neural networks, Vancouver, Canada, July 24–29. IEEE; 2016.
Lauren P, Qu G, Zhang F, Lendasse A. Discriminant document embeddings with an extreme learning machine for classifying clinical narrative. Neurocomputing, vol. 277, 14 February 2018, pp. 129–138. 2017.
Zheng W, Qian Y, Lu H. Text categorization based on regularization extreme learning machine. Neural Comput & Applic 2013;22(3-4):447–456.
Article Google Scholar
Zeng L, Li Z. Text classification based on paragraph distributed representation and extreme learning machine. Int’l conference in swarm intelligence. Springer; 2015. p. 81–88.
Poria S, Cambria E, Gelbukh A, Bisio F, Hussain A. Sentiment data flow analysis by means of dynamic linguistic patterns. IEEE Comput Intell Mag 2015;10(4):26–36.
Article Google Scholar
Cambria E, Gastaldo P, Bisio F, Zunino R. An elm-based model for affective analogical reasoning. Neurocomputing 2015;149:443–455.
Article Google Scholar
Erb RJ. Introduction to backpropagation neural network computation. Pharm Res 1993;10(2):165–170.
Article PubMed CAS Google Scholar
Murphy KP. Machine learning: a probabilistic perspective. Cambridge: MIT Press; 2012.
Google Scholar
Baldi P. Autoencoders, unsupervised learning, and deep architectures. J. Mach. Learn. Res. (Proceedings of ICML workshop on unsupervised and transfer learning) 2012;27:37–50.
Google Scholar
Ofek N, Poria S, Rokach L, Cambria E, Hussain A, Shabtai A. Unsupervised commonsense knowledge enrichment for domain-specific sentiment analysis. Cogn Comput 2016;8(3):467–477. https://doi.org/10.1007/s12559-015-9375-3.
Article Google Scholar
Xia Y, Cambria E, Hussain A, Zhao H. Word polarity disambiguation using bayesian model and opinion-level features. Cogn Comput 2015;7(3):369–380. https://doi.org/10.1007/s12559-014-9298-4.
Article Google Scholar
Li Y, Pan Q, Yang T, Wang S, Tang J, Cambria E. Learning word representations for sentiment analysis. Cogn Comput 2017;9(6):843–851. https://doi.org/10.1007/s12559-017-9492-2.
Article Google Scholar
Cambria E. Affective computing and sentiment analysis. IEEE Intell Syst 2016;31(2):102–107.
Article Google Scholar
Mesnil G, He X, Deng L, Bengio Y. Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. Proceedings of interspeech; 2013.
Yao K, Zweig G, Hwang M-Y, Shi Y, Yu D. Recurrent neural networks for language understanding. Interspeech; 2013. p. 2524–2528.
Zhu S, Yu K. Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding. 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE; 2017. p. 5675–5679.
Lauren P, Qu G, Huang G-B, Watta P, Lendasse A. A low-dimensional vector representation for words using an extreme learning machine. 2017 international joint conference on neural networks (IJCNN); 2017. p. 1817–1822.
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems; 2013. p. 3111–3119.
Bottou L. Online learning and stochastic approximations. On-Line Learning in Neural Networks 1998;17(9):142.
Google Scholar
Mnih A, Hinton GE. A scalable hierarchical distributed language model. Advances in neural information processing systems; 2009. p. 1081–1088.
Mikolov T, Deoras A, Povey D, Burget L, Černockỳ J. Strategies for training large scale neural network language models. 2011 IEEE workshop on automatic speech recognition and understanding (ASRU). IEEE; 2011. p. 196–201.
Mnih A, Kavukcuoglu K. Learning word embeddings efficiently with noise-contrastive estimation. Advances in neural information processing systems; 2013. p. 2265–2273.
Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 2011;12(Jul):2121–2159.
Google Scholar
Rao CR, Mitra SK, et al. Generalized inverse of a matrix and its applications. Proceedings of the sixth berkeley symposium on mathematical statistics and probability, Volume 1: Theory of Statistics. The Regents of the University of California; 1972.
Huang G-B, Siew C-K. Extreme learning machine: Rbf network case. ICARCV 2004 8th control, automation, robotics and vision conference, 2004. vol. 2. IEEE; 2004. p. 1029–1036.
Huang G-B, Zhu Q-Y, Siew C-K. Extreme learning machine: theory and applications. Neurocomputing 2006;70(1):489–501.
Article Google Scholar
Huang G-B, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern 2012;42(2):513–529.
Article PubMed Google Scholar
Kasun LLC, Zhou H, Huang G-B, Vong CM. Representational learning with elms for big data. IEEE Intell Syst 2013;28(6):31–34.
Google Scholar
Alpaydin E. Introduction to machine learning. Cambridge: MIT Press; 2014.
Google Scholar
Goodfellow I, Bengio Y, Courville A. Deep learning. Cambridge: MIT Press; 2016.
Google Scholar
James G, Witten D, Hastie T. An introduction to statistical learning: with applications in r. New York: Springer. 2014.
Everitt BS, Dunn G. 2001. Applied multivariate data analysis. Wiley Online Library, Vol. 2.
Pang B, Lee L. A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. Proceedings of the ACL; 2004.
Hemphill CT, Godfrey JJ, Doddington GR, et al. The atis spoken language systems pilot corpus. Proceedings of the DARPA speech and natural language workshop; 1990.
Dahl DA, Bates M, Brown M, Fisher W, Hunicke-Smith K, Pallett D, Pao C, Rudnicky A, Shriberg E. Expanding the scope of the atis task: the atis-3 corpus. Proceedings of the workshop on human language technology. Association for Computational Linguistics; 1994. p. 43–48.
Tur G, Hakkani-Tür D, Heck L. What is left to be understood in atis? 2010 IEEE spoken language technology workshop (SLT). IEEE; 2010. p. 19–24.
Ramshaw LA, Marcus MP. Text chunking using transformation-based learning. Natural language processing using very large corpora. Springer; 1999. p. 157–176.
Bullinaria JA, Levy JP. Extracting semantic representations from word co-occurrence statistics: a computational study. Behav Res Methods 2007;39(3):510–526.
Article PubMed Google Scholar
Weisstein EW. Sigmoid function. http://mathworld.wolfram.com/sigmoidfunction.html. Accessed 5 Jan 2017. 2002.
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 2014;15(1):1929–1958.
Google Scholar
Maaten Lvd, Hinton G. Visualizing data using t-sne. J Mach Learn Res 2008;9(Nov):2579–2605.
Google Scholar
Tan P-N, Steinbach M, Kumar V. Introduction to data mining. London: Pearson; 2006.
Google Scholar
Li J, Jurafsky D. Do multi-sense embeddings improve natural language understanding? arXiv:1506.01070. 2015.

Download references

Funding

This work was partially supported by the National Natural Science Foundation of China under Grant No. 61502338.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Oakland University, Rochester, MI, USA
Paula Lauren & Guangzhi Qu
College of Computer Science and Information Engineering, Tianjin University of Science and Technology, Tianjin, China
Jucheng Yang
Department of Electrical and Computer Engineering, The University of Michigan-Dearborn, Dearborn, MI, USA
Paul Watta
School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
Guang-Bin Huang
Department of Industrial and Systems Engineering, The University of Iowa, Iowa City, IA, USA
Amaury Lendasse

Authors

Paula Lauren
View author publications
You can also search for this author in PubMed Google Scholar
Guangzhi Qu
View author publications
You can also search for this author in PubMed Google Scholar
Jucheng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Paul Watta
View author publications
You can also search for this author in PubMed Google Scholar
Guang-Bin Huang
View author publications
You can also search for this author in PubMed Google Scholar
Amaury Lendasse
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guangzhi Qu.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Informed Consent

Consent was not required as no human or animals were involved.

Human and Animal Rights

This article does not contain any studies with human participants or animals, performed by any of the authors.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lauren, P., Qu, G., Yang, J. et al. Generating Word Embeddings from an Extreme Learning Machine for Sentiment Analysis and Sequence Labeling Tasks. Cogn Comput 10, 625–638 (2018). https://doi.org/10.1007/s12559-018-9548-y

Download citation

Received: 06 November 2017
Accepted: 29 January 2018
Published: 02 March 2018
Issue Date: August 2018
DOI: https://doi.org/10.1007/s12559-018-9548-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generating Word Embeddings from an Extreme Learning Machine for Sentiment Analysis and Sequence Labeling Tasks

Abstract

Access this article

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

Natural language processing: state of the art, current trends and challenges

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Informed Consent

Human and Animal Rights

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Generating Word Embeddings from an Extreme Learning Machine for Sentiment Analysis and Sequence Labeling Tasks

Abstract

Access this article

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

Natural language processing: state of the art, current trends and challenges

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Informed Consent

Human and Animal Rights

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation