Sparse Word Representation for RNN Language Models on Cellphones

Ruan, Chong; Liu, Yanshuang

doi:10.1007/978-3-031-23804-8_5

Chong Ruan^8,9 &
Yanshuang Liu⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13397))

Included in the following conference series:

International Conference on Computational Linguistics and Intelligent Text Processing

285 Accesses

Abstract

Language models are a key component of input methods, because they provide good suggestions for the next candidate input word given previous context. Recurrent neural network (RNN) language models are the state-of-the-art language models, but they are notorious for their large size and computation cost. A main source of parameters and computation of RNN language models is embedding matrices. In this paper, we propose a sparse representation-based method to compress embedding matrices and reduce both the size and computation of the models. We conduct experiments on the PTB dataset and also test its performance on cellphones to illustrate its effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Global context-dependent recurrent neural network language model with sparse feature learning

Article 21 June 2017

Neural Networks Compression for Language Modeling

A Recurrent Neural Network Language Model Based on Word Embedding

Notes

1.
Available at http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz.
2.
See https://github.com/tensorflow/models/blob/master/tutorials/rnn/ptb/ptb_word_lm.py for details.

References

Mikolov, T., Karafiát, M., Burget, L., Cernockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Interspeech, vol. 2, p. 3 (2010)
Google Scholar
Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014)
See, A., Luong, M.T., Manning, C.D.: Compression of neural machine translation models via pruning. In: Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, CoNLL 2016, Berlin, Germany, 11–12 August 2016, pp. 291–301 (2016)
Google Scholar
Narang, S., Diamos, G., Sengupta, S., Elsen, E.: Exploring sparsity in recurrent neural networks. arXiv preprint arXiv:1704.05119 (2017)
Chen, W., Wilson, J., Tyree, S., Weinberger, K., Chen, Y.: Compressing neural networks with the hashing trick. In: International Conference on Machine Learning, pp. 2285–2294 (2015)
Google Scholar
Lu, Z., Sindhwani, V., Sainath, T.N.: Learning compact recurrent neural networks. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, Shanghai, China, 20–25 March 2016, pp. 5960–5964 (2016)
Google Scholar
Chen, Y., Mou, L., Xu, Y., Li, G., Jin, Z.: Compressing neural language models by sparse word representations. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, Berlin, Germany, 7–12 August 2016, vol. 1: Long Papers (2016)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with lstm (1999)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Press, O., Wolf, L.: Using the output embedding to improve language models. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017, Valencia, Spain, 3–7 April 2017, vol. 2: Short Papers, pp. 157–163 (2017)
Google Scholar
Řehůřek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50, Valletta, Malta, May 2010. ELRA (2010). http://is.muni.cz/publication/884893/en
Abadi, M., et al.: Tensorflow: a system for large-scale machine learning. In: OSDI, vol. 16, pp. 265–283 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Schoold of Electronical Engineering and Computer Science, Peking University, Beijing, 100871, China
Chong Ruan
Kika Tech, Longshaoheng Mansion, Hepingli South Road, Dongcheng District, Beijing, 100013, China
Chong Ruan & Yanshuang Liu

Authors

Chong Ruan
View author publications
You can also search for this author in PubMed Google Scholar
Yanshuang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chong Ruan .

Editor information

Editors and Affiliations

Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ruan, C., Liu, Y. (2023). Sparse Word Representation for RNN Language Models on Cellphones. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2018. Lecture Notes in Computer Science, vol 13397. Springer, Cham. https://doi.org/10.1007/978-3-031-23804-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-23804-8_5
Published: 26 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23803-1
Online ISBN: 978-3-031-23804-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Sparse Word Representation for RNN Language Models on Cellphones