Abstract
In this paper, we propose a dictionary screening method for embedding compression in text classification. The key point is to evaluate the importance of each keyword in the dictionary. To this end, we first train a pre-specified recurrent neural network-based model using a full dictionary. This leads to a benchmark model, which we use to obtain the predicted class probabilities for each sample in a dataset. Next, to evaluate the impact of each keyword in affecting the predicted class probabilities, we develop a novel method for assessing the importance of each keyword in a dictionary. Consequently, each keyword can be screened, and only the most important keywords are reserved. With these screened keywords, a new dictionary with a considerably reduced size can be constructed. Accordingly, the original text sequence can be substantially compressed. The proposed method leads to significant reductions in terms of parameters, average text sequence, and dictionary size. Meanwhile, the prediction power remains very competitive compared to the benchmark model. Extensive numerical studies are presented to demonstrate the empirical performance of the proposed method.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Acharya, A., Goel, R., Metallinou, A., Dhillon, I.: Online embedding compression for text classification using low rank matrix factorization. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 33, pp. 6196–6203 (2019)
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_52
Cho, K., Merrienboer, B.V., Gulcehre, C., Schwenk, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. Computer Science (2014)
Cybendo, G.: Approximations by superpositions of a sigmoidal function. Math. Control Signals Systems 2, 183–192 (1989)
Deng, L., Li, G., Han, S., Shi, L., Xie, Y.: Model compression and hardware acceleration for neural networks: a comprehensive survey. Proc. IEEE 108(4), 485–532 (2020)
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. J. Mach. Learn. Res. 15, 315–323 (2011)
Hornik, K., Stinchcombe, M.B., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2, 359–366 (1989)
Hossain, E., Sharif, O., Hoque, M.M., Sarker, I.H.: SentiLSTM: a deep learning approach for sentiment analysis of restaurant reviews. In: Proceedings of 20th International Conference on Hybrid Intelligent Systems (2020)
Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jegou, H., Mikolov, T.: Fasttext.zip: compressing text classification models. arXiv preprint arXiv:1612.03651 (2016)
Kim, Y.: Convolutional neural networks for sentence classification. Eprint Arxiv (2014)
Li, F., Zhang, M., Fu, G., Qian, T., Ji, D.: A Bi-LSTM-RNN model for relation classification using low-cost sequence features (2016)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. Computer Science (2013)
Raunak, V.: Effective dimensionality reduction for word embeddings. arXiv preprint arXiv:1708.03629 (2017)
Sachan, D.S., Zaheer, M., Salakhutdinov, R.: Revisiting LSTM networks for semi-supervised text classification via mixed objective function. Proceedings of the AAAI Conference on Artificial Intelligence (2019)
Shu, R., Nakayama, H.: Compressing word embeddings via deep compositional code learning. arXiv preprint arXiv:1711.01068 (2017)
Sparck-Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Document. 28(1), 11–21 (1972)
Xiao, Y., Cho, K.: Efficient character-level document classification by combining convolution and recurrent layers (2016)
Zhang, X., Zhao, J., Lecun, Y.: Character-level convolutional networks for text classification. MIT Press (2015)
Acknowledgements
Zhou’s research is supported in part by the National Natural Science Foundation of China (Nos. 72171226, 11971504), the Beijing Municipal Social Science Foundation (No. 19GLC052). Wang’s research is partially supported by the National Natural Science Foundation of China (No. 12271012, 11831008) and the Open Research Fund of the Key Laboratory of Advanced Theory and Application in Statistics and Data Science (KLATASDS-MOE-ECNU-KLATASDS2101).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhou, J., Jing, X., Liu, M., Wang, H. (2023). Compressing the Embedding Matrix by a Dictionary Screening Approach in Text Classification. In: Kashima, H., Ide, T., Peng, WC. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2023. Lecture Notes in Computer Science(), vol 13935. Springer, Cham. https://doi.org/10.1007/978-3-031-33374-3_36
Download citation
DOI: https://doi.org/10.1007/978-3-031-33374-3_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-33373-6
Online ISBN: 978-3-031-33374-3
eBook Packages: Computer ScienceComputer Science (R0)