Abstract
This work presents the results of comparison text representations used for short text classification with SVM and neural network when challenged with imbalanced data. We analyze both direct and indirect methods for selecting the proper category and improve them with various representation techniques. As a baseline, we set up a BOW method and then use more sophisticated approaches: word embeddings and transformer-based. The study were done on a dataset from a legal domain where the task was to select the topic of the discussion with the layer. The experiments indicate that fine-tuned pre-trained BERT model for this task gives the best results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Pre-trained word vectors of 30+ languages. Github (2017). https://github.com/Kyubyong/wordvectors/ (Accessed 10 April 2022)
Topic modelling with sentence bert (2022). https://voicelab.ai/topic-modelling-with-sentence-bert. (Accessed 1 May 2022)
Babić, K., Martinčić-Ipšić, S., Meštrović, A.: Survey of neural text representation models. Information 11(11), 511 (2020)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Ass. Comput. Linguist. 5, 135–146 (2017)
Clark, K., Luong, M.T., Le, Q.V., Manning, C.D.: Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555 (2020)
Dadas, S.: A repository of polish NLP resources. Github (2019). https://github.com/sdadas/polish-nlp-resources/ (Accessed 10 April 2022)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding (2019). https://arxiv.org/abs/1810.04805
Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: Tutorial and survey (2020)
Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. Int. J. Comput. Vis. 129(6), 1789–1819 (2021)
Horan, C.: Google’s bert - nlp and transformer architecture that are reshaping ai landcape (2021). https://neptune.ai/blog/bert-and-the-transformer-architecture-reshaping-the-ai-landscape
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019)
Liu, Y., et al.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Majumder, G., Pakray, P., Gelbukh, A., Pinto, D.: Semantic textual similarity methods, tools, and applications: a survey. Computación y Sistemas 20(4), 647–665 (2016)
Manning, C., Raghavan, P., Schutze, H.: Term weighting, and the vector space model. In: Introduction to Information Retrieval, pp. 109–133 (2008)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013). https://arxiv.org/abs/1301.3781
Olewniczak, Szymon, Szymański, Julian: Fast approximate string search for wikification. In: Paszynski, M., Kranzlmüller, D., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds.) ICCS 2021. LNCS, vol. 12744, pp. 347–361. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77967-2_29
Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543 (2014)
Reimers, N., Gurevych, I.: Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019)
Ruder, S., Peters, M.E., Swayamdipta, S., Wolf, T.: Transfer learning in natural language processing. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: Tutorials. pp. 15–18 (2019)
Rybak, P., Mroczkowski, R., Tracz, J., Gawlik, I.: Klej: comprehensive benchmark for polish language understanding. arXiv preprint arXiv:2005.00630 (2020)
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)
Sun, Y., Zheng, Y., Hao, C., Qiu, H.: Nsp-bert: A prompt-based zero-shot learner through an original pre-training task-next sentence prediction. arXiv preprint arXiv:2109.03564 (2021)
Szymanski, J., Naruszewicz, M.: Review on wikification methods. AI Commun. 32(3), 235–251 (2019). https://doi.org/10.3233/AIC-190581
Szymański, J., Kawalec, N.: An analysis of neural word representations for wikipedia articles classification. Cybern. Syst. 50, 176–196 (2019)
Takase, S., Okazaki, N.: Positional encoding to control output sequence length. arXiv preprint arXiv:1904.07418 (2019)
Team, D.: Bert variants and their differences - 360digitmg (2021). https://360digitmg.com/bert-variants-and-their-differences (Accessed 19 April 2022)
Vaswani, A., et al.: Attention is all you need (2017). https://arxiv.org/abs/1706.03762
Wang, J., Wang, Z., Zhang, D., Yan, J.: Combining knowledge with deep convolutional neural networks for short text classification. In: IJCAI (2017)
Wang, Q., et al.: Learning deep transformer models for machine translation. arXiv preprint arXiv:1906.01787 (2019)
Wawrzyński, A., Szymański, J.: Study of statistical text representation methods for performance improvement of a hierarchical attention network. Appl. Sci. 11(13), 6113 (2021)
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: Xlnet: Generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Acknowledgments
The work was supported by founds of the project A semi-autonomous system for generating legal advice and opinions based on automatic query analysis using the transformer-type deep neural network architecture with multitasking learning, POIR.01.01.01–00-1965/20.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zymkowski, T. et al. (2023). Short Texts Representations for Legal Domain Classification. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2022. Lecture Notes in Computer Science(), vol 13588. Springer, Cham. https://doi.org/10.1007/978-3-031-23492-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-23492-7_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23491-0
Online ISBN: 978-3-031-23492-7
eBook Packages: Computer ScienceComputer Science (R0)