Skip to main content

Short Texts Representations for Legal Domain Classification

  • Conference paper
  • First Online:
Artificial Intelligence and Soft Computing (ICAISC 2022)

Abstract

This work presents the results of comparison text representations used for short text classification with SVM and neural network when challenged with imbalanced data. We analyze both direct and indirect methods for selecting the proper category and improve them with various representation techniques. As a baseline, we set up a BOW method and then use more sophisticated approaches: word embeddings and transformer-based. The study were done on a dataset from a legal domain where the task was to select the topic of the discussion with the layer. The experiments indicate that fine-tuned pre-trained BERT model for this task gives the best results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Pre-trained word vectors of 30+ languages. Github (2017). https://github.com/Kyubyong/wordvectors/ (Accessed 10 April 2022)

  2. Topic modelling with sentence bert (2022). https://voicelab.ai/topic-modelling-with-sentence-bert. (Accessed 1 May 2022)

  3. Babić, K., Martinčić-Ipšić, S., Meštrović, A.: Survey of neural text representation models. Information 11(11), 511 (2020)

    Article  Google Scholar 

  4. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Ass. Comput. Linguist. 5, 135–146 (2017)

    Google Scholar 

  5. Clark, K., Luong, M.T., Le, Q.V., Manning, C.D.: Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555 (2020)

  6. Dadas, S.: A repository of polish NLP resources. Github (2019). https://github.com/sdadas/polish-nlp-resources/ (Accessed 10 April 2022)

  7. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding (2019). https://arxiv.org/abs/1810.04805

  8. Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: Tutorial and survey (2020)

    Google Scholar 

  9. Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. Int. J. Comput. Vis. 129(6), 1789–1819 (2021)

    Article  Google Scholar 

  10. Horan, C.: Google’s bert - nlp and transformer architecture that are reshaping ai landcape (2021). https://neptune.ai/blog/bert-and-the-transformer-architecture-reshaping-the-ai-landscape

  11. Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019)

  12. Liu, Y., et al.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

  13. Majumder, G., Pakray, P., Gelbukh, A., Pinto, D.: Semantic textual similarity methods, tools, and applications: a survey. Computación y Sistemas 20(4), 647–665 (2016)

    Article  Google Scholar 

  14. Manning, C., Raghavan, P., Schutze, H.: Term weighting, and the vector space model. In: Introduction to Information Retrieval, pp. 109–133 (2008)

    Google Scholar 

  15. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013). https://arxiv.org/abs/1301.3781

  16. Olewniczak, Szymon, Szymański, Julian: Fast approximate string search for wikification. In: Paszynski, M., Kranzlmüller, D., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds.) ICCS 2021. LNCS, vol. 12744, pp. 347–361. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77967-2_29

    Chapter  Google Scholar 

  17. Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  18. Reimers, N., Gurevych, I.: Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019)

  19. Ruder, S., Peters, M.E., Swayamdipta, S., Wolf, T.: Transfer learning in natural language processing. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: Tutorials. pp. 15–18 (2019)

    Google Scholar 

  20. Rybak, P., Mroczkowski, R., Tracz, J., Gawlik, I.: Klej: comprehensive benchmark for polish language understanding. arXiv preprint arXiv:2005.00630 (2020)

  21. Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)

  22. Sun, Y., Zheng, Y., Hao, C., Qiu, H.: Nsp-bert: A prompt-based zero-shot learner through an original pre-training task-next sentence prediction. arXiv preprint arXiv:2109.03564 (2021)

  23. Szymanski, J., Naruszewicz, M.: Review on wikification methods. AI Commun. 32(3), 235–251 (2019). https://doi.org/10.3233/AIC-190581

  24. Szymański, J., Kawalec, N.: An analysis of neural word representations for wikipedia articles classification. Cybern. Syst. 50, 176–196 (2019)

    Article  Google Scholar 

  25. Takase, S., Okazaki, N.: Positional encoding to control output sequence length. arXiv preprint arXiv:1904.07418 (2019)

  26. Team, D.: Bert variants and their differences - 360digitmg (2021). https://360digitmg.com/bert-variants-and-their-differences (Accessed 19 April 2022)

  27. Vaswani, A., et al.: Attention is all you need (2017). https://arxiv.org/abs/1706.03762

  28. Wang, J., Wang, Z., Zhang, D., Yan, J.: Combining knowledge with deep convolutional neural networks for short text classification. In: IJCAI (2017)

    Google Scholar 

  29. Wang, Q., et al.: Learning deep transformer models for machine translation. arXiv preprint arXiv:1906.01787 (2019)

  30. Wawrzyński, A., Szymański, J.: Study of statistical text representation methods for performance improvement of a hierarchical attention network. Appl. Sci. 11(13), 6113 (2021)

    Article  Google Scholar 

  31. Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: Xlnet: Generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

Download references

Acknowledgments

The work was supported by founds of the project A semi-autonomous system for generating legal advice and opinions based on automatic query analysis using the transformer-type deep neural network architecture with multitasking learning, POIR.01.01.01–00-1965/20.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julian Szymański .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zymkowski, T. et al. (2023). Short Texts Representations for Legal Domain Classification. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2022. Lecture Notes in Computer Science(), vol 13588. Springer, Cham. https://doi.org/10.1007/978-3-031-23492-7_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-23492-7_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-23491-0

  • Online ISBN: 978-3-031-23492-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics