skip to main content
10.1145/3628797.3628965acmotherconferencesArticle/Chapter ViewAbstractPublication PagessoictConference Proceedingsconference-collections
research-article

A Question-Answering System for Vietnamese Public Administrative Services

Published:07 December 2023Publication History

ABSTRACT

In the realm of legal question-answering (QA) systems, information retrieval (IR) plays a pivotal role. Despite thorough research in numerous languages, the Vietnamese research community has shown limited interest in legal information retrieval, particularly in the context of public administrative services. In this paper, we propose the development of a QA system tailored to the Vietnamese language, specifically focusing on the domain of public administrative services. Our system provides legal-based responses, and it is built upon a combination of retrieval and re-ranking techniques. We employ both lexical-based and semantic-based retrieval models and integrate them to create the final model. Our research shows that the system outperforms existing models in retrieving public administrative information and answering questions related to Vietnamese legal documents.

References

  1. Arian Askari and Suzan Verberne. 2021. Combining Lexical and Neural Retrieval with Longformer-based Summarization for Effective Case Law Retrieval.. In DESIRES. 162–170.Google ScholarGoogle Scholar
  2. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google ScholarGoogle Scholar
  3. Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. Simcse: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821 (2021).Google ScholarGoogle Scholar
  4. Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2019. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data 7, 3 (2019), 535–547.Google ScholarGoogle ScholarCross RefCross Ref
  5. Vladimir Karpukhin, Barlas Oğuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906 (2020).Google ScholarGoogle Scholar
  6. Phi Manh Kien, Ha-Thanh Nguyen, Ngo Xuan Bach, Vu Tran, Minh Le Nguyen, and Tu Minh Phuong. 2020. Answering Legal Questions by Learning Neural Attentive Text Representation. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), 988–998. https://doi.org/10.18653/v1/2020.coling-main.86Google ScholarGoogle ScholarCross RefCross Ref
  7. Raghavan P. Manning, C.D. and H. Schutze. 2008. Introduction to Information Retrieval. Cambridge University Press, Cambridge.Google ScholarGoogle Scholar
  8. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013).Google ScholarGoogle Scholar
  9. Dat Quoc Nguyen and Anh Tuan Nguyen. 2020. PhoBERT: Pre-trained language models for Vietnamese. arXiv preprint arXiv:2003.00744 (2020).Google ScholarGoogle Scholar
  10. Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen, Xinying Song, and Rabab Ward. 2016. Deep sentence embedding using long short-term memory networks: Analysis and application to information retrieval. IEEE/ACM Transactions on Audio, Speech, and Language Processing 24, 4 (2016), 694–707.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Jingfang Xu, and Xueqi Cheng. 2017. Deeprank: A new deep architecture for relevance ranking in information retrieval. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 257–266.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.Google ScholarGoogle ScholarCross RefCross Ref
  13. Nhat-Minh Pham, Ha-Thanh Nguyen, and Trong-Hop Do. 2022. Multi-stage Information Retrieval for Vietnamese Legal Texts. arXiv preprint arXiv:2209.14494 (2022).Google ScholarGoogle Scholar
  14. Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019).Google ScholarGoogle Scholar
  15. Stephen Robertson, Hugo Zaragoza, and Michael Taylor. 2004. Simple BM25 extension to multiple weighted fields. In Proceedings of the thirteenth ACM international conference on Information and knowledge management. 42–49.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Gerard Salton and Christopher Buckley. 1988. Term-weighting approaches in automatic text retrieval. Information processing & management 24, 5 (1988), 513–523.Google ScholarGoogle Scholar
  17. Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Grégoire Mesnil. 2014. A latent semantic model with convolutional-pooling structure for information retrieval. In Proceedings of the 23rd ACM international conference on conference on information and knowledge management. 101–110.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Zeynep Akkalyoncu Yilmaz, Shengjin Wang, Wei Yang, Haotian Zhang, and Jimmy Lin. 2019. Applying BERT to document retrieval with birch. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations. 19–24.Google ScholarGoogle Scholar

Index Terms

  1. A Question-Answering System for Vietnamese Public Administrative Services
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology
          December 2023
          1058 pages
          ISBN:9798400708916
          DOI:10.1145/3628797

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 7 December 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate147of318submissions,46%
        • Article Metrics

          • Downloads (Last 12 months)19
          • Downloads (Last 6 weeks)4

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format