research-article

A Natural Language Processing Survey on Legislative and Greek Documents

Authors:
Panteleimon Krasadakis

Department of Informatics, University Of Piraeus, Greece

Department of Informatics, University Of Piraeus, Greece
View Profile

,
Evangelos Sakkopoulos

Department of Informatics, University of Piraeus, Greece

Department of Informatics, University of Piraeus, Greece
View Profile

,
Vassilios S. Verykios

School of Science and Technology, Hellenic Open University, Greece

School of Science and Technology, Hellenic Open University, Greece
View Profile

PCI '21: Proceedings of the 25th Pan-Hellenic Conference on InformaticsNovember 2021Pages 407–412https://doi.org/10.1145/3503823.3503898

Published:22 February 2022Publication History

PCI '21: Proceedings of the 25th Pan-Hellenic Conference on Informatics

Pages 407–412

ABSTRACT

Natural Language Processing is developing rapidly alongside the various complex applications that make use of it and they will depend on it even further in the future. It has many challenges that require the attention of both researchers and businesses. The state-of-the-art approaches usually involve the implementation of Deep Learning Neural Networks. Our work serves as a rigorous research of the bibliography on the field focusing on Legal and Greek documents. We also present the current challenges of the field and some future considerations.

References

[n. d.]. Akoma Ntoso. Retrieved October 1, 2021 from http://www.akomantoso.org/Google Scholar
[n. d.]. gr-nlp-toolkit. Retrieved October 1, 2021 from https://github.com/nlpaueb/gr-nlp-toolkitGoogle Scholar
[n. d.]. Label Studio. Retrieved October 1, 2021 from https://labelstud.io/playground/Google Scholar
[n. d.]. NLP Progress. Retrieved October 1, 2021 from https://nlpprogress.com/Google Scholar
[n. d.]. NLTK. Retrieved October 1, 2021 from https://www.nltk.org/Google Scholar
[n. d.]. Silk. Retrieved October 1, 2021 from http://silkframework.org/Google Scholar
[n. d.]. Spacy. Retrieved October 1, 2021 from https://spacy.io/Google Scholar
Nikolaos Aletras, Dimitrios Tsarapatsanis, Daniel Preotiuc-Pietro, and Vasileios Lampos. 2016. Predicting judicial decisions of the European Court of Human Rights: a Natural Language Processing perspective. PeerJ Comput. Sci. 2(2016), e93.Google ScholarCross Ref
Iosif Angelidis, Ilias Chalkidis, and Manolis Koubarakis. 2018. Named Entity Recognition, Linking and Generation for Greek Legislation. In JURIX.Google Scholar
Jean-Michel Autebert, Jean Berstel, and Luc Boasson. 1997. Context-Free Languages and Pushdown Automata. Springer-Verlag, Berlin, Heidelberg, 111–174.Google Scholar
Michalis Avgerinos Loutsaris, Zoi Lachana, Charalampos Alexopoulos, and Yannis Charalabidis. 2021. Legal Text Processing: Combing Two Legal Ontological Approaches through Text Mining. In DG.O2021: The 22nd Annual International Conference on Digital Government Research (Omaha, NE, USA) (DG.O’21). Association for Computing Machinery, New York, NY, USA, 522–532. https://doi.org/10.1145/3463677.3463730Google ScholarDigital Library
Nikos Bartziokas, Thanassis Mavropoulos, and Constantine Kotropoulos. 2020. Datasets and Performance Metrics for Greek Named Entity Recognition. In 11th Hellenic Conference on Artificial Intelligence (Athens, Greece) (SETN 2020). Association for Computing Machinery, New York, NY, USA, 160–167. https://doi.org/10.1145/3411408.3411437Google ScholarDigital Library
Cristian Cardellino, Milagro Teruel, Laura Alonso Alemany, and Serena Villata. 2017. A low-cost, high-coverage legal named entity recognizer, classifier and linker. 9–18. https://doi.org/10.1145/3086512.3086514Google ScholarDigital Library
Ilias Chalkidis and Ion Androutsopoulos. 2017. A Deep Learning Approach to Contract Element Extraction. In JURIX.Google Scholar
Ilias Chalkidis, Ion Androutsopoulos, and Achilleas Michos. 2017. Extracting contract elements. Proceedings of the 16th edition of the International Conference on Articial Intelligence and Law(2017).Google ScholarDigital Library
Ilias Chalkidis, Emmanouil Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras, and Ion Androutsopoulos. 2019. Extreme Multi-Label Legal Text Classification: A Case Study in EU Legislation. In Proceedings of the Natural Legal Language Processing Workshop 2019. Association for Computational Linguistics, Minneapolis, Minnesota, 78–87. https://doi.org/10.18653/v1/W19-2209Google ScholarCross Ref
Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras, and Ion Androutsopoulos. 2020. LEGAL-BERT: The Muppets straight out of Law School. arxiv:2010.02559 [cs.CL]Google Scholar
Ilias Chalkidis, Manos Fergadiotis, Dimitrios Tsarapatsanis, Nikolaos Aletras, Ion Androutsopoulos, and Prodromos Malakasiotis. 2021. Paragraph-level Rationale Extraction through Regularization: A case study on European Court of Human Rights Cases. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 226–241. https://doi.org/10.18653/v1/2021.naacl-main.22Google Scholar
Ilias Chalkidis, Charalampos Nikolaou, Panagiotis Soursos, and Manolis Koubarakis. 2017. Modeling and Querying Greek Legislation Using Semantic Web Technologies. 591–606. https://doi.org/10.1007/978-3-319-58068-5_36Google ScholarDigital Library
Ronan Collobert and Jason Weston. 2008. A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. In Proceedings of the 25th International Conference on Machine Learning (Helsinki, Finland) (ICML ’08). Association for Computing Machinery, New York, NY, USA, 160–167. https://doi.org/10.1145/1390156.1390177Google ScholarDigital Library
Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural Language Processing (Almost) from Scratch. J. Mach. Learn. Res. 12 (Nov. 2011), 2493–2537.Google ScholarDigital Library
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arxiv:1810.04805 [cs.CL]Google Scholar
Jacob Eisenstein. 2019. Introduction to Natural Language Processing. MIT Press.Google Scholar
Ahmed Elnaggar, Christoph Gebendorfer, Ingo Glaser, and Florian Matthes. 2018. Multi-Task Deep Learning for Legal Document Translation, Summarization and Multi-Label Classification. arxiv:1810.07513 [cs.CL]Google Scholar
Ahmed Elnaggar, Robin Otto, and Florian Matthes. 2018. Deep Learning for Named-Entity Linking with Transfer Learning for Legal Documents. In Proceedings of the 2018 Artificial Intelligence and Cloud Computing Conference (Tokyo, Japan) (AICCC ’18). Association for Computing Machinery, New York, NY, USA, 23–28. https://doi.org/10.1145/3299819.3299846Google ScholarDigital Library
John Garofalakis, Konstantinos Plessas, and Athanasios Plessas. 2016. A semi-automatic system for the consolidation of Greek legislative texts. 1–6. https://doi.org/10.1145/3003733.3003735Google ScholarDigital Library
John Garofalakis, Konstantinos Plessas, Athanasios Plessas, and Panoraia Spiliopoulou. 2018. A Project for the Transformation of Greek Legal Documents into Legal Open Data. In Proceedings of the 22nd Pan-Hellenic Conference on Informatics (Athens, Greece) (PCI ’18). Association for Computing Machinery, New York, NY, USA, 144–149. https://doi.org/10.1145/3291533.3291548Google ScholarDigital Library
Yoav Goldberg and Graeme Hirst. 2017. Neural Network Methods in Natural Language Processing. Morgan and Claypool Publishers.Google Scholar
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org.Google ScholarDigital Library
Thomas Gordon. 2010. An Overview of the Legal Knowledge Interchange Format. 240–242. https://doi.org/10.1007/978-3-642-15402-7_30Google Scholar
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-term Memory. Neural computation 9 (12 1997), 1735–80. https://doi.org/10.1162/neco.1997.9.8.1735Google ScholarDigital Library
Deepali Jain, Malaya Dutta Borah, and Anupam Biswas. 2020. Fine-Tuning Textrank for Legal Document Summarization: A Bayesian Optimization Based Approach. In Forum for Information Retrieval Evaluation(Hyderabad, India) (FIRE 2020). Association for Computing Machinery, New York, NY, USA, 41–48. https://doi.org/10.1145/3441501.3441502Google ScholarDigital Library
Nikitas Karanikolas. 2014. A Methodology for Building Simple but Robust Stemmers without Language Knowledge: Stemmer Configuration. Procedia - Social and Behavioral Sciences 147 (08 2014). https://doi.org/10.1016/j.sbspro.2014.07.113Google Scholar
Mi-Young Kim, Ying Xu, and R. Goebel. 2015. A Convolutional Neural Network in Legal Question Answering.Google Scholar
Marios Koniaris, George Papastefanatos, and Yannis Vassiliou. 2016. Towards Automatic Structuring and Semantic Indexing of Legal Documents. In Proceedings of the 20th Pan-Hellenic Conference on Informatics (Patras, Greece) (PCI ’16). Association for Computing Machinery, New York, NY, USA, Article 4, 6 pages. https://doi.org/10.1145/3003733.3003801Google ScholarDigital Library
John Koutsikakis, Ilias Chalkidis, Prodromos Malakasiotis, and Ion Androutsopoulos. 2020. GREEK-BERT: The Greeks visiting Sesame Street. 11th Hellenic Conference on Artificial Intelligence (Sep 2020). https://doi.org/10.1145/3411408.3411440Google ScholarDigital Library
John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the Eighteenth International Conference on Machine Learning(ICML ’01). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 282–289.Google Scholar
Christina Leber, Dan Yang, Luis Tari, Andrew Crapo, and Aravind Chandramouli. 2013. Using Semantics to Process Legal Document Updates. In Proceedings of the Sixth International Workshop on Exploiting Semantic Annotations in Information Retrieval (San Francisco, California, USA) (ESAIR ’13). Association for Computing Machinery, New York, NY, USA, 53–56. https://doi.org/10.1145/2513204.2513220Google ScholarDigital Library
Michalis Avgerinos Loutsaris and Yannis Charalabidis. 2020. Legal Informatics from the Aspect of Interoperability: A Review of Systems, Tools and Ontologies. In Proceedings of the 13th International Conference on Theory and Practice of Electronic Governance (Athens, Greece) (ICEGOV 2020). Association for Computing Machinery, New York, NY, USA, 731–737. https://doi.org/10.1145/3428502.3428611Google ScholarDigital Library
Qiang Lu, Jack G. Conrad, Khalid Al-Kofahi, and William Keenan. 2011. Legal Document Clustering with Built-in Topic Segmentation. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (Glasgow, Scotland, UK) (CIKM ’11). Association for Computing Machinery, New York, NY, USA, 383–392. https://doi.org/10.1145/2063576.2063636Google ScholarDigital Library
Arpan Mandal, Raktim Chaki, Sarbajit Saha, Kripabandhu Ghosh, Arindam Pal, and Saptarshi Ghosh. 2017. Measuring Similarity among Legal Court Case Documents. In Proceedings of the 10th Annual ACM India Compute Conference (Bhopal, India) (Compute ’17). Association for Computing Machinery, New York, NY, USA, 1–9. https://doi.org/10.1145/3140107.3140119Google ScholarDigital Library
Eneldo Loza Mencía and Johannes Fürnkranz. 2008. Efficient Pairwise Multilabel Classification for Large-Scale Problems in the Legal Domain. In ECML/PKDD.Google Scholar
Eneldo Mencía. 2009. Segmentation of Legal Documents. 88–97. https://doi.org/10.1145/1568234.1568245Google ScholarDigital Library
Marjan Mernik, Jan Heering, and Anthony Sloane. 2005. When and How to Develop Domain-Specific Languages. ACM Comput. Surv. 37 (12 2005), 316–. https://doi.org/10.1145/1118890.1118892Google ScholarDigital Library
Yury Muravev. 2020. Machine translation and legal tech in legal translation training. 1–7. https://doi.org/10.1145/3446434.3446553Google ScholarDigital Library
David Nadeau and Satoshi Sekine. 2007. A Survey of Named Entity Recognition and Classification. Lingvisticae Investigationes 30 (08 2007). https://doi.org/10.1075/li.30.1.03nadGoogle Scholar
Jesus Manuel Niebla Zatarain. 2018. Artificial Intelligence and Legal Analytics: New Tools for Law Practice in the Digital Age. SCRIPT-ed 15 (08 2018), 156–161. https://doi.org/10.2966/scrip.150118.156Google Scholar
Natalya F. Noy. 2004. Semantic Integration: A Survey of Ontology-Based Approaches. SIGMOD Rec. 33, 4 (Dec. 2004), 65–70. https://doi.org/10.1145/1041410.1041421Google ScholarDigital Library
Stamatis Outsios, Christos Karatsalos, Konstantinos Skianis, and Michalis Vazirgiannis. 2020. Evaluation of Greek Word Embeddings. arxiv:1904.04032 [cs.CL]Google Scholar
Girish Palshikar. 2012. Techniques for Named Entity Recognition: A Survey. Vol. 1. 191–. https://doi.org/10.4018/978-1-4666-3604-0.ch022Google Scholar
Yannis Panagis, Urska Sadl, and Fabien Tarissan. 2017. Giving Every Case Its (Legal) Due - The Contribution of Citation Networks and Text Similarity Techniques to Legal Studies of European Union Law. In JURIX.Google Scholar
Harris Papageorgiou, Prokopis Prokopidis, Voula Giouli, and Stelios Piperidis. 2000. A Unified POS Tagging Architecture and its Application to Greek. In Proceedings of the Second International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA), Athens, Greece. http://www.lrec-conf.org/proceedings/lrec2000/pdf/181.pdfGoogle Scholar
Eleni Partalidou, Eleftherios Spyromitros-Xioufis, Stavros Doropoulos, Stavros Vologiannidis, and Konstantinos I. Diamantaras. 2019. Design and implementation of an open source Greek POS Tagger and Entity Recognizer using spaCy. arxiv:1912.10162 [cs.CL]Google Scholar
Prokopis Prokopidis and Haris Papageorgiou. 2017. Universal Dependencies for Greek.Google Scholar
Prokopis Prokopidis and Stelios Piperidis. 2020. A Neural NLP Toolkit for Greek. In 11th Hellenic Conference on Artificial Intelligence (Athens, Greece) (SETN 2020). Association for Computing Machinery, New York, NY, USA, 125–128. https://doi.org/10.1145/3411408.3411430Google ScholarDigital Library
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. arxiv:1910.10683 [cs.LG]Google Scholar
Nils Reimers and Iryna Gurevych. 2017. Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks. arxiv:1707.06799 [cs.CL]Google Scholar
Anna Rogers, Olga Kovaleva, and Anna Rumshisky. 2020. A Primer in BERTology: What We Know About How BERT Works. Transactions of the Association for Computational Linguistics 8 (2020), 842–866. https://doi.org/10.1162/tacl_a_00349Google ScholarCross Ref
Henok Sahilu and Solomon Atnafu. 2010. Change-Aware Legal Document Retrieval Model. In Proceedings of the International Conference on Management of Emergent Digital EcoSystems (Bangkok, Thailand) (MEDES ’10). Association for Computing Machinery, New York, NY, USA, 174–181. https://doi.org/10.1145/1936254.1936284Google ScholarDigital Library
Mike Schuster and Kuldip Paliwal. 1997. Bidirectional recurrent neural networks. Signal Processing, IEEE Transactions on 45 (12 1997), 2673 – 2681. https://doi.org/10.1109/78.650093Google ScholarDigital Library
Fabrizio Sebastiani. 2001. Machine Learning in Automated Text Categorization. Comput. Surveys 34 (04 2001), 1–47. https://doi.org/10.1145/505282.505283Google ScholarDigital Library
Rosa Stern, Benoît Sagot, and Frédéric Béchet. 2012. A Joint Named Entity Recognition and Entity Linking System. In Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data. Association for Computational Linguistics, Avignon, France, 52–60. https://aclanthology.org/W12-0508Google ScholarDigital Library
Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: A Core of Semantic Knowledge. In Proceedings of the 16th International Conference on World Wide Web (Banff, Alberta, Canada) (WWW ’07). Association for Computing Machinery, New York, NY, USA, 697–706. https://doi.org/10.1145/1242572.1242667Google ScholarDigital Library
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. arxiv:1409.3215 [cs.CL]Google Scholar
Dimitrios Tsarapatsanis and Nikolaos Aletras. 2021. On the Ethical Limits of Natural Language Processing on Legal Text. arxiv:2105.02751 [cs.CL]Google Scholar
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. arxiv:1706.03762 [cs.CL]Google Scholar
Liu Xue, Song Qing, and Zhang Pengzhou. 2018. Relation Extraction Based on Deep Learning. 687–691. https://doi.org/10.1109/ICIS.2018.8466437Google Scholar

Index Terms

A Natural Language Processing Survey on Legislative and Greek Documents

Index terms have been assigned to the content through auto-classification.

Recommendations

NLP for the Greek Language: A Brief Survey
SETN 2020: 11th Hellenic Conference on Artificial Intelligence

There is a plethora of methods, tools and resources for processing text in the English language, however this is not the case for other languages, like Greek. Due to the increasing interest in NLP, and since there is a noteworthy number of works ...
Read More
Deep Natural Language Processing for Search Systems
SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

Deep learning models have been very successful in many natural language processing tasks. Search engine works with rich natural language data, e.g., queries and documents, which implies great potential of applying deep natural language processing on ...
Read More
Natural language processing in law: Prediction of outcomes in the higher courts of Turkey
Abstract
Natural language processing (NLP) based approaches have recently received attention for legal systems of several countries. It is of interest to study the wide variety of legal systems that have so far not received any attention. In ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PCI '21: Proceedings of the 25th Pan-Hellenic Conference on Informatics
November 2021
499 pages
ISBN:9781450395557
DOI:10.1145/3503823
Editors:
Michael Gr. Vassilakopoulos,
Nikitas N. Karanikolas,
George Stamoulis,
Vassilios S. Verykios,
Cleo Sgouropoulou
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 February 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Deep Learning
Information Extraction
Natural Language Processing
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate190of390submissions,49%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 125
  Total Downloads
- Downloads (Last 12 months)36
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

A Natural Language Processing Survey on Legislative and Greek Documents

PCI '21: Proceedings of the 25th Pan-Hellenic Conference on Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

NLP for the Greek Language: A Brief Survey

Deep Natural Language Processing for Search Systems

Natural language processing in law: Prediction of outcomes in the higher courts of Turkey

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

A Natural Language Processing Survey on Legislative and Greek Documents

PCI '21: Proceedings of the 25th Pan-Hellenic Conference on Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

NLP for the Greek Language: A Brief Survey

Deep Natural Language Processing for Search Systems

Natural language processing in law: Prediction of outcomes in the higher courts of Turkey

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media