ABSTRACT
Natural Language Processing is developing rapidly alongside the various complex applications that make use of it and they will depend on it even further in the future. It has many challenges that require the attention of both researchers and businesses. The state-of-the-art approaches usually involve the implementation of Deep Learning Neural Networks. Our work serves as a rigorous research of the bibliography on the field focusing on Legal and Greek documents. We also present the current challenges of the field and some future considerations.
- [n. d.]. Akoma Ntoso. Retrieved October 1, 2021 from http://www.akomantoso.org/Google Scholar
- [n. d.]. gr-nlp-toolkit. Retrieved October 1, 2021 from https://github.com/nlpaueb/gr-nlp-toolkitGoogle Scholar
- [n. d.]. Label Studio. Retrieved October 1, 2021 from https://labelstud.io/playground/Google Scholar
- [n. d.]. NLP Progress. Retrieved October 1, 2021 from https://nlpprogress.com/Google Scholar
- [n. d.]. NLTK. Retrieved October 1, 2021 from https://www.nltk.org/Google Scholar
- [n. d.]. Silk. Retrieved October 1, 2021 from http://silkframework.org/Google Scholar
- [n. d.]. Spacy. Retrieved October 1, 2021 from https://spacy.io/Google Scholar
- Nikolaos Aletras, Dimitrios Tsarapatsanis, Daniel Preotiuc-Pietro, and Vasileios Lampos. 2016. Predicting judicial decisions of the European Court of Human Rights: a Natural Language Processing perspective. PeerJ Comput. Sci. 2(2016), e93.Google ScholarCross Ref
- Iosif Angelidis, Ilias Chalkidis, and Manolis Koubarakis. 2018. Named Entity Recognition, Linking and Generation for Greek Legislation. In JURIX.Google Scholar
- Jean-Michel Autebert, Jean Berstel, and Luc Boasson. 1997. Context-Free Languages and Pushdown Automata. Springer-Verlag, Berlin, Heidelberg, 111–174.Google Scholar
- Michalis Avgerinos Loutsaris, Zoi Lachana, Charalampos Alexopoulos, and Yannis Charalabidis. 2021. Legal Text Processing: Combing Two Legal Ontological Approaches through Text Mining. In DG.O2021: The 22nd Annual International Conference on Digital Government Research (Omaha, NE, USA) (DG.O’21). Association for Computing Machinery, New York, NY, USA, 522–532. https://doi.org/10.1145/3463677.3463730Google ScholarDigital Library
- Nikos Bartziokas, Thanassis Mavropoulos, and Constantine Kotropoulos. 2020. Datasets and Performance Metrics for Greek Named Entity Recognition. In 11th Hellenic Conference on Artificial Intelligence (Athens, Greece) (SETN 2020). Association for Computing Machinery, New York, NY, USA, 160–167. https://doi.org/10.1145/3411408.3411437Google ScholarDigital Library
- Cristian Cardellino, Milagro Teruel, Laura Alonso Alemany, and Serena Villata. 2017. A low-cost, high-coverage legal named entity recognizer, classifier and linker. 9–18. https://doi.org/10.1145/3086512.3086514Google ScholarDigital Library
- Ilias Chalkidis and Ion Androutsopoulos. 2017. A Deep Learning Approach to Contract Element Extraction. In JURIX.Google Scholar
- Ilias Chalkidis, Ion Androutsopoulos, and Achilleas Michos. 2017. Extracting contract elements. Proceedings of the 16th edition of the International Conference on Articial Intelligence and Law(2017).Google ScholarDigital Library
- Ilias Chalkidis, Emmanouil Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras, and Ion Androutsopoulos. 2019. Extreme Multi-Label Legal Text Classification: A Case Study in EU Legislation. In Proceedings of the Natural Legal Language Processing Workshop 2019. Association for Computational Linguistics, Minneapolis, Minnesota, 78–87. https://doi.org/10.18653/v1/W19-2209Google ScholarCross Ref
- Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras, and Ion Androutsopoulos. 2020. LEGAL-BERT: The Muppets straight out of Law School. arxiv:2010.02559 [cs.CL]Google Scholar
- Ilias Chalkidis, Manos Fergadiotis, Dimitrios Tsarapatsanis, Nikolaos Aletras, Ion Androutsopoulos, and Prodromos Malakasiotis. 2021. Paragraph-level Rationale Extraction through Regularization: A case study on European Court of Human Rights Cases. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 226–241. https://doi.org/10.18653/v1/2021.naacl-main.22Google Scholar
- Ilias Chalkidis, Charalampos Nikolaou, Panagiotis Soursos, and Manolis Koubarakis. 2017. Modeling and Querying Greek Legislation Using Semantic Web Technologies. 591–606. https://doi.org/10.1007/978-3-319-58068-5_36Google ScholarDigital Library
- Ronan Collobert and Jason Weston. 2008. A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. In Proceedings of the 25th International Conference on Machine Learning (Helsinki, Finland) (ICML ’08). Association for Computing Machinery, New York, NY, USA, 160–167. https://doi.org/10.1145/1390156.1390177Google ScholarDigital Library
- Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural Language Processing (Almost) from Scratch. J. Mach. Learn. Res. 12 (Nov. 2011), 2493–2537.Google ScholarDigital Library
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arxiv:1810.04805 [cs.CL]Google Scholar
- Jacob Eisenstein. 2019. Introduction to Natural Language Processing. MIT Press.Google Scholar
- Ahmed Elnaggar, Christoph Gebendorfer, Ingo Glaser, and Florian Matthes. 2018. Multi-Task Deep Learning for Legal Document Translation, Summarization and Multi-Label Classification. arxiv:1810.07513 [cs.CL]Google Scholar
- Ahmed Elnaggar, Robin Otto, and Florian Matthes. 2018. Deep Learning for Named-Entity Linking with Transfer Learning for Legal Documents. In Proceedings of the 2018 Artificial Intelligence and Cloud Computing Conference (Tokyo, Japan) (AICCC ’18). Association for Computing Machinery, New York, NY, USA, 23–28. https://doi.org/10.1145/3299819.3299846Google ScholarDigital Library
- John Garofalakis, Konstantinos Plessas, and Athanasios Plessas. 2016. A semi-automatic system for the consolidation of Greek legislative texts. 1–6. https://doi.org/10.1145/3003733.3003735Google ScholarDigital Library
- John Garofalakis, Konstantinos Plessas, Athanasios Plessas, and Panoraia Spiliopoulou. 2018. A Project for the Transformation of Greek Legal Documents into Legal Open Data. In Proceedings of the 22nd Pan-Hellenic Conference on Informatics (Athens, Greece) (PCI ’18). Association for Computing Machinery, New York, NY, USA, 144–149. https://doi.org/10.1145/3291533.3291548Google ScholarDigital Library
- Yoav Goldberg and Graeme Hirst. 2017. Neural Network Methods in Natural Language Processing. Morgan and Claypool Publishers.Google Scholar
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org.Google ScholarDigital Library
- Thomas Gordon. 2010. An Overview of the Legal Knowledge Interchange Format. 240–242. https://doi.org/10.1007/978-3-642-15402-7_30Google Scholar
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-term Memory. Neural computation 9 (12 1997), 1735–80. https://doi.org/10.1162/neco.1997.9.8.1735Google ScholarDigital Library
- Deepali Jain, Malaya Dutta Borah, and Anupam Biswas. 2020. Fine-Tuning Textrank for Legal Document Summarization: A Bayesian Optimization Based Approach. In Forum for Information Retrieval Evaluation(Hyderabad, India) (FIRE 2020). Association for Computing Machinery, New York, NY, USA, 41–48. https://doi.org/10.1145/3441501.3441502Google ScholarDigital Library
- Nikitas Karanikolas. 2014. A Methodology for Building Simple but Robust Stemmers without Language Knowledge: Stemmer Configuration. Procedia - Social and Behavioral Sciences 147 (08 2014). https://doi.org/10.1016/j.sbspro.2014.07.113Google Scholar
- Mi-Young Kim, Ying Xu, and R. Goebel. 2015. A Convolutional Neural Network in Legal Question Answering.Google Scholar
- Marios Koniaris, George Papastefanatos, and Yannis Vassiliou. 2016. Towards Automatic Structuring and Semantic Indexing of Legal Documents. In Proceedings of the 20th Pan-Hellenic Conference on Informatics (Patras, Greece) (PCI ’16). Association for Computing Machinery, New York, NY, USA, Article 4, 6 pages. https://doi.org/10.1145/3003733.3003801Google ScholarDigital Library
- John Koutsikakis, Ilias Chalkidis, Prodromos Malakasiotis, and Ion Androutsopoulos. 2020. GREEK-BERT: The Greeks visiting Sesame Street. 11th Hellenic Conference on Artificial Intelligence (Sep 2020). https://doi.org/10.1145/3411408.3411440Google ScholarDigital Library
- John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the Eighteenth International Conference on Machine Learning(ICML ’01). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 282–289.Google Scholar
- Christina Leber, Dan Yang, Luis Tari, Andrew Crapo, and Aravind Chandramouli. 2013. Using Semantics to Process Legal Document Updates. In Proceedings of the Sixth International Workshop on Exploiting Semantic Annotations in Information Retrieval (San Francisco, California, USA) (ESAIR ’13). Association for Computing Machinery, New York, NY, USA, 53–56. https://doi.org/10.1145/2513204.2513220Google ScholarDigital Library
- Michalis Avgerinos Loutsaris and Yannis Charalabidis. 2020. Legal Informatics from the Aspect of Interoperability: A Review of Systems, Tools and Ontologies. In Proceedings of the 13th International Conference on Theory and Practice of Electronic Governance (Athens, Greece) (ICEGOV 2020). Association for Computing Machinery, New York, NY, USA, 731–737. https://doi.org/10.1145/3428502.3428611Google ScholarDigital Library
- Qiang Lu, Jack G. Conrad, Khalid Al-Kofahi, and William Keenan. 2011. Legal Document Clustering with Built-in Topic Segmentation. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (Glasgow, Scotland, UK) (CIKM ’11). Association for Computing Machinery, New York, NY, USA, 383–392. https://doi.org/10.1145/2063576.2063636Google ScholarDigital Library
- Arpan Mandal, Raktim Chaki, Sarbajit Saha, Kripabandhu Ghosh, Arindam Pal, and Saptarshi Ghosh. 2017. Measuring Similarity among Legal Court Case Documents. In Proceedings of the 10th Annual ACM India Compute Conference (Bhopal, India) (Compute ’17). Association for Computing Machinery, New York, NY, USA, 1–9. https://doi.org/10.1145/3140107.3140119Google ScholarDigital Library
- Eneldo Loza Mencía and Johannes Fürnkranz. 2008. Efficient Pairwise Multilabel Classification for Large-Scale Problems in the Legal Domain. In ECML/PKDD.Google Scholar
- Eneldo Mencía. 2009. Segmentation of Legal Documents. 88–97. https://doi.org/10.1145/1568234.1568245Google ScholarDigital Library
- Marjan Mernik, Jan Heering, and Anthony Sloane. 2005. When and How to Develop Domain-Specific Languages. ACM Comput. Surv. 37 (12 2005), 316–. https://doi.org/10.1145/1118890.1118892Google ScholarDigital Library
- Yury Muravev. 2020. Machine translation and legal tech in legal translation training. 1–7. https://doi.org/10.1145/3446434.3446553Google ScholarDigital Library
- David Nadeau and Satoshi Sekine. 2007. A Survey of Named Entity Recognition and Classification. Lingvisticae Investigationes 30 (08 2007). https://doi.org/10.1075/li.30.1.03nadGoogle Scholar
- Jesus Manuel Niebla Zatarain. 2018. Artificial Intelligence and Legal Analytics: New Tools for Law Practice in the Digital Age. SCRIPT-ed 15 (08 2018), 156–161. https://doi.org/10.2966/scrip.150118.156Google Scholar
- Natalya F. Noy. 2004. Semantic Integration: A Survey of Ontology-Based Approaches. SIGMOD Rec. 33, 4 (Dec. 2004), 65–70. https://doi.org/10.1145/1041410.1041421Google ScholarDigital Library
- Stamatis Outsios, Christos Karatsalos, Konstantinos Skianis, and Michalis Vazirgiannis. 2020. Evaluation of Greek Word Embeddings. arxiv:1904.04032 [cs.CL]Google Scholar
- Girish Palshikar. 2012. Techniques for Named Entity Recognition: A Survey. Vol. 1. 191–. https://doi.org/10.4018/978-1-4666-3604-0.ch022Google Scholar
- Yannis Panagis, Urska Sadl, and Fabien Tarissan. 2017. Giving Every Case Its (Legal) Due - The Contribution of Citation Networks and Text Similarity Techniques to Legal Studies of European Union Law. In JURIX.Google Scholar
- Harris Papageorgiou, Prokopis Prokopidis, Voula Giouli, and Stelios Piperidis. 2000. A Unified POS Tagging Architecture and its Application to Greek. In Proceedings of the Second International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA), Athens, Greece. http://www.lrec-conf.org/proceedings/lrec2000/pdf/181.pdfGoogle Scholar
- Eleni Partalidou, Eleftherios Spyromitros-Xioufis, Stavros Doropoulos, Stavros Vologiannidis, and Konstantinos I. Diamantaras. 2019. Design and implementation of an open source Greek POS Tagger and Entity Recognizer using spaCy. arxiv:1912.10162 [cs.CL]Google Scholar
- Prokopis Prokopidis and Haris Papageorgiou. 2017. Universal Dependencies for Greek.Google Scholar
- Prokopis Prokopidis and Stelios Piperidis. 2020. A Neural NLP Toolkit for Greek. In 11th Hellenic Conference on Artificial Intelligence (Athens, Greece) (SETN 2020). Association for Computing Machinery, New York, NY, USA, 125–128. https://doi.org/10.1145/3411408.3411430Google ScholarDigital Library
- Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. arxiv:1910.10683 [cs.LG]Google Scholar
- Nils Reimers and Iryna Gurevych. 2017. Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks. arxiv:1707.06799 [cs.CL]Google Scholar
- Anna Rogers, Olga Kovaleva, and Anna Rumshisky. 2020. A Primer in BERTology: What We Know About How BERT Works. Transactions of the Association for Computational Linguistics 8 (2020), 842–866. https://doi.org/10.1162/tacl_a_00349Google ScholarCross Ref
- Henok Sahilu and Solomon Atnafu. 2010. Change-Aware Legal Document Retrieval Model. In Proceedings of the International Conference on Management of Emergent Digital EcoSystems (Bangkok, Thailand) (MEDES ’10). Association for Computing Machinery, New York, NY, USA, 174–181. https://doi.org/10.1145/1936254.1936284Google ScholarDigital Library
- Mike Schuster and Kuldip Paliwal. 1997. Bidirectional recurrent neural networks. Signal Processing, IEEE Transactions on 45 (12 1997), 2673 – 2681. https://doi.org/10.1109/78.650093Google ScholarDigital Library
- Fabrizio Sebastiani. 2001. Machine Learning in Automated Text Categorization. Comput. Surveys 34 (04 2001), 1–47. https://doi.org/10.1145/505282.505283Google ScholarDigital Library
- Rosa Stern, Benoît Sagot, and Frédéric Béchet. 2012. A Joint Named Entity Recognition and Entity Linking System. In Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data. Association for Computational Linguistics, Avignon, France, 52–60. https://aclanthology.org/W12-0508Google ScholarDigital Library
- Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: A Core of Semantic Knowledge. In Proceedings of the 16th International Conference on World Wide Web (Banff, Alberta, Canada) (WWW ’07). Association for Computing Machinery, New York, NY, USA, 697–706. https://doi.org/10.1145/1242572.1242667Google ScholarDigital Library
- Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. arxiv:1409.3215 [cs.CL]Google Scholar
- Dimitrios Tsarapatsanis and Nikolaos Aletras. 2021. On the Ethical Limits of Natural Language Processing on Legal Text. arxiv:2105.02751 [cs.CL]Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. arxiv:1706.03762 [cs.CL]Google Scholar
- Liu Xue, Song Qing, and Zhang Pengzhou. 2018. Relation Extraction Based on Deep Learning. 687–691. https://doi.org/10.1109/ICIS.2018.8466437Google Scholar
Index Terms
- A Natural Language Processing Survey on Legislative and Greek Documents
Recommendations
NLP for the Greek Language: A Brief Survey
SETN 2020: 11th Hellenic Conference on Artificial IntelligenceThere is a plethora of methods, tools and resources for processing text in the English language, however this is not the case for other languages, like Greek. Due to the increasing interest in NLP, and since there is a noteworthy number of works ...
Deep Natural Language Processing for Search Systems
SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information RetrievalDeep learning models have been very successful in many natural language processing tasks. Search engine works with rich natural language data, e.g., queries and documents, which implies great potential of applying deep natural language processing on ...
Natural language processing in law: Prediction of outcomes in the higher courts of Turkey
AbstractNatural language processing (NLP) based approaches have recently received attention for legal systems of several countries. It is of interest to study the wide variety of legal systems that have so far not received any attention. In ...
Comments