skip to main content
10.1145/2514601.2514635acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicailConference Proceedingsconference-collections
research-article

A system for classifying multi-label text into EuroVoc

Published: 10 June 2013 Publication History

Abstract

In this work we present a working system for automatic classification of text documents into the EuroVoc multilingual thesaurus. EuroVoc contains around 7,000 categories with different levels of specificity. The system relies on a simple approach for the treatment of multi-label texts where each document may have more than one associated category. The classifier is based on the well-known Support Vector Machine algorithm trained using the JRC-Acquis corpus, containing around 23,000 documents labeled with six EuroVoc categories in average. The demonstration scenario will show the ability of the system to classify documents taken on site from the Eur-Lex web portal of the European Union, together with features for visualization and navigation of the texts at different granulatity.

References

[1]
C. C. Aggarwal and C. X. Zhai. A survey of text classification algorithms. Mining Text Data, pages 163--222, 2012.
[2]
C. Biagioli, E. Francesconi, A. Passerini, S. Montemagni, and C. Soria. Automatic semantics extraction in law documents. In Proceedings of the 10th international conference on Artificial intelligence and law, pages 133--140. ACM, 2005.
[3]
G. Boella, L. Di Caro, L. Lesmo, D. Rispoli, and L. Robaldo. In Burkhard Schäfer, editor, JURIX, pages 21--30. IOS Press.
[4]
G. Boella, L. Humphreys, M. Martin, P. Rossi, and L. van der Torre. Eunomos, a legal document and knowledge management system to build legal services. AI Approaches to the Complexity of Legal Systems., pages 131--146, 2012.
[5]
M. R. Boutell, J. Luo, X. Shen, and C. M. Brown. Learning multi-label scene classification. Pattern recognition, 37(9):1757--1771, 2004.
[6]
C. Cortes and V. Vapnik. Support-vector networks. Machine learning, 20(3):273--297, 1995.
[7]
E. de Maat, K. Krabben, and R. Winkels. Machine learning versus knowledge based classification of legal texts. In Proceedings of Legal Knowledge and Information Systems Conference: JURIX 2010, pages 87--96, 2010.
[8]
S. Diplaris, G. Tsoumakas, P. Mitkas, and I. Vlahavas. Protein classification with multiple algorithms. Advances in Informatics, pages 448--456, 2005.
[9]
B. Lauser and A. Hotho. Automatic multi-label subject indexing in a multilingual environment. Research and Advanced Technology for Digital Libraries, pages 140--151, 2003.
[10]
L. Lesmo. The Turin University Parser at Evalita 2009. Proceedings of EVALITA, 9, 2009.
[11]
G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Commun. ACM, 18:613--620, November 1975.
[12]
R. Steinberger, E. Mohamed, and M. Turchi. Jrc eurovoc indexer jex-a freely available multilabel categorisation tool. In In Proceedings of the 8th International Conference on Language Resources and Evaluation (LRECŠ2012), 2012.

Cited By

View all
  • (2024)A SHACL-Based Approach for Enhancing Automated Compliance Checking with RDF DataInformation10.3390/info1512075915:12(759)Online publication date: 29-Nov-2024
  • (2024)Compliance Checking in the Energy Domain via W3C StandardsNew Frontiers in Artificial Intelligence10.1007/978-3-031-60511-6_1(3-18)Online publication date: 4-Jun-2024
  • (2023)Efficient compliance checking of RDF dataJournal of Logic and Computation10.1093/logcom/exad03433:8(1753-1776)Online publication date: 6-Jun-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICAIL '13: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Law
June 2013
277 pages
ISBN:9781450320801
DOI:10.1145/2514601
  • Conference Chair:
  • Enrico Francesconi,
  • Program Chair:
  • Bart Verheij

Sponsors

  • ITTIG-CNR: Istituto di Teoria e Tecniche dell'Informazione Giuridica - Consiglio Nazionale delle Ricerche
  • IAAIL: Intl Asso for Artifical Intel & Law

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 June 2013

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

ICAIL '13
Sponsor:
  • ITTIG-CNR
  • IAAIL

Acceptance Rates

ICAIL '13 Paper Acceptance Rate 17 of 53 submissions, 32%;
Overall Acceptance Rate 69 of 169 submissions, 41%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A SHACL-Based Approach for Enhancing Automated Compliance Checking with RDF DataInformation10.3390/info1512075915:12(759)Online publication date: 29-Nov-2024
  • (2024)Compliance Checking in the Energy Domain via W3C StandardsNew Frontiers in Artificial Intelligence10.1007/978-3-031-60511-6_1(3-18)Online publication date: 4-Jun-2024
  • (2023)Efficient compliance checking of RDF dataJournal of Logic and Computation10.1093/logcom/exad03433:8(1753-1776)Online publication date: 6-Jun-2023
  • (2022)Exploiting Textual Similarity Techniques in Harmonization of LawsAIxIA 2021 – Advances in Artificial Intelligence10.1007/978-3-031-08421-8_13(185-197)Online publication date: 19-Jul-2022
  • (2021)Surviving the Legal Jungle: Text Classification of Italian Laws in extremely Noisy conditionsProceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 202010.4000/books.aaccademia.8390(122-127)Online publication date: 3-Sep-2021
  • (2021)Towards compliance checking in reified I/O logic via SHACLProceedings of the Eighteenth International Conference on Artificial Intelligence and Law10.1145/3462757.3466065(215-219)Online publication date: 21-Jun-2021
  • (2020)Modeling artificial agents’ actions in context – a deontic cognitive event ontologyApplied Ontology10.3233/AO-200236(1-35)Online publication date: 24-Jul-2020
  • (2020)Populating legal ontologies using semantic role labelingArtificial Intelligence and Law10.1007/s10506-020-09271-3Online publication date: 24-Jun-2020
  • (2019)Formalizing GDPR Provisions in Reified I/O Logic: The DAPRECO Knowledge BaseJournal of Logic, Language and Information10.1007/s10849-019-09309-zOnline publication date: 19-Nov-2019
  • (2018)Multi-Task Deep Learning for Legal Document Translation, Summarization and Multi-Label ClassificationProceedings of the 2018 Artificial Intelligence and Cloud Computing Conference10.1145/3299819.3299844(9-15)Online publication date: 21-Dec-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media