Relation-Level Vector Representation for Relation Extraction and Classification on Specialized Data

Gosset, Camille; Billami, Mokhtar Boumedyen; Lafourcade, Mathieu; Bortolaso, Christophe; Derras, Mustapha

doi:10.1007/978-3-031-08530-7_27

Camille Gosset^11,12,
Mokhtar Boumedyen Billami¹¹,
Mathieu Lafourcade¹²,
Christophe Bortolaso¹¹ &
…
Mustapha Derras¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13343))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

1504 Accesses

Abstract

During this last decade, word embeddings models learning continuous vector representations of words have been established and integrated in several applications of Natural Language Processing (NLP). These models have been subsequently extended to learn representations of other textual objects such as word senses/definitions, fragments of textual documents or even whole texts. In this paper, we focus on the creation of continuous vector representations for relations. We propose a model where vectors embeddings of relations are deduced from (multi-)words embeddings. The training of these representations is carried out from a business corpus referring to several specialized domains such as health, justice, urbanism, or elections. The quality of these representations is evaluated on the task of identifying/classifying lexical-semantic relations from texts with binary classifiers (Machine Learning techniques). This task consists in classifying from a relation embedding in input, the relation type of this representation. The obtained results are good and surpass the performances for a recent one state-of-the-art system dedicated to the creation of relations representations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Rumelhart, D.E., McClelland, J.L.: Distributed representations (1986)
Google Scholar
Elman, J.L.: Finding structure in time. Cogn. Sci. 14, 179–211 (1990)
Article Google Scholar
Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of the International Conference on Learning Representations (ICLR), pp. 1–12 (2013)
Google Scholar
Hill, F., Cho, K., Korhonen, A.: Learning distributed representations of sentences from unlabelled data. In: Proceedings of NAACL, San Diego, California, pp. 1367–1377, June 2016. https://doi.org/10.18653/v1/N16-1162
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using siamese BERT-networks (2019)
Google Scholar
Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014). https://arxiv.org/abs/1405.4053
Chen, M.: Efficient vector representation for documents through corruption (2017)
Google Scholar
Wu, L., et al.: Word mover’s embedding: from Word2Vec to document embedding (2018)
Google Scholar
Galke, L., Saleh, A., Scherp, A.: Word embeddings for practical information retrieval. In: INFORMATIK 2017, pp. 2155–2167 (2017). https://doi.org/10.18420/in2017_215
Camacho-Collados, J., Espinosa-Anke, L., Shoaib, J., Schockaert, S.: A latent variable model for learning distributional relation vectors, Macau, China (2019)
Google Scholar
Lafourcade, M.: Making people play for lexical acquisition with the JeuxDeMots prototype. In: SNLP 2007: 7th International Symposium on NLP, p. 7 (2007). https://hal-lirmm.ccsd.cnrs.fr/lirmm-00200883
Turney, P.D.: Measuring semantic similarity by latent relational analysis. In: Proceedings of IJCAI, pp. 1136–1141 (2005)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Proceedings of EMNLP, pp. 1532–1543 (2014)
Google Scholar
Jameel, S., Bouraoui, Z., Schockaert, S.: Unsupervised learning of distributional relation vectors. In: Proceedings of ACL (Volume 1: Long Paper), Australia, pp. 23–33 (2018)
Google Scholar
Washio, K., Kato, T.: Filling missing paths: modeling co-occurrences of word pairs and dependency paths for recognizing lexical semantic relations. In: NAACL, pp. 1123–1133 (2018)
Google Scholar
Joshi, M., Choi, E., Levy, O., Weld, D.S., Zettlemoyer, L.: Pair2Vec: compositional word-pair embeddings for cross-sentence inference (2019)
Google Scholar
Espinosa-Anke, L., Schockaert, S., Camacho-Collados, J.: Relational word embeddings (2019)
Google Scholar
Chin, J., Havasi, C., Speer, R.: ConceptNet 5.5: an open multilingual graph of general knowledge. In: Proceedings of AAAI, pp. 4444–4451 (2017)
Google Scholar
Granada, R., Vieira, R., Trojahn, C., Aussenac-Gilles, N.: Evaluating the Complementarity of Taxonomic Relation Extraction Methods Across Different Languages (2018). http://export.arxiv.org/pdf/1811.03245
Panchenko, A., Naets, H., Brouwers, L., Fairon, C.: Recherche et visualisation de mots sémantiquement liés. In: TALN-RÉCITAL, pp. 747–754 (2013)
Google Scholar
Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th Conference on Computational Linguistics (COLING 1992), Stroudsburg, PA, USA, vol. 2, pp. 539–545 (1992)
Google Scholar
Bunescu, R.C., Mooney, R.J.: A shortest path dependency kernel for relation extraction. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 724–731 (2005)
Google Scholar
Panchenko, A., et al.: TAXI at SemEval-2016 task 13: a taxonomy induction method based on lexico-syntactic patterns, substrings and focused crawling (2016)
Google Scholar
Bordea, G., Buitelaar, P., Faralli, S., Navigli, R.: SemEval-2015 task 17: taxonomy extraction evaluation (TExEval) (2015)
Google Scholar
Mintz, M.D., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP, pp. 1003–1011 (2009)
Google Scholar
Brin, S.: Extracting patterns and relations from the world wide web. In: Atzeni, P., Mendelzon, A., Mecca, G. (eds.) WebDB 1998. LNCS, vol. 1590, pp. 172–183. Springer, Heidelberg (1999). https://doi.org/10.1007/10704656_11
Chapter Google Scholar
Etzioni, O., et al.: Methods for domain-independent information extraction from the web: an experimental comparison. In: American Association for Artificial Intelligence (AAAI), pp. 391–398 (2004)
Google Scholar
Bunescu, R.C., Mooney, R.J.: Subsequence kernels for relation extraction (2005)
Google Scholar
Kambhatla, N.: Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations, Morristown, NJ, USA (2004)
Google Scholar
Miller, S., Fox, H., Ramshaw, L., Weischedel, R.: A Novel Use of Statistical Parsing to Extract Information from Text (2000). https://aclanthology.org/A00-2030
Culotta, A., McCallum, A., Betz, J.: Integrating probabilistic extraction models and data mining to discover relations and patterns in text. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, pp. 296–303 (2006)
Google Scholar
Cartier, E.: Extraction automatique de relations sémantiques dans les définitions: approche hybride, construction d’un corpus de relations sémantiques pour le français. In: Actes de la 22e conférence sur le TALN, pp. 131–145, June 2015. https://aclanthology.org/2015.jeptalnrecital-long.12
Qi, P., Zhang, Y., Zhang, Y., Bolton, J., Manning, C.D.: Stanza: a Python natural language processing toolkit for many human languages (2020)
Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. Machine Learning in Python, p. 6 (2011)
Google Scholar
Lang, K.: NewsWeeder: learning to filter netnews. In: Prieditis, A., Russell, S. (eds.) Machine Learning Proceedings 1995, San Francisco, CA, pp. 331–339 (1995). https://doi.org/10.1016/B978-1-55860-377-6.50048-7
Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Berger-Levrault, 64 Rue Jean Rostand, 31670, Labège, France
Camille Gosset, Mokhtar Boumedyen Billami, Christophe Bortolaso & Mustapha Derras
LIRMM, 161 Rue Ada, 34095, Montpellier, France
Camille Gosset & Mathieu Lafourcade

Authors

Camille Gosset
View author publications
You can also search for this author in PubMed Google Scholar
Mokhtar Boumedyen Billami
View author publications
You can also search for this author in PubMed Google Scholar
Mathieu Lafourcade
View author publications
You can also search for this author in PubMed Google Scholar
Christophe Bortolaso
View author publications
You can also search for this author in PubMed Google Scholar
Mustapha Derras
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Camille Gosset .

Editor information

Editors and Affiliations

i-SOMET, Inc., Morioka-shi, Iwate, Japan
Hamido Fujita
College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, Guangdong, China
Philippe Fournier-Viger
Texas State University, San Marcos, TX, USA
Moonis Ali
Shanghai University of Finance and Economics, Shanghai, China
Yinglin Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gosset, C., Billami, M.B., Lafourcade, M., Bortolaso, C., Derras, M. (2022). Relation-Level Vector Representation for Relation Extraction and Classification on Specialized Data. In: Fujita, H., Fournier-Viger, P., Ali, M., Wang, Y. (eds) Advances and Trends in Artificial Intelligence. Theory and Practices in Artificial Intelligence. IEA/AIE 2022. Lecture Notes in Computer Science(), vol 13343. Springer, Cham. https://doi.org/10.1007/978-3-031-08530-7_27

Download citation

DOI: https://doi.org/10.1007/978-3-031-08530-7_27
Published: 30 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08529-1
Online ISBN: 978-3-031-08530-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics