Word embeddings-based transfer learning for boosted relational dependency networks

Luca, Thais; Paes, Aline; Zaverucha, Gerson

doi:10.1007/s10994-023-06404-y

Word embeddings-based transfer learning for boosted relational dependency networks

Published: 20 September 2023

Volume 113, pages 1269–1302, (2024)
Cite this article

Machine Learning Aims and scope Submit manuscript

243 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Conventional machine learning methods assume data to be independent and identically distributed (i.i.d.) and ignore the relational structure of the data, which contains crucial information about how objects participate in relationships and events. Statistical Relational Learning (SRL) combines elements from statistical and probabilistic modeling to relational learning to represent and learn in domains with complex relational and rich probabilistic structures. SRL models do not suppose data to be i.i.d., but, as conventional machine learning models, they also assume training and testing data are sampled from the same distribution. Transfer learning has emerged as an essential technique to handle scenarios where such an assumption does not hold. It aims to provide methods with the ability to recognize knowledge previously learned in a source domain and apply it to a new model in a target domain to start solving a new task. For SRL models, the primary challenge is to transfer the learned structure, mapping the vocabulary across different domains. In this work, we propose TransBoostler, an algorithm that uses pre-trained word embeddings to guide the mapping. We follow the assumption that the name of a predicate has a semantic connotation that can be mapped to a vector space model. Next, TransBoostler employs theory revision to adapt the mapped model to the target data. Experimental results showed that TransBoostler successfully transferred trees across different domains. It performs equally well as, or better than, previous systems and requires less training time for some investigated scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Knowledge Graphs: Opportunities and Challenges

Article Open access 03 April 2023

Explainable AI Methods - A Brief Overview

Bolstering stochastic gradient descent with model building

Article Open access 15 April 2024

Data availability

The data is publicly available at https://github.com/MeLLL-UFF/TransBoostler.

Code availability

The code is available at https://github.com/MeLLL-UFF/TransBoostler.

Notes

The source code and experiments are publicly available at https://github.com/MeLL-UFF/TransBoostler.
Munich Information Center of Protein Sequence.

References

Azevedo Santos, R., Paes, A., & Zaverucha, G. (2020). Transfer learning by mapping and revising boosted relational dependency networks. Mach Learn, 109(7), 1435–1463. https://doi.org/10.1007/s10994-020-05871-x
Article MathSciNet Google Scholar
Baziotis, C., Pelekis, N., & Doulkeridis, C. (2017). Deep lstm with attention for message-level and topic-based sentiment analysis. In: Proc. of the 11th Int. Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, pp 747–754
Bilenko, M., & Mooney, RJ. (2003). Adaptive duplicate detection using learnable string similarity measures. In: Proc. of the Ninth ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining. Association for Computing Machinery, New York, NY, USA, KDD ’03, p 39-48, https://doi.org/10.1145/956750.956759
Bojanowski, P., Grave, E., Joulin, A., et al. (2017). Enriching Word Vectors with Subword Information. Trans of the Association for Computational Linguistics, 5, 135–146. https://doi.org/10.1162/tacl_a_00051 https://arxiv.org/abs/https://direct.mit.edu/tacl/articlepdf/doi/10.1162/tacl a 00051/1567442/tacl a 00051.pdf
Bordes, A., Usunier, N., & Garcia-Duran, A., et al. (2013). Translating embeddings for modeling multi-relational data. In: Burges C, Bottou L, Welling M, et al (eds) Advances in Neural Information Processing Systems, vol 26. Curran Associates, Inc., https://proceedings.neurips.cc/paper/2013/file/1cecc7a77928ca8133fa24680a88d2f9-Paper.pdf
Bratko, I. (1990). PROLOG programming for artificial intelligence (2nd ed.). Inc, USA: Addison-Wesley Longman Publishing Co.
Google Scholar
Carlson, A., Betteridge, J., Kisiel, B., & et al. (2010). Toward an architecture for never-ending language learning. In: Proc. of the Twenty-Fourth AAAI Conf. on Artificial Intelligence. AAAI Press, AAAI’10, p 1306-1313
Davis, J., & Goadrich, M. (2006). The relationship between precision-recall and roc curves. In: Proc. of the 23rd Int. Conf. on Mach. Learn. Association for Computing Machinery, New York, NY, USA, ICML ’06, p 233-240, https://doi.org/10.1145/1143844.1143874
De Raedt, L. (2008). Logical and Relational Learning. pp 1–1, https://doi.org/10.1007/978-3-540-88190-2_1
Dietterich, TG., Ashenfelter, A., & Bulatov, Y. (2004). Training conditional random fields via gradient tree boosting. In: Proc. of the Twenty-First Int. Conf. on Mach. Learn. Association for Computing Machinery, New York, NY, USA, ICML ’04, p 28, https://doi.org/10.1145/1015330.1015428
Duboc, A. L., Paes, A., & Zaverucha, G. (2009). Using the bottom clause and mode declarations in fol theory revision from examples. Machine Learning, 76(1), 73–107.
Article Google Scholar
de Figueiredo, L. F., Paes, A., & Zaverucha, G. (2022). Transfer learning for boosted relational dependency networks through genetic algorithm. In N. Katzouris & A. Artikis (Eds.), Inductive Logic Programming (pp. 125–139). Springer Int: Publishing, Cham.
Chapter Google Scholar
Friedman, N., Getoor, L., Koller, D., & et al. (1999). Learning probabilistic relational models. In: Proc. of the 16th Int. Joint Conf. on Artificial Intelligence - Volume 2. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, IJCAI’99, p 1300-1307
Getoor, L., & Taskar, B. (2007). Introduction to Statistical Relational Learning (Adapt. Computation and Mach. Learn.). The MIT Press
Haaren, JV., Kolobov, A., & Davis, J. (2015). Todtler: Two-order-deep transfer learning. In: Proc. of the Twenty-Ninth AAAI Conference on Artificial Intelligence. AAAI Press, AAAI’15, p 3007-3015
Han, X., Huang, Z., An, B., & et al. (2021). Adaptive transfer learning on graph neural networks. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. Association for Computing Machinery, New York, NY, USA, p 565-574
Hirsch, S., Guy, I., Nus, A., & et al. (2020). Query reformulation in e-commerce search. In: Proc. of the 43rd Int. ACM SIGIR Conf. on Research and Development in Information Retrieval. Association for Computing Machinery, New York, NY, USA, p 1319-1328, https://doi.org/10.1145/3397271.3401065
Khosravi, H., & Bina, B. (2010). A survey on statistical relational learning. In: Proc. of the 23rd Canadian Conf. on Adv. in Artificial Intelligence. Springer-Verlag, Berlin, Heidelberg, AI’10, p 256-268, https://doi.org/10.1007/978-3-642-13059-5_25
Khosravi, H., Schulte, O., Hu, J., et al. (2012). Learning compact markov logic networks with decision trees. In S. H. Muggleton, A. Tamaddoni-Nezhad, & F. A. Lisi (Eds.), Inductive Logic Programming (pp. 20–25). Berlin Heidelberg, Berlin, Heidelberg: Springer.
Chapter Google Scholar
Kuhn, H. W. (1955). The Hungarian Method for the Assignment Problem. Naval Research Logistics Quarterly, 2(1–2), 83–97. https://doi.org/10.1002/nav.3800020109
Article MathSciNet Google Scholar
Kumaraswamy, R., Odom, P., Kersting, K., & et al. (2015). Transfer learning via relational type matching. In: 2015 IEEE Int. Conf. on Data Mining, pp 811–816, https://doi.org/10.1109/ICDM.2015.138
Kusner, M., Sun, Y., Kolkin, N., & et al. (2015). From word embeddings to document distances. In: Bach F, Blei D (eds) Proc. of the 32nd Int. Conf. on Mach. Learn., Proc. of Mach. Learn. Res., vol 37. PMLR, Lille, France, pp 957–966
Lee, C. K., Lu, C., Yu, Y., et al. (2021). Transfer learning with graph neural networks for optoelectronic properties of conjugated oligomers. The Journal of Chemical Physics, 154(2), 024–906.
Article Google Scholar
Luca, T., Paes, A., & Zaverucha, G. (2022). Mapping across relational domains for transfer learning with word embeddings-based similarity. In N. Katzouris & A. Artikis (Eds.), Inductive Logic Programming (pp. 167–182). Springer Int: Publishing, Cham.
Chapter Google Scholar
Mewes, H. W., Frishman, D., Gruber, C., et al. (2000). Mips: A database for genomes and protein sequences. Nucleic Acids Research, 28, 37–40. https://doi.org/10.1093/nar/28.1.37
Article CAS PubMed PubMed Central Google Scholar
Mihalkova, L., & Mooney, RJ. (2007). Bottom-up learning of markov logic network structure. In: Proc. of the 24th Int. Conf. on Mach. Learn. Association for Computing Machinery, New York, NY, USA, ICML ’07, p 625-632, https://doi.org/10.1145/1273496.1273575
Mihalkova, L., & Mooney, RJ. (2009). Transfer learning from minimal target data by mapping across relational domains. In: Proc. of the 21st Int. Joint Conference on Artificial Intelligence. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, IJCAI’09, p 1163-1168
Mihalkova, L., Huynh, T., & Mooney, RJ. (2007). Mapping and revising markov logic networks for transfer learning. In: Proc. of the 22nd Nat. Conf. on Artificial Intelligence - Volume 1. AAAI Press, AAAI’07, p 608-614
Mikolov, T., Chen, K., Corrado, G., & et al. (2013a). Efficient estimation of word representations in vector space. In: Bengio Y, LeCun Y (eds) 1st Int. Conf. on Learn. Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings
Mikolov, T., Sutskever, I., Chen, K., et al. (2013b). Distributed representations of words and phrases and their compositionality. In: Proc. of the 26th Int. Conf. on Neural Information Processing Systems - Volume 2. Curran Associates Inc., Red Hook, NY, USA, NIPS’13, p 3111-3119
Mikolov, T., Grave, E., Bojanowski, P., & et al. (2018). Advances in pre-training distributed word representations. In: Proc. of the Eleventh Int. Conf. on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Miyazaki, Japan, https://aclanthology.org/L18-1008
Miller, G. A. (1995). Wordnet: A lexical database for English. Communications ACM, 38(11), 39–41. https://doi.org/10.1145/219717.219748
Article Google Scholar
Natarajan, S., Khot, T., Kersting, K., et al. (2012). Gradient-based boosting for statistical relational learning: The relational dependency network case. Machine Learning, 86(1), 25–56. https://doi.org/10.1007/s10994-011-5244-9
Article MathSciNet Google Scholar
Neville, J., & Jensen, D. (2007). Relational dependency networks. Journal Machine Learning Research, 8, 653–692.
Google Scholar
Paes, A., Zaverucha, G., & Costa, V. S. (2017). On the use of stochastic local search techniques to revise first-order logic theories from examples. Machine Learning, 106(2), 197–241. https://doi.org/10.1007/s10994-016-5595-3
Article MathSciNet Google Scholar
Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359. https://doi.org/10.1109/TKDE.2009.191
Article Google Scholar
Pele, O., & Werman, M. (2009). Fast and robust earth mover’s distances. In: 2009 IEEE 12th Int. Conf. on Computer Vision, pp 460–467, https://doi.org/10.1109/ICCV.2009.5459199
Pilehvar, M. T., & Camacho-Collados, J. (2020). Embeddings in natural language processing: Theory and advances in vector representations of meaning. Synthesis Lectures on Human Language Technologies, 13(4), 1–175. https://doi.org/10.2200/S01057ED1V01Y202009HLT047
Article Google Scholar
Shvaytser, H. (1990). A necessary condition for learning from positive examples. Machine Learning, 5(1), 101–113.
Article Google Scholar
Sidorov, G., Gelbukh, A., Gomez, Adorno, H., & et al. (2014). Soft similarity and soft cosine measure: Similarity of features in vector space model. Computación y Sistemas 18. https://doi.org/10.13053/cys-18-3-2043
Stahl, I. (1993). Predicate invention in ilp-an overview. In: European Conference on Machine Learning, Springer, pp 311–322
Tan, C., Sun, F., Kong, T., & et al. (2018). A survey on deep transfer learning. In: Kurková V, Manolopoulos Y, Hammer B, et al (eds) Artificial Neural Networks and Mach. Learn. - ICANN 2018 - 27th Int. Conf. on Artificial Neural Networks, Rhodes, Greece, October 4-7, 2018, Proc., Part III, Lecture Notes in Computer Science, vol 11141. Springer, pp 270–279, https://doi.org/10.1007/978-3-030-01424-7_27
Torregrossa, F., Allesiardo, R., Claveau, V., et al. (2021). A survey on training and evaluation of word embeddings. Int Journal of Data Science and Analytics, 11(2), 85–103. https://hal.archives-ouvertes.fr/hal-03148517
Torrey, L., & Shavlik, J. (2010). Transfer learning. In: Handbook of research on Mach. Learn. applications and trends: algorithms, methods, and techniques. IGI global, pp 242–264
Toutanova, K., Klein, D., Manning, CD., & et al. (2003). Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proc. of the 2003 Conf. of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1. Association for Computational Linguistics, USA, NAACL ’03, p 173-180, https://doi.org/10.3115/1073445.1073478
Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27(11), 1134–1142.
Article Google Scholar
Vig, L., Srinivasan, A., Bain, M., et al. (2017). An investigation into the role of domain-knowledge on the use of embeddings. Int Conf. on Inductive Logic Programming (pp. 169–183). Springer.
Google Scholar
Wang, Z., Zhang, J., Feng, J., & et al. (2014). Knowledge graph embedding by translating on hyperplanes. In: Proc. of the AAAI Conf. on Artificial Intelligence
Wrobel, S. (1996). First Order Theory Refinement. In L. De Raedt (Ed.), Advances in Inductive Logic Programming. IOS Press.
Google Scholar
Wu, Z., Zhao, D., Liang, Q., & et al. (2021). Dynamic sparsity neural networks for automatic speech recognition. In: IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP 2021, Toronto, ON, Canada, June 6-11, 2021. IEEE, pp 6014–6018, https://doi.org/10.1109/ICASSP39728.2021.9414505
Yang, Q., Zhang, Y., Dai, W., et al. (2020). Transfer Learning. Cambridge University Press. https://doi.org/10.1017/9781139061773
Article Google Scholar

Download references

Acknowledgements

We would like to thank the Brazilian Research Agencies CAPES, CNPq (311275/2020-6 (AP) and 308376/2021-8 (GZ)), and FAPERJ (E-26/202.914/2019 (247109) (AP)) for financial support. Thanks to the authors of RDN-B and TreeBoostler for the available code and the help with doubts. We also thank the authors of the datasets and the anonymous reviewers for their valuable feedback.

Funding

This work was partially funded by CNPq, FAPERJ, and CAPES.

Author information

Authors and Affiliations

Department of Systems Engineering and Computer Science, COPPE, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
Thais Luca & Gerson Zaverucha
Institute of Computing, Universidade Federal Fluminense, Niteroi, RJ, Brazil
Aline Paes

Authors

Thais Luca
View author publications
You can also search for this author in PubMed Google Scholar
Aline Paes
View author publications
You can also search for this author in PubMed Google Scholar
Gerson Zaverucha
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to building ideas and reaching the research goals and aims, and; the development and design of the methodology used in this work. All authors were committed to the application of statistical and computational techniques as well as verifying the experimental results and other research outputs. Thais Luca was responsible for designing the computer programs, software development, and implementing the code used during research. Thais was also responsible for other activities such as conducting the research and performing the experiments, and; preparing and creating the published work. Aline Paes and Gerson Zaverucha were responsible for supervising the research activity, mentorship, coordinating research activity planning and execution, and acquisition of the financial support for the project leading to this publication

Corresponding author

Correspondence to Thais Luca.

Ethics declarations

Conflict of interest

Not applicable. There is no conflict of interests.

Ethicaal approval

Not applicable. The research conduct here does not require ethics approval.

Consent to participate

Not applicable. This manuscript does not conduct any research with individuals.

Consent for publication

Not applicable. All images in the manuscript are produced by the authors.

Additional information

Editors: Alireza Tamaddoni-Nezhad, Alan Bundy, Luc De Raedt, Artur d’Avila Garcez, Sebastijan Dumančić, Cèsar Ferri, Pascal Hitzler, Nikos Katzouris, Denis Mareschal, Stephen Muggleton, Ute Schmid.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A Mapping algorithms

1.1 A.1 Depth-first mapping algorithm

We present the algorithm of the Depth-first mapping approach. As source predicates appear in the tree structure, it computes similarities with target predicates to return its most similar.

1.2 A.2 Ranked-first mapping algorithm

We present the algorithm of the Ranked-first mapping approach. First, it computes all similarities between pairs of source and target predicates. Then, it follows such an ordered list to map the most similar predicates.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Luca, T., Paes, A. & Zaverucha, G. Word embeddings-based transfer learning for boosted relational dependency networks. Mach Learn 113, 1269–1302 (2024). https://doi.org/10.1007/s10994-023-06404-y

Download citation

Received: 26 May 2022
Revised: 04 December 2022
Accepted: 16 August 2023
Published: 20 September 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s10994-023-06404-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Word embeddings-based transfer learning for boosted relational dependency networks

Abstract

Access this article

Similar content being viewed by others

Knowledge Graphs: Opportunities and Challenges

Explainable AI Methods - A Brief Overview

Bolstering stochastic gradient descent with model building

Data availability

Code availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethicaal approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Appendix A Mapping algorithms

1.1 A.1 Depth-first mapping algorithm

1.2 A.2 Ranked-first mapping algorithm

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Word embeddings-based transfer learning for boosted relational dependency networks

Abstract

Access this article

Similar content being viewed by others

Knowledge Graphs: Opportunities and Challenges

Explainable AI Methods - A Brief Overview

Bolstering stochastic gradient descent with model building

Data availability

Code availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethicaal approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Appendix A Mapping algorithms

Appendix A Mapping algorithms

1.1 A.1 Depth-first mapping algorithm

1.2 A.2 Ranked-first mapping algorithm

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation