Automated Ontology Extraction from Unstructured Texts using Deep Learning

Navarro-Almanza, Raúl; Juárez-Ramírez, Reyes; Licea, Guillermo; Castro, Juan R.

doi:10.1007/978-3-030-35445-9_50

Raúl Navarro-Almanza⁵,
Reyes Juárez-Ramírez⁵,
Guillermo Licea⁵ &
…
Juan R. Castro⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 862))

1008 Accesses
2 Citations

Abstract

Ontologies are computational artifacts to represent knowledge through classes and relations between them. Those knowledge bases require a lot human effort to be constructed due to the need of domain experts and knowledge engineers. Ontology Learning aims to automatically build ontologies from data that can be from multimedia, web pages, databases, unstructured text, etc. In this work, we propose a methodology to automatically build an ontology to represent concepts map of subjects to be used in academic context. The main contribution of this methodology is that does not require handcrafted features by using Deep Learning techniques to identify taxonomic and semantic relations between concepts of some specific domain. Also, due the implementation of transfer learning is not needed of specific domain dataset, the relation classification model is trained with Wikipedia and WordNet by distant supervision technique and the knowledge is transferred to a specific domain by word embedding techniques. The results of this approach are promising considering the lack of human intervention and feature engineering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Atapattu, T., Falkner, K., Falkner, N.: Automated extraction of semantic concepts from semi-structured data: supporting computer-based education through the analysis of lecture notes. In: International Conference on Database and Expert Systems Applications, pp. 161–175 (2012)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Caraballo, S.A.: Automatic construction of a hypernym-labeled noun hierarchy from text. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, ACL’99, pp. 120–126. Association for Computational Linguistics, Stroudsburg, PA, USA (1999)
Google Scholar
Chakraborty, S., Roy, D., Basu, A.: TMRF e-book development of knowledge based intelligent tutoring system. Sajja & Akerkar 1, 74–100 (2010)
Google Scholar
Colace, F., De Santo, M., Greco, L., Amato, F., Moscato, V., Picariello, A.: Terminological ontology learning and population using latent Dirichlet allocation. J. Vis. Lang. Comput. 25(6), 818–826 (2014)
Article Google Scholar
Conde, A., Larrañaga, M., Calvo, I., Elorriaga, J., Arruarte, A.: Automatic generation of the domain module from electronic textbooks: method and validation. IEEE Trans. Knowl. Data Eng. 26(1), 69–82 (2014)
Article Google Scholar
dos Santos, C.N., Xiang, B., Zhou, B.: Classifying relations by ranking with convolutional neural networks. Acl-2015 3, 626–634 (2015)
Google Scholar
Fan, M., Zhou, Q., Abel, A., Zheng, T.F., Grishman, R.: Probabilistic belief embedding for large-scale knowledge population. Cogn. Comput. 8(6), 1087–1102 (2016)
Article Google Scholar
Gantayat, N., Iyer, S.: Automated building of domain ontologies from lecture notes in courseware. In: Proceedings—IEEE International Conference on Technology for Education, T4E 2011, pp. 89–95 (2011)
Google Scholar
Gligora, M., Jakupovic, A.: A prevalence trend of characteristics of intelligent and adaptive hypermedia e-learning systems (2015)
Google Scholar
Hendrickx, I., Kim, S.N., Kozareva, Z., Nakov, P., Séaghdha, D., Padó, S., Pennacchiotti, M., Romano, L., Szpakowicz, S.: SemEval-2010 Task 8: multi-way classification of semantic relations between pairs of nominals. In: Computational Linguistics, Number June 2009 in DEW’09, pp. 94–99. Association for Computational Linguistics, Stroudsburg, PA, USA (2010)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Isir, R.M., Canada, G.H.: Using ontological engineering to overcome common AI-ED problems. J. Artif. Intell. Educ. 11, 107–121 (2000)
Google Scholar
Kingma, D.P., Adam, J.B.: Method for stochastic optimization. CoRR (2014). abs/1412.6
Google Scholar
Komninos, A.: Dependency based embeddings for sentence classification tasks. In: Naacl 2016, pp. 1490–1500 (2016)
Google Scholar
Landauer, T.K., Dumais, S.T.: A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104(2), 211–240 (1997)
Article Google Scholar
Lecun, Y., Bengio, Y., Hinton, G.: Deep learning (2015)
Article Google Scholar
Lin, Y., Shen, S., Liu, Z., Luan, H., Sun, M.: Neural relation extraction with selective attention over instances. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol. 1: Long Papers, pp. 2124–2133 (2016)
Google Scholar
Mikolov, T., Corrado, G., Chen, K., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of the International Conference on Learning Representations (ICLR 2013), pp. 1–12 (2013)
Google Scholar
Navarro-Almanza, R., Licea, G., Juárez-Ramírez, R., Mendoza, O.: Semantic Capture analysis in word embedding vectors using convolutional neural network. In: Rocha, A., Correia, A.M., Adeli, H., Reis, L.P., Costanzo, S. (eds.) Recent Advances in Information Systems and Technologies, vol. 1, pp. 106–114. Springer, Cham (2017)
Chapter Google Scholar
Pan, W., Zhong, F., Yang, Q.: Transfer learning for text mining. In: Aggarwal, C.C., Zhai, C. (eds.) Mining Text Data, pp. 223–257. Springer, Boston, MA, USA (2012)
Chapter Google Scholar
Qin, P., Weiran, X., Guo, J.: An empirical convolutional neural network approach for semantic relation classification. Neurocomputing 190, 1–9 (2016)
Article Google Scholar
Ramírez-Noriega, A., Juárez-Ramírez, R., Jiménez, S., Inzunza, S., Navarro, R., López-Martínez, J.: An ontology of the object orientation for intelligent tutoring systems. In: 2017 5th International Conference in Software Engineering Research and Innovation, pp. 163–170 (2017)
Google Scholar
Rios-Alvarado, A.B., Lopez-Arevalo, I., Sosa-Sosa, V.J.: Learning concept hierarchies from textual resources for ontologies construction. Expert Syst. Appl. 40(15), 5907–5915 (2013)
Article Google Scholar
Xiong, S., Ji, D.: Exploiting flexible-constrained K-means clustering with word embedding for aspect-phrase grouping. Inf. Sci. 367–368, 689–699 (2016)
Article Google Scholar
Zeng, D., Liu, K., Chen, Y., Zhao, J.: Distant supervision for relation extraction via piecewise convolutional neural networks. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1753–1762 (2015)
Google Scholar
Zhao, H., Lu, Z., Poupart, P.: Self-adaptive hierarchical sentence model. In: IJCAI International Joint Conference on Artificial Intelligence, January 2015, pp. 4069–4076 (2015)
Google Scholar
Zhou, P., Shi, W., Tian, J., Qi, Z., Li, B., Hao, H., Xu, B.: Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol. 2, Short Papers, pp. 207–212 (2016)
Google Scholar

Download references

Acknowledgements

This research was supported/partially supported by MyDCI (Maestría y Doctorado en Ciencias e Ingeniería).

Author information

Authors and Affiliations

Calzada Universidad 14418, Universidad Autónoma de Baja California, Tijuana, Baja California, México
Raúl Navarro-Almanza, Reyes Juárez-Ramírez, Guillermo Licea & Juan R. Castro

Authors

Raúl Navarro-Almanza
View author publications
You can also search for this author in PubMed Google Scholar
Reyes Juárez-Ramírez
View author publications
You can also search for this author in PubMed Google Scholar
Guillermo Licea
View author publications
You can also search for this author in PubMed Google Scholar
Juan R. Castro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raúl Navarro-Almanza .

Editor information

Editors and Affiliations

Division of Graduate Studies and Research, Tijuana Institute of Technology, Tijuana, Baja California, Mexico
Oscar Castillo
Division of Graduate Studies and Research, Tijuana Institute of Technology, Tijuana, Baja California, Mexico
Patricia Melin
Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland
Janusz Kacprzyk

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Navarro-Almanza, R., Juárez-Ramírez, R., Licea, G., Castro, J.R. (2020). Automated Ontology Extraction from Unstructured Texts using Deep Learning. In: Castillo, O., Melin, P., Kacprzyk, J. (eds) Intuitionistic and Type-2 Fuzzy Logic Enhancements in Neural and Optimization Algorithms: Theory and Applications. Studies in Computational Intelligence, vol 862. Springer, Cham. https://doi.org/10.1007/978-3-030-35445-9_50

Download citation

DOI: https://doi.org/10.1007/978-3-030-35445-9_50
Published: 28 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35444-2
Online ISBN: 978-3-030-35445-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Automated Ontology Extraction from Unstructured Texts using Deep Learning