From Semi-automated to Automated Methods of Ontology Learning from Twitter Data

Alajlan, Saad; Coenen, Frans; Mandya, Angrosh

doi:10.1007/978-3-030-66196-0_10

From Semi-automated to Automated Methods of Ontology Learning from Twitter Data

Saad Alajlan^11,12,
Frans Coenen¹¹ &
Angrosh Mandya¹¹

Conference paper
First Online: 14 January 2021

371 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1297))

Abstract

This paper presents four different mechanisms for ontology learning from Twitter data. The learning process involves the identification of entities and relations from a specified Twitter data set, which is then used to produce an ontology. The initial two methods considered, the Stanford and GATE based ontology learning frameworks, are both semi-automated methods for identifying the relations in the desired ontology. Although the two frameworks effectively create an ontology supported knowledge resource, the frameworks feature a particular disadvantage; the time-consuming and cumbersome task of manually annotating a relation extraction training data sets. As a result two other ontology learning frameworks are proposed, one using regular expressions which reduces the required resource, and one that combines Shortest Path Dependency parsing and Word Mover’s Distance to fully automate the process of creating relation extraction training data. All four are analysed and discussed in this paper.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Ahmed, W., Demaerini, G., Bath, P.A.: Topics discussed on twitter at the beginning of the 2014 ebola epidemic in united states. In: iConference 2017 Proceedings (2017)
Google Scholar
Alajlan., S., Coenen., F., Konev., B., Mandya., A.: Ontology learning from twitter data. In: Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD, pp. 94–103. INSTICC, SciTePress (2019)
Google Scholar
Arias, M., Arratia, A., Xuriguera, R.: Forecasting with twitter data. ACM Trans. Intell. Syst. Technol. (TIST) 5(1), 1–24 (2014)
Google Scholar
Bunescu, R.C., Mooney, R.J.: A shortest path dependency kernel for relation extraction. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 724–731. Association for Computational Linguistics (2005)
Google Scholar
Carlson, A., Betteridge, J., Wang, R.C., Hruschka, E.R., Mitchell, T.M.: Coupled semi-supervised learning for information extraction. In: Proceedings of the 3rd ACM International Conference on Web Search and Data Mining, p. 101. ACM (2010)
Google Scholar
Chunxiao, W., et al.: Customizing an information extraction system to a new domain. In: Regulatory Peptides, vol. 141, pp. 35–43. Association for Computational Linguistics (2007)
Google Scholar
Cunningham, H.: Gate, a general architecture for text engineering. Comput. Humanit. 36(2), 223–254 (2002)
Article Google Scholar
Erkan, G., Ozgur, A., Radev, D.R.: Semi-supervised classification for extracting protein interaction sentences using dependency parsing. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (2007)
Google Scholar
Exner, P., Nugues, P.: Entity extraction: from unstructured text to dbpedia RDF triples. In: The Web of Linked Entities Workshop (WoLE 2012), pp. 58–69. CEUR (2012)
Google Scholar
Fellbaum, C.: Wordnet. In: Theory and Applications of Ontology: Computer Applications, pp. 231–243. Springer, Dordrecht (2010). https://doi.org/10.1007/978-90-481-8847-5_10
Cunningham H., Maynard, D., Tablan, V.: JAPE: a Java Annotation Patterns Engine (Second Edition). Department of Computer Science, University of Sheffield (2000)
Google Scholar
Harlow, C.: Data Munging Tools in Preparation for RDF: catmandu and LODRefine. The Code4Lib Journal 30(30), 1–30 (2015)
Google Scholar
Iqbal, R., Murad, M.A.A., Mustapha, A., Sharef, N.M.: An analysis of ontology engineering methodologies: a literature review. Res. J. Appl. Sci. Eng. Technol. 6(16), 2993–3000 (2013)
Article Google Scholar
Kavalec, M., Svaték, V.: A study on automated relation labelling in ontology learning. Ontology Learning from Text: Methods, Evaluation and Applications, pp. 44–58 (2005)
Google Scholar
Klusch, M., Kapahnke, P., Schulte, S., Lecue, F., Bernstein, A.: Semantic web service search: a brief survey. KI - Künstliche Intelligenz 30(2), 139–147 (2015). https://doi.org/10.1007/s13218-015-0415-7
Article Google Scholar
Kübler, S., McDonald, R., Nivre, J.: Dependency parsing. Synthesis Lect. Human Lang. Technol. 1(1), 1–127 (2009)
Google Scholar
Kusner, M., Sun, Y., Kolkin, N., Weinberger, K.: From word embeddings to document distances. In: International Conference on Machine Learning, pp. 957–966 (2015)
Google Scholar
Li, M., Du, X.Y., Wang, S.: Learning ontology from relational database. In: 2005 International Conference on Machine Learning and Cybernetics. vol. 6, pp. 3410–3415. IEEE (2005)
Google Scholar
Maedche, A., Staab, S.: Ontology learning for the semantic web. IEEE Intell. Syst. 16(2), 72–79 (2001)
Article Google Scholar
Mahmoud, N., Elbeh, H., Abdlkader, H.M.: Ontology learning based on word embeddings for text big data extraction. In: 2018 14th International Computer Engineering Conference (ICENCO), pp. 183–188. IEEE (2018)
Google Scholar
Mazari, A.C., Aliane, H., Alimazighi, Z.: Automatic construction of ontology from arabic texts. In: ICWIT, pp. 193–202 (2012)
Google Scholar
McCrae, J., Fellbaum, C., Cimiano, P.: Publishing and linking wordnet using lemon and rdf. In: Proceedings of the 3rd Workshop on Linked Data in Linguistics (2014)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Prud’Hommeaux, E., Seaborne, A., Prud, E., Laboratories, H.p.: SPARQL Query Language for RDF. W3C Working Draftd, pp. 1–95 (2008)
Google Scholar
Qian, L., Zhou, G.: Tree kernel-based protein-protein interaction extraction from biomedical literature. J. Biomed. Inform. 45(3), 535–543 (2012)
Article Google Scholar
Riedel, S., Mccallum, A.: Relation Extraction with Matrix Factorization. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 74–84 (2013)
Google Scholar
Roth, D., Yih, W.t.: Global Inference for Entity and Relation Identification via a Linear Programming Formulation. Introduction to Statistical Relational Learning, pp. 553–580 (2019)
Google Scholar
Stieglitz, S., Dang-Xuan, L.: Social media and political communication: a social media analytics framework. Social Network Anal. Mining 3(4), 1277–1291 (2012). https://doi.org/10.1007/s13278-012-0079-3
Article Google Scholar
Takamatsu, S., Sato, I., Nakagawa, H.: Reducing Wrong Labels in Distant Supervision for Relation Extraction. In: ACL, pp. 721–729. Association for Computational Linguistics (2012)
Google Scholar
Tanwar, M., Duggal, R., Khatri, S.K.: Unravelling unstructured data: A wealth of information in big data. In: 2015 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO)(Trends and Future Directions), pp. 1–6. IEEE (2015)
Google Scholar
Gruber, T.: A translation approach to portable ontology specifications. Knowl. Acquisition 5(2), 199–220 (1993)
Article Google Scholar
Xiang, Z., Gretzel, U.: Role of social media in online travel information search. Tourism Management 31(2), 179–188 (2010)
Article Google Scholar
Zhang, T., Ramakrishnan, R., Livny, M.: Birch: an efficient data clustering method for very large databases. In: ACM Sigmod Record. vol. 25, pp. 103–114. ACM (1996)
Google Scholar
Zhou, L.: Ontology learning: state of the art and open issues. Inf. Technol. Manage. 8(3), 241–252 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, The University of Liverpool, Liverpool, UK
Saad Alajlan, Frans Coenen & Angrosh Mandya
College of Computer and Information Sciences, Al Imam Mohammad Ibn Saud Islamic University, Riyadh, Saudi Arabia
Saad Alajlan

Authors

Saad Alajlan
View author publications
You can also search for this author in PubMed Google Scholar
Frans Coenen
View author publications
You can also search for this author in PubMed Google Scholar
Angrosh Mandya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saad Alajlan .

Editor information

Editors and Affiliations

Instituto de Telecomunicações, Lisbon, Portugal
Ana Fred
Federal University of Pernambuco, Recife, Brazil
Ana Salgado
University of Madeira, Funchal, Portugal
David Aveiro
Delft University of Technology, Delft, The Netherlands
Jan Dietz
Polytechnic Institute of Coimbra, Coimbra, Portugal
Jorge Bernardino
Polytechnic Institute of Setúbal, Setúbal, Portugal
Joaquim Filipe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alajlan, S., Coenen, F., Mandya, A. (2020). From Semi-automated to Automated Methods of Ontology Learning from Twitter Data. In: Fred, A., Salgado, A., Aveiro, D., Dietz, J., Bernardino, J., Filipe, J. (eds) Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2019. Communications in Computer and Information Science, vol 1297. Springer, Cham. https://doi.org/10.1007/978-3-030-66196-0_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-66196-0_10
Published: 14 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66195-3
Online ISBN: 978-3-030-66196-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics