Skip to main content

From Semi-automated to Automated Methods of Ontology Learning from Twitter Data

  • Conference paper
  • First Online:
  • 371 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1297))

Abstract

This paper presents four different mechanisms for ontology learning from Twitter data. The learning process involves the identification of entities and relations from a specified Twitter data set, which is then used to produce an ontology. The initial two methods considered, the Stanford and GATE based ontology learning frameworks, are both semi-automated methods for identifying the relations in the desired ontology. Although the two frameworks effectively create an ontology supported knowledge resource, the frameworks feature a particular disadvantage; the time-consuming and cumbersome task of manually annotating a relation extraction training data sets. As a result two other ontology learning frameworks are proposed, one using regular expressions which reduces the required resource, and one that combines Shortest Path Dependency parsing and Word Mover’s Distance to fully automate the process of creating relation extraction training data. All four are analysed and discussed in this paper.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Ahmed, W., Demaerini, G., Bath, P.A.: Topics discussed on twitter at the beginning of the 2014 ebola epidemic in united states. In: iConference 2017 Proceedings (2017)

    Google Scholar 

  2. Alajlan., S., Coenen., F., Konev., B., Mandya., A.: Ontology learning from twitter data. In: Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD, pp. 94–103. INSTICC, SciTePress (2019)

    Google Scholar 

  3. Arias, M., Arratia, A., Xuriguera, R.: Forecasting with twitter data. ACM Trans. Intell. Syst. Technol. (TIST) 5(1), 1–24 (2014)

    Google Scholar 

  4. Bunescu, R.C., Mooney, R.J.: A shortest path dependency kernel for relation extraction. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 724–731. Association for Computational Linguistics (2005)

    Google Scholar 

  5. Carlson, A., Betteridge, J., Wang, R.C., Hruschka, E.R., Mitchell, T.M.: Coupled semi-supervised learning for information extraction. In: Proceedings of the 3rd ACM International Conference on Web Search and Data Mining, p. 101. ACM (2010)

    Google Scholar 

  6. Chunxiao, W., et al.: Customizing an information extraction system to a new domain. In: Regulatory Peptides, vol. 141, pp. 35–43. Association for Computational Linguistics (2007)

    Google Scholar 

  7. Cunningham, H.: Gate, a general architecture for text engineering. Comput. Humanit. 36(2), 223–254 (2002)

    Article  Google Scholar 

  8. Erkan, G., Ozgur, A., Radev, D.R.: Semi-supervised classification for extracting protein interaction sentences using dependency parsing. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (2007)

    Google Scholar 

  9. Exner, P., Nugues, P.: Entity extraction: from unstructured text to dbpedia RDF triples. In: The Web of Linked Entities Workshop (WoLE 2012), pp. 58–69. CEUR (2012)

    Google Scholar 

  10. Fellbaum, C.: Wordnet. In: Theory and Applications of Ontology: Computer Applications, pp. 231–243. Springer, Dordrecht (2010). https://doi.org/10.1007/978-90-481-8847-5_10

  11. Cunningham H., Maynard, D., Tablan, V.: JAPE: a Java Annotation Patterns Engine (Second Edition). Department of Computer Science, University of Sheffield (2000)

    Google Scholar 

  12. Harlow, C.: Data Munging Tools in Preparation for RDF: catmandu and LODRefine. The Code4Lib Journal 30(30), 1–30 (2015)

    Google Scholar 

  13. Iqbal, R., Murad, M.A.A., Mustapha, A., Sharef, N.M.: An analysis of ontology engineering methodologies: a literature review. Res. J. Appl. Sci. Eng. Technol. 6(16), 2993–3000 (2013)

    Article  Google Scholar 

  14. Kavalec, M., Svaték, V.: A study on automated relation labelling in ontology learning. Ontology Learning from Text: Methods, Evaluation and Applications, pp. 44–58 (2005)

    Google Scholar 

  15. Klusch, M., Kapahnke, P., Schulte, S., Lecue, F., Bernstein, A.: Semantic web service search: a brief survey. KI - Künstliche Intelligenz 30(2), 139–147 (2015). https://doi.org/10.1007/s13218-015-0415-7

    Article  Google Scholar 

  16. Kübler, S., McDonald, R., Nivre, J.: Dependency parsing. Synthesis Lect. Human Lang. Technol. 1(1), 1–127 (2009)

    Google Scholar 

  17. Kusner, M., Sun, Y., Kolkin, N., Weinberger, K.: From word embeddings to document distances. In: International Conference on Machine Learning, pp. 957–966 (2015)

    Google Scholar 

  18. Li, M., Du, X.Y., Wang, S.: Learning ontology from relational database. In: 2005 International Conference on Machine Learning and Cybernetics. vol. 6, pp. 3410–3415. IEEE (2005)

    Google Scholar 

  19. Maedche, A., Staab, S.: Ontology learning for the semantic web. IEEE Intell. Syst. 16(2), 72–79 (2001)

    Article  Google Scholar 

  20. Mahmoud, N., Elbeh, H., Abdlkader, H.M.: Ontology learning based on word embeddings for text big data extraction. In: 2018 14th International Computer Engineering Conference (ICENCO), pp. 183–188. IEEE (2018)

    Google Scholar 

  21. Mazari, A.C., Aliane, H., Alimazighi, Z.: Automatic construction of ontology from arabic texts. In: ICWIT, pp. 193–202 (2012)

    Google Scholar 

  22. McCrae, J., Fellbaum, C., Cimiano, P.: Publishing and linking wordnet using lemon and rdf. In: Proceedings of the 3rd Workshop on Linked Data in Linguistics (2014)

    Google Scholar 

  23. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  24. Prud’Hommeaux, E., Seaborne, A., Prud, E., Laboratories, H.p.: SPARQL Query Language for RDF. W3C Working Draftd, pp. 1–95 (2008)

    Google Scholar 

  25. Qian, L., Zhou, G.: Tree kernel-based protein-protein interaction extraction from biomedical literature. J. Biomed. Inform. 45(3), 535–543 (2012)

    Article  Google Scholar 

  26. Riedel, S., Mccallum, A.: Relation Extraction with Matrix Factorization. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 74–84 (2013)

    Google Scholar 

  27. Roth, D., Yih, W.t.: Global Inference for Entity and Relation Identification via a Linear Programming Formulation. Introduction to Statistical Relational Learning, pp. 553–580 (2019)

    Google Scholar 

  28. Stieglitz, S., Dang-Xuan, L.: Social media and political communication: a social media analytics framework. Social Network Anal. Mining 3(4), 1277–1291 (2012). https://doi.org/10.1007/s13278-012-0079-3

    Article  Google Scholar 

  29. Takamatsu, S., Sato, I., Nakagawa, H.: Reducing Wrong Labels in Distant Supervision for Relation Extraction. In: ACL, pp. 721–729. Association for Computational Linguistics (2012)

    Google Scholar 

  30. Tanwar, M., Duggal, R., Khatri, S.K.: Unravelling unstructured data: A wealth of information in big data. In: 2015 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO)(Trends and Future Directions), pp. 1–6. IEEE (2015)

    Google Scholar 

  31. Gruber, T.: A translation approach to portable ontology specifications. Knowl. Acquisition 5(2), 199–220 (1993)

    Article  Google Scholar 

  32. Xiang, Z., Gretzel, U.: Role of social media in online travel information search. Tourism Management 31(2), 179–188 (2010)

    Article  Google Scholar 

  33. Zhang, T., Ramakrishnan, R., Livny, M.: Birch: an efficient data clustering method for very large databases. In: ACM Sigmod Record. vol. 25, pp. 103–114. ACM (1996)

    Google Scholar 

  34. Zhou, L.: Ontology learning: state of the art and open issues. Inf. Technol. Manage. 8(3), 241–252 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Saad Alajlan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Alajlan, S., Coenen, F., Mandya, A. (2020). From Semi-automated to Automated Methods of Ontology Learning from Twitter Data. In: Fred, A., Salgado, A., Aveiro, D., Dietz, J., Bernardino, J., Filipe, J. (eds) Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2019. Communications in Computer and Information Science, vol 1297. Springer, Cham. https://doi.org/10.1007/978-3-030-66196-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-66196-0_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-66195-3

  • Online ISBN: 978-3-030-66196-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics