Skip to main content
Log in

A multi-phase correlation search framework for mining non-taxonomic relations from unstructured text

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Over the last decade, ontology engineering has been pursued by “learning” the ontology from domain-specific electronic documents. Most of the research works are focused on extraction of concepts and taxonomic relations. The extraction of non-taxonomic relations is often neglected and not well researched. In this paper, we present a multi-phase correlation search framework to extract non-taxonomic relations from unstructured text. Our framework addresses the two main problems in any non-taxonomic relations extraction: (a) the discovery of non-taxonomic relations and (b) the labelling of non-taxonomic relations. First, our framework is capable of extracting correlated concepts beyond ordinary search window size of a single sentence. Interesting correlations are then filtered using association rule mining with lift interestingness measure. Next, our framework distinguishes non-taxonomic concept pairs from taxonomic concept pairs based on existing domain ontology. Finally, our framework features the usage of domain related verbs as labels for the non-taxonomic relations. Our proposed framework has been tested with the marine biology domain. Results have been validated by domain experts showing reliable results as well as demonstrate significant improvement over traditional association rule approach in search of non-taxonomic relations from unstructured text.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD conference on management of data, pp 207–216

  2. Alvarez SA (2003) Chi-squared computation for association rules: Preliminary results. Technical report BC-CS-2003-01, Computer Science Department, Boston College

  3. Bui QC, Katrenko S, Sloot PMA (2011) A hybrid approach to extract protein-protein interactions. Bioinformatics 27(2):259–265

    Article  Google Scholar 

  4. Buitelaar P, Cimiano P, Grobelnik M et al (2005) Ontology learning from text. In: Tutorial at ECML/PKDD

  5. Chagnoux M, Hernandez N, Aussenac-Gilles N, (2008) An interactive pattern based approach for extracting non-taxonomic relations from texts. In: Workshop on ontology learning and population (associated to ECAI, (2008) OLP. University of Patras, Patras, pp 1–6

  6. Chowdhury MFM, Lavelli A (2012) Combining tree structures, flat features and patterns for biomedical relation extraction. In: EACL, pp 420–429

  7. Cimiano P, Völker J (2005) Text2Onto: a framework for ontology learning and data-driven change discovery. In: Proceedings of the 10th international conference on applications and natural language to databases (NLDB ’05), pp 227–238

  8. Cimiano P, Völker J, Studer R (2006) Ontologies on demand? A description of the state-of-the-art, applications, challenges and trends for ontology learning from text. Information, Wissenschaft und Praxis 57(6–7):315–320

    Google Scholar 

  9. Cunningham H (2002) GATE, a general architecture for text engineering. Comput Hum 36(2):223–254

    Article  Google Scholar 

  10. Ding L, Finin T, Joshi A et al (2004) Swoogle: a search and metadata engine for the semantic web. In: Proceedings of the 13th ACM international conference on information and knowledge management (CIKM 2004), pp 652–659

  11. Fundel K, Küffner R, Zimmer R (2007) RelEx–relation extraction using dependency parse trees. Bioinformatics 23(3):365–371

    Article  Google Scholar 

  12. Gulla JA, Brasethvik T, Kvarv GS (2009) Association rules and cosine similarities in ontology relationship learning. In Enterprise information systems. Springer, Berlin, pp 201–212

  13. Hall M, Frank E, Holmes G et al (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18

    Article  Google Scholar 

  14. Jang H, Lim J, Lim JH et al (2006) Finding the evidence for protein-protein interactions from PubMed abstracts. Bioinformatics 22(14):220–226

    Article  Google Scholar 

  15. Kamaruddin SS, Hamdan AR, Bakar AA et al (2009) Automatic extraction of performance indicators from financial statements. In: Proceedings of the international conference on electrical engineering and informatics (ICEEI’ 09), pp 348–350

  16. Kavalec M, Maedche A, Svátek V (2003) Discovery of lexical entries for non-taxonomic relations in ontology learning. In: Theory and practice of computer science, pp 17–33, LNCS, vol 2932, SOFSEM 2004

  17. Kavalec M, Svaték V (2005) A study on automated relation labelling in ontology learning. In: Buitelaar P, Cimiano P, Magnini B (eds) Ontology learning from text: methods, evaluation and applications. IOS Press, Amsterdam, pp 44–58

    Google Scholar 

  18. Kornfeld W, Wattecamps J (1998) Automatically locating, extracting and analyzing tabular data. In: Proceedings of the 21st ACM SIGIR international conference on research and development in, information retrieval, pp 347–348

  19. Maedche A (2002) Ontology learning for the semantic web. Kluwer Academic Publishers, Norwell

    Book  MATH  Google Scholar 

  20. Maedche A, Staab S (2000) Discovering conceptual relations from text. In: Proceedings of the 13th european conference on, artificial intelligence (ECAI-2000), pp 321–325

  21. Maedche A, Staab S (2000) The text-to-onto ontology learning environment. In: Software demonstration at the 8th international conference on conceptual structures (ICSS-2000), pp 14–18

  22. Nedellec C (2000) Corpus-based learning of semantic relations by the ILP system, Asium. In: Cussens J, Dzeroski S (eds) Proceedings of learning language in logic. Springer, Berlin, pp 259–278

    Chapter  Google Scholar 

  23. Punuru J, Chen J (2012) Learning non-taxonomical semantic relations from domain texts. J Intell Inf Syst 38(1):191–207

    Article  Google Scholar 

  24. Rinaldi F, Schneider G, Kaljurand K et al (2007) Mining of relations between proteins over biomedical scientific literature using a deep-linguistic approach. Artif Intell Med 39(2):127–136

    Article  Google Scholar 

  25. Sánchez D, Moreno A (2008) Learning non-taxonomic relationships from web documents for domain ontology construction. Data Knowl Eng 64(3):600–623

    Article  Google Scholar 

  26. Serra I, Girardi R (2011) Extracting non-taxonomic relationships of ontologies from texts. Intell Inf Manag 3(4):119–124

    Google Scholar 

  27. Schutz A, Buitelaar P (2005) RelExt: a tool for relation extraction from text in ontology extension. In: Proceedings of the 4th international semantic web conference, pp 593–606

  28. Shamsfard M, Barforoush AA (2004) Learning ontologies from natural language texts. Int J Hum Comput Stud 60(1):17–63

    Article  Google Scholar 

  29. Sheikh L, Tanveer B, Hamdani M (2004) Interesting measures for mining association rules. In: Proceedings of the 8th IEEE international multi-topic conference (INMIC ’04), pp 641–644

  30. Shen M, Liu DR, Huang YS (2012) Extracting semantic relations to enrich domain ontologies. J Intell Inf Syst 39(3):749–761

    Article  Google Scholar 

  31. Velardi P, Navigli R, Cucchiarelli A et al (2005) Evaluation of OntoLearn, a methodology for automatic learning of domain ontologies. In: Buitelaar P, Cimiano P, Magnini B (eds) Ontology learning from text: methods, applications and evaluation. IOS Press, Amsterdam, pp 92–106

    Google Scholar 

  32. Villaverde J, Persson A, Godoy D et al (2009) Supporting the discovery and labeling of non-taxonomic relationships in ontology learning. Expert Syst Appl 36(7):10288–10294

    Article  Google Scholar 

  33. Weichselbraun A, Wohlgenannt G, Scharl A (2010) Refining non-taxonomic relation labels with external structured data to support ontology learning. Data Knowl Eng Eng 69(8):763–778

    Article  Google Scholar 

  34. Witten IH, Paynter GW, Frank E et al (1999) KEA: practical automatic keyphrase extraction. In: Proceedings of the 4th ACM conference on digital libraries, pp 254–255

  35. Wong MK, Abidi SSR, Jonsen ID (2011) Mining non-taxonomic concept pairs from unstructured text: a concept correlation search framework. In: Proceedings of the 7th international conference on web information systems and technologies, pp 707–715

Download references

Acknowledgments

This research is supported by a R&D grant from CANARIE, Canada, through the Network Enabled Platform program. We would also like to extend our gratitude to Dr. Isidora Katara for her valuable help in the evaluation of the proposed framework.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mei Kuan Wong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wong, M.K., Abidi, S.S.R. & Jonsen, I.D. A multi-phase correlation search framework for mining non-taxonomic relations from unstructured text . Knowl Inf Syst 38, 641–667 (2014). https://doi.org/10.1007/s10115-012-0593-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-012-0593-7

Keywords

Navigation