Abstract
Over the last decade, ontology engineering has been pursued by “learning” the ontology from domain-specific electronic documents. Most of the research works are focused on extraction of concepts and taxonomic relations. The extraction of non-taxonomic relations is often neglected and not well researched. In this paper, we present a multi-phase correlation search framework to extract non-taxonomic relations from unstructured text. Our framework addresses the two main problems in any non-taxonomic relations extraction: (a) the discovery of non-taxonomic relations and (b) the labelling of non-taxonomic relations. First, our framework is capable of extracting correlated concepts beyond ordinary search window size of a single sentence. Interesting correlations are then filtered using association rule mining with lift interestingness measure. Next, our framework distinguishes non-taxonomic concept pairs from taxonomic concept pairs based on existing domain ontology. Finally, our framework features the usage of domain related verbs as labels for the non-taxonomic relations. Our proposed framework has been tested with the marine biology domain. Results have been validated by domain experts showing reliable results as well as demonstrate significant improvement over traditional association rule approach in search of non-taxonomic relations from unstructured text.
Similar content being viewed by others
References
Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD conference on management of data, pp 207–216
Alvarez SA (2003) Chi-squared computation for association rules: Preliminary results. Technical report BC-CS-2003-01, Computer Science Department, Boston College
Bui QC, Katrenko S, Sloot PMA (2011) A hybrid approach to extract protein-protein interactions. Bioinformatics 27(2):259–265
Buitelaar P, Cimiano P, Grobelnik M et al (2005) Ontology learning from text. In: Tutorial at ECML/PKDD
Chagnoux M, Hernandez N, Aussenac-Gilles N, (2008) An interactive pattern based approach for extracting non-taxonomic relations from texts. In: Workshop on ontology learning and population (associated to ECAI, (2008) OLP. University of Patras, Patras, pp 1–6
Chowdhury MFM, Lavelli A (2012) Combining tree structures, flat features and patterns for biomedical relation extraction. In: EACL, pp 420–429
Cimiano P, Völker J (2005) Text2Onto: a framework for ontology learning and data-driven change discovery. In: Proceedings of the 10th international conference on applications and natural language to databases (NLDB ’05), pp 227–238
Cimiano P, Völker J, Studer R (2006) Ontologies on demand? A description of the state-of-the-art, applications, challenges and trends for ontology learning from text. Information, Wissenschaft und Praxis 57(6–7):315–320
Cunningham H (2002) GATE, a general architecture for text engineering. Comput Hum 36(2):223–254
Ding L, Finin T, Joshi A et al (2004) Swoogle: a search and metadata engine for the semantic web. In: Proceedings of the 13th ACM international conference on information and knowledge management (CIKM 2004), pp 652–659
Fundel K, Küffner R, Zimmer R (2007) RelEx–relation extraction using dependency parse trees. Bioinformatics 23(3):365–371
Gulla JA, Brasethvik T, Kvarv GS (2009) Association rules and cosine similarities in ontology relationship learning. In Enterprise information systems. Springer, Berlin, pp 201–212
Hall M, Frank E, Holmes G et al (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18
Jang H, Lim J, Lim JH et al (2006) Finding the evidence for protein-protein interactions from PubMed abstracts. Bioinformatics 22(14):220–226
Kamaruddin SS, Hamdan AR, Bakar AA et al (2009) Automatic extraction of performance indicators from financial statements. In: Proceedings of the international conference on electrical engineering and informatics (ICEEI’ 09), pp 348–350
Kavalec M, Maedche A, Svátek V (2003) Discovery of lexical entries for non-taxonomic relations in ontology learning. In: Theory and practice of computer science, pp 17–33, LNCS, vol 2932, SOFSEM 2004
Kavalec M, Svaték V (2005) A study on automated relation labelling in ontology learning. In: Buitelaar P, Cimiano P, Magnini B (eds) Ontology learning from text: methods, evaluation and applications. IOS Press, Amsterdam, pp 44–58
Kornfeld W, Wattecamps J (1998) Automatically locating, extracting and analyzing tabular data. In: Proceedings of the 21st ACM SIGIR international conference on research and development in, information retrieval, pp 347–348
Maedche A (2002) Ontology learning for the semantic web. Kluwer Academic Publishers, Norwell
Maedche A, Staab S (2000) Discovering conceptual relations from text. In: Proceedings of the 13th european conference on, artificial intelligence (ECAI-2000), pp 321–325
Maedche A, Staab S (2000) The text-to-onto ontology learning environment. In: Software demonstration at the 8th international conference on conceptual structures (ICSS-2000), pp 14–18
Nedellec C (2000) Corpus-based learning of semantic relations by the ILP system, Asium. In: Cussens J, Dzeroski S (eds) Proceedings of learning language in logic. Springer, Berlin, pp 259–278
Punuru J, Chen J (2012) Learning non-taxonomical semantic relations from domain texts. J Intell Inf Syst 38(1):191–207
Rinaldi F, Schneider G, Kaljurand K et al (2007) Mining of relations between proteins over biomedical scientific literature using a deep-linguistic approach. Artif Intell Med 39(2):127–136
Sánchez D, Moreno A (2008) Learning non-taxonomic relationships from web documents for domain ontology construction. Data Knowl Eng 64(3):600–623
Serra I, Girardi R (2011) Extracting non-taxonomic relationships of ontologies from texts. Intell Inf Manag 3(4):119–124
Schutz A, Buitelaar P (2005) RelExt: a tool for relation extraction from text in ontology extension. In: Proceedings of the 4th international semantic web conference, pp 593–606
Shamsfard M, Barforoush AA (2004) Learning ontologies from natural language texts. Int J Hum Comput Stud 60(1):17–63
Sheikh L, Tanveer B, Hamdani M (2004) Interesting measures for mining association rules. In: Proceedings of the 8th IEEE international multi-topic conference (INMIC ’04), pp 641–644
Shen M, Liu DR, Huang YS (2012) Extracting semantic relations to enrich domain ontologies. J Intell Inf Syst 39(3):749–761
Velardi P, Navigli R, Cucchiarelli A et al (2005) Evaluation of OntoLearn, a methodology for automatic learning of domain ontologies. In: Buitelaar P, Cimiano P, Magnini B (eds) Ontology learning from text: methods, applications and evaluation. IOS Press, Amsterdam, pp 92–106
Villaverde J, Persson A, Godoy D et al (2009) Supporting the discovery and labeling of non-taxonomic relationships in ontology learning. Expert Syst Appl 36(7):10288–10294
Weichselbraun A, Wohlgenannt G, Scharl A (2010) Refining non-taxonomic relation labels with external structured data to support ontology learning. Data Knowl Eng Eng 69(8):763–778
Witten IH, Paynter GW, Frank E et al (1999) KEA: practical automatic keyphrase extraction. In: Proceedings of the 4th ACM conference on digital libraries, pp 254–255
Wong MK, Abidi SSR, Jonsen ID (2011) Mining non-taxonomic concept pairs from unstructured text: a concept correlation search framework. In: Proceedings of the 7th international conference on web information systems and technologies, pp 707–715
Acknowledgments
This research is supported by a R&D grant from CANARIE, Canada, through the Network Enabled Platform program. We would also like to extend our gratitude to Dr. Isidora Katara for her valuable help in the evaluation of the proposed framework.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wong, M.K., Abidi, S.S.R. & Jonsen, I.D. A multi-phase correlation search framework for mining non-taxonomic relations from unstructured text . Knowl Inf Syst 38, 641–667 (2014). https://doi.org/10.1007/s10115-012-0593-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-012-0593-7