Abstract
This paper proposes an ontology learning method which is used to generate a graphical ontology structure called ontology graph. The ontology graph defines the ontology and knowledge conceptualization model, and the ontology learning process defines the method of semiautomatic learning and generates ontology graphs from Chinese texts of different domains, the so-called domain ontology graph (DOG). Meanwhile, we also define two other ontological operations—document ontology graph generation and ontology graph-based text classification, which can be carried out with the generated DOG. This research focuses on Chinese text data, and furthermore, we conduct two experiments: the DOG generation and ontology graph-based text classification, with Chinese texts as the experimental data. The first experiment generates ten DOGs as the ontology graph instances to represent ten different domains of knowledge. The generated DOGs are then further used for the second experiment to provide performance evaluation. The ontology graph-based approach is able to achieve high text classification accuracy (with 92.3 % in f-measure) over other text classification approaches (such as 86.8 % in f-measure for tf–idf approach). The better performance in the comparative experiments reveals that the proposed ontology graph knowledge model, the ontology learning and generation process, and the ontological operations are feasible and effective.
Similar content being viewed by others
References
Alani H, Sanghee K, Millard DE, Weal MJ, Hall W, Lewis PH, Shadbolt NR (2003) Automatic ontology-based knowledge extraction from web documents. IEEE Intell Syst 18(1):14–21
Besana P, Robertson D (2008) Probabilistic dialogue models for dynamic ontology mapping. Lect Notes Comput Sci 5327:41–51
Buitelaar P, Ciomiano P (2008) Ontology learning and population: bridging the gap between text and knowledge. IOS Press, The Netherlands
Busagala LSP, Ohyama W, Wakabayashi T, Kimura F (2008) Improving automatic text classification by integrated feature analysis. IEICE Trans Inf Syst E91(D4):1101–1109
Chen WQ, Mizoguchi R (1999) Communication content ontology for learner model agent in multi-agent architecture. Adv Res Comput Commun Educ 95–102
Cimiano P, Hotho A, Staab S (2005) Learning concept hierarchies from text corpora using formal concept analysis. J Artif Intell Res 24(1):305–339
Dahab MY, Hassan HA, Rafea A (2008) TextOntoEx: automatic ontology construction from natural English text. Expert Syst Appl 34(2):1474–1480
Dicheva D, Dichev C (2007) Authors support in the TM4L environment. Int J Inf Technol Knowl 1(3):215–219
Dong ZD, Dong Q (2006) HowNet and the computation of meaning. World Scientific Publishing Company, Singapore
Etzioni O, Cafarella M, Downey D,Popescu AM, Shaked T, Soderland S, Weld DS, Yates A (2005) Unsupervised named-entity extraction from the web: an experimental study. Artif Intell 165(1):91–134
Fensel D, van Harmelen F, Horrocks I, McGuinness DL, Patel-Schneider PF (2001) OIL: an ontology infrastructure for the semantic web. IEEE Intell Syst 16(2):38–45
Forman G (2003) An extensive empirical study of feature selection metrics for text classification. Int J Mach Learn Res 3(7–8):1289–1305
Gacitua R, Sawyer P, Rayson P (2008) A flexible framework to experiment with ontology learning techniques. Knowl Based Syst 21(3):192–199
Gruber TR (2008) Ontology, encyclopedia of database systems. Springer, Berlin
Haase P, Völker J (2008) Ontology learning and reasoning-dealing with uncertainty and inconsistency. Lect Notes Comput Sci 5327:366–384
Hazman M, El-Beltagy SR, Rafea A (2009) Ontology learning from domain specific Web documents. Int J Metadata Semant Ontol 4(1/2):24–33
Koutero A, Fujita S, Sugawara K (2010) Design of an assisting agent using a dynamic ontology. Proc IEEE/ACIS Int Conf Comput Inf Sci 611–616
Lan M, Tan CL, Su J, Lu Y (2009) Supervised and traditional term weighting methods for automatic text categorization. IEEE Trans Pattern Anal Mach Intell 31(4):721–735
Lim E, Liu J, Lee R (2009) Knowledge discovery from text learning for ontology modelling. Proc Int Conf Fuzz Syst Knowl Discov 7:227–231
Lougheed P, Bogyo B, Brokenshire D, Kumar V (2005) Towards formalizing electronic portfolios. In: Proceedings of the international workshop applications of semantic Web techniques E-Learn. pp 9–18
Maedche A (2001) Ontology learning for the semantic web. IEEE Intell Syst 16(2):72–79
Mahinovs A, Tiwari A (2007) Text classification method review. Decis Eng Rep Ser 1–13
Missikoff M, Velardi P, Fabriani P (2003) Text mining techniques to automatically enrich a domain ontology. Appl Intell 18(3):323–340
Mochol M, Jentzsch A, Euzenat J (2006) Applying an analytic method for matching approach selection. In: Proceedings of the international workshop Ontology Match. pp 37–48
Navigli R, Velardi P (2004) Learning domain ontologies from document warehouses and dedicated web sites. Comput Linguist 30(2):151–179
Navigli R, Velardi P, Cucchiarelli R, Neri F (2004) Automatic ontology learning: supporting a per-concept evaluation by domain experts.in: Proceedinds of the European Conference on artificial intelligence. http://olp.dfki.de/ecai04/final-velardi.pdf
Noy NF, Musen MA (2000) PROMPT: algorithm and tool for automated ontology merging and alignment. In: Proceedings of the international conference on articial intelligence and conference on Innovative appliance articial intelligence. pp 450–455
Oberle D, Eberhart A, Staab S, Volz R (2004) Developing and managing software components in an ontology-based application server. Lect Notes Comput Sci 3231:459–477
Oddy RN (1981) Information retrieval research. Butterworths, London
Ottens K, Aussenac-Gilles N, Gleizes MP, Camps V (2007) Dynamic ontology co-evolution from texts: principles and case study. In: Proceedings of the international workshop on the emerging semantics ontology evolving. pp 70–83
Rosse C, Mejino JL Jr (2003) A reference ontology for biomedical informatics: the foundational model of anatomy. J Biomed Inform 36(6):478–500
Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv 34(1):1–47
Shoham Y (1987) Temporal logics in AI: semantical and ontological considerations. Artif Intell 33(1):89–104
Simperl E (2009) Reusing ontologies on the semantic web: a feasibility study. Data Knowl Eng 68(10):905−925
Sun AX, Lim EP, Ng WK (2003) Performance measurement framework for hierarchical text classification. J Am Soc Inf Sci Tech 54(11):1014–1028
Vacura M, Svátek V, Smrž P (2008) Pattern-based framework for representation of uncertainty in ontologies.In: Proceedings of the international conference on text speech and dialogue. pp 227–234
Vagin V, Fomina M (2011) Problem of knowledge discovery in noisy databases. Int J Mach Learn Cyber 2(3):135–145
Wang XZ, Dong LC, Yan JH (2012) Maximum ambiguity based sample selection in fuzzy decision tree induction. IEEE Trans Knowl Data Eng 24(8):1491–1505
Wang XZ, He YL, Dong LC, Zhao HY (2011) Particle swarm optimization for determining fuzzy measures from data. Inf Sci 181(19):4230–4252
Warren P (2006) Knowledge management and the semantic web: from scenario to technology. IEEE Intell Syst 21(1):53−59
Yi WG, Lu MY, Liu Z (2011) Multi-valued attribute and multi-labeled data decision tree algorithm. Int J Mach Learn Cyber 2(2):67–74
Zahiri SH (2012) Classification rule discovery using learning automata. Int J Mach Learn Cyber 3(3):205–213
Zhang Y, Vasconcelos W, Sleeman D (2005) OntoSearch: an ontology search engine. Res Dev Intell Syst XXI 1a:58–69
Zhang Q, Xing CX, Zhou LZ, Feng JH (2003) An ontology-based method for querying the web data. In: Proceedings of the international conference on advanced information network application 628-631
Zhong ZM, Liu ZT, Li CH, Guan Y (2012) Event ontology reasoning based on event class influence factors. Int J Mach Learn Cyber 3(2):133–139
Acknowledgments
The authors are very grateful for the editors and anonymous reviewers. Their many valuable and constructive comments and suggestions helped us significantly improve this work. This work was supported in part by the GRF Grant 5237/08E, by the CRG Grant G-U756 of The Hong Kong Polytechnic University, and by the National Natural Science Foundations of China under Grant 60903088 and Grant 61170040.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Liu, J.N.K., He, Yl., Lim, E.H.Y. et al. Domain ontology graph model and its application in Chinese text classification. Neural Comput & Applic 24, 779–798 (2014). https://doi.org/10.1007/s00521-012-1272-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-012-1272-z