skip to main content
10.1145/3093241.3107883acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccdaConference Proceedingsconference-collections
research-article

Supervised Ontology-Based Document Classification Model

Authors Info & Claims
Published:19 May 2017Publication History

ABSTRACT

Ontology-based document classification relies on background knowledge exploited by ontologies to represent documents. Background knowledge is embedded in a document using the exact matching technique. The basic idea of this technique is to map a term to a concept by searching only the concept labels that explicitly occur in a document. Searching only the presence of concept labels limits the capabilities to capture and exploit the whole conceptualization involved in user information and content meanings. Therefore, to address this limitation, we propose a new document classification model based on ontologies. The proposed model uses background knowledge derived by ontologies for document representation. It associates a document with a set of concepts by not only using the exact matching technique but also by identifying and extracting new terms which can be semantically related to the concepts of ontologies. Additionally, the proposed model employs a new concept weighting technique which computes the weight of a concept using the relevance and the importance of the concept. We conducted several experiments using a real ontology and a dataset to test our proposed model. The results obtained by experiments run on 3 different classification algorithms using the baseline ontology, the improved concept vector space model by using the new concept weighting technique, and the enriched ontology, show that our proposed model achieved a considerable improvement of classification performance.

References

  1. Wang, P., and Domeniconi, C. 2008. Building Semantic Kernels for Text Classification Using Wikipedia. In Proceedings of the 14th ACM International Conference on Knowledge Discovery and Data Mining, pp. 713--721. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Kastrati, Z., Imran, A., and Yayilgan, S. 2016. SEMCON - A Semantic and Contextual Objective Metric for Enriching Domain Ontology Concepts. International Journal on Semantic Web and Information Systems, vol. 12(2), pp. 1--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Kastrati, Z., Imran, A.S., and Yayilgan, S.Y. 2015. SEMCON: Semantic and Contextual Objective Metric. In Proceedings of the 9th IEEE International Conference on Semantic Computing, pp. 65--68. Google ScholarGoogle ScholarCross RefCross Ref
  4. Kastrati, Z., Imran, A., and Yayilgan, S.Y. 2015. An Improved Concept Vector Space Model for Ontology Based Classification. In Proceedings of the 11th International Conference on Signal Image Technology & Internet Systems, pp. 240--245. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Nyberg, K., Raiko, T., Tinanen, T., and Hyvonen E. 2010. Document Classification Utilising Ontologies and Relations between Documents. In Proceedings of the 8th Workshop on Mining and Learning with Graphs, pp.86-93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Camous, F., Blott, S., and Smeaton, A. 2007. Ontology-Based MEDLINE Document Classification. In S. Hochreiter, & R. Wagner (Ed.), LNCS: Vol. 4414. Bioinformatics Research and Development, pp. 439--452. Google ScholarGoogle ScholarCross RefCross Ref
  7. Dinh, D., and Tamine, L. 2011. Biomedical Concept Extraction Based on Combining the Content-based and Word Order Similarities. In Proceedings of the ACM Symposium on Applied Computing, pp 1159--1163. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Sy, M-F., Ranwez, S., Montmain, J., Regnault, A., Crampes, M., and Ranwez, V. 2012. User Centered and Ontology Based Information Retrieval System for Life Sciences. BMC Bioinformatics, 13(1).Google ScholarGoogle Scholar
  9. Fang, J., Guo, L., and Niu, Y. 2010. Documents Classification by Using Ontology Reasoning and Similarity Measure. In Proceedings of the 7th International Conference on Fuzzy Systems and Knowledge Discovery, pp. 1535--1539. Google ScholarGoogle ScholarCross RefCross Ref
  10. Keikha, M., Khonsari, A., and Oroumchian, F. 2009. Rich document representation and classification: An analysis. Knowledge-Based Systems, vol. 22(1), pp. 67--71. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Deng, S., and Peng, H. 2006. Document Classification Based on Support Vector Machine Using A Concept Vector Model. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, pp. 473--476. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Castells, P., Fernandez, M., and Vallet D. 2007. An Adaptation of the Vector Space Model for Ontology Based Information Retrieval. IEEE Transactions on Knowledge and data engineering, vol. 19(2), pp. 261--272. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Supervised Ontology-Based Document Classification Model

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICCDA '17: Proceedings of the International Conference on Compute and Data Analysis
        May 2017
        307 pages
        ISBN:9781450352413
        DOI:10.1145/3093241

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 19 May 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited
      • Article Metrics

        • Downloads (Last 12 months)1
        • Downloads (Last 6 weeks)0

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader