skip to main content
10.1145/1242572.1242778acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
Article

Altering document term vectors for classification: ontologies as expectations of co-occurrence

Published: 08 May 2007 Publication History

Abstract

In this paper we extend the state-of-the-art in utilizing background knowledge for supervised classification by exploiting the semantic relationships between terms explicated in Ontologies. Preliminary evaluations indicate that the new approach generally improves precision and recall, more so for hard to classify cases and reveals patterns indicating the usefulness of such background knowledge.

References

[1]
Aleman-Meza B. et. al, Semantic Analytics on Social Networks: Experiences in Addressing the Problem of Conflict of Interest Detection. WWW 2006.
[2]
Aleman-Meza B. et al., An Ontological Approach to the Document Access Problem of Insider Threat. IEEE International Conference on Intelligence and Security Informatics, 2005.
[3]
Kemafor A. et. al. SemRank: Ranking Complex Relationship Search Results on the Semantic Web. WWW 2005.
[4]
http://lucene.apache.org/java/docs/index.html Apache Lucene
[5]
Baeza-Yates R., B. Ribeiro-Neto, Modern Information Retrieval. 1999 Addison--Wesley.
[6]
Cavnar W.B., J.M. Trenkle, N-Gram-Based Text Categorization. 1994 In Proceedings of the SDAIR.
[7]
Semantic Document Classification http://lsdis.cs.uga.edu/semdis/DocumentClassification.html
[8]
Halaschek C. et. al. Discovering and Ranking Semantic Associations over a Large RDF Metabase. VLDB 2004
[9]
Han, E. and Karypis, G., Centroid-Based Document Classification: Analysis Experimental Results Principles of Data Mining and Knowledge Discovery, 2000
[10]
Miller George A, WordNet: A Lexical Database for English. 1995 Communications of the ACM, 38 (11). 39--41.
[11]
Mladenic D. and M. Grobelnik., Feature selection for classification based on text hierarchy. Automated Learning and Discovery 1998.
[12]
Salton G. and C. Buckley, Term Weighting Approaches in Automatic Text Retrieval. 1987 Technical Report
[13]
Salton G. et al., A Vector Space Model for Automatic Indexing. 1975 Communications of the ACM, vol. 18, nr. 11, pages 613--620.
[14]
Scott S. and S. Matwin., Text Classification Using WordNet Hypernyms. Use of WordNet in Natural Language Processing Systems, 1998.

Cited By

View all
  • (2022)Framework for the Analysis of Resilient Performance Conditionings in Integrated Operations of the Oil and Gas IndustryResilience in a Digital Age10.1007/978-3-030-85954-1_6(71-92)Online publication date: 11-Mar-2022
  • (2018)Document Clustering Using an Ontology-Based Vector Space ModelInformation Retrieval and Management10.4018/978-1-5225-5191-1.ch085(1860-1883)Online publication date: 2018
  • (2017)Semantic enrichment of product data supported by machine learning techniques2017 International Conference on Engineering, Technology and Innovation (ICE/ITMC)10.1109/ICE.2017.8280056(1472-1479)Online publication date: Jun-2017
  • Show More Cited By

Index Terms

  1. Altering document term vectors for classification: ontologies as expectations of co-occurrence

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WWW '07: Proceedings of the 16th international conference on World Wide Web
      May 2007
      1382 pages
      ISBN:9781595936547
      DOI:10.1145/1242572
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 08 May 2007

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. background domain knowledge
      2. ranking semantic relationships
      3. supervised document classification
      4. vector space models

      Qualifiers

      • Article

      Conference

      WWW'07
      Sponsor:
      WWW'07: 16th International World Wide Web Conference
      May 8 - 12, 2007
      Alberta, Banff, Canada

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 27 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)Framework for the Analysis of Resilient Performance Conditionings in Integrated Operations of the Oil and Gas IndustryResilience in a Digital Age10.1007/978-3-030-85954-1_6(71-92)Online publication date: 11-Mar-2022
      • (2018)Document Clustering Using an Ontology-Based Vector Space ModelInformation Retrieval and Management10.4018/978-1-5225-5191-1.ch085(1860-1883)Online publication date: 2018
      • (2017)Semantic enrichment of product data supported by machine learning techniques2017 International Conference on Engineering, Technology and Innovation (ICE/ITMC)10.1109/ICE.2017.8280056(1472-1479)Online publication date: Jun-2017
      • (2017)Mathematical Method of Translation into Ukrainian Sign Language Based on OntologiesAdvances in Intelligent Systems and Computing II10.1007/978-3-319-70581-1_7(89-100)Online publication date: 22-Nov-2017
      • (2016)Improving ontology-based text classificationJournal of Applied Logic10.1016/j.jal.2015.09.00817:C(48-58)Online publication date: 1-Sep-2016
      • (2015)Document Clustering Using an Ontology-Based Vector Space ModelInternational Journal of Information Retrieval Research10.4018/IJIRR.20150701035:3(39-60)Online publication date: Jul-2015
      • (2015)Management of Knowledge Sources Supported by Domain OntologiesInternational Journal of Intelligent Systems in Accounting and Finance Management10.1002/isaf.136122:1(29-64)Online publication date: 1-Jan-2015
      • (2014)Text Classification Techniques in Oil Industry ApplicationsInternational Joint Conference SOCO’13-CISIS’13-ICEUTE’1310.1007/978-3-319-01854-6_22(211-220)Online publication date: 2014
      • (2013)Distributional term representations for short-text categorizationProceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 210.1007/978-3-642-37256-8_28(335-346)Online publication date: 24-Mar-2013
      • (2009)Exploiting term relationship to boost text classificationProceedings of the 18th ACM conference on Information and knowledge management10.1145/1645953.1646192(1637-1640)Online publication date: 2-Nov-2009
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media