Skip to main content

Rearranging Classified Items in Hierarchies Using Categorization Uncertainty

  • Conference paper
Advances in Data Analysis

Abstract

The classification into hierarchical structures is a problem of increasing importance, e.g. considering the growing use of ontologies or keyword hierarchies used in many web-based information systems. Therefore, it is not surprising that it is a field of ongoing research. Here, we propose an approach that utilizes hierarchy information in the classification process. In contrast to other methods, the hierarchy information is used independently of the classifier rather than integrating it directly. This enables the use of arbitrary standard classification methods. Furthermore, we discuss how hierarchical classification in general and our setting in specific can be evaluated appropriately. We present our algorithm and evaluate it on two datasets of web pages using Naïve Bayes and SVM as baseline classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • BADE, K. and NÜRNBERGER, A. (2005): Supporting Web Search by User Specific Document Categorization: Intelligent Bookmarks. Proc. of LIT05, 115–123.

    Google Scholar 

  • CAI, L. and HOFMANN, T. (2004): Hierarchical Document Categorization with Support Vector Machines. Proceedings of 13 th ACM Conference on Information and Knowledge Management, 78–87.

    Google Scholar 

  • CECI, M. and MALERBA, D. (2003): Hierarchical Classification of HTML Documents with WebClassII. Proc. of 25 th Europ. Conf. on Inform. Retrieval, 57–72.

    Google Scholar 

  • CESA-BIANCHI, N., GENTILE, C., TIRONI, A. and ZANIBONI, L. (2004): Incremental Algorithms for Hierarchical Classification. Neural Information Processing Systems, 233–240.

    Google Scholar 

  • CHANG, C. and LIN, C. (2001): LIBSVM: A Library for Support Vector Machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.

    Google Scholar 

  • CHOI, B. and PENG, X. (2004): Dynamic and Hierarchical Classification of Web Pages. Online Information Review, 28,2, 139–147.

    Article  Google Scholar 

  • DUMAIS, S. and CHEN, H. (2000): Hierarchical Classification of Web Content. Proceedings of the 23 rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 256–263.

    Google Scholar 

  • FROMMHOLZ, I. (2001): Categorizing Web Documents in Hierarchical Catalogues. Proceedings of the European Colloquium on Information Retrieval Research.

    Google Scholar 

  • GRANITZER, M. and AUER, P. (2005): Experiments with Hierarchical Text Classification. Proc. of 9 th IASTED Intern. Conference on Artificial Intelligence.

    Google Scholar 

  • HOTHO, A., NÜRNBERGER, A. and PAAß G. (2005): A Brief Survey of Text Mining. GLDV-J. for Comp. Linguistics & Language Technology, 20,1, 19–62.

    Google Scholar 

  • MCCALLUM, A., ROSENFELD, R., MITCHELL, T. and NG, A. (1998): Improving Text Classification by Shrinkage in a Hierarchy of Classes. Proceedings of the 15 th International Conference on Machine Learning (ICML98), 359–367.

    Google Scholar 

  • SINKA, M. and CORNE, D. (2002): A Large Benchmark Dataset forWeb Document Clustering. Soft Computing Systems: Design, Management and Applications, Volume 87 of Frontiers in Artificial Intelligence and Applications, 881–890.

    Google Scholar 

  • SUN, A. and LIM, E. (2001): Hierarchical Text Classification and Evaluation. Proc. of the 2001 IEEE International Conference on Data Mining, 521–528.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bade, K., Nürnberger, A. (2007). Rearranging Classified Items in Hierarchies Using Categorization Uncertainty. In: Decker, R., Lenz, H.J. (eds) Advances in Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70981-7_15

Download citation

Publish with us

Policies and ethics