Skip to main content

Concept Hierarchy Construction by Combining Spectral Clustering and Subsumption Estimation

  • Conference paper
Web Information Systems – WISE 2006 (WISE 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4255))

Included in the following conference series:

Abstract

With the rapid development of the Web, how to add structural guidance (in the form of concept hierarchies) for Web document navigation becomes a hot research topic. In this paper, we present a method for the automatic acquisition of concept hierarchies. Given a set of concepts, each concept is regarded as a vertex in an undirected, weighted graph. The problem of concept hierarchy construction is then transformed into a modified graph partitioning problem and solved by spectral methods. As the undirected graph cannot accurately depict the hyponymy information regarding the concepts, subsumption estimation is introduced to guide the spectral clustering algorithm. Experiments on real data show very encouraging results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Yahoo Directory, http://www.yahoo.com

  2. Open Directory, http://www.dmoz.org

  3. Chuang, S.-L., Chien, L.-F.: A Practical Web-based Approach to Generating Topic Hierarchy for Text Segments. In: Proceedings of the 2004 ACM CIKM International Conference on Information and Knowledge Management, Washington, pp. 127–136 (2004)

    Google Scholar 

  4. Fellbaum, C.: WordNet, An Electronic Lexical Database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  5. Cimiano, P., Pivk, A., Schmidt-Thieme, L., Staab, S.: Learning Taxonomic Relations from Heterogeneous Sources of Evidence. In: Ontology Learning from Text: Methods, Evaluation and Applications, pp. 59–73. IOS Press, Amsterdam (2005)

    Google Scholar 

  6. Caraballo, S.A.: Automatic Construction of a Hypernym-labeled Noun Hierarchy from Text. In: Proceedings of 27th Annual Meeting of the Association for Computational Linguistics, Univeristy of Maryland, College Park, Maryland (1999)

    Google Scholar 

  7. Velardi, P., Fabriani, P., Missikoff, M.: Using Text Processing Techniques to Automatically Enrich a Domain Ontology. In: Proceedings of 2nd International Conference on Formal Ontology in Information Systems, pp. 270–284 (2001)

    Google Scholar 

  8. Hearst, M.A.: Automatic Acquisition of Hyponyms from Large Text Corpora. In: Proceedings of the 14th International Conference on Computational Linguistics, pp. 539–545 (1992)

    Google Scholar 

  9. Zamir, O., Etzioni, O.: Grouper: A Dynamic Clustering Interface to Web Search Results. Computer Networks 31(11-16), 1361–1374 (1999)

    Article  Google Scholar 

  10. Zeng, H.-J., He, Q.-C., Chen, Z., Ma, W.-Y.: Learning To Cluster Web Search Results. In: Proceedings of the 27th Annual International Conference on Research and Development in Information Retrieval, Sheffield, United Kingdom, pp. 210–217 (2004)

    Google Scholar 

  11. Vivisimo, http://vivisimo.com/html/index

  12. Gao, B., Liu, T.-Y., Feng, G., Qin, T., Cheng, Q.S., Ma, W.-Y.: Hierarchical Taxonomy Preparation for Text Categorization Using Consistent Bipartite Spectral Graph Copartitioning. IEEE Trans. Knowl. Data Eng. 17(9), 1263–1273 (2005)

    Article  Google Scholar 

  13. Sanderson, M., Croft, B.: Deriving Concept Hierarchies from Text. In: Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 206–213 (1999)

    Google Scholar 

  14. Shi, J., Malik, J.: Normalized Cuts and Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)

    Article  Google Scholar 

  15. Dunning, T.: Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics 19(1), 61–74 (1993)

    Google Scholar 

  16. Manning, C., Schutze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)

    MATH  Google Scholar 

  17. Yang, Y., Pedersen, J.O.: A Comparative Study on Feature Selection in Text Categorization. In: Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), Nashville, Tennessee, pp. 412–420 (1997)

    Google Scholar 

  18. Brin, S., Page, L.: The Anatomy of a Large-scale Hypertextual Web Search Engine. Computer Networks 30(1-7), 107–117 (1998)

    Google Scholar 

  19. Zhang, Z., Chen, J., Li, X.: A Preprocessing Framework and Approach for Web Applications. Journal of Web Engineering 2(3), 175–191 (2004)

    MathSciNet  Google Scholar 

  20. Grady, L., Schwartz, E.L.: The Graph Analysis Toolbox: Image Processing on Arbitrary Graphs. Technical Report, Boston University, Boston, MA (2003)

    Google Scholar 

  21. Golub, G.H., Van Loan, C.F.: Matrix Computations. John Hopkins Press (1989)

    Google Scholar 

  22. Chen, J., Li, Q., Jia, W.: Automatically Generating an E-textbook on the Web. World Wide Web 8(4), 377–394 (2005)

    Article  Google Scholar 

  23. Chung, F.: Spectral Graph Theory. American Mathematical Society (1997)

    Google Scholar 

  24. Church, K.W., Hanks, P.: Word Association Norms, Mutual Information and Lexicography. Computational Linguistics 16(1), 22–29 (1990)

    Google Scholar 

  25. Snedecor, G.W., Cochran, W.G.: Statistical Methods, 8th edn. Iowa State University Press (1989)

    Google Scholar 

  26. Grady, L., Schwartz, E.L.: Isoperimetric Graph Partitioning for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, J., Li, Q. (2006). Concept Hierarchy Construction by Combining Spectral Clustering and Subsumption Estimation. In: Aberer, K., Peng, Z., Rundensteiner, E.A., Zhang, Y., Li, X. (eds) Web Information Systems – WISE 2006. WISE 2006. Lecture Notes in Computer Science, vol 4255. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11912873_22

Download citation

  • DOI: https://doi.org/10.1007/11912873_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-48105-8

  • Online ISBN: 978-3-540-48107-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics