Skip to main content

Hierarchical Topic Term Extraction for Semantic Annotation in Chinese Bulletin Board System

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4185))

Abstract

With the current growing interest in the Semantic Web, the demand for ontological data has been on the verge of emergency. Currently many structured and semi-structured documents have been applied for ontology learning and annotation. However, most of the electronic documents on the web are plain-text, and these texts are still not well utilized for the Semantic Web. In this paper, we propose a novel method to automatically extract topic terms to generate a concept hierarchy from the data of Chinese Bulletin Board System (BBS), which is a collection of plain-text. In addition, our work provides the text source associated with the extracted concept as well, which could be a perfect fit for the semantic search application that makes a fusion of both formal and implicit semantics. The experimental results indicate that our method is effective and the extracted concept hierarchy is meaningful.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berendt, B., Hotho, A., Stumme, G.: Towards semantic web mining. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, p. 264. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  2. Cunningham, H.: Information Extraction: a User Guide (revised version), University of Sheffield, May, 1999. Department of Computer Science (1999)

    Google Scholar 

  3. Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. In: Proc. of the 40th Anniversary Meeting of the Association for Computational Linguistics (2002)

    Google Scholar 

  4. Dill, S., Eiron, N., Gibson, D., Gruhl, D., Guha, R., Jhingran, A., Kanungo, T., Rajagopalan, S., Tomkins, A., Tomlin, J.A., Zien, J.Y.: Semtag and seeker: Bootstrapping the semantic web via automated semantic annotation. In: The Twelfth International World Wide Web Conference, WWW 2003 (2003)

    Google Scholar 

  5. Fellbaum, C.: WordNet: on Electronic lexical Database. MIT Press, Cambridge

    Google Scholar 

  6. Handschuh, S., Staab, S., Maedche, A.: Creating relational metadata with a component- based, ontology driven frame work. In: Proceeding sofK-Cap 2001, Victoria, BC, Canada (October 2001)

    Google Scholar 

  7. http://www.nlp.org.cn

  8. Lawrie, D., Croft, W.B.: Finding Topic Words for Hierarchical Summarization. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2001 (2001)

    Google Scholar 

  9. Levenshtein, V.I.: Binary Codes Capable of Correcting Deletions, Insertions and Reversals. Cybernetics and Control Theory 10, 707–710 (1966)

    MathSciNet  Google Scholar 

  10. Li, L., Liu, Q.L., Zhang, L., Yu, Y.: PDLP: Providing an Uncertainty Reasoning Service for Semantic Web Application. In: Zhou, X., Li, J., Shen, H.T., Kitsuregawa, M., Zhang, Y. (eds.) APWeb 2006. LNCS, vol. 3841, pp. 628–639. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  11. Li, Y., Zhang, L., Yu, Y.: Learning to Generate Semantic Annotation for Domain Specific Sentences. In: K-CAP 2001 Workshop on Knowledge Markup & Semantic Annotation, Victoria B.C., Canada, October 21 (2001)

    Google Scholar 

  12. Liu, B., Hu, M., Cheng, J.H.: Opinion Observer: Analyzing and Comparing Opinions on the Web. In: WWW 2005, Chiba, Japan (2005)

    Google Scholar 

  13. Liu, W., Xue, G.-R., Huang, S., Yu, Y.: Interactive chinese search results clustering for personalization. In: Fan, W., Wu, Z., Yang, J. (eds.) WAIM 2005. LNCS, vol. 3739, pp. 676–681. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  14. Matsuo, Y., Ishizuka, M.: Keyword Extraction from a Single Document using Word Co-occurrence Statistical Information. International Journal on Artificial Intelligence Tools

    Google Scholar 

  15. Maynard, D., Tablan, V., Bontcheva, K., Cunningham, H., Wilks, Y.: MUlti-Source Entity recognition – an Information Extraction System for Diverse Text Types. Technical report CS–02–03, Univ. of Sheffield, Dep. of CS (2003)

    Google Scholar 

  16. Mori, J., Matsuo, Y., Ishizuka, M.: Personal Keyword Extraction from the Web. Journal of Japanese Society of Artificial Intelligence 20(5), 337–345 (2005)

    Article  Google Scholar 

  17. Resnik, P.: Semantic Similarity in a taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language. Journal of Artificial Intelligence Research

    Google Scholar 

  18. Sanderson, M., Croft, W.B.: Deriving concept hierarchies from text. In: Proceedings of the 22nd annual international ACM SIGIR conference on Research and Development in Information Retrieval, pp. 206–213 (1999)

    Google Scholar 

  19. Sekine, S., Sudo, K., Nobata, C.: Extended Named Entity Hierarchy. In: LREC 2002 (2002)

    Google Scholar 

  20. Turney, P.D.: Coherent key phrase extraction via Web mining. In: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, IJCAI 2003 (2003)

    Google Scholar 

  21. Witten, I., Paynter, G., Frank, E., Gutwin, C., NevillManning, C.: KEA: Practical Automatic Key phrase Extraction. In: The Proceedings of ACM Digital Libraries Conference, pp. 254–255 (1999)

    Google Scholar 

  22. Zhang, L., Yu, Y., Zhou, J., Lin, C.X., Yang, Y.: An Enhanced Model for Searching in Semantic Portals. In: Proc. of 14th International World Wide Web Conference (WWW 2005), Chiba, Japan, May 10-14 (2005)

    Google Scholar 

  23. Zeng, H.J., He, Q.C., Chen, Z., Ma, W.Y., Ma, J.W.: Learning to Cluster Web Search Results. In: SIGIR 2004, Sheffield, South Yorkshire, UK, July 25–29 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wu, X., Huang, S., Zhang, J., Yu, Y. (2006). Hierarchical Topic Term Extraction for Semantic Annotation in Chinese Bulletin Board System. In: Mizoguchi, R., Shi, Z., Giunchiglia, F. (eds) The Semantic Web – ASWC 2006. ASWC 2006. Lecture Notes in Computer Science, vol 4185. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11836025_4

Download citation

  • DOI: https://doi.org/10.1007/11836025_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-38329-1

  • Online ISBN: 978-3-540-38331-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics