Skip to main content

Mining Context-Specific Web Knowledge: An Experimental Dictionary-Based Approach

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5227))

Abstract

This work presents an experimental semantic approach for mining knowledge from the World Wide Web (WWW). The main goal is to build a context-specific knowledge base from web documents. The basic idea is to use a reference knowledge provided by a dictionary as the indexing structure of domain-specific computed knowledge instances organised in the form of interlinked text words. The WordNet lexical database has been used as reference knowledge for the English web documents. Both the reference and the computed knowledge are actually conceived as word graphs. Graph is considered here as a powerful way to represent structured knowledge. This assumption has many consequences on the way knowledge can be explored and similar knowledge patterns can be identified. In order to identify context-specific elements in knowledge graphs, the novel semantic concept of “minutia” has been introduced. A preliminary evaluation of the efficacy of the proposed approach has been carried out. A fair comparison strategy with other non-semantic competing approaches is currently under investigation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American (May 2001)

    Google Scholar 

  2. Kleinberg, J.M.: Authoritative Sources in a Hyperlinked Environment. In: Proc. of the 9th ACM-SIAM Symposium on Discrete Algorithms (SODA 1998), San Francisco, California, USA, January 1998, pp. 668–677 (1998); Journal of the ACM (JACM) 46, 604–632 (September 1999) (Extended version)

    Google Scholar 

  3. Cimiano, P., Hotho, A., Staab, S.: Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis. Journal of Artificial Intelligence Research (JAIR) 24, 305–339 (2005)

    MATH  Google Scholar 

  4. Seo, Y.W., Ankolekar, A., Sycara, K.: Feature Selection for Extracting Semantically Rich Words. Technical report CMU-RI-TR-04-18 Robotics Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania (March 2004)

    Google Scholar 

  5. Dellschaft, K., Staab, S.: On How to Perform a Gold Standard Based Evaluation of Ontology Learning. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 228–241. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Brin, S., Page, L.: The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems 30, 107–117 (1998)

    Article  Google Scholar 

  7. Richardson, R., Smeaton, A.F., Murphy, J.: Using WordNet for Conceptual Distance Measurement. In: Proc. of the Annual BCS-IRSG Colloquium on IR Research, Glasgow, Scotland, pp. 100–123 (March 1994)

    Google Scholar 

  8. Fellbaum, C.: WordNet: An Electronic Lexical Database (May 1998) ISBN-10: 0-262-06197-X

    Google Scholar 

  9. Chakrabarti,S., Dom, B.E., Gibson, D., Kleinberg, J,M., Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Mining the Link Structure of the World Wide Web. IEEE Computer 32, 60–67 (1999)

    Google Scholar 

  10. Kosala, R., Blockeel, H.: Web Mining Research: A Survey. ACM SIGKDD Explorations Newsletter 2, 1–15 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

De-Shuang Huang Donald C. Wunsch II Daniel S. Levine Kang-Hyun Jo

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Di Lecce, V., Calabrese, M., Soldo, D. (2008). Mining Context-Specific Web Knowledge: An Experimental Dictionary-Based Approach. In: Huang, DS., Wunsch, D.C., Levine, D.S., Jo, KH. (eds) Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence. ICIC 2008. Lecture Notes in Computer Science(), vol 5227. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85984-0_108

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85984-0_108

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85983-3

  • Online ISBN: 978-3-540-85984-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics