Mining Context-Specific Web Knowledge: An Experimental Dictionary-Based Approach

Di Lecce, Vincenzo; Calabrese, Marco; Soldo, Domenico

doi:10.1007/978-3-540-85984-0_108

Mining Context-Specific Web Knowledge: An Experimental Dictionary-Based Approach

Vincenzo Di Lecce¹,
Marco Calabrese¹ &
Domenico Soldo¹

Conference paper

2144 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5227))

Abstract

This work presents an experimental semantic approach for mining knowledge from the World Wide Web (WWW). The main goal is to build a context-specific knowledge base from web documents. The basic idea is to use a reference knowledge provided by a dictionary as the indexing structure of domain-specific computed knowledge instances organised in the form of interlinked text words. The WordNet lexical database has been used as reference knowledge for the English web documents. Both the reference and the computed knowledge are actually conceived as word graphs. Graph is considered here as a powerful way to represent structured knowledge. This assumption has many consequences on the way knowledge can be explored and similar knowledge patterns can be identified. In order to identify context-specific elements in knowledge graphs, the novel semantic concept of “minutia” has been introduced. A preliminary evaluation of the efficacy of the proposed approach has been carried out. A fair comparison strategy with other non-semantic competing approaches is currently under investigation.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American (May 2001)
Google Scholar
Kleinberg, J.M.: Authoritative Sources in a Hyperlinked Environment. In: Proc. of the 9th ACM-SIAM Symposium on Discrete Algorithms (SODA 1998), San Francisco, California, USA, January 1998, pp. 668–677 (1998); Journal of the ACM (JACM) 46, 604–632 (September 1999) (Extended version)
Google Scholar
Cimiano, P., Hotho, A., Staab, S.: Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis. Journal of Artificial Intelligence Research (JAIR) 24, 305–339 (2005)
MATH Google Scholar
Seo, Y.W., Ankolekar, A., Sycara, K.: Feature Selection for Extracting Semantically Rich Words. Technical report CMU-RI-TR-04-18 Robotics Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania (March 2004)
Google Scholar
Dellschaft, K., Staab, S.: On How to Perform a Gold Standard Based Evaluation of Ontology Learning. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 228–241. Springer, Heidelberg (2006)
Chapter Google Scholar
Brin, S., Page, L.: The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems 30, 107–117 (1998)
Article Google Scholar
Richardson, R., Smeaton, A.F., Murphy, J.: Using WordNet for Conceptual Distance Measurement. In: Proc. of the Annual BCS-IRSG Colloquium on IR Research, Glasgow, Scotland, pp. 100–123 (March 1994)
Google Scholar
Fellbaum, C.: WordNet: An Electronic Lexical Database (May 1998) ISBN-10: 0-262-06197-X
Google Scholar
Chakrabarti,S., Dom, B.E., Gibson, D., Kleinberg, J,M., Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Mining the Link Structure of the World Wide Web. IEEE Computer 32, 60–67 (1999)
Google Scholar
Kosala, R., Blockeel, H.: Web Mining Research: A Survey. ACM SIGKDD Explorations Newsletter 2, 1–15 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Polytechnic of Bari – II Faculty of Engineering - DIASS Polytechnic of Bari,
Vincenzo Di Lecce, Marco Calabrese & Domenico Soldo

Authors

Vincenzo Di Lecce
View author publications
You can also search for this author in PubMed Google Scholar
Marco Calabrese
View author publications
You can also search for this author in PubMed Google Scholar
Domenico Soldo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

De-Shuang Huang Donald C. Wunsch II Daniel S. Levine Kang-Hyun Jo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Di Lecce, V., Calabrese, M., Soldo, D. (2008). Mining Context-Specific Web Knowledge: An Experimental Dictionary-Based Approach. In: Huang, DS., Wunsch, D.C., Levine, D.S., Jo, KH. (eds) Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence. ICIC 2008. Lecture Notes in Computer Science(), vol 5227. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85984-0_108

Download citation

DOI: https://doi.org/10.1007/978-3-540-85984-0_108
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85983-3
Online ISBN: 978-3-540-85984-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics