Knowledge Discovery over the Deep Web, Semantic Web and XML

Varde, Aparna; Suchanek, Fabian; Nayak, Richi; Senellart, Pierre

doi:10.1007/978-3-642-00887-0_73

Aparna Varde¹⁹,
Fabian Suchanek²⁰,
Richi Nayak²¹ &
…
Pierre Senellart²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5463))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

1560 Accesses
3 Citations

Abstract

In this tutorial we provide an insight into Web Mining, i.e., discovering knowledge from the World Wide Web, especially with reference to the latest developments in Web technology. The topics covered are: the Deep Web, also known as the Hidden Web or Invisible Web; the Semantic Web including standards such as RDFS and OWL; the eXtensible Markup Language XML, a widespread communication medium for the Web; and domain-specific markup languages defined within the context of XML We explain how each of these developments support knowledge discovery from data stored over the Web, thereby assisting several real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Comprehensive Review on Ontology and Semantic Web

Karma: A System for Mapping Structured Sources into the Semantic Web

Knowledge Harvesting: Achievements and Challenges

References

Chang, C., Kayed, M., Girgis, M.R., Shaalan, K.F.: A survey of Web information extraction systems. IEEE Transactions on Knowledge and Data Engineering 18(10), 1411–1428 (2006)
Article Google Scholar
Crescenzi, V., Mecca, G., Merialdo, P.: Roadrunner: Towards automatic data extraction from large Web sites. In: VLDB, Rome, Italy (September 2001)
Google Scholar
He, B., Patel, M., Zhang, Z., Chang, K.C.: Accessing the deep Web: A survey. Communications of the ACM 50(2), 94–101 (2007)
Article Google Scholar
Madhavan, J., Halevy, A.Y., Cohen, S., Dong, X., Jeffery, S.R., Ko, D., Yu, C.: Structured data meets the Web: A few observations. IEEE Data Engineering Bullerin 29(4), 19–26 (2006)
Google Scholar
Senellart, P., Mittal, A., Muschick, D., Gilleron, R.: andTommasi, M., Automatic Wrapper Induction from Hidden-Web Sources with Domain Knowledge. In: WIDM, Napa, USA, pp. 9–16 (October 2008)
Google Scholar
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.G.: DBpedia: A nucleus for a web of open data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)
Chapter Google Scholar
Lenat, D., Guha, R.V.: Building Large Knowledge Based Systems: Representation and Inference in the Cyc Project. Addison-Wesley, Reading (1989)
Google Scholar
Staab, S., Studer, R. (eds.): Handbook on Ontologies, 2nd edn. Springer, Heidelberg (2008)
MATH Google Scholar
Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: A Core of Semantic Knowledge. In: WWW 2007 (2007)
Google Scholar
Word Wide Web Consortium. OWL Web Ontology Language (W3C Recommendation 2004-02-10), http://www.w3.org/TR/owl-features/
Li, H., Shan, F., Lee, S.Y.: Online mining of frequent query trees over XML data streams. In: 15th international conference on World Wide Web, Edinburgh, Scotland, pp. 959–960. ACM Press, New York (2008)
Google Scholar
Kutty, S., Nayak, R.: Frequent Pattern Mining on XML documents. In: Song, M., Wu, Y.-F. (eds.) Handbook of Research on Text and Web Mining Technologies, pp. 227–248. Idea Group Inc., USA (2008)
Google Scholar
Nayak, R.: Fast and Effective Clustering of XML Data Utilizing their Structural Information. Knowledge and Information Systems (KAIS) 14(2), 197–215 (2008)
Article Google Scholar
Rusu, L.I., Rahayu, W., Taniar, D.: Mining Association Rules from XML Documents. In: Vakali, A., Pallis, G. (eds.) Web Data Management Practices (2007)
Google Scholar
Wan, J.: Mining Association rules from XML data mining query. Research and practice in Information Technology 32, 169–174 (2004)
Google Scholar
Boag, S., Fernandez, M., Florescu, D., Robie, J., Simeon, J.: XQuery 1.0: An XML Query Language. W3C Working Draft (November 2003)
Google Scholar
Clark, J., DeRose, S.: XML Path Language (XPath) Version 1.0. W3C Recommendation (November 1999)
Google Scholar
Davidson, S., Fan, W., Hara, C., Qin, J.: Propagating XML Constraints to Relations. In: International Conference on Data Engineering (March 2003)
Google Scholar
Guo, J., Araki, K., Tanaka, K., Sato, J., Suzuki, M., Takada, A., Suzuki, T., Nakashima, Y., Yoshihara, H.: The Latest MML (Medical Markup Language) —XML based Standard for Medical Data Exchange / Storage. Journal of Medical Systems 27(4), 357–366 (2003)
Article Google Scholar
Varde, A., Rundensteiner, E., Fahrenholz, S.: XML Based Markup Languages for Specific Domains. In: Web Based Support Systems. Springer, Heidelberg (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Montclair State University, Montclair, NJ, USA
Aparna Varde
Databases and Information Systems, Max Planck Institute for Informatics, Saarbrucken, Germany
Fabian Suchanek
Faculty of Information Technology, Queensland University of Technology, Brisbane, Australia
Richi Nayak
Department of Computer Science and Networking, Telecom Paristech, Paris, France
Pierre Senellart

Authors

Aparna Varde
View author publications
You can also search for this author in PubMed Google Scholar
Fabian Suchanek
View author publications
You can also search for this author in PubMed Google Scholar
Richi Nayak
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Senellart
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology and Electrical Engineering, The University of Queensland, QLD 4072, Brisbane, Australia
Xiaofang Zhou & Ke Deng &
Tokyo Institute of Technology, Graduate School of Information Science and Engineering, 2-12-1 Oh-Okayama Meguro-ku, 152-8552, Tokyo, Japan
Haruo Yokota
CSIRO, Castray Esplanade, TAS 7000, Hobart, Australia
Qing Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Varde, A., Suchanek, F., Nayak, R., Senellart, P. (2009). Knowledge Discovery over the Deep Web, Semantic Web and XML. In: Zhou, X., Yokota, H., Deng, K., Liu, Q. (eds) Database Systems for Advanced Applications. DASFAA 2009. Lecture Notes in Computer Science, vol 5463. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00887-0_73

Download citation

DOI: https://doi.org/10.1007/978-3-642-00887-0_73
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00886-3
Online ISBN: 978-3-642-00887-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics