Skip to main content

Knowledge Discovery over the Deep Web, Semantic Web and XML

  • Conference paper
Database Systems for Advanced Applications (DASFAA 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5463))

Included in the following conference series:

Abstract

In this tutorial we provide an insight into Web Mining, i.e., discovering knowledge from the World Wide Web, especially with reference to the latest developments in Web technology. The topics covered are: the Deep Web, also known as the Hidden Web or Invisible Web; the Semantic Web including standards such as RDFS and OWL; the eXtensible Markup Language XML, a widespread communication medium for the Web; and domain-specific markup languages defined within the context of XML We explain how each of these developments support knowledge discovery from data stored over the Web, thereby assisting several real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Chang, C., Kayed, M., Girgis, M.R., Shaalan, K.F.: A survey of Web information extraction systems. IEEE Transactions on Knowledge and Data Engineering 18(10), 1411–1428 (2006)

    Article  Google Scholar 

  2. Crescenzi, V., Mecca, G., Merialdo, P.: Roadrunner: Towards automatic data extraction from large Web sites. In: VLDB, Rome, Italy (September 2001)

    Google Scholar 

  3. He, B., Patel, M., Zhang, Z., Chang, K.C.: Accessing the deep Web: A survey. Communications of the ACM 50(2), 94–101 (2007)

    Article  Google Scholar 

  4. Madhavan, J., Halevy, A.Y., Cohen, S., Dong, X., Jeffery, S.R., Ko, D., Yu, C.: Structured data meets the Web: A few observations. IEEE Data Engineering Bullerin 29(4), 19–26 (2006)

    Google Scholar 

  5. Senellart, P., Mittal, A., Muschick, D., Gilleron, R.: andTommasi, M., Automatic Wrapper Induction from Hidden-Web Sources with Domain Knowledge. In: WIDM, Napa, USA, pp. 9–16 (October 2008)

    Google Scholar 

  6. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.G.: DBpedia: A nucleus for a web of open data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  7. Lenat, D., Guha, R.V.: Building Large Knowledge Based Systems: Representation and Inference in the Cyc Project. Addison-Wesley, Reading (1989)

    Google Scholar 

  8. Staab, S., Studer, R. (eds.): Handbook on Ontologies, 2nd edn. Springer, Heidelberg (2008)

    MATH  Google Scholar 

  9. Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: A Core of Semantic Knowledge. In: WWW 2007 (2007)

    Google Scholar 

  10. Word Wide Web Consortium. OWL Web Ontology Language (W3C Recommendation 2004-02-10), http://www.w3.org/TR/owl-features/

  11. Li, H., Shan, F., Lee, S.Y.: Online mining of frequent query trees over XML data streams. In: 15th international conference on World Wide Web, Edinburgh, Scotland, pp. 959–960. ACM Press, New York (2008)

    Google Scholar 

  12. Kutty, S., Nayak, R.: Frequent Pattern Mining on XML documents. In: Song, M., Wu, Y.-F. (eds.) Handbook of Research on Text and Web Mining Technologies, pp. 227–248. Idea Group Inc., USA (2008)

    Google Scholar 

  13. Nayak, R.: Fast and Effective Clustering of XML Data Utilizing their Structural Information. Knowledge and Information Systems (KAIS) 14(2), 197–215 (2008)

    Article  Google Scholar 

  14. Rusu, L.I., Rahayu, W., Taniar, D.: Mining Association Rules from XML Documents. In: Vakali, A., Pallis, G. (eds.) Web Data Management Practices (2007)

    Google Scholar 

  15. Wan, J.: Mining Association rules from XML data mining query. Research and practice in Information Technology 32, 169–174 (2004)

    Google Scholar 

  16. Boag, S., Fernandez, M., Florescu, D., Robie, J., Simeon, J.: XQuery 1.0: An XML Query Language. W3C Working Draft (November 2003)

    Google Scholar 

  17. Clark, J., DeRose, S.: XML Path Language (XPath) Version 1.0. W3C Recommendation (November 1999)

    Google Scholar 

  18. Davidson, S., Fan, W., Hara, C., Qin, J.: Propagating XML Constraints to Relations. In: International Conference on Data Engineering (March 2003)

    Google Scholar 

  19. Guo, J., Araki, K., Tanaka, K., Sato, J., Suzuki, M., Takada, A., Suzuki, T., Nakashima, Y., Yoshihara, H.: The Latest MML (Medical Markup Language) —XML based Standard for Medical Data Exchange / Storage. Journal of Medical Systems 27(4), 357–366 (2003)

    Article  Google Scholar 

  20. Varde, A., Rundensteiner, E., Fahrenholz, S.: XML Based Markup Languages for Specific Domains. In: Web Based Support Systems. Springer, Heidelberg (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Varde, A., Suchanek, F., Nayak, R., Senellart, P. (2009). Knowledge Discovery over the Deep Web, Semantic Web and XML. In: Zhou, X., Yokota, H., Deng, K., Liu, Q. (eds) Database Systems for Advanced Applications. DASFAA 2009. Lecture Notes in Computer Science, vol 5463. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00887-0_73

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00887-0_73

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00886-3

  • Online ISBN: 978-3-642-00887-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics