Skip to main content

A Conceptual Model for the Web

  • Conference paper
  • First Online:
Conceptual Modeling — ER 2000 (ER 2000)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1920))

Included in the following conference series:

Abstract

Most documents available over the web conform to the HTML specification. Such documents are hierarchically structured in nature. The existing graph-based or tree-based data models for the web only provide a very low level representation of such hierarchical structure. In this paper, we introduce a conceptual model for the web that is able to represent the complex hierarchical structure within the web documents at a high level that is close to human conceptualization/visualization of the documents. We also describe how to convert HTML documents based on this conceptual model. Using the conceptual model and conversion method, we can capture the essence (i.e., semistructure) of HTML documents in a natural and simple way.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. T. Bray, J. Paoli, and C.M. Sperberg-McQueen. Extensible Markup Language (XML) 1.0. W3C Recommendation. See http://www.w3c.org/TR/1999/REC-xml-19980210, February 1998.

  2. P. Buneman, S. Davidson, G. Hilebrand, and D. Suciu. A Query Language and Optimization Techniques for Unstructured Data. In Proceedings of the ACM SIG-MOD International Conference on Management of Data, pages 505–516, 1996.

    Google Scholar 

  3. J. Clark and S. DeRose. XML Path Language (XPath) Version 1.0. W3C Recommendation. See http://www.w3c.org/TR/1999/REC-xpath-19991116, November 1999.

  4. M. Fernandez, D. Florescu, A. Levy, and D. Suciu. A Query Language for a Web-Site Management System. SIGMOD Record, pages 4–11, 1997.

    Google Scholar 

  5. M. Fernandez, D. Florescu, A. Levy, and D. Suciu. Reasoning About Web-Site Structure. In Proceedings of AAAI’98 Workshop on AI and Information Integration, 1998.

    Google Scholar 

  6. D. Florescu, A. Levy, and A. Mendelzon. Database Techniques for the World-Wide Web: A Survey. SIGMOD Record, 27(3):59–74, 1998.

    Article  Google Scholar 

  7. J. Hammer, H. Garcia-Molina, J. Cho, A. Crespo, and R. Aranha. Extracting Semistructured Information from the Web. In Proceedings of the Workshop on Management of Semistructured Data, 1997.

    Google Scholar 

  8. C. A. Knoblock, S. Minton, J. L. Ambite, N. Ashish, P. J. Modi, I. Muslea, A. G. Philpot, and S. Tejada. Modeling Web Sources for Information Integration. In Proceedings of the 15th National Conference on AI, 1998.

    Google Scholar 

  9. M. Liu and T. W. Ling. A Data Model for Semistructured Data with Partial and Inconsistent Information. In Proceedings of the International Conference on Ad-vances in Database Technology (EDBT 2000), pages 317–331, Konstanz, Germany, March 27-31 2000. Springer-Verlag LNCS 1777.

    Google Scholar 

  10. M. Liu, T. W. Ling, and T. Guan. Integration of Semistructured Data with Partial and Inconsistent Information. In Proceedings of the International Database Engineering and Application Symposium (IDEAS’ 99), pages 44–52, Montreal, Canada, August 2-4 1999. IEEE-CS Press.

    Google Scholar 

  11. I. Muslea, S. Minton, and C. A. Knoblock. Hierarchical Wrapper Induction for Semistructured Information Sources. To appear in Journal of Autonomous Agents and Multi-Agent Systems.

    Google Scholar 

  12. Y. Papakonstantinou, H. Garcia-Molina, and J. Widom. Object Exchange across Heterogeneous Information. In Proceedings of the International Conference on Data Engineering, pages 251–260. IEEE Computer Society, 1995.

    Google Scholar 

  13. D. Raggett, A. L. Hors, and I. Jacobs. HTML 4.01 Specification. W3C Recommendation. See http://www.w3c.org/TR/html401, December 1999.

  14. L. Wood, A. L. Hors, et al. Document Object Model (DOM) Level 2 Specification. W3C Recommendation. See http://www.w3c.org/TR/2000/CR-DOM-Level-2-20000307, March 2000.

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liu, M., Wang Ling, T. (2000). A Conceptual Model for the Web. In: Laender, A.H.F., Liddle, S.W., Storey, V.C. (eds) Conceptual Modeling — ER 2000. ER 2000. Lecture Notes in Computer Science, vol 1920. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45393-8_17

Download citation

  • DOI: https://doi.org/10.1007/3-540-45393-8_17

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41072-0

  • Online ISBN: 978-3-540-45393-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics