Skip to main content

i-Cube: A Tool-Set for the Dynamic Extraction and Integration of Web Data Content

  • Conference paper
  • First Online:
Electronic Commerce Technologies (ISEC 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2040))

Included in the following conference series:

  • 363 Accesses

Abstract

This paper presents the i-Cube environment, a tool-set that allows for Internet data and content originally available as HTML Web pages and programmatic scripts to be denoted, modeled, and represented in the form of XML documents. These XML documents conform to spe- cific Document Type Definitions and other structural constraints that are fully customizable by the end-user or the service provider. The approach is based on representing HTML document data content in the form of annotated trees. Specific areas of interest and data content in the original HTML document that need to be encoded in the form of an XML rep- resentation, are represented as a collection of annotated sub-trees in the tree that corresponds to a large HTML document. A service integration module allows for different categories of analysis and presentation rules to be invoked according to script based user-defined logic.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Aho, M. Ganapathi, S. Tjiang.: Code Generation Using Tree Matching and Dy-namic Programming. ACM Transactions on Programming Languages and Systems, vol. 11, No. 4, (1989)

    Google Scholar 

  2. G. Arocena, A. Mendelzon, G. Mihaila.: Applications of a Web Query language. In Proceedings of the 6th International WWW Conference, Santa Clara, California, (1997)

    Google Scholar 

  3. G. Arocena, A. Mendelzon.: WebOQL: Restructuring Documents, Databases, and Webs. In Proceeding of the SIGMOD Conference, Seattle, (1998)

    Google Scholar 

  4. N. Ashish and C. Knoblock.: Wrapper Generation for Semi-structured Internet Sources. In ACM SIGMOD Record vol.20 No.4, (1999)

    Google Scholar 

  5. B. S. Baker.: Parameterized Pattern Matching: Algorithms and Applications. Journal Computer and System Sciences, (1994)

    Google Scholar 

  6. J.A. Bergstra, P. Klint.: The Discrete Time ToolBus, In Science of Computer Programming, 31(2–3), (1998)

    Google Scholar 

  7. F. Ranno, S. K. Shrivastava, S. Wheater.:A Language for Specifying the Composition of Reliable Distributed Applications, Technical Report 17, Esprit LTR Project No. 24962, Project Report, C3DS Project. Dept. of Computing Science, University of Newcastle upon Tyne, (1999)

    Google Scholar 

  8. A. Gal, S. Kerr, J. Mylopoulos.: Information Services for the Web: Building and Maintaining Domain Models. International Journal of Cooperative Information Systems, 8(4), (1999)

    Google Scholar 

  9. L. Giu, C. Pu, W. Iian.: XWRAP: An XML-enabledWrapper Construction System for Web Information Sources. In Proceedings of ICDE'2000 (2000)

    Google Scholar 

  10. K. Kontogiannis, R. Gregory.: Customizable Integration in Web-enabled Environments, Lecture Notes in Computer Science, Engineering Distributed Objects (2001)

    Google Scholar 

  11. J. Hammer, H. Garcia-Molina, J. Cho, R. Aranha, and A. Crespo.: Extracting Semisructured Information from the Web. In Proceedings of the Workshop on Management of Semistructured Data. Tucson, Arizona, (1997)

    Google Scholar 

  12. G. Huck, P. Fankhauser, K. Aberer, and E.J. Neuhold.: JEDI: Extracting and Synthesizing Information from theWeb. In Proceedings of Cooperative Information Systems (COOPIS), New-York, (1998)

    Google Scholar 

  13. L. Liu, et.al.: CQ: A Personalized Update Monitoring Toolkit. In Proceedings of the ACM SIGMOD, (1998)

    Google Scholar 

  14. E. Myers, W. Miller.: Approximate Matching of Regular Expressions. In Bulletin of Mathematical Biology, Vol.51 No.1, (1989)

    Google Scholar 

  15. A. Sahuguet, F. Azavant.: Wysiwyg Web Wrapper Factory (W4F). In Proceedings of WWW Conference, (1999)

    Google Scholar 

  16. VoiceXML Forum.: VoiceXML Specification 1.0, 2000. In URL: http://www.voicexml.org/ (200)

  17. WAP Forum.: Wireless Application Protocol Architecture Specification 1998. In URL: http://www.wapforum.org (1998)

  18. World Wide Web Consortium (W3C).: Extensible Markup Language (XML) Version 1.0, (1998).

    Google Scholar 

  19. World Wide Web Consortium (W3C).: Extensible Markup Language (XSL) Version 1.0, (1998).

    Google Scholar 

  20. World Wide Web Consortium (W3C).: XML Path Language (XPath) Version 1.0, (1999).

    Google Scholar 

  21. Y. Zou, K. Kontogiannis, Web Based Specification and Integration of Legacy Services. In Proceedings of CASCON 2000, IBM Toronto Laboratories, (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Poon, F., Kontogiannis, K. (2001). i-Cube: A Tool-Set for the Dynamic Extraction and Integration of Web Data Content. In: Kou, W., Yesha, Y., Tan, C.J. (eds) Electronic Commerce Technologies. ISEC 2001. Lecture Notes in Computer Science, vol 2040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45415-2_8

Download citation

  • DOI: https://doi.org/10.1007/3-540-45415-2_8

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41963-1

  • Online ISBN: 978-3-540-45415-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics