Abstract
This paper presents the i-Cube environment, a tool-set that allows for Internet data and content originally available as HTML Web pages and programmatic scripts to be denoted, modeled, and represented in the form of XML documents. These XML documents conform to spe- cific Document Type Definitions and other structural constraints that are fully customizable by the end-user or the service provider. The approach is based on representing HTML document data content in the form of annotated trees. Specific areas of interest and data content in the original HTML document that need to be encoded in the form of an XML rep- resentation, are represented as a collection of annotated sub-trees in the tree that corresponds to a large HTML document. A service integration module allows for different categories of analysis and presentation rules to be invoked according to script based user-defined logic.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A. Aho, M. Ganapathi, S. Tjiang.: Code Generation Using Tree Matching and Dy-namic Programming. ACM Transactions on Programming Languages and Systems, vol. 11, No. 4, (1989)
G. Arocena, A. Mendelzon, G. Mihaila.: Applications of a Web Query language. In Proceedings of the 6th International WWW Conference, Santa Clara, California, (1997)
G. Arocena, A. Mendelzon.: WebOQL: Restructuring Documents, Databases, and Webs. In Proceeding of the SIGMOD Conference, Seattle, (1998)
N. Ashish and C. Knoblock.: Wrapper Generation for Semi-structured Internet Sources. In ACM SIGMOD Record vol.20 No.4, (1999)
B. S. Baker.: Parameterized Pattern Matching: Algorithms and Applications. Journal Computer and System Sciences, (1994)
J.A. Bergstra, P. Klint.: The Discrete Time ToolBus, In Science of Computer Programming, 31(2–3), (1998)
F. Ranno, S. K. Shrivastava, S. Wheater.:A Language for Specifying the Composition of Reliable Distributed Applications, Technical Report 17, Esprit LTR Project No. 24962, Project Report, C3DS Project. Dept. of Computing Science, University of Newcastle upon Tyne, (1999)
A. Gal, S. Kerr, J. Mylopoulos.: Information Services for the Web: Building and Maintaining Domain Models. International Journal of Cooperative Information Systems, 8(4), (1999)
L. Giu, C. Pu, W. Iian.: XWRAP: An XML-enabledWrapper Construction System for Web Information Sources. In Proceedings of ICDE'2000 (2000)
K. Kontogiannis, R. Gregory.: Customizable Integration in Web-enabled Environments, Lecture Notes in Computer Science, Engineering Distributed Objects (2001)
J. Hammer, H. Garcia-Molina, J. Cho, R. Aranha, and A. Crespo.: Extracting Semisructured Information from the Web. In Proceedings of the Workshop on Management of Semistructured Data. Tucson, Arizona, (1997)
G. Huck, P. Fankhauser, K. Aberer, and E.J. Neuhold.: JEDI: Extracting and Synthesizing Information from theWeb. In Proceedings of Cooperative Information Systems (COOPIS), New-York, (1998)
L. Liu, et.al.: CQ: A Personalized Update Monitoring Toolkit. In Proceedings of the ACM SIGMOD, (1998)
E. Myers, W. Miller.: Approximate Matching of Regular Expressions. In Bulletin of Mathematical Biology, Vol.51 No.1, (1989)
A. Sahuguet, F. Azavant.: Wysiwyg Web Wrapper Factory (W4F). In Proceedings of WWW Conference, (1999)
VoiceXML Forum.: VoiceXML Specification 1.0, 2000. In URL: http://www.voicexml.org/ (200)
WAP Forum.: Wireless Application Protocol Architecture Specification 1998. In URL: http://www.wapforum.org (1998)
World Wide Web Consortium (W3C).: Extensible Markup Language (XML) Version 1.0, (1998).
World Wide Web Consortium (W3C).: Extensible Markup Language (XSL) Version 1.0, (1998).
World Wide Web Consortium (W3C).: XML Path Language (XPath) Version 1.0, (1999).
Y. Zou, K. Kontogiannis, Web Based Specification and Integration of Legacy Services. In Proceedings of CASCON 2000, IBM Toronto Laboratories, (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Poon, F., Kontogiannis, K. (2001). i-Cube: A Tool-Set for the Dynamic Extraction and Integration of Web Data Content. In: Kou, W., Yesha, Y., Tan, C.J. (eds) Electronic Commerce Technologies. ISEC 2001. Lecture Notes in Computer Science, vol 2040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45415-2_8
Download citation
DOI: https://doi.org/10.1007/3-540-45415-2_8
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41963-1
Online ISBN: 978-3-540-45415-1
eBook Packages: Springer Book Archive