skip to main content
10.1145/2247596.2247659acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

Extending a general-purpose streaming system for XML

Published:27 March 2012Publication History

ABSTRACT

General-purpose streaming systems support diverse application domains with powerful and user-defined stream operators. Most general-purpose streaming systems have their own, non-XML, internal data representation. However, streaming input is often either a sequence of small XML documents, or a scan of a huge document. Prior work on XML streaming focuses on filtering, not transforming, XML, and does not describe how to integrate with a general-purpose streaming system. This paper describes how to integrate an XML transformer with a streaming system by designing a specification syntax that is both consistent with the existing system and familiar to XML users. After type-checking the specification, we compile it to an efficient automaton driven by SAX events. Our approach extends the underlying streaming system with XML support without changing its core architecture, and the same technique could be used for other extensions beyond XML.

References

  1. D. J. Abadi, Y. Ahmad, M. Balazinska, U. Cetintemel, M. Cherniack, J.-H. Hwang, W. Lindner, A. S. Maskey, A. Rasin, E. Ryvkina, N. Tatbul, Y. Xing, and S. Zdonik. The design of the Borealis stream processing engine. In Conference on Innovative Data Systems Research (GIDR), 2005.Google ScholarGoogle Scholar
  2. L. Amini, H. Andrade, R. Bhagwan, F. Eskesen, R. King, P. Selo, Y. Park, and C. Venkatramani. SPC: A distributed, scalable platform for data mining. In Workshop on Data Mining Standards, Services and Platforms (DM-SSP), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Arasu, S. Babu, and J. Widom. The CQL continuous query language: Semantic foundations and query execution. Journal on Very Large Data Bases (VLDB J.), 15(2), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Barton, P. Charles, D. Goyal, M. Raghavachari, M. Fontoura, and V. Josifovski. Streaming XPath processing with forward and backward axes. In International Conference on Data Engineering (ICDE), 2003.Google ScholarGoogle ScholarCross RefCross Ref
  5. P. Boncz, T. Grust, M. van Keulen, S. Manegold, J. Rittinger, and J. Teubner. MonetDB/XQuery: A fast XQuery processor powered by a relational engine. In Demo at International Conference on Management of Data (SIGMOD-Demo), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Brownell. SAX2. O'Reilly, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. N. Bruno, N. Koudas, and D. Srivastava. Holistic twig joins: Optimal XML pattern matching. In International Conference on Management of Data (SIGMOD), 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein, W. Hong, S. Krishnamurthy, S. Madden, V. Raman, F. Reiss, and M. A. Shah. TelegraphCQ: Continuous dataflow processing for an uncertain world. In Conference on Innovative Data Systems Research (CIDR), 2003.Google ScholarGoogle Scholar
  9. J. Chen, D. J. DeWitt, F. Tian, and Y. Wang. NiagaraCQ: A scalable continuous query system for internet databases. In International Conference on Management of Data (SIGMOD), 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. K. Gupta and D. Suciu. Stream processing of XPath queries with predicates. In International Conference on Management of Data (SIGMOD), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Harren, M. Raghavachari, O. Shmueli, M. G. Burke, R. Bordawekar, I. Pechtchanski, and V. Sarkar. XJ: Facilitating XML processing in Java. In International World Wide Web Conferences (WWW), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Hentschel, L. Haas, and R. Miller. Just-in-time data integration in action. In Demo at Very Large Data Bases (VLDB-Demo), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Hirzel, H. Andrade, B. Gedik, V. Kumar, G. Losa, M. Mendell, H. Nasgaard, R. Soulé, and K.-L. Wu. SPL Streams Processing Language Specification. Technical Report RC24897, IBM Research, 2009.Google ScholarGoogle Scholar
  14. M. Hirzel and B. Gedik. Streams that compose using macros that oblige. In Workshop on Partial Evaluation and Program Manipulation (PERM), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. H. Hosoya and B. C. Pierce. XDuce: A typed XML processing language. In International World Wide Web Conferences (WWW), 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. E. Meijer, B. Beckman, and G. M. Bierman. LINQ: Reconciling objects, relations, and XML in the .NET framework. In Industrial Sessions at the International Conference on Management of Data (SIGMOD), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. F. Peng and S. S. Chawathe. XPath queries on streaming data. In International Conference on Management of Data (SIGMOD), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Schmidt, F. Waas, M. Kersten, M. J. Carey, I. Manolescu, and R. Busse. XMark: A benchmark for XML data management. In Very Large Data Bases (VLDB), 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Soulé, M. Hirzel, R. Grimm, B. Gedik, H. Andrade, V. Kumar, and K.-L. Wu. A universal calculus for stream processing languages. In European Symposium on Programming (ESOP), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Extending a general-purpose streaming system for XML

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            EDBT '12: Proceedings of the 15th International Conference on Extending Database Technology
            March 2012
            643 pages
            ISBN:9781450307901
            DOI:10.1145/2247596

            Copyright © 2012 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 27 March 2012

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate7of10submissions,70%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader