Skip to main content

Efficient Filtering Query Indexing in Data Stream

  • Conference paper
Web Information Systems – WISE 2006 Workshops (WISE 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4256))

Included in the following conference series:

Abstract

Filtering queries are widely used in data stream applications. As more and more filtering queries are registered in high-speed data stream management system, the processing efficiency becomes crucial. This paper presents an efficient query index structure based on decision tree. The index structure makes full use of predicate indices on single attributes, as well as the conjunction relationship between predicates in a single query. It is easy to integrate various predicate indices into this structure. How to select dividing attributes during construction is crucial to the performance of the index tree. Two dividing attribute selection algorithms are described. One is based on information gain (IG) and the other is based on estimated time cost (ETC). The latter takes some sample tuples as a training data set and is able to build more efficient trees. Our experiments demonstrate that.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and Issues in Data Stream. In: Proc. ACM Symp. on Principles of Database Systems, pp. 1–16 (2002)

    Google Scholar 

  2. Hanson, E., Chaaboun, M., Kim, C.-H., Wang, Y.-W.: A predicate matching algorithm for database rule systems. In: Proc. of ACM SIGMOD Int. Conf. on Management of Data, pp. 271–280 (1990)

    Google Scholar 

  3. Hanson, E.N., Johnson, T.: The Interval Skip List: A data structure for finding all intervals that overlap a point. In: Dehne, F., Sack, J.-R., Santoro, N. (eds.) WADS 1991. LNCS, vol. 519, pp. 153–164. Springer, Heidelberg (1991)

    Chapter  Google Scholar 

  4. Hanson, E., Johnson, T.: Selection predicate indexing for active database using interval skip lists. Information Systems 21(3), 269–298 (1996)

    Article  Google Scholar 

  5. Wu, K.-L., Chen, S.-K., Yu, P.S.: Interval Query Indexing for Efficient Stream Processing. In: Proc. of ACM CIKM (2004)

    Google Scholar 

  6. Wu, K.-L., Chen, S.-K., Yu, P.S.: Query indexing with containment-encoded intervals for efficient stream processing. Knowl. Inf. Syst. 9(1), 62–90 (2006)

    Article  Google Scholar 

  7. Wu, K.-L., Chen, S.-K., Yu, P.S.: On-Demand Index for Efficient Structural Joins. IBM Research Report (2006)

    Google Scholar 

  8. Chandrasekaran, S., Franklin, M.J.: Streaming Queries over Streaming Data. In: Proceedings of the 28th VLDB Conference, Hong Kong, China (2002)

    Google Scholar 

  9. Aguilera, M.K., Strom, R.E., Sturman, D.C., Astley, M., Chandra, T.D.: Matching events in a content-based subscription system. In: Proc. of the 18th ACM Symp. on Principles of Distributed Computing, Atlanta, pp. 53–61 (1999)

    Google Scholar 

  10. Campailla, A., Chaki, S., Clarke, E., Jha, S., Veith, H.: Efficient filtering in publish-subscribe systems using binary decision diagrams. In: Proc. of the ICSE 2001, pp. 443–452. IEEE Computer Society, Toronto (2001)

    Google Scholar 

  11. Fabret, F., Jacobsen, H.A., Llirbat, F., Pereira, J., Ross, K.A., Shasha, D.: Filtering algorithms and implementation for very fast publish/subscribe systems. In: Proc. of ACM SIGMOD Int. Conf. on Management of Data (2001)

    Google Scholar 

  12. Krügel, C., Tóth, T.: Using decision trees to improve signature-based intrusion detection. In: Vigna, G., Krügel, C., Jonsson, E. (eds.) RAID 2003. LNCS, vol. 2820, pp. 173–191. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  13. Navarro, G., Raffinot, M.: Flexibale Pattern Matching in Strings, pp. 49–54. Cambridge University Press, Cambridge (2002)

    Google Scholar 

  14. Snort. Open-source Network Intrusion Detection System, http://www.snort.org

  15. Mitchell, T.M.: Machine Learning, pp. 63–66. McGraw-Hill, New York (1997)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, Y., Bai, S., Tan, J., Guo, L. (2006). Efficient Filtering Query Indexing in Data Stream. In: Feng, L., Wang, G., Zeng, C., Huang, R. (eds) Web Information Systems – WISE 2006 Workshops. WISE 2006. Lecture Notes in Computer Science, vol 4256. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11906070_1

Download citation

  • DOI: https://doi.org/10.1007/11906070_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-47663-4

  • Online ISBN: 978-3-540-47664-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics