Skip to main content

Query Refinement for Correlation-Based Time Series Exploration

  • Conference paper
  • First Online:
Databases Theory and Applications (ADC 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10538))

Included in the following conference series:

  • 987 Accesses

Abstract

In this paper, we focus on the problem of exploring sequential data to discover time sub-intervals that satisfy certain pairwise correlation constraints. Differently than most existing works, we use the deviation from targeted pairwise correlation constraints as an objective to minimize in our problem. Moreover, we include users preferences as an objective in the form of maximizing similarity to users’ initial sub-intervals. The combination of these two objectives are prevalent in applications where users explore time series data to locate time sub-intervals in which targeted patterns exist. Discovering these sub-intervals among time series data is extremely useful in various application areas such as network and environment monitoring.

Towards finding the optimal sub-interval (i.e., optimal query) satisfying these objectives, we propose applying query refinement techniques to enable efficient processing of candidate queries. Specifically, we propose QFind, an efficient algorithm which refines a user’s initial query to discover the optimal query by applying novel pruning techniques. QFind applies two-level pruning techniques to safely skip processing unqualified candidate queries, and early abandon the computations of correlation for some pairs based on a monotonic property. We experimentally validate the efficiency of our proposed algorithm against state-of-the-art algorithm under different settings using real and synthetic data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Blyth, C.R.: On simpson’s paradox and the sure-thing principle. J. Am. Stat. Assoc. 67(338), 364–366 (1972)

    Article  MathSciNet  MATH  Google Scholar 

  2. Chaudhuri, S.: Generalization and a framework for query modification. In: Proceedings of the Sixth International Conference on Data Engineering, Los Angeles, California, USA, 5–9 February 1990, pp. 138–145 (1990)

    Google Scholar 

  3. Gavrilov, M., Anguelov, D., Indyk, P., Motwani, R.: Mining the stock market (extended abstract): which measure is best? In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, USA, 20–23 August 2000, pp. 487–496 (2000)

    Google Scholar 

  4. Guo, T., Sathe, S., Aberer, K.: Fast distributed correlation discovery over streaming time-series data. In: Proceedings of the 24th ACM International Conference on Information and Knowledge Management, CIKM 2015, Melbourne, VIC, Australia, 19–23 October 2015, pp. 1161–1170 (2015)

    Google Scholar 

  5. Li, Y., Huo U, L., Yiu, M.L., Gong, Z.: Efficient discovery of longest-lasting correlation in sequence databases. VLDB J. 25(6), 767–790 (2016)

    Article  Google Scholar 

  6. Lin, J., Keogh, E.J., Lonardi, S., Lankford, J.P., Nystrom, D.M.: Visually mining and monitoring massive time series. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA, 22–25 August 2004, pp. 460–469 (2004)

    Google Scholar 

  7. Liu, J., Terzis, A.: Sensing data centres for energy efficiency. Philos. Trans. R. Soc. Lond. A Math. Phys. Eng. Sci. 370(1958), 136–157 (2012)

    Article  Google Scholar 

  8. Matsubara, Y., Sakurai, Y., Ueda, N., Yoshikawa, M.: Fast and exact monitoring of co-evolving data streams. In: 2014 IEEE International Conference on Data Mining, ICDM 2014, Shenzhen, China, 14–17 December 2014, pp. 390–399 (2014)

    Google Scholar 

  9. Mueen, A., Nath, S., Liu, J.: Fast approximate correlation for massive time-series data. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, Indiana, USA, 6–10 June 2010, pp. 171–182 (2010)

    Google Scholar 

  10. Palpanas, T.: Data series management: the road to big sequence analytics. SIGMOD Rec. 44(2), 47–52 (2015)

    Article  Google Scholar 

  11. Pelkonen, T., Franklin, S., Cavallaro, P., Huang, Q., Meza, J., Teller, J., Veeraraghavan, K.: Gorilla: a fast, scalable, in-memory time series database. PVLDB 8(12), 1816–1827 (2015)

    Google Scholar 

  12. Rakthanmanon, T., Campana, B.J.L., Mueen, A., Batista, G.E.A.P.A., Westover, M.B., Zhu, Q., Zakaria, J., Keogh, E.J.: Searching and mining trillions of time series subsequences under dynamic time warping. In: The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012, Beijing, China, 12–16 August 2012, pp. 262–270 (2012)

    Google Scholar 

  13. Reeves, G., Liu, J., Nath, S., Zhao, F.: Managing massive time series streams with multiscale compressed trickles. PVLDB 2(1), 97–108 (2009)

    Google Scholar 

  14. Reiss, C., Wilkes, J., Hellerstein, J.L.: Google cluster-usage traces: format + schema. Technical report, Google Inc., Mountain View, CA, USA, November 2011

    Google Scholar 

  15. Sakurai, Y., Papadimitriou, S., Faloutsos, C.: BRAID: stream mining through group lag correlations. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, USA, 14–16 June 2005, pp. 599–610 (2005)

    Google Scholar 

  16. Tabachnick, B.G., Fidell, L.S.: Using Multivariate Statistics, 5th edn. Allyn & Bacon Inc., Needham Heights (2006)

    Google Scholar 

  17. Utomo, C., Li, X., Wang, S.: Classification based on compressive multivariate time series. In: Cheema, M.A., Zhang, W., Chang, L. (eds.) ADC 2016. LNCS, vol. 9877, pp. 204–214. Springer, Cham (2016). doi:10.1007/978-3-319-46922-5_16

    Chapter  Google Scholar 

  18. Vartak, M., Raghavan, V., Rundensteiner, E.A.: Qrelx: generating meaningful queries that provide cardinality assurance. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, Indiana, USA, 6–10 June 2010, pp. 1215–1218 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdullah M. Albarrak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Albarrak, A.M., Sharaf, M.A. (2017). Query Refinement for Correlation-Based Time Series Exploration. In: Huang, Z., Xiao, X., Cao, X. (eds) Databases Theory and Applications. ADC 2017. Lecture Notes in Computer Science(), vol 10538. Springer, Cham. https://doi.org/10.1007/978-3-319-68155-9_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-68155-9_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-68154-2

  • Online ISBN: 978-3-319-68155-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics