Skip to main content

Location-Awareness in Time Series Compression

  • Conference paper
  • First Online:
Advances in Databases and Information Systems (ADBIS 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11019))

Included in the following conference series:

Abstract

We present our initial findings regarding the problem of the impact that time series compression may have on similarity-queries, in the settings in which the elements of the dataset are accompanied with additional contexts. Broadly, the main objective of any data compression approach is to provide a more compact (i.e., smaller size) representation of a given original dataset. However, as has been observed in the large body of works on compression of spatial data, applying a particular algorithm “blindly” may yield outcomes that defy the intuitive expectations – e.g., distorting certain topological relationships that exist in the “raw” data [7]. In this study, we quantify this distortion by defining a measure of similarity distortion based on Kendall’s \(\tau \). We evaluate this measure, and the correspondingly achieved compression ratio for the five most commonly used time series compression algorithms and the three most common time series similarity measures. We report some of our observations here, along with the discussion of the possible broader impacts and the challenges that we plan to address in the future.

X. Teng—Research supported by NSF grant III 1823267.

G. Trajcevski—Research supported by NSF grants III-1823279 and CNS-1823267, and ONR grant N00014-14-1-0215.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Around the same time, there were other algorithms developed for polyline simplification, some of which had almost-identical methodologies with the DP algorithm. Most notably [16] which is the reason that sometimes the name Ramer-Douglas-Peuker is used in the literature.

  2. 2.

    https://github.com/XuTengNU/ADBIS2018.git.

References

  1. GPCC: Global Precipitation Climatology Centre. https://climatedataguide.ucar.edu/climate-data/gpcc-global-precipitation-climatology-centre

  2. Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. In: Lomet, D.B. (ed.) FODO 1993. LNCS, vol. 730, pp. 69–84. Springer, Heidelberg (1993). https://doi.org/10.1007/3-540-57301-1_5

    Chapter  Google Scholar 

  3. Barequet, G., Chen, D.Z., Daescu, O., Goodrich, M.T., Snoeyink, J.: Efficiently approximating polygonal paths in three and higher dimensions. Algorithmica 33(2), 150–167 (2002)

    Article  MathSciNet  Google Scholar 

  4. Chan, W.S., Chin, F.: Approximation of polygonal curves with minimum number of line segments. Int. J. Comput. Geom. Appl. 6 (1992)

    Google Scholar 

  5. Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.J.: Querying and mining of time series data: experimental comparison of representations and distance measures. PVLDB 1(2) (2008)

    Google Scholar 

  6. Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitised line or its caricature. Can. Cartographer 10(2) (1973)

    Google Scholar 

  7. Estkowski, R., Mitchell, J.S.B.: Simplifying a polygonal subdivision while keeping it simple. In: Symposium on Computational Geometry (2001)

    Google Scholar 

  8. Ferrari, L., Rosi, A., Mamei, M., Zambonelli, F.: Extracting urban patterns from location-based social networks. In: Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Location-Based Social Networks, LBSN 2011 (2011)

    Google Scholar 

  9. Hey, T., Tansley, S., Tolle, K.M., et al.: The Fourth Paradigm: Data-Intensive Scientific Discovery, vol. 1. Microsoft Research, Redmond (2009)

    Google Scholar 

  10. Hirschberg, D., Lelewer, D.A.: Data compression. Comput. Surv. 19(3) (1987)

    Google Scholar 

  11. Kasperson, J.X., Kasperson, R.E., Turner II, B.L.: Regions at Risk: Comparisons of Threatened Environments. United Nations University Press, Tokyo (1995)

    Google Scholar 

  12. Kendall, M.G.: A new measure of rank correlation. Biometrika 30, 81–93 (1938)

    Article  Google Scholar 

  13. Keogh, E., Ratanamahatana, C.A.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7, 358–386 (2005)

    Article  Google Scholar 

  14. Keogh, E.J., Chakrabarti, K., Pazzani, M.J., Mehrotra, S.: Dimensionality reduction for fast similarity search in large time series databases. Knowl. Inf. Syst. 3(3) (2001)

    Google Scholar 

  15. Pearson, K.: Note on regression and inheritance in the case of two parents. Proc. Roy. Soc. London (1895)

    Google Scholar 

  16. Ramer, U.: An iterative procedure for the polygonal approximation of plane curves. Comput. Graph. Image Process. 1, 244–256 (1972)

    Google Scholar 

  17. Reddy, M.J., Adarsh, S.: Time-frequency characterization of sub-divisional scale seasonal rainfall in India using the Hilbert-Huang transform. Stochast. Environ. Res. Risk Assess. 30, 1063–1085 (2016)

    Google Scholar 

  18. Rényi, A.: On measures of entropy and information. Technical report, Hungarian Academy of Sciences, Budapest, Hungary (1961)

    Google Scholar 

  19. Rienecker, M.M., et al.: Merra: Nasa’s modern-era retrospective analysis for research and applications. J. Climate 24(14), 3624–3648 (2011)

    Article  Google Scholar 

  20. Saalfeld, A.: Topologically consistent line simplification with the Douglas-Peucker algorithm. Cartography Geogr. Inf. Sci. 26(1), 7–18 (1999)

    Article  Google Scholar 

  21. Sayemuzzaman, M., Jha, M.K.: Seasonal and annual precipitation time series trend analysis in North Carolina, United States. Atmos. Res. 137 (2014)

    Google Scholar 

  22. Sayood, K.: Introduction to Data Compression. Morgan Kauffman, Burlington (1996)

    Google Scholar 

  23. Sharma, C.S., Panda, S.N., Pradhan, R.P., Singh, A., Kawamura, A.: Precipitation and temperature changes in Eastern India by multiple trend detection methods. Atmos. Res. 180 (2016)

    Google Scholar 

  24. Shekhar, S., Chawla, S.: Spatial Databases: A Tour. Prentice Hall, Upper Saddle River (2003)

    Google Scholar 

  25. Steinbach, M., Karypis, G., Kumar, V., et al.: A comparison of document clustering techniques. In: KDD Workshop on Text Mining, vol. 400, pp. 525–526 (2000)

    Google Scholar 

  26. Suarez, M.J., et al.: The geos-5 data assimilation system-documentation of versions 5.0. 1, 5.1. 0, and 5.2. 0 (2008)

    Google Scholar 

  27. Vaisman, A.A., Zimányi, E.: Data Warehouse Systems - Design and Implementation. Data-Centric Systems and Applications. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54655-6

    Book  Google Scholar 

  28. van Kreveld, M., Luo, J.: The definition and computation of trajectory and subtrajectory similarity. In: GIS (2007)

    Google Scholar 

  29. Visvalingam, M., Whyatt, J.D.: Line generalisation by repeated elimination of points. Cartographic J. 30(1) (1993)

    Google Scholar 

  30. Weibel, R.: Generalization of spatial data: principles and selected algorithms. In: van Kreveld, M., Nievergelt, J., Roos, T., Widmayer, P. (eds.) CISM School 1996. LNCS, vol. 1340, pp. 99–152. Springer, Heidelberg (1997). https://doi.org/10.1007/3-540-63818-0_5

    Chapter  Google Scholar 

  31. Yang, G., Züfle, A.: Spatio-temporal prediction of social connections. In: Proceedings of the Fourth International ACM Workshop on Managing and Mining Enriched Geo-Spatial Data, Chicago, IL, USA, 14 May 2017 (2017)

    Google Scholar 

Download references

Acknowledgments

We thank Praxxal Patel and Yash Thesia for their help in finalizing part of the experiments.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Xu Teng , Andreas Züfle or Goce Trajcevski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Teng, X., Züfle, A., Trajcevski, G., Klabjan, D. (2018). Location-Awareness in Time Series Compression. In: Benczúr, A., Thalheim, B., Horváth, T. (eds) Advances in Databases and Information Systems. ADBIS 2018. Lecture Notes in Computer Science(), vol 11019. Springer, Cham. https://doi.org/10.1007/978-3-319-98398-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-98398-1_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-98397-4

  • Online ISBN: 978-3-319-98398-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics