Abstract
Nowadays, we can observe increasing interest in processing and exploration of time series. Growing volumes of data and needs of efficient processing pushed research in new directions. This paper presents a lossless lightweight compression planner intended to be used in a time series database system. We propose a novel compression method which is ultra fast and tries to find the best possible compression ratio by composing several lightweight algorithms tuned dynamically for incoming data. The preliminary results are promising and open new horizons for data intensive monitoring and analytic systems.
P. Przymus: The project was partially funded by Marshall of Kuyavian-Pomeranian Voivodeship in Poland with the funds from European Social Fund (EFS) in the form of a PhD scholarships. “Krok w przyszłość – stypendia dla doktorantów V edycja” (Step in the future – PhD scholarships V edition).
K. Kaczmarski: The project was partially funded by National Science Centre, decision DEC-2012/07/D/ST6/02483.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In this work we understand optimal compression as the best compression within available lightweight algorithms.
- 2.
Note that this is a certain simplification, i.e. instead \(c_{re}(0,h_{re},\bar{D})\) where \(\bar{D}\) is dataset after removing all instances of dominant value.
References
Apache HBase (2013). http://hbase.apache.org
OpenTSDB - A Distributed, Scalable Monitoring System (2013). http://opentsdb.net/
ParStream - website (2013). https://www.parstream.com
TempoDB - Hosted time series database service (2013). https://tempo-db.com/
Andrzejewski, W., Wrembel, R.: GPU-WAH: applying GPUs to compressing bitmap indexes with word aligned hybrid. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds.) DEXA 2010, Part II. LNCS, vol. 6262, pp. 315–329. Springer, Heidelberg (2010)
Boncz, P.A., Zukowski, M., Nes, N.: Monetdb/x100: hyper-pipelining query execution. In: CIDR, pp. 225–237 (2005)
Breß, S., Schallehn, E., Geist, I.: Towards Optimization of Hybrid CPU/GPU Query Plans in Database Systems. In: New Trends in Databases and Information Systems, pp. 27–35. Springer, Heidelberg (2013)
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. In: OSDI’06: Seventh Symposium on Operating System Design and Implementation, Seattle, WA, November, pp. 205–218 (2006)
Chatfield, C.: The Analysis of Time Series: An Introduction, 6th edn. CRC Press, Florida (2004)
Cloudkick. 4 months with cassandra, a love story, March 2010. https://www.cloudkick.com/blog/2010/mar/02/4_months_with_cassandra/
Dean, J., Ghemawat, S.: Mapreduce simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2004)
Delbru, R., Campinas, S., Samp, K., Tummarello, G.: Adaptive frame of reference for compressing inverted lists. Technical report, DERI - Digital Enterprise Research Institute, December 2010
Fang, W., He, B., Luo, Q.: Database compression on graphics processors. Proc. VLDB Endowment 3(1–2), 670–680 (2010)
Fink, E., Gandhi, H.S.: Compression of time series by extracting major extrema. J. Exp. Theor. Artif. Intell. 23(2), 255–270 (2011)
Lees, M., Ellen, R., Steffens, M., Brodie, P., Mareels, I., Evans, R.: Information infrastructures for utilities management in the brewing industry. In: Herrero, P., Panetto, H., Meersman, R., Dillon, T. (eds.) OTM-WS 2012. LNCS, vol. 7567, pp. 73–77. Springer, Heidelberg (2012)
Marler, R.T., Arora, J.S.: Survey of multi-objective optimization methods for engineering. Struct. Mult. Optim. 26(6), 369–395 (2004)
OpenTSDB. Whats opentsdb (2010–2012). http://opentsdb.net/
Papadimitriou, C.H., Yannakakis, M.: Multiobjective query optimization. In: Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 52–59. ACM (2001)
Przymus, P., Kaczmarski, K.: Improving efficiency of data intensive applications on GPU using lightweight compression. In: Herrero, P., Panetto, H., Meersman, R., Dillon, T. (eds.) OTM-WS 2012. LNCS, vol. 7567, pp. 3–12. Springer, Heidelberg (2012)
Przymus, P., Kaczmarski, K.: Dynamic compression strategy for time series database using GPU. In: New Trends in Databases and Information Systems. 17th East-European Conference on Advances in Databases and Information Systems, 1–4 September 2013 - Genoa, Italy (2013)
Przymus, P., Kaczmarski, K.: Time series queries processing with gpu support. In: New Trends in Databases and Information Systems. 17th East-European Conference on Advances in Databases and Information Systems, 1–4 September 2013 - Genoa, Italy (2013)
Przymus, P., Kaczmarski, K., Stencel, K.: A bi-objective optimization framework for heterogeneous CPU/GPU query plans. In: CS&P 2013 Concurrency, Specification and Programming. Proceedings of the 22nd International Workshop on Concurrency, Specification and Programming, 25–27 September 2013 - Warsaw, Poland (2013)
Przymus, P., Rykaczewski, K., Wiśniewski, R.: Application of wavelets and Kernel methods to detection and extraction of behaviours of freshwater mussels. In: Kim, T., Adeli, H., Slezak, D., Sandnes, F.E., Song, X., Chung, K., Arnett, K.P. (eds.) FGIT 2011. LNCS, vol. 7105, pp. 43–54. Springer, Heidelberg (2011)
Wu, L., Storus, M., Cross, D.: Cs315a: final project cuda wuda shuda: Cuda compression project (2009)
Yan, H., Ding, S., Suel, T.: Inverted index compression and query processing with optimized document ordering. In: Proceedings of the 18th International Conference on World Wide Web, pp. 401–410. ACM (2009)
Zukowski, M., Heman, S., Nes, N., Boncz, P.: Super-scalar RAM-CPU cache compression. In: ICDE’06. Proceedings of the 22nd International Conference on Data Engineering, pp. 59–59. IEEE (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Przymus, P., Kaczmarski, K. (2014). Compression Planner for Time Series Database with GPU Support. In: Hameurlain, A., et al. Transactions on Large-Scale Data- and Knowledge-Centered Systems XV. Lecture Notes in Computer Science(), vol 8920. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45761-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-662-45761-0_2
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45760-3
Online ISBN: 978-3-662-45761-0
eBook Packages: Computer ScienceComputer Science (R0)