A Framework of Write Optimization on Read-Optimized Out-of-Core Column-Store Databases

Yu, Feng; Hou, Wen-Chi

doi:10.1007/978-3-319-22849-5_12

A Framework of Write Optimization on Read-Optimized Out-of-Core Column-Store Databases

Feng Yu¹⁸ &
Wen-Chi Hou¹⁹

Conference paper
First Online: 01 January 2015

1215 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9261))

Abstract

The column-store database features a faster data reading speed and higher data compression efficiency compared with traditional row-based databases. However, optimizing write operations in the column-store database is one of the well-known challenges. Most existing works on write performance optimization focus on main-memory column-store databases. In this work, we investigate optimizing write operation (update and deletion) on out-of-core (OOC, or external memory) column-store databases. We propose a general framework to work for both normal OOC storage or big data storage, such as Hadoop Distributed File System (HDFS). On normal OOC storage, we propose an innovative data storage format called Timestamped Binary Association Table (or TBAT). Based on TBAT, a new update method, called Asynchronous Out-of-Core Update (or AOC Update), is designed to replace the traditional update. On big data storage, we further extend TBAT onto HDFS and propose the Asynchronous Map-Only Update (or AMO Update) to replace the traditional update. Fast selection methods are developed in both contexts to improve data retrieving speed. A significant improvement in speed performance is shown in the extensive experiments when performing write operations on TBAT in normal and Map-Reduce environment.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Abadi, D.J., Boncz, P.A., Harizopoulos, S.: Column-oriented database systems. Proc. VLDB Endow. 2(2), 1664–1665 (2009)
Article MATH Google Scholar
Aiyer, A.S., Bautin, M., Chen, G.J., Damania, P., Khemani, P., Muthukkaruppan, K., Ranganathan, K., Spiegelberg, N., Tang, L., Vaidya, M.: Storage infrastructure behind facebook messages: using HBase at scale. IEEE Data Eng. Bull. 35(2), 4–13 (2012)
Google Scholar
Boncz, P.: Monet: A Next-Generation DBMS Kernel For Query-Intensive Applications. Ph.D. thesis, Universiteit van Amsterdam, Amsterdam, The Netherlands, May 2002
Google Scholar
Boncz, P., Grust, T., Van Keulen, M., Manegold, S., Rittinger, J., Teubner, J.: Monetdb/xquery: a fast xquery processor powered by a relational engine. In: ACM SIGMOD, pp. 479–490 (2006)
Google Scholar
Brill, R.: The Taxir Primer. ERIC, Washington, D.C (1971)
Google Scholar
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2), 4:1–4:26 (2008)
Article Google Scholar
Copeland, G.P., Khoshafian, S.N.: A decomposition storage model. In: Proceedings of ACM SIGMOD Record, vol. 14, pp. 268–279. ACM (1985)
Google Scholar
Dean, J., Ghemawat, S.: MapReduce: a flexible data processing tool. Commun. ACM 53(1), 72–77 (2010)
Article Google Scholar
Estabrook, G.F., Brill, R.C.: The theory of the taxir accessioner. Math. Biosci. 5(3), 327–340 (1969)
Article Google Scholar
Färber, F., Cha, S.K., Primsch, J., Bornhövd, C., Sigg, S., Lehner, W.: SAP HANA database: data management for modern business applications. SIGMOD Rec. 40(4), 45–51 (2012)
Article Google Scholar
Färber, F., May, N., Lehner, W., Große, P., Müller, I., Rauhe, H., Dees, J.: The SAP HANA database - an architecture overview. IEEE Data Eng. Bull. 35(1), 28–33 (2012)
Google Scholar
George, L.: HBase: The Definitive Guide. O’Reilly Media Inc., CA (2011)
MATH Google Scholar
Ghemawat, S., Gobioff, H., Leung, S.-T.: The google file system. SIGOPS Oper. Syst. Rev. 37(5), 29–43 (2003)
Article MATH Google Scholar
Gluche, D., Grust, T., Mainberger, C., Scholl, M.: Incremental updates for materialized OQL views. In: Bry, François (ed.) DOOD 1997. LNCS, vol. 1341, pp. 52–66. Springer, Heidelberg (1997)
Chapter Google Scholar
Khoshafian, S., Copeland, G.P., Jagodis, T., Boral, H., Valduriez, P.: A query processing strategy for the decomposed storage model. In: Proceedings, pp. 636. Order from IEEE Computer Society (1987)
Google Scholar
Krueger, J., Grund, M., Tinnefeld, C., Plattner, H., Zeier, A., Faerber, F.: Optimizing write performance for read optimized databases. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds.) DASFAA 2010. LNCS, vol. 5982, pp. 291–305. Springer, Heidelberg (2010)
Chapter Google Scholar
Krueger, J., Kim, C., Grund, M., Satish, N., Schwalb, D., Chhugani, J., Plattner, H., Dubey, P., Zeier, A.: Fast updates on read-optimized databases using multi-core cpus. Proc. VLDB Endow. 5(1), 61–72 (2011)
Article Google Scholar
Ladwig, G., Harth, A.: Cumulusrdf: linked data management on nested key-value stores. In: The 7th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS 2011), p. 30 (2011)
Google Scholar
Lamb, A., Fuller, M., Varadarajan, R., Tran, N., Vandiver, B., Doshi, L., Bear, C.: The vertica analytic database: C-store 7 years later. Proc. VLDB Endow. 5(12), 1790–1801 (2012)
Article Google Scholar
White, T.: Hadoop: The Definitive Guide, 2nd edn. O’Reilly, CA (2010)
Google Scholar
Zukowski, M., Nes, N., Boncz, P.: Dsm vs. nsm: Cpu performance tradeoffs in block-oriented query processing. In: DaMoN 2008, pp. 47–54. ACM, New York (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Youngstown State University, Youngstown, OH, 44555, USA
Feng Yu
Southern Illinois University, Carbondale, IL, 62901, USA
Wen-Chi Hou

Authors

Feng Yu
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Chi Hou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Feng Yu .

Editor information

Editors and Affiliations

Hewlett-Packard Enterprise, Sunnyvale, California, USA
Qiming Chen
Paul Sabatier University, Toulouse, France
Abdelkader Hameurlain
Blaise Pascal University, Aubiere, France
Farouk Toumani
University of Linz, Linz, Austria
Roland Wagner
Universidad Politécnica de Valencia, Valencia, Spain
Hendrik Decker

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yu, F., Hou, WC. (2015). A Framework of Write Optimization on Read-Optimized Out-of-Core Column-Store Databases. In: Chen, Q., Hameurlain, A., Toumani, F., Wagner, R., Decker, H. (eds) Database and Expert Systems Applications. Globe DEXA 2015 2015. Lecture Notes in Computer Science(), vol 9261. Springer, Cham. https://doi.org/10.1007/978-3-319-22849-5_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-22849-5_12
Published: 11 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22848-8
Online ISBN: 978-3-319-22849-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics