Abstract
Solid-state drives (SSDs) are widely used in large data processing applications due to their higher random access throughput than HDDs and capability of parallel I/O processing. The I/O bottlenecks that HDDs on database systems face can be resolved by using SSDs because of these advantages. However, access latency on cache hierarchy may become a new bottleneck in SSD-based databases. In this study, we quantitatively analyzed the behavior of SSD-based databases by taking hashjoin operation. We found that cache misses in SSD-based databases can be decreased by reducing the hashtable size to fit into the cache. This is because the I/O cost is not increased by the high throughput of the SSDs, even though the hashjoin partition files are fragmented. We also observed that cache misses are not increased by taking a multi-hashjoin query. This is because the total size of multiple hashtables can fit into the cache size in SSD-based databases, which is in contrast to HDD-based databases, where hashtables require almost all of the available memory. Overall, our analytics clarify that the performance of multiple queries in SSD-based databases can be improved by considering data access locality of the hashjoin operation and determining the appropriate hashtable size to fit into the cache.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bhattacharjee, B., Ross, K.A., Lang, C., Mihaila, G.A., Banikazemi, M.: Enhancing recovery using an SSD buffer pool extension. In: DaMoN 2011, pp. 10–16. ACM (2011)
Canim, M., Mihaila, G.A., Bhattacharjee, B., Ross, K.A., Lang, C.A.: SSD bufferpool extensions for database systems. Proc. VLDB Endow. 1435–1446 (2010)
Kang, W.H., Lee, S.W., Moon, B.: Flash-based extended cache for higher throughput and faster recovery. Proc. VLDB Endow. 5(11), 1615–1626 (2012)
Do, J., Zhang, D., Patel, J.M., De Witt, D.J., Naughton, J.F., Halverson, A.: Turbocharging DBMS buffer pool using SSDs. In: SIGMOD 2011, pp. 1113–1124. ACM (2011)
Koltsidas, I., Viglas, S.D.: Flashing up the storage layer. Proc. VLDB Endow. 1(1), 514–525 (2008)
Luo, T., Lee, R., Mesnier, M., Chen, F., Zhang, X.: hStorage-DB: Heterogeneity-aware data management to exploit the full capability of hybrid storage systems. Proc. VLDB Endow. 5(10), 1076–1087 (2012)
Li, Y., He, B., Yang, R.J., Luo, Q., Yi, K.: Tree indexing on solid state drives. Proc. VLDB Endow. 3(1-2), 1195–1206 (2010)
Tsirogiannis, D., Harizopoulos, S., Shah, M.A., Wiener, J.L., Graefe, G.: Query Processing Techniques for Solid State Drives. In: SIGMOD 2009, pp. 59–72. ACM (2009)
Kitsuregawa, M., Tanaka, H., Moto-Oka, T.: Relational Algebra Machine GRACE. In: Goto, E., Furukawa, K., Nakajima, R., Nakata, I., Yonezawa, A. (eds.) RIMS 1982. LNCS, vol. 147, pp. 191–214. Springer, Heidelberg (1983)
Schneider, D.A., De Witt, D.J.: A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment. In: SIGMOD 1989, pp. 110–121. ACM (1989)
PostgreSQL, http://www.postgresql.org/
Transaction Processing Performance Council, An ad-hoc, decision support benchmark, http://www.tpc.org/tpch/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Suzuki, K., Hayamizu, Y., Yokoyama, D., Nakano, M., Kitsuregawa, M. (2014). Comprehensive Analytics of Large Data Query Processing on Relational Database with SSDs. In: Wang, H., Sharaf, M.A. (eds) Databases Theory and Applications. ADC 2014. Lecture Notes in Computer Science, vol 8506. Springer, Cham. https://doi.org/10.1007/978-3-319-08608-8_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-08608-8_12
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08607-1
Online ISBN: 978-3-319-08608-8
eBook Packages: Computer ScienceComputer Science (R0)