Skip to main content

Comprehensive Analytics of Large Data Query Processing on Relational Database with SSDs

  • Conference paper
Book cover Databases Theory and Applications (ADC 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8506))

Included in the following conference series:

  • 1155 Accesses

Abstract

Solid-state drives (SSDs) are widely used in large data processing applications due to their higher random access throughput than HDDs and capability of parallel I/O processing. The I/O bottlenecks that HDDs on database systems face can be resolved by using SSDs because of these advantages. However, access latency on cache hierarchy may become a new bottleneck in SSD-based databases. In this study, we quantitatively analyzed the behavior of SSD-based databases by taking hashjoin operation. We found that cache misses in SSD-based databases can be decreased by reducing the hashtable size to fit into the cache. This is because the I/O cost is not increased by the high throughput of the SSDs, even though the hashjoin partition files are fragmented. We also observed that cache misses are not increased by taking a multi-hashjoin query. This is because the total size of multiple hashtables can fit into the cache size in SSD-based databases, which is in contrast to HDD-based databases, where hashtables require almost all of the available memory. Overall, our analytics clarify that the performance of multiple queries in SSD-based databases can be improved by considering data access locality of the hashjoin operation and determining the appropriate hashtable size to fit into the cache.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bhattacharjee, B., Ross, K.A., Lang, C., Mihaila, G.A., Banikazemi, M.: Enhancing recovery using an SSD buffer pool extension. In: DaMoN 2011, pp. 10–16. ACM (2011)

    Google Scholar 

  2. Canim, M., Mihaila, G.A., Bhattacharjee, B., Ross, K.A., Lang, C.A.: SSD bufferpool extensions for database systems. Proc. VLDB Endow. 1435–1446 (2010)

    Google Scholar 

  3. Kang, W.H., Lee, S.W., Moon, B.: Flash-based extended cache for higher throughput and faster recovery. Proc. VLDB Endow. 5(11), 1615–1626 (2012)

    Article  Google Scholar 

  4. Do, J., Zhang, D., Patel, J.M., De Witt, D.J., Naughton, J.F., Halverson, A.: Turbocharging DBMS buffer pool using SSDs. In: SIGMOD 2011, pp. 1113–1124. ACM (2011)

    Google Scholar 

  5. Koltsidas, I., Viglas, S.D.: Flashing up the storage layer. Proc. VLDB Endow. 1(1), 514–525 (2008)

    Article  Google Scholar 

  6. Luo, T., Lee, R., Mesnier, M., Chen, F., Zhang, X.: hStorage-DB: Heterogeneity-aware data management to exploit the full capability of hybrid storage systems. Proc. VLDB Endow. 5(10), 1076–1087 (2012)

    Article  Google Scholar 

  7. Li, Y., He, B., Yang, R.J., Luo, Q., Yi, K.: Tree indexing on solid state drives. Proc. VLDB Endow. 3(1-2), 1195–1206 (2010)

    Article  Google Scholar 

  8. Tsirogiannis, D., Harizopoulos, S., Shah, M.A., Wiener, J.L., Graefe, G.: Query Processing Techniques for Solid State Drives. In: SIGMOD 2009, pp. 59–72. ACM (2009)

    Google Scholar 

  9. Kitsuregawa, M., Tanaka, H., Moto-Oka, T.: Relational Algebra Machine GRACE. In: Goto, E., Furukawa, K., Nakajima, R., Nakata, I., Yonezawa, A. (eds.) RIMS 1982. LNCS, vol. 147, pp. 191–214. Springer, Heidelberg (1983)

    Chapter  Google Scholar 

  10. Schneider, D.A., De Witt, D.J.: A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment. In: SIGMOD 1989, pp. 110–121. ACM (1989)

    Google Scholar 

  11. PostgreSQL, http://www.postgresql.org/

  12. Transaction Processing Performance Council, An ad-hoc, decision support benchmark, http://www.tpc.org/tpch/

  13. Perf, https://perf.wiki.kernel.org/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Suzuki, K., Hayamizu, Y., Yokoyama, D., Nakano, M., Kitsuregawa, M. (2014). Comprehensive Analytics of Large Data Query Processing on Relational Database with SSDs. In: Wang, H., Sharaf, M.A. (eds) Databases Theory and Applications. ADC 2014. Lecture Notes in Computer Science, vol 8506. Springer, Cham. https://doi.org/10.1007/978-3-319-08608-8_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08608-8_12

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08607-1

  • Online ISBN: 978-3-319-08608-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics