Skip to main content

HDStore: An SSD/HDD Hybrid Distributed Storage Scheme for Large-Scale Data

  • Conference paper
  • First Online:
Web-Age Information Management (WAIM 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8597))

Included in the following conference series:

Abstract

Traditional data storage schemes are primarily based upon Hard Disk Drives (HDD). However, with the appearance of large amount of data on the Web, the read/write performance based on HDD has reached a bottleneck. Thus the emerging of Solid State Drives (SSD) has provided an opportunity for the storage of the Web of data. In this paper, we propose an SSD/HDD hybrid distributed storage scheme, called HDStore, for large-scale data, in which the single fix-sized journal file using the append-only mode is stored on SSD to support efficient read and write, while several segment files focusing on read are stored on HDD. Through a series of operations build, split, move, and merge between the journal and segment files, we constructed HDStore storage scheme based on JS-model. The experimental results show that HDStore obtains an efficient optimization of data read/write, especially the write performance has increased by 15 % compared to the traditional HDD-based scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lohr, S.: The age of big data. New York Times 11 (2012)

    Google Scholar 

  2. Davis, M.: Specialized data management method: U.S. Patent 5,423,038. P. 6 Jun. (1995)

    Google Scholar 

  3. Minelli, M., Chambers, M., Dhiraj, A.: Big data technology. Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today’s Businesses, pp. 61–88 (2013)

    Google Scholar 

  4. McAfee, A., Brynjolfsson, E.: Big data: the management revolution. J. Harvard Bus. Rev. 90(10), 60–68 (2012)

    Google Scholar 

  5. Fienberg, S.E.: The Analysis of Cross-Classified Categorical Data. Springer, New York (2007)

    Book  MATH  Google Scholar 

  6. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. J. Commun. ACM. 51(1), 107–113 (2008)

    Article  Google Scholar 

  7. Porcarelli, D., Brunelli, D., Magno, M., et al.: A multi-harvester architecture with hybrid storage devices and smart capabilities for low power systems. In: 2012 International Symposium on Power electronics, Electrical Drives, Automation and Motion (SPEEDAM), PP. 946–951. IEEE (2012)

    Google Scholar 

  8. Tsirogiannis, D., Harizopoulos, S., Shah, M.A., et al.: Query processing techniques for solid state drives. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, pp. 59–72. ACM (2009)

    Google Scholar 

  9. Agrawal, N., Prabhakaran, V., Wobber, T., et al.: Design tradeoffs for SSD performance. In: USENIX Annual Technical Conference, pp. 57–70 (2008)

    Google Scholar 

  10. Madden, S.: From databases to big data. J. IEEE Internet Comput. 16(3), 4–6 (2012)

    Article  Google Scholar 

  11. Hitchcock, R., Smith, G.L., Cheng, D.D.: Timing analysis of computer hardware. J. IBM J. Res. Dev. 26(1), 100–105 (1982)

    Article  Google Scholar 

  12. Sarajlic, E., Yamahata, C., Cordero, M., et al.: Electrostatic rotary stepper micromotor for skew angle compensation in hard disk drive. In: IEEE 22nd International Conference on Micro Electro Mechanical Systems, 2009, MEMS 2009, pp. 1079–1082. IEEE (2009)

    Google Scholar 

  13. Chen, F., Koufaty, D.A., Zhang, X.: Understanding intrinsic characteristics and system implications of flash memory based solid state drives. ACM SIGMETRICS Perform. Eval. Rev. 37(1), 181–192 (2009)

    Google Scholar 

  14. Rizvi, S.S., Chung, T.S.: Flash SSD vs HDD: High performance oriented modern embedded and multimedia storage systems. In: 2010 2nd International Conference on Computer Engineering and Technology (ICCET), vol. 7. IEEE V7-297–V7-299 (2010)

    Google Scholar 

  15. SYSTAP, L.: Bigdata. http://www.systap.com/bigdata.htm

  16. Guo, Y., Pan, Z., Heflin, J.: LUBM: A benchmark for OWL knowledge base systems. Web Semant. Sci. Serv. Agents. World Wide Web 3(2), 158–182 (2005)

    Article  Google Scholar 

  17. Prud’Hommeaux, E., Seaborne, A.: SPARQL query language for RDF. W3C recommendation, 15 (2008)

    Google Scholar 

  18. Yang, P., Jin, P., Wan, S., Yue, L.: HB-Storage: optimizing SSDs with a HDD write buffer. In: Gao, Y., Shim, K., Ding, Z., Jin, P., Ren, Z., Xiao, Y., Liu, A., Qiao, S. (eds.) WAIM 2013 Workshops 2013. LNCS, vol. 7901, pp. 28–39. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  19. Chen, S.: FlashLogging: exploiting flash devices for synchronous logging performance. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, pp. 73–86. ACM (2009)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the National High-tech R&D Program of China (863 Program) (2013AA013204) and the National Natural Science Foundation of China (61100049, 61373165), Special Fund for Fast Sharing of Science Paper in Net Era by CSTD (No.2012008).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Wang .

Editor information

Editors and Affiliations

Appendix A: The Query of LUBM Data Set

Appendix A: The Query of LUBM Data Set

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Feng, Z., Feng, Z., Wang, X., Rao, G., Wei, Y., Li, Z. (2014). HDStore: An SSD/HDD Hybrid Distributed Storage Scheme for Large-Scale Data. In: Chen, Y., et al. Web-Age Information Management. WAIM 2014. Lecture Notes in Computer Science(), vol 8597. Springer, Cham. https://doi.org/10.1007/978-3-319-11538-2_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11538-2_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11537-5

  • Online ISBN: 978-3-319-11538-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics