Skip to main content
Log in

A hybrid memory built by SSD and DRAM to support in-memory Big Data analytics

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Big Data requires a shift in traditional computing architecture. The in-memory computing is a new paradigm for Big Data analytics. However, DRAM-based main memory is neither cost-effective nor energy-effective. This work combines flash-based solid state drive (SSD) and DRAM together to build a hybrid memory, which meets both of the two requirements. As the latency of SSD is much higher than that of DRAM, the hybrid architecture should guarantee that most requests are served by DRAM rather than by SSD. Accordingly, we take two measures to enhance the hit ratio of DRAM. First, the hybrid memory employs an adaptive prefetching mechanism to guarantee that data have already been prepared in DRAM before they are demanded. Second, the DRAM employs a novel replacement policy to give higher priority to replace data that are easy to be prefetched because these data can be served by prefetching once they are demanded once again. On the contrary, the data that are hard to be prefetched are protected by DRAM. The prefetching mechanism and replacement policy employed by the hybrid memory rely on access patterns of files. So, we propose a novel pattern recognition method by improving the LZ data compression algorithm to detect access patterns. We evaluate our proposals via prototype and trace-driven simulations. Experimental results demonstrate that the hybrid memory is able to extend the DRAM by more than twice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Zikopoulos P, Eaton C, Zikopoulos P (2011) Understanding Big Data: analytics for enterprise class Hadoop and streaming data. 19 Oct 2011. http://public.dhe.ibm.com/common/ssi/ecm/en/iml14296usen/IML14296USEN.PDF. Accessed 22 Dec 2013

  2. Villars RL, Olofson CW, Eastwood M (2011) Big Data: what it is and why you should care. http://sites.amd.com/us/Documents/IDC_AMD_Big_Data_Whitepaper.pdf. Accessed 22 Dec 2013

  3. Shinnar A, Cunningham D, Herta B, Saraswat V (2012) M3R: increased performance for in-memory Hadoop jobs. In: Proceedings of the VLDB endowment, vol 5, no. 4, Istanbul, Turkey, August 27–31

  4. Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauley M, Franklin MJ, Shenker S, Stoica I (2012) Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX conference on networked systems design and implementation, San Jose, CA, April 25–27

  5. Pirk H, Funke F, Grund M, Neumann T, Leser U, Manegold S, Kemper A, Kersten M (2013) CPU and cache efficient management of memory-resident databases. In: Proceedings of ICDE2013, Brisbane, Australia, April 8–12

  6. Larson P-A, Blanas S, Diaconu C, Freedman C, Patel JM, Zwilling M (2012) High-performance concurrency control mechanisms for main-memory databases. In: Proceedings of the VLDB endowment, vol 5, no. 10, Istanbul, Turkey, Aug 27–31

  7. Levandoski J, Larson P, Stoica R (2013) Identifying hot and cold data in main-memory databases. In: Proceedings of ICDE2013, Brisbane, Australia, April 8–12

  8. Albutiu M-C, Kemper A, Neumann T (2012) Massively parallel sortmerge joins in main memory multicore database systems. In: Proceedings of the VLDB Endowment, vol 5, no. 4, Istanbul, Turkey, Aug 27–31

  9. Kgil T, Roberts D, Mudge T (2008) Improving NAND flash based disk caches. In: Proceedings of the 35th international symposium on computer, architecture, pp 327–338, June 21–25

  10. Kgil T, Mudge T (2006) FlashCache: a NAND flash memory file cache for low power web servers. In: Proceedings of the 2006 international conference on compilers, architecture and synthesis for embedded systems, Seoul, Korea, Oct 22–25

  11. Wu X, Li J, Zhang L, Speight E, Rajamony R, Xie Y (2009) Hybrid cache architecture with disparate memory technologies. In: Proceedings of 36th annual international symposium computer architecture (ISCA 09), pp 34-45

  12. Dhiman G, Ayoub R, Rosing T (2009) PDRAM: a hybrid PRAM and DRAM main memory system. In: Proceedings of design automation conference, pp 664–669, July 26–31

  13. Qureshi MK, Srinivasan V, Rivers JA (2009) Scalable high performance main memory system using phase-change memory technology. In: Proceedings of 36th annual international symposium computer architecture (ISCA 09), pp 24–33

  14. Liang S, Song J, Zhang X (2007) STEP: sequentiality and thrashing detection based prefetching to improve performance of networked storage servers. In: Proceedings of 27th international conference on distributed, computing systems, June 25–27

  15. Gill BS, Modha DS (2005) Sarc: sequential prefetching in adaptive replacement cache. In: Proceedings of the general track: USENIX 2005 annual technical conference (USENIX), pp 293–308

  16. Gill BS, Bathen LAD (2007) AMP: adaptive multi-stream prefetching in a shared cache. In: Proceedings of the fifth USENIX symposium on file and storage technologies, pp 185–198, San Jose, CA

  17. Xiao N, Chen ZG, Liu F, Lai MC, An LF (2011) P3Stor: a parallel, durable flash-based SSD for enterprise-scale storage systems. Sci China Inf Sci 54:1129–1141

    Article  Google Scholar 

  18. Cao P, Felten EW, Karlin AR, Li K (1996) Implementation and performance of integrated applica-tion-controlled file caching, prefetching and disk scheduling. ACM Trans Comput Syst 14(4):311–343

    Article  Google Scholar 

  19. Choi J, Noh SH, Min SL et al (2000) Towards application/file-level characterization of block references: a case for fine-grained buffer management. In: Proceedings of the 2000 ACM SIGMETRICS international conference on measurement and modeling of computer systems 2000. Santa Clara, CA, US

  20. Ziv j, Lempel A (1978) Compression of individual sequences via variable-rate coding. IEEE Trans Inf Theory 24(5):530–536

    Article  MathSciNet  MATH  Google Scholar 

  21. Kang WH, Lee SW, Moon B (2012) Flash-based extended cache for higher throughput and faster recovery. Proc VLDB Endow 5(11):1615–1626

    Article  Google Scholar 

  22. Gniady C, Butt AR, Hu YC (2004) Program-counter-based pattern classification in buffer caching. In: Proceedings of the 6th conference on symposium on operating systems design and implementation. San Francisco, CA, Dec 6–8

  23. Patterson RH, Gibson GA, Ginting E, Stodolsky D, Zelenka J (1995) Informed prefetching and caching. In: Proceedings of the 15th ACM symposium on operating systems principles (SOSP), pp 79–95

  24. Li Z, Chen Z, Srinivasan SM, Zhou Y (2004) C-miner: mining block correlations in storage systems. In: Proceedings of the 3rd USENIX conference on file and storage technologies (FAST), pp 173–186

  25. Curewitz KM, Krishnan P, Vitter JS (1993) Practical prefetching via data compression. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data, pp 257–266, Washington, DC, US, May 25–28

  26. Vitter JS, Krishnan P (1991) Optimal prefetching via data compression. In: Proceedings of the 32nd annual IEEE symposium on foundations of computer science, October

  27. Uppal AJ, Chiang RC, Huang HH (2012) Flashy prefetching for high-performance flash drives. In: Proceedings of 28th symposium on mass storage systems and technologies (MSST), April 16–20

  28. SNIA Block Traces. http://iotta.snia.org/traces. Accessed 3 May 2013

  29. Narayanan D, Donnelly A, Rowstron A (2008) Write off-loading: practical power management for enterprise storage. In: Proceedings of the 6th USENIX conference on file and storage technologies, pp 253–267, San Jose, CA, USA, Feb 26–29

  30. Ramaxel, IO Edge 400GB PCIE2.0 Flash Card. http://www.ramaxel.com. Accessed 10 July 2013

Download references

Acknowledgments

We are grateful to our anonymous reviewers for their suggestions to improve this paper. This work is supported by the National High Technology Research and Development 863 Program of China under Grant No. 2013AA013201, the National Natural Science Foundation of China under Grant Nos. 61025009, 61120106005, 61232003, 61170288, 61379145, and 61332003.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yutong Lu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, Z., Lu, Y., Xiao, N. et al. A hybrid memory built by SSD and DRAM to support in-memory Big Data analytics. Knowl Inf Syst 41, 335–354 (2014). https://doi.org/10.1007/s10115-013-0727-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-013-0727-6

Keywords

Navigation