skip to main content
10.1145/2818950.2818981acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmemsysConference Proceedingsconference-collections
research-article

Herniated Hash Tables: Exploiting Multi-Level Phase Change Memory for In-Place Data Expansion

Published: 05 October 2015 Publication History

Abstract

Hash tables are a commonly used data structure used in many algorithms and applications. As applications and data scale, the efficient implementation of hash tables becomes increasingly important and challenging. In particular, memory capacity becomes increasingly important and entries can become asymmetrically chained across hash buckets. This chaining prevents two forms of parallelism: memory-level parallelism (allowing multiple prefetch requests to overlap) and memory-computation parallelism (allowing computation to overlap memory operations). We propose, herniated hash tables, a technique that exploits multi-level phase change memory (PCM) storage to expand storage at each hash bucket and increase parallelism without increasing physical space.
The technique works by increasing the number of bits stored within the same resistance range of an individual PCM cell. We pack more data into the same bit by decreasing noise margins, and we pay for this higher density with higher latency reads and writes that resolve the more accurate resistance values. Furthermore, our organization, coupled with an addressing and prefetching scheme, increases memory parallelism of the herniated datastructure.
We simulate our system with a variety of hash table applications and evaluate the density and performance benefits in comparison to a number of baseline systems. Compared with conventional chained hash tables on single-level PCM, herniated hash tables can achieve 4.8x density on a 4-level PCM while achieving up to 67% performance improvement.

References

[1]
F. Alibart, L. Gao, B. Hoskins, and D. B. Strukov. High-precision tuning of state for memristive devices by adaptable variation-tolerant algorithm. CoRR, abs/1110.1393, 2011.
[2]
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The parsec benchmark suite: Characterization and architectural implications. In Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pages 72--81. ACM, 2008.
[3]
N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood. The gem5 simulator. SIGARCH Comput. Archit. News, 39(2):1--7, 2011.
[4]
N. G. Bronson, J. Casper, H. Chafi, and K. Olukotun. Transactional predication: high-performance concurrent sets and maps for stm. In Proceedings of the 29th ACM SIGACT-SIGOPS symposium on Principles of distributed computing, pages 6--15. ACM, 2010.
[5]
A. Cortex. a9 processor. URL: http://www.arm.com/products/processors/cortex-a/cortex-a9.php.{Accessed 6 January 2014}, 2011.
[6]
K. Diefendorff, R. Oehler, and R. Hochsprung. Evolution of the powerpc architecture. IEEE Micro, (2):34--49, 1994.
[7]
J. Huck and J. Hays. Architectural support for translation table management in large address space machines. In ACM SIGARCH Computer Architecture News, volume 21, pages 39--50. ACM, 1993.
[8]
B. Jacob and T. Mudge. Software-managed address translation. In High-Performance Computer Architecture, 1997., Third International Symposium on, pages 156--167. IEEE, 1997.
[9]
B. Jacob and T. Mudge. Virtual memory in contemporary microprocessors. Micro, IEEE, 18(4):60--75, 1998.
[10]
B. Jacob and T. Mudge. Virtual memory: Issues of implementation. Computer, 31(6):33--43, 1998.
[11]
B. L. Jacob and T. N. Mudge. A look at several memory management units, tlb-refill mechanisms, and page table organizations. In ACM SIGOPS Operating Systems Review, volume 32, pages 295--306. ACM, 1998.
[12]
L. Jiang, B. Zhao, Y. Zhang, J. Yang, and B. R. Childers. Improving write operations in mlc phase change memory. In High Performance Computer Architecture (HPCA), 2012 IEEE 18th International Symposium on, pages 1--10. IEEE, 2012.
[13]
M. Joshi, W. Zhang, and T. Li. Mercury: A fast and energy-efficient multi-level cell based phase change memory system. In High Performance Computer Architecture (HPCA), 2011 IEEE 17th International Symposium on, pages 345--356. IEEE, 2011.
[14]
X. Li, D. G. Andersen, M. Kaminsky, and M. J. Freedman. Algorithmic improvements for fast concurrent cuckoo hashing. In Proceedings of the Ninth European Conference on Computer Systems, page 27. ACM, 2014.
[15]
M. M. Michael. High performance dynamic lock-free hash tables and list-based sets. In Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures, pages 73--82. ACM, 2002.
[16]
M. Poremba, T. Zhang, and Y. Xie. Nvmain 2.0: Architectural simulator to model (non-) volatile memory systems.
[17]
M. K. Qureshi, M. M. Franceschini, L. Lastras-Monta, et al. Improving read performance of phase change memories via write cancellation and write pausing. In High Performance Computer Architecture (HPCA), 2010 IEEE 16th International Symposium on, pages 1--11. IEEE, 2010.
[18]
M. K. Qureshi, M. M. Franceschini, L. A. Lastras-Montaño, and J. P. Karidis. Morphable memory system: A robust architecture for exploiting multi-level phase change memories. In 37th International Symposium on Computer Architecture, pages 153--162, 2010.
[19]
H. Saadeldeen, D. Franklin, G. Long, C. Hill, A. Browne, D. Strukov, T. Sherwood, and F. T. Chong. Memristors for neural branch prediction: a case study in strict latency and write endurance challenges. In Proceedings of the ACM International Conference on Computing Frontiers, pages 26:1--26:10, 2013.
[20]
A. Sampson, J. Nelson, K. Strauss, and L. Ceze. Approximate storage in solid-state memories. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, pages 25--36, 2013.
[21]
O. Shalev and N. Shavit. Split-ordered lists: Lock-free extensible hash tables. Journal of the ACM (JACM), 53(3):379--405, 2006.
[22]
M. Talluri, M. D. Hill, and Y. A. Khalidi. A new page table for 64-bit address spaces, volume 29. ACM, 1995.
[23]
J. Triplett, P. E. McKenney, and J. Walpole. Resizable, scalable, concurrent hash tables via relativistic programming. In USENIX Annual Technical Conference, page 11, 2011.
[24]
D. L. Weaver and T. Gremond. The SPARC architecture manual. PTR Prentice Hall Englewood Cliffs, NJ 07632, 1994.
[25]
L. Zhang, D. Strukov, H. Saadeldeen, D. Fan, M. Zhang, and D. Franklin. Spongedirectory: Flexible sparse directories utilizing multi-level memristors. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation, pages 61--74, 2014.
[26]
R. Zhou and T. Li. Leveraging phase change memory to achieve efficient virtual machine execution. In Proceedings of the 9th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, pages 179--190, 2013.

Cited By

View all
  • (2019)Quick-and-Dirty: An Architecture for High-Performance Temporary Short Writes in MLC PCMIEEE Transactions on Computers10.1109/TC.2019.2900036(1-1)Online publication date: 2019
  • (2017)Thermal-aware, heterogeneous materials for improved energy and reliability in 3D PCM architecturesProceedings of the International Symposium on Memory Systems10.1145/3132402.3132407(223-236)Online publication date: 2-Oct-2017
  • (2017)Memory cocktail therapyProceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3123939.3124548(232-244)Online publication date: 14-Oct-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
MEMSYS '15: Proceedings of the 2015 International Symposium on Memory Systems
October 2015
278 pages
ISBN:9781450336048
DOI:10.1145/2818950
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 October 2015

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

MEMSYS '15
MEMSYS '15: International Symposium on Memory Systems
October 5 - 8, 2015
DC, Washington DC, USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)2
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Quick-and-Dirty: An Architecture for High-Performance Temporary Short Writes in MLC PCMIEEE Transactions on Computers10.1109/TC.2019.2900036(1-1)Online publication date: 2019
  • (2017)Thermal-aware, heterogeneous materials for improved energy and reliability in 3D PCM architecturesProceedings of the International Symposium on Memory Systems10.1145/3132402.3132407(223-236)Online publication date: 2-Oct-2017
  • (2017)Memory cocktail therapyProceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3123939.3124548(232-244)Online publication date: 14-Oct-2017
  • (2017)Quick-and-Dirty: Improving Performance of MLC PCM by Using Temporary Short Writes2017 IEEE International Conference on Computer Design (ICCD)10.1109/ICCD.2017.101(585-588)Online publication date: Nov-2017
  • (2017)Balancing Performance and Lifetime of MLC PCM by Using a Region Retention Monitor2017 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2017.45(385-396)Online publication date: Feb-2017
  • (2017)An Efficient Network-on-Chip Router for Dataflow ArchitectureJournal of Computer Science and Technology10.1007/s11390-017-1703-532:1(11-25)Online publication date: 11-Jan-2017
  • (2016)Mellow writesACM SIGARCH Computer Architecture News10.1145/3007787.300119244:3(519-531)Online publication date: 18-Jun-2016
  • (2016)Mellow writesProceedings of the 43rd International Symposium on Computer Architecture10.1109/ISCA.2016.52(519-531)Online publication date: 18-Jun-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media