skip to main content
10.1145/3559009.3569653acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
research-article
Open access

FlatPack: Flexible Compaction of Compressed Memory

Published: 27 January 2023 Publication History

Abstract

The capacity and bandwidth of main memory is an increasingly important factor in computer system performance. Memory compression and compaction have been combined to increase effective capacity and reduce costly page faults. However, existing systems typically maintain compaction at the expense of bandwidth. One major cause of extra traffic in such systems is page overflows, which occur when data compressibility degrades and compressed pages must be reorganized. This paper introduces FlatPack, a novel approach to memory compaction which is able to mitigate this overhead by reorganizing compressed data dynamically with less data movement. Reorganization is carried out by an addition to the memory controller, without intervention from software. FlatPack is able to maintain memory capacity competitive with current state-of-the-art memory compression designs, while reducing mean memory traffic by up to 67%. This yields average improvements in performance and total system energy consumption over existing memory compression solutions of 31--46% and 11--25%, respectively. In total, FlatPack improves on baseline performance and energy consumption by 108% and 40%, respectively, in a single-core system, and 83% and 23%, respectively, in a multi-core system.

References

[1]
Alaa R Alameldeen and David A Wood. 2004. Frequent pattern compression: A significance-based compression scheme for L2 caches. Dept. Comp. Scie., Univ. Wisconsin-Madison, Tech. Rep 1500 (2004).
[2]
A. Arelakis, F. Dahlgren, and P. Stenstrom. 2015. HyComp: a hybrid cache compression method for selection of data-type-specific compression methods. In MICRO. IEEE, Waikiki, Hawaii, USA, 38--49.
[3]
Angelos Arelakis and Per Stenstrom. 2014. SC2: A statistical compression cache scheme. In 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA). IEEE, Minneapolis, Minnesota, USA, 145--156.
[4]
Yanan Cao, Long Chen, and Zhao Zhang. 2015. Flexible memory: A novel main memory architecture with block-level memory compression. In 2015 IEEE International Conference on Networking, Architecture and Storage (NAS). IEEE, Boston, Massachusetts, USA, 285--294.
[5]
Xi Chen, Lei Yang, Robert P Dick, Li Shang, and Haris Lekatsas. 2010. C-pack: A high-performance microprocessor cache compression algorithm. IEEE transactions on very large scale integration (VLSI) systems 18, 8 (2010), 1196--1208.
[6]
E. Choukse, M. Erez, and A. R. Alameldeen. 2018. Compresso: Pragmatic Main Memory Compression. In MICRO. IEEE, Fukuoka, Japan, 546--558.
[7]
Esha Choukse, Michael B. Sullivan, Mike O'Connor, Mattan Erez, Jeff Pool, David Nellans, and Stephen W. Keckler. 2020. Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA). ACM/IEEE, Valencia, Spain, 926--939.
[8]
Standard Performance Evaluation Corporation. 2017. SPEC CPU 2017. Retrieved 2021-07-30 from https://www.spec.org/cpu2017
[9]
Magnus Ekman and Per Stenstrom. 2005. A robust main-memory compression scheme. In SIGARCH C.A. News, Vol. 33. ACM, New York, New York, USA, 74--85.
[10]
Albin Eldstål-Damlin, Pedro Trancoso, and Ioannis Sourdis. 2019. AVR: Reducing Memory Traffic with Approximate Value Reconstruction. In ICPP. ACM, Kyoto, Japan, 1--10.
[11]
Albin Eldstål-Ahrens, Angelos Arelakis, and Ioannis Sourdis. 2022. L2C: Combining Lossy and Lossless Compression on Memory and I/O. ACM Trans. Embed. Comput. Syst. 21, 1, Article 12 (jan 2022).
[12]
Albin Eldstål-Ahrens and Ioannis Sourdis. 2020. MemSZ: Squeezing Memory Traffic with Lossy Compression. ACM TACO 17, 4, Article 40 (Nov. 2020), 40:1--40:25 pages.
[13]
Peter A. Franaszek and Dan E. Poff. 2007. Management of Guest OS Memory Compression In Virtualized Systems. Patent US20080307188A1.
[14]
D. Genbrugge, S. Eyerman, and L. Eeckhout. 2010. Interval simulation: Raising the level of abstraction in architectural simulation. In HPCA. IEEE, Bangalore, India, 1--12.
[15]
E.G. Hallnor and S.K. Reinhardt. 2005. A unified compressed memory hierarchy. In 11th International Symposium on High-Performance Computer Architecture. IEEE, San Fransisco, California, USA, 201--212.
[16]
S. Hong, P. J. Nair, B. Abali, A. Buyuktosunoglu, K. Kim, and M. Healy. 2018. Attaché: Towards Ideal Memory Compression by Mitigating Metadata Bandwidth Overheads. In MICRO. IEEE, Fukuoka, Japan, 326--338.
[17]
Raghavendra Kanakagiri, Biswabandan Panda, and Madhu Mutyam. 2017. MBZip: Multiblock data compression. TACO 14, 4 (2017), 1--29.
[18]
Jungrae Kim, Michael Sullivan, Esha Choukse, and Mattan Erez. 2016. Bit-plane compression: Transforming data for better compression in many-core architectures. In ISCA. ACM/IEEE, Seoul, Republic of Korea, 329--340.
[19]
Sohan Lal, Jan Lucas, and Ben Juurlink. 2019. SLC: Memory access granularity aware selective lossy compression for GPUs. In DATE. IEEE, IEEE, Grenoble, France, 1184--1189.
[20]
Charles Lefurgy, Karthick Rajamani, Freeman Rawson, Wes Felter, Michael Kistler, and Tom W Keller. 2003. Energy management for commercial servers. Computer 36, 12 (2003), 39--48.
[21]
Jure Leskovec and Rok Sosič. 2016. Snap: A general-purpose network analysis and graph-mining library. ACM Transactions on Intelligent Systems and Technology (TIST) 8, 1 (2016), 1--20.
[22]
Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, and Norman P. Jouppi. 2009. McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures. In MICRO. IEEE, New York, New York, USA, 469--480.
[23]
Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: building customized program analysis tools with dynamic instrumentation. In ACM SIGPLAN Notices, Vol. 40. ACM, New York, New York, USA, 190--200.
[24]
Naveen Muralimanohar, Rajeev Balasubramonian, and Norman P Jouppi. 2009. CACTI 6.0: A tool to model large caches. HP lab. 27 (2009), 22--31.
[25]
Richard C Murphy, Kyle B Wheeler, Brian W Barrett, and James A Ang. 2010. Introducing the graph 500. Cray Users Group (CUG) 19 (2010), 45--74.
[26]
David J. Palframan, Nam Sung Kim, and Mikko H. Lipasti. 2015. COP: To Compress and Protect Main Memory. In 42nd Annual International Symposium on Computer Architecture (Portland, Oregon) (ISCA '15). ACM, 682--693.
[27]
Sungbo Park, Ingab Kang, Yaebin Moon, Jung Ho Ahn, and G. Edward Suh. 2021. BCD Deduplication: Effective Memory Compression Using Partial Cache-Line Deduplication. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (Virtual, USA) (ASPLOS 2021). 52--64.
[28]
Gennady Pekhimenko, Vivek Seshadri, Yoongu Kim, Hongyi Xin, Onur Mutlu, Phillip B Gibbons, Michael A Kozuch, and Todd C Mowry. 2016. Linearly compressed pages: a low-complexity, low-latency main memory compression framework. In MICRO. IEEE, Taipei, Taiwan, 172--184.
[29]
Gennady Pekhimenko, Vivek Seshadri, Onur Mutlu, Michael A Kozuch, Phillip B Gibbons, and Todd C Mowry. 2012. Base-delta-immediate compression: Practical data compression for on-chip caches. In PACT. ACM, Minneapolis, Minnesota, USA, 377--388.
[30]
Ashish Ranjan, Arnab Raha, Vijay Raghunathan, and Anand Raghunathan. 2020. Approximate Memory Compression. IEEE TVLSI 28, 4 (2020), 980--991.
[31]
Paul Rosenfeld, Elliott Cooper-Balis, and Bruce Jacob. 2011. DRAMSim2: A cycle accurate memory system simulator. IEEE CAL 10, 1 (2011), 16--19.
[32]
Larry Seiler, Daqi Lin, and Cem Yuksel. 2020. Compacted CPU/GPU Data Compression via Modified Virtual Address Translation. Proc. ACM Comput. Graph. Interact. Tech. 3, 2, Article 19 (Aug. 2020), 18 pages.
[33]
A. Seznec. 1994. Decoupled sectored caches: conciliating low tag implementation cost and low miss ratio. In ISCA. ACM/IEEE, Chicago, Illinois, USA, 384--393.
[34]
Ali Shafiee, Meysam Taassori, Rajeev Balasubramonian, and Al Davis. 2014. MemZip: Exploring unconventional benefits from memory compression. In HPCA. IEEE, Orlando, Florida, USA, 638--649.
[35]
R Brett Tremaine, Peter A Franaszek, John T Robinson, Charles O Schulz, T Basil Smith, Michael E Wazlowski, and P Maurice Bland. 2001. IBM memory expansion technology (MXT). IBM Journal of Research and Development 45, 2 (2001), 271--285.
[36]
Po-An Tsai, Andres Sanchez, Christopher W Fletcher, and Daniel Sanchez. 2020. Safecracker: Leaking secrets through compressed caches. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems. 1125--1140.
[37]
Carl A. Waldspurger. 2003. Memory Resource Management in VMware ESX Server. SIGOPS Oper. Syst. Rev. 36, SI (dec 2003), 181--194.
[38]
Jisoo Yang and Julian Seymour. 2018. Pmbench: A micro-benchmark for profiling paging performance on a system with low-latency SSDs. In Information Technology-New Generations. Springer, New York, New York, USA, 627--633.
[39]
Youtao Zhang and Rajiv Gupta. 2003. Enabling partial cache line prefetching through data compression. In 2003 International Conference on Parallel Processing, 2003. Proceedings. IEEE, IEEE, Lyon, France, 277--285.
[40]
Jishen Zhao, Sheng Li, Jichuan Chang, John L Byrne, Laura L Ramirez, Kevin Lim, Yuan Xie, and Paolo Faraboschi. 2015. Buri: Scaling big-memory computing with hardware-based memory expansion. ACM Transactions on Architecture and Code Optimization (TACO) 12, 3 (2015), 31.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PACT '22: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques
October 2022
569 pages
ISBN:9781450398688
DOI:10.1145/3559009
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

In-Cooperation

  • IFIP WG 10.3: IFIP WG 10.3
  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 January 2023

Check for updates

Author Tags

  1. memory compression
  2. memory system

Qualifiers

  • Research-article

Funding Sources

  • Swedish Foundation for Strategic Research

Conference

PACT '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 121 of 471 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 483
    Total Downloads
  • Downloads (Last 12 months)217
  • Downloads (Last 6 weeks)20
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media