skip to main content
10.1145/2925426.2926293acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

Simulation and Analysis Engine for Scale-Out Workloads

Published: 01 June 2016 Publication History

Abstract

We introduce a system-level Simulation and Analysis Engine (SAE) framework based on dynamic binary instrumentation for fine-grained and customizable instruction-level introspection of everything that executes on the processor. SAE can instrument the BIOS, kernel, drivers, and user processes. It can also instrument multiple systems simultaneously using a single instrumentation interface, which is essential for studying scale-out applications. SAE is an x86 instruction set simulator designed specifically to enable rapid prototyping, evaluation, and validation of architectural extensions and program analysis tools using its flexible APIs. It is fast enough to execute full platform workloads---a modern operating system can boot in a few minutes---thus enabling research, evaluation, and validation of complex functionalities related to multicore configurations, virtualization, security, and more. To reach high speeds, SAE couples tightly with a virtual platform and employs both a just-in-time (JIT) compiler that helps simulate simple instructions efficiently and a fast interpreter for simulating new or complex instructions. We describe SAE's architecture and instrumentation engine design and show the framework's usefulness for single- and multi-system architectural and program analysis studies.

References

[1]
F. Bellard. QEMU, a Fast and Portable Dynamic Translator. In USENIX Annual Technical Conference, FREENIX Track, 2005.
[2]
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The parsec benchmark suite: Characterization and architectural implications. In Proc. of PACT, 2008.
[3]
C. Bienia and K. Li. Benchmarking modern multiprocessors. Princeton University USA, 2011.
[4]
N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, et al. The gem5 simulator. SIGARCH Computer Architecture News, 39, 2011.
[5]
D. Bruening, T. Garnett, and S. Amarasinghe. An infrastructure for adaptive dynamic optimization. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization, 2003.
[6]
D. Bruening, Q. Zhao, and S. Amarasinghe. Transparent dynamic instrumentation. In ACM SIGPLAN Notices, volume 47. ACM, 2012.
[7]
P. P. Bungale and C.-K. Luk. PinOS: a programmable framework for whole-system dynamic instrumentation. In Proceedings of the 3rd international conference on Virtual execution environments. ACM, 2007.
[8]
K. H. Cettei. Code cache management in dynamic optimization systems. PhD thesis, Harvard University Cambridge, Massachusetts, 2004.
[9]
M. Charney. In https://software.intel.com/en-us/articles/xed-x86-encoder-decoder-software-library. Last accessed: Sep. 1, 2015.
[10]
P. Feiner, A. D. Brown, and A. Goel. Comprehensive kernel instrumentation via dynamic binary translation. In ACM SIGARCH Computer Architecture News, volume 40, 2012.
[11]
M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. D. Popescu, A. Ailamaki, and B. Falsafi. Clearing the clouds: a study of emerging scale-out workloads on modern hardware. In SIGPLAN Notices, volume 47, 2012.
[12]
A. Henderson, A. Prakash, L. K. Yan, X. Hu, X. Wang, R. Zhou, and H. Yin. Make it work, make it right, make it fast: Building a platform-neutral whole-system dynamic binary analysis platform. In Proceedings of the International Symposium on Software Testing and Analysis. ACM, 2014.
[13]
J. D. Hiser, D. Williams, W. Hu, J. W. Davidson, J. Mars, and B. R. Childers. Evaluating indirect branch handling mechanisms in software dynamic translation systems. In Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, 2007.
[14]
Z. Jia, L. Wang, J. Zhan, L. Zhang, and C. Luo. Characterizing data analysis workloads in data centers. In IEEE International Symposium on Workload Characterization (IISWC), 2013.
[15]
M. Jurczyk and G. Coldwind. Identifying and exploiting windows kernel race conditions via memory access patterns. In The Symposium on Security for Asia Network, 2013.
[16]
S. Kanev, J. P. Darago, K. Hazelwood, P. Ranganathan, T. Moseley, G.-Y. Wei, D. Brooks, S. Campanoni, K. Brownell, T. M. Jones, et al. Profiling a warehouse-scale computer. In Proceedings of the 42nd Annual International Symposium on Computer Architecture. ACM, 2015.
[17]
K. P. Lawton. Bochs: A portable pc emulator for unix/x. Linux Journal, (29es), 1996.
[18]
P. Lotfi-Kamran, B. Grot, M. Ferdman, S. Volos, O. Kocberber, J. Picorel, A. Adileh, D. Jevdjic, S. Idgunji, E. Ozer, et al. Scale-out processors. In ACM SIGARCH Computer Architecture News, volume 40, 2012.
[19]
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In ACM Sigplan Notices, volume 40, 2005.
[20]
M. Michael, J. E. Moreira, D. Shiloach, and R. W. Wisniewski. Scale-up x scale-out: A case study using nutch/lucene. In Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE International. IEEE, 2007.
[21]
N. Nethercote and J. Seward. Valgrind: a framework for heavyweight dynamic binary instrumentation. In ACM Sigplan notices, volume 42, 2007.
[22]
M. Poletto and V. Sarkar. Linear scan register allocation. ACM Transactions on Programming Languages and Systems (TOPLAS), 21, 1999.
[23]
J. Smith and R. Nair. Virtual machines: versatile platforms for systems and processes. Elsevier, 2005.
[24]
X. Tong, J. Luo, and A. Moshovos. QTrace: An interface for customizable full system instrumentation. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2013.
[25]
T. F. Wenisch, R. E. Wunderlich, M. Ferdman, A. Ailamaki, B. Falsafi, and J. C. Hoe. SimFlex: statistical sampling of computer system simulation. IEEE MICRO Special Issue on Computer Architecture Simulation and Modeling, 26, 2006.
[26]
C. Zheng, J. Zhan, Z. Jia, and L. Zhang. Characterizing os behavior of scale-out data center workloads. In Workshop on the Interaction amongst Virtualization, Operating Systems and Computer Architecture (WIVOSCA), 2013.

Cited By

View all
  • (2023)Memory-Efficient Hashed Page Tables2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071061(1221-1235)Online publication date: Feb-2023
  • (2022)Parallel virtualized memory translation with nested elastic cuckoo page tablesProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507720(84-97)Online publication date: 28-Feb-2022
  • (2020)Elastic Cuckoo Page TablesProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378493(1093-1108)Online publication date: 9-Mar-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICS '16: Proceedings of the 2016 International Conference on Supercomputing
June 2016
547 pages
ISBN:9781450343619
DOI:10.1145/2925426
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Analysis
  2. JIT
  3. big data
  4. full-system
  5. instrumentation
  6. multicore
  7. multisystem
  8. scale-out
  9. transparency

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICS '16
Sponsor:

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Memory-Efficient Hashed Page Tables2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071061(1221-1235)Online publication date: Feb-2023
  • (2022)Parallel virtualized memory translation with nested elastic cuckoo page tablesProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507720(84-97)Online publication date: 28-Feb-2022
  • (2020)Elastic Cuckoo Page TablesProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378493(1093-1108)Online publication date: 9-Mar-2020
  • (2020)P-INSPECT: Architectural Support for Programmable Non-Volatile Memory Frameworks2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00050(509-524)Online publication date: Oct-2020
  • (2020)Draco: Architectural and Operating System Support for System Call Security2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00017(42-57)Online publication date: Oct-2020
  • (2020)BabelFishProceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture10.1109/ISCA45697.2020.00049(501-514)Online publication date: 30-May-2020
  • (2019)Architectural Implications of Function-as-a-Service ComputingProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358296(1063-1075)Online publication date: 12-Oct-2019
  • (2019)PageSeer: Using Page Walks to Trigger Page Swaps in Hybrid Memory Systems2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2019.00012(596-608)Online publication date: Feb-2019
  • (2017)PageforgeProceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3123939.3124540(302-314)Online publication date: 14-Oct-2017
  • (2017)StressRight: Finding the right stress for accurate in-development system evaluation2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS.2017.7975292(205-216)Online publication date: Apr-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media