skip to main content
10.1145/3502181.3531469acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article
Public Access

FPVM: Towards a Floating Point Virtual Machine

Published: 27 June 2022 Publication History

Abstract

Alternatives to IEEE floating point arithmetic have become all the rage. Some extract more representational power out of the available bits. Others offer the potential for lower or higher precision than is available in IEEE-compatible hardware. Even an "interface to the real numbers" has recently been proposed. Using such alternative arithmetic systems within an existing scientific or other significant codebase is a major challenge, however. We explore how to address this challenge through virtualizing the IEEE floating point hardware, specifically on x64. The goal of the floating point virtual machine (FPVM) is to allow an existing application binary to be seamlessly extended to support the desired alternative arithmetic system with overheads determined by that system and not the virtualization mechanisms. We describe the prospects, issues, and tradeoffs for four different approaches for building FPVM: trap-and-emulate, trap-and-patch, binary transformation, and IR transformation. We then describe the design and implementation of our current design, which combines static binary analysis/translation and trap-and-emulate execution. We evaluate our FPVM implementation on several benchmarks, virtualizing them to use posits and MPFR. Finally, we comment on kernel- and hardware-level innovations that could further reduce overheads for floating point virtualization.

References

[1]
The risc-v instruction set manual. volume i: User-level isa.
[2]
Capstone: The ultimate disassembler, 2021.
[3]
Arnold, M. G., Bailey, T. A., Cowles, J. R., and Cupal, J. J. Redundant logarithmic arithmetic. IEEE Transactions on Computers 39, 8 (Aug. 1990), 1077--1086.
[4]
Bailey, D., Barszcz, E., Barton, J., Browning, D., Carter, R., Dagum, L., Fatoohi, R., Fineberg, S., Frederickson, P., Lasinksi, T., Schreiber, R., Simon, H., Venkatakrishnan, V., and Weeratunga, S. The nas parallel benchmarks (nas 1). Tech. Rep. RNR-94-007, NASA, March 1994.
[5]
Balakrishnan, G., and Reps, T. Analyzing memory accesses in x86 executables. In International conference on compiler construction (2004), Springer, pp. 5--23.
[6]
Bao, T., and Zhang, X. On-the-fly detection of instability problems in floating-point program execution. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA) (October 2013).
[7]
Bellard, F. Libbf: The tiny big float library. Available at https://bellard.org/libbf/, 2017.
[8]
Bentley, M., Briggs, I., Gopalakrishnan, G., Ahn, D. H., Laguna, I., Lee, G. L., and Jones, H. E. Multi-level analysis of compiler-induced variability and performance tradeoffs. In Proceedings of the 28th ACM Symposium on High-performance Parallel and Distributed Computing (HPDC 2019) (June 2019).
[9]
Benz, F., Hildebrandt, A., and Hack, S. A dynamic program analysis to find floating-point accuracy problems. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) (2012).
[10]
Boehm, H.-J. Simple garbage-collector-safety. In Proceedings of the ACM SIGPLAN 1996 Conference on Programming Language Design and Implementation (New York, NY, USA, 1996), PLDI '96, Association for Computing Machinery, p. 89--98.
[11]
Boehm, H.-J. Towards an api for the real numbers. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) (June 2020).
[12]
Bryan, G. L., Norman, M. L., O'Shea, B. W., Abel, T., Wise, J. H., Turk, M. J., Reynolds, D. R., Collins, D. C., Wang, P., Skillman, S. W., Smith, B., Harkness, R. P., Bordner, J., Kim, J.-h., Kuhlen, M., Xu, H., Goldbaum, N., Hummels, C., Kritsuk, A. G., Tasker, E., Skory, S., Simpson, C. M., Hahn, O., Oishi, J. S., So, G. C., Zhao, F., Cen, R., Li, Y., and The Enzo Collaboration. ENZO: An Adaptive Mesh Refinement Code for Astrophysics. The Astrophysical Journal 211, 2 (March 2014), 19.
[13]
Cherkaev, A. The secret life of a nan. https://anniecherkaev.com/the-secret-life-of-nan, March 2018.
[14]
Chiang, W.-F., Baranowski, M., Briggs, I., Solovyev, A., Gopalakrishnan, G., and Rakamariç, Z. Rigorous floating-point mixed-precision tuning. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL) (2017), pp. 300--315.
[15]
Courbet, C. Nsan: A floating-point numerical sanitizer. In Proceedings of the 30th ACM SIGPLAN International Conference on Compiler Construction (CC) (March 2021).
[16]
Crozier, P., Thornquist, H., Numrich, R., Williams, A., Edwards, H., Keiter, E., Rajan, M., Willenbring, J., Doerfler, D., and Heroux, M. Improving performance via mini-applications. Tech. Rep. SAND2009--5574, Sandia National Laboratories, January 2009.
[17]
Devine, S., Bugnion, E., and Rosenblum, M. Virtualization system including a virtual machine monitor for a computer with a segmented architecture. United States Patent Number 6397242.
[18]
Dinda, P., and Bernat, A. Comparing the understanding of ieee floating point between scientific and non-scientific users. Tech. Rep. NWU-CS-2021-07, Department of Computer Science, Northwestern University, December 2021.
[19]
Dinda, P., Bernat, A., and Hetland, C. Spying on the floating point behavior of existing, unmodified scientific applications. In Proceedings of the 29th ACM Symposium on High-performance Parallel and Distributed Computing (HPDC 2020) (June 2020). Best Paper.
[20]
Dinda, P., and Hetland, C. Do developers understand IEEE floating point? In Proceedings of the 32rd IEEE International Parallel and Distributed Processing Symposium (IPDPS 2018) (Apr. 2018).
[21]
Duck, G. J., Gao, X., and Roychoudhury, A. Binary rewriting without control flow recovery. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (New York, NY, USA, 2020), PLDI 2020, Association for Computing Machinery, p. 151--163.
[22]
Févotte, F., and Lathuilière, B. VERROU: assessing floating point accuracy without recompiling, October 2016. working paper or preprint.
[23]
Fousse, L., Hanrot, G., Lefèvre, V., Pélissier, P., and Zimmermann, P. Mpfr: A multiple-precision binary floating-point library with correct rounding. ACM Transactions on Mathematical Software (TOMS) 33, 2 (June 2007).
[24]
Ghosh, S., Cuevas, M., Campanoni, S., and Dinda, P. Compiler-based timing for extremely fine-grain preemptive parallelism. In Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing (SC 2020) (November 2020).
[25]
Goldberg, R. Survey of virtual machine research. IEEE Computer (June 1974), 34--45.
[26]
Gustafson, J. The End of Error: Unum Computing. Chapman and Hall/CRC, 2015.
[27]
Hale, K., and Dinda, P. A case for transforming parallel runtimes into operating system kernels. In Proceedings of the 24th ACM Symposium on High-performance Parallel and Distributed Computing (HPDC 2015) (June 2015).
[28]
Hale, K., and Dinda, P. Enabling hybrid parallel runtimes through kernel and virtualization support. In Proceedings of the 12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE 2016) (April 2016).
[29]
Hickey, T., Ju, Q., and Van Emden, M. H. Interval arithmetic: From principles to implementation. Journal of the ACM 48, 5 (Sept. 2001), 1038--1068.
[30]
Hollingsworth, J. K., and Buck, B. DynInstAPI Programmer's Guide Release 1.0, July 1997. http://www.cs.umd.edu/ hollings/dyninstAPI/dyninstUserGuide.pdf.
[31]
Ian A. Mason, S. I. https://github.com/SRI-CSL/gllvm, 2018.
[32]
IEEE Floating Point Working Group. IEEE standard for binary floating-point arithmetic. ANSI/IEEE Std 754--1985 (1985).
[33]
IEEE Floating Point Working Group. IEEE standard for floating-point arithmetic. IEEE Std 754-2008 (Aug 2008), 1--70.
[34]
Jin, H., Frumkin, M., and Yan, J. The openmp implementation of nas parallel benchmarks and its performance (nas 3). Tech. Rep. NAS-99-011, NASA, March 1999. OpenMP 3.0 version available at https://github.com/benchmark-subsetting/NPB3.0-omp-C.
[35]
Jost, T., Durand, Y., Fabre, C., Cohen, A., and Pétrot, F. Vp float: First class treatment for variable precision floating point arithmetic. In Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques (PACT) (September 2020).
[36]
Jost, T. T., Durand, Y., Fabre, C., Cohen, A., and Pérrot, F. Seamless compiler integration of variable precision floating-point arithmetic. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) (February-March 2021).
[37]
Kahan, W. A critique of john l. gustafson's the end of error--unum computation and his a radical approach to computation with real numbers. In Proceedings of the 23rd IEEE Symposium on Computer Arithmetic (ARITH) (July 2016).
[38]
Kalamkar, D., Mudigere, D., Mellempudi, N., Das, D., Banerjee, K., Avancha, S., Vooturi, D. T., Jammalamadaka, N., Huang, J., Yuen, H., Yang, J., Park, J., Heinecke, A., Georganas, E., Srinivasan, S., Kundu, A., Smelyanskiy, M., Kaul, B., and Kundu, P. D. A study of BFLOAT16 for deep learning training. arXiv preprint arXiv:1905.12322, May 2019.
[39]
Lam, M. O., Hollingsworth, J. K., and Stewart, G. Dynamic floating-point cancellation detection. Parallel Computing 39, 3 (2013), 146--155.
[40]
Landi, W. Undecidability of static analysis. ACM Lett. Program. Lang. Syst. 1, 4 (dec 1992), 323--337.
[41]
Lattner, C., and Adve, V. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO'04) (Palo Alto, California, Mar 2004).
[42]
Lee, W.-C., Bao, T., Zheng, Y., Zhang, X., Vora, K., and Gupta, R. Raive: Runtime assessment of floating-point instability by vectorization. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA) (2015).
[43]
Matula, D. W., and Kornerup, P. Finite precision rational arithmetic: Slash number systems. IEEE Transactions on Computers C-34, 1 (Jan 1985), 3--18.
[44]
Milroy, D. J., Baker, A. H., Hammerling, D. M., Dennis, J. M., Mickelson, S. A., and Jessup, E. R. Towards characterizing the variability of statistically consistent community earth system model simulations. Procedia Computer Science 80, C (June 2016), 1589--1600.
[45]
Moon, F. C. Chaotic and Fractal Dynamics: An Introduction for Applied Scientists and Engineers. John Wiley and Sons, Inc., 1992.
[46]
Omni OpenMP Compiler Group, University of Versailles Saint Quentin en Yvlines. Nas parallel benchmarks 3.0-unofficial openmp c version. https://github.com/benchmark-subsetting/NPB3.0-omp-C, 2014.
[47]
Omtzigt, E. T. L., Gottschling, P., Seligman, M., and Zorn, W. Universal Numbers Library: design and implementation of a high-performance reproducible number systems library. arXiv:2012.11011 (2020).
[48]
Panchekha, P., Sanchez-Stern, A., Wilcox, J. R., and Tatlock, Z. Automatically improving accuracy for floating point expressions. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) (June 2015).
[49]
Popek, G., and Goldberg, R. Formal requirements for virtualizable third generation architectures. Communications of the ACM (July 1974), 413--421.
[50]
Ramalingam, G. The undecidability of aliasing. ACM Trans. Program. Lang. Syst. 16, 5 (sep 1994), 1467--1471.
[51]
Ravitch, T. https://github.com/travitch/whole-program-llvm, 2016.
[52]
Rubio-González, C., Nguyen, C., Nguyen, H. D., Demmel, J., Kahan, W., Sen, K., Bailey, D. H., Iancu, C., and Hough, D. Precimonious: Tuning assistant for floating-point precision. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (Supercomputing) (2013).
[53]
Sanchez-Stern, A., Panchekha, P., Lerner, S., and Tatlock, Z. Finding root causes of floating point error. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) (June 2018).
[54]
Sawaya, G., Bentley, M., Briggs, I., Gopalakrishnan, G., and Ahn, D. H. Flit: Cross-platform floating-point result-consistency tester and workload. In Proceedings of the 2017 IEEE International Symposium on Workload Characterization (IISWC) (Oct 2017), pp. 229--238.
[55]
Shoshitaishvili, Y., Wang, R., Salls, C., Stephens, N., Polino, M., Dutcher, A., Grosen, J., Feng, S., Hauser, C., Kruegel, C., and Vigna, G. Sok: (state of) the art of war: Offensive techniques in binary analysis.
[56]
Sugerman, J., Venkitachalan, G., and Lim, B.-H. Virtualizing I/O devices on VMware workstation's hosted virtual machine monitor. In Proceedings of the USENIX Annual Technical Conference (June 2001).
[57]
Walker, J. Fbench: Floating point benchmarks. https://www.fourmilab.ch/fbench/, September 2021.
[58]
Wingo, A. Value representation in javascript implementations. http://wingolog.org/archives/2011/05/18/value-representation-in-javascript-implementations, May 2011.

Cited By

View all
  • (2023)CARAT KOP: Towards Protecting the Core HPC Kernel from Linux Kernel ModulesProceedings of the SC '23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624237(1596-1605)Online publication date: 12-Nov-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HPDC '22: Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing
June 2022
314 pages
ISBN:9781450391993
DOI:10.1145/3502181
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. floating point arithmetic
  2. ieee 754
  3. software development
  4. virtualization

Qualifiers

  • Research-article

Funding Sources

Conference

HPDC '22

Acceptance Rates

Overall Acceptance Rate 166 of 966 submissions, 17%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)161
  • Downloads (Last 6 weeks)27
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)CARAT KOP: Towards Protecting the Core HPC Kernel from Linux Kernel ModulesProceedings of the SC '23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624237(1596-1605)Online publication date: 12-Nov-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media