Skip to main content
Log in

Reducing division latency with reciprocal caches

Ускорение деления с помощью кэширования обратных значений

  • Published:
Reliable Computing

Abstract

Floating-point division is generally regarded as a high latency operation in typical floating-point applications. Many techniques exist for increasing division performance, often at the cost of increasing either chip area, cycle time, or both. This paper presents two methods for reducing the latency of division. Using applications from the SPECfp92 and NAS benchmark suites, these methods are evaluated to determine their effects on overall system performance. The notion of recurring computation is presented, and it is shown how recurring division can be exploited using an additional, dedicated division cache. For multiplication-based division algorithms, reciprocal caches can be utilized to store recurring reciprocals. Results show that reciprocal caches can achieve nearly a two-times speedup in division performance for reasonable cache sizes.

Abstract

Деление значений с ¶rt;лаваю¶rt;ей точкой в ¶rt;р¶rt;ложениях, ис¶rt;ользуюн¶rt;х арифметику с ¶rt;лаваю¶rt;ей точкой, обычно требует боль¶rt;их затрат времени. Д¶rt;я ¶rt;овы¶rt;ения эффективности деления ¶rt;релложено немало методов, многие из которых требуют увеличения ¶rt;ло¶rt;ади кристалла, снижения тактовой частоты или и того, и другого. Представлены лва метода ускорения опера¶rt;ии леления. Приводятся данные о влиянии зтих методов на об¶rt;ую ¶rt;роизводительность системы, ¶rt;олученные с ¶rt;омо¶rt;ью тестовых ¶rt;рограмм из ¶rt;акетов SPECfp92 и NAS. Приводится ¶rt;онятие рекуррентных вычн¶rt;ений и ¶rt;реллагается с¶rt;особ реализа¶rt;ии рекуррентного деления с ¶rt;омо¶rt;ью до¶rt;олнительной кэ¶rt;-¶rt;амяти, отвеленной с¶rt;е¶rt;иально для этой о¶rt;ера¶rt;и. В алгоритмах деления, основанных на умножении, можно использовать кэ¶rt;-¶rt;амять для хранения рекуррентных обратных значений. Результаты свидетельствуют, то кэ¶rt;-¶rt;амять для обратных значений может обес¶rt;ечить ¶rt;очти двукратное увеличение скорости деления ¶rt;ри сравнительно небол¶rt;ом ее размере.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Flynn, M.On division by functional iteration. IEEE Transactions on ComputersC-19 (8) (1970).

  2. ANSI/IEEE std 754–1985, IEEE standard for binary floating-point arithmetic.

  3. NAS parallel benchmarks release. August, 1991.

  4. Oberman, S. and Flynn, M.Design issues in floating-point division. Technical Report No. CSL-TR-94-647, Computer Systems Laboratory, Stanford University, 1994.

  5. Oberman, S. and Flynn, M.On division and reciprocal caches. Technical Report No. CSL-TR-95-666, Computer Systems Laboratory, Stanford University, 1995.

  6. Richardson, S. E.Exploiting trivial and redundant computation. In: “Proceedings of the 11th IEEE Symposium on Computer Arithmetic”, 1993, pp. 220–227.

  7. Spec benchmark suite release. February, 1992.

  8. Srivastava, A. and Eustace, A.ATOM: a system for building customized program analysis tools. In: “Proceedings of the SIGPLAN’94 Conference on Programming Language Design and Implementation”, 1994, pp. 196–205.

  9. Waser, S. and Flynn, M.Introduction to arithmetic for digital systems designers. Holt, Rinehart, and Winston, 1982.

Download references

Author information

Authors and Affiliations

Authors

Additional information

© S. F. Oberman, M. J. Flynn, 1996

Rights and permissions

Reprints and permissions

About this article

Cite this article

Oberman, S.F., Flynn, M.J. Reducing division latency with reciprocal caches. Reliable Comput 2, 147–153 (1996). https://doi.org/10.1007/BF02425917

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02425917

Keywords