Skip to main content

Enhancing Speedup in Network Processing Applications by Exploiting Instruction Reuse with Flow Aggregation

  • Chapter
Embedded Software for SoC

Abstract

Instruction Reuse (IR) is a microarchitectural technique that improves the execution time of a program by removing redundant computations at run-time. Although this is the job of an optimizing compiler, they do not succeed many a time due to limited knowledge of run-time data. In this article we concentrate on integer ALU and load instructions in packet processing applications and see how IR can be used to obtain better performance. In addition, we attempt to answer the following questions in the article - (1) Can IR be improved by reducing interference in the reuse buffer?, (2) What characteristics of network applications can be exploited to improve IR?, and (3) What is the effect of IR on resource contention and memory accesses? We propose an aggregation scheme that combines the high-level concept of network traffic i.e. “flows” with the low level microarchitectural feature of programs i.e. repetition of instructions and data and propose an architecture that exploits temporal locality in incoming packet data to improve IR by reducing interference in the RB. We find that the benefits that can be achieved by exploiting IR varies widely depending on the nature of the application and input data. For the benchmarks considered, we find that IR varies between 1% and 50% while the speedup achieved varies between 1% and 24%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Sodani and G. Sohi. “Dynamic Instruction Reuse.” 24th Annual International Symposium on Computer Architecture, July 1997, pp. 194–205.

    Google Scholar 

  2. A. Sodani and G. Sohi. “Understanding the Differences between Value Prediction and Instruction Reuse.” 32nd Annual International Symposium on Microarchiteclure, December 1998, pp. 205–215.

    Google Scholar 

  3. A. Sodani and G. Sohi. “An Empirical Analysis of Instruction Repetition.” In Proceedings of ASPLOS-8, 1998.

    Google Scholar 

  4. J. Yang and R. Gupta, “Load Redundancy Removal through Instruction Reuse.” In Proceedings of International Conference on Parallel Processing, August 2000, pp. 61–68.

    Google Scholar 

  5. C. Molina, A. Gonzalez and J. Tubella. “Dynamic Removal of Redundant Computations.” In Proceedings of International Conference on Supercomputing, June 1999.

    Google Scholar 

  6. F. Baker. “Requirements for IP Version 4 Routers,” RFC-1812, Network Working Group, June 1995.

    Google Scholar 

  7. D. Burger, T. M. Austin, and S. Bennett. “Evaluating Future Microprocessors: The SimpleScalar Tool Set.” Technical Report CS-TR-96-1308, University of Wisconsin-Madison, July 1996.

    Google Scholar 

  8. Tilman Wolf and, Mark Franklin. “CommBench-A Telecommunications Benchmark for Network Processors.” IEEE Symposium on Performance Analysis of Systems and Software, Apr 2000, pp. 154–162.

    Google Scholar 

  9. Gokhan Memik, B. Mangione-Smith, and W. Hu, “NetBench: A Benchmarking Suite for Network Processors.” In Proceedings of ICCAD, November 2001.

    Google Scholar 

  10. S. Bradner and J. McQuaid. “A Benchmarking Methodology for Network Interconnect Devices.” RFC-2544.

    Google Scholar 

  11. Stephen Melvin and Yale Patt. “Handling of Packet Dependencies: A Critical Issue for Highly Parallel Network Processors.” In Proceedings of CASES, 2002.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Kluwer Academic Publishers

About this chapter

Cite this chapter

Surendra, G., Banerjee, S., Nandy, S.K. (2003). Enhancing Speedup in Network Processing Applications by Exploiting Instruction Reuse with Flow Aggregation. In: Jerraya, A.A., Yoo, S., Verkest, D., Wehn, N. (eds) Embedded Software for SoC. Springer, Boston, MA. https://doi.org/10.1007/0-306-48709-8_27

Download citation

  • DOI: https://doi.org/10.1007/0-306-48709-8_27

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4020-7528-5

  • Online ISBN: 978-0-306-48709-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics