Abstract
Nowadays energy-efficiency becomes the first design metric in chip development. To pursue higher energy efficiency, the processor architects should reduce or eliminate those unnecessary energy dissipations. Indirect-branch prediction has become a performance bottleneck, especially for the applications written in object-oriented languages. Previous hardware-based indirect-branch predictors are generally inefficient, for they either require significant hardware storage or predict indirect-branch targets slowly. In this paper, we propose an energy-efficient indirect-branch prediction technique called TAP (target address pointer) prediction. Its key idea includes two parts: utilizing specific hardware pointers to accelerate the indirect branch prediction flow and reusing the existing processor components to reduce additional hardware costs and power consumption. When fetching an indirect branch, TAP prediction first gets the specific pointers called target address pointers from the conditional branch predictor, and then uses such pointers to generate virtual addresses which index the indirect-branch targets. This technique spends similar time compared to the dedicated storage techniques without requiring additional large amounts of storage. Our evaluation shows that TAP prediction with some representative state-of-the-art branch predictors can improve performance significantly over the baseline processor. Compared with those hardware-based indirect-branch predictors, the TAP-Perceptron scheme achieves performance improvement equivalent to that provided by an 8 K-entry TTC predictor, and also outperforms the VPC predictor.
Similar content being viewed by others
References
Mudge T. Power: A first-class architectural design constraint. Computer, 2001, 34(4): 52–58.
Jiménez D, Lin C. Dynamic branch prediction with perceptrons. In Proc. the 7th Int. Symposium on High-Performance Computer Architecture, Jan. 2001, pp.197–206.
Seznec A. A 64 Kbytes ISL-TAGE branch predictor. In JWA-C-2: Championship Branch Prediction, Jun. 2011. http://w-ww.jilp.org/jwac-2/program/cbp3 03 seznec.pdf, Oct. 2014.
Seznec A. Analysis of the O-GEometrichistory length branch predictor. In Proc. the 32nd ISCA, Jun. 2005, pp.394–405.
Seznec A, Michaud P. A case for (partially) TAgged GEometric history length branch prediction. Journal of Instruction-Level Parallelism (JILP), 2006, 8: 1–23.
Seznec A. Storage free confidence estimation for the TAGE branch predictor. In Proc. the 17th IEEE Int. Symposium on HPCA, Feb. 2011, pp.443–454.
Driesen K, Hölzle U. Multi-stage Cascaded prediction. In Proc. the 5th Int. Euro-Par Conf. Parallel Processing, Aug. 1999, pp.1312–1321.
Chang P, Hao E, Patt Y. Target prediction for indirect jumps. In Proc. the 24th ISCA, June 1997, pp.274–283.
Kim H, Joao J, Mutlu O, Lee C, Patt Y, Cohn R. Virtual program counter (VPC) prediction: Very low cost indirect branch prediction using conditional branch prediction hard-ware. IEEE Trans. Computers, 2009, 58(9): 1153–1170.
Calder D, Grunwald D, Zorn B. Quantifying behavioral differences between C and C++ programs. Journal of Programming Languages, 1994, 2(4): 313–351.
Lanier T. Exploring the design of the Cortex-A15 proces-sor: Arm's next generation mobile applications processor. 2012, http://www.arm.com/files/pdf/AT-Exploring the Design of the Cortex-A15.pdf, Oct. 2014.
Fog A. The microarchitecture of Intel, AMD and VIA CPUs: An optimization guide for assembly programmers and compiler makers. Technical Report, Technical University of Denmark, 2012.
Driesen K, Hölzle U. Accurate indirect branch prediction. In Proc. the 25th ISCA, April 1998, pp.167–178.
Joao J, Mutlu O, Kim H, Agarwal R, Patt Y. Improving the performance of object-oriented languages with dynamic predication of indirect jumps. In Proc. the 13th ASPLOS, Mar. 2008. pp.80–90.
Yeh T, Patt Y. Two-level adaptive training branch prediction. In Proc. the 24th MICRO, Sept. 1991, pp.51–61.
Lee J, Smith A. Branch prediction strategies and branch target buffer design. IEEE Computer, 1984, 17(1): 6–22.
Driesen K, Hölzle U. The cascaded predictor: Economical and adaptive branch target prediction. In Proc the 31st Annual ACM/IEEE Int. Symposium on Microarchitecture, Nov. 30–Dec. 2, 1998, pp.249–258.
Xie Z C, Tong D, Huang M K, Shi Q Q, Cheng X. Swip prediction: Complexity-effective indirect-branch prediction using pointers. Journal of Computer Science and Technology, 2012, 27(4): 754–768.
Farooq M, Chen L, John L. Value based BTB indexing for indirect jump prediction. In Proc. the 16th HPCA, Jan. 2010, pp.1–11.
Azizi O, Mahesri A, Lee B C, Patel S, Horowitz M. Energy-performance tradeoffs in processor architecture and circuit design: A marginal cost analysis. In Proc the 37th ISCA, June 2010, pp.26–36.
Xie Z C, Tong D, Huang M K, Wang X Y, Shi Q Q, Cheng X. Tap prediction: Reusing conditional branch predictor for indirect branches with target address pointers. In Proc. the 29th ICCD, Oct. 2011, pp.119–126.
Binkert N, Beckmann B, Black G et al. The gem5 simulator. ACM SIGARCH Comput. Archit. News, 2011, 39(2): 1–7.
Henning J. SPEC CPU2006 benchmark descriptions. SIGARCH Comput. Archit. News, 2006, 34(4): 1–17.
Blackburn S, Garner R, Hoffmann C et al. The DaCapo benchmarks: Java benchmarking development and analysis. In Proc. the 21st OOPSLA, Oct. 2006. pp.169–190.
Thoziyoor S, Muralimanohar N, Ahn J H, Jouppi N P. CACTI 5.1. Technical Report, HP Labs, 2008.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
ESM 1
(PDF 81 kb)
Rights and permissions
About this article
Cite this article
Xie, ZC., Tong, D. & Huang, MK. A General Low-Cost Indirect Branch Prediction Using Target Address Pointers. J. Comput. Sci. Technol. 29, 929–946 (2014). https://doi.org/10.1007/s11390-014-1480-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-014-1480-3