Abstract
Data prefetching is an effective approach to improve performance by hiding long memory latency. Existing profiling feedback optimizations can do well in pointer-based linked data structure prefetching. However, these optimizations, which instrument and optimize source code during compiling or post link, usually incur tremendous overhead at profiling stage. Furthermore, it is a mission impossible for these methods to do optimization without source code. This work designs and implements an Event Sampling based Prefetching Optimizer, which is a post-link prefetching based on hardware performance counters event sampling. Evaluation on SW26010 processor shows that with the proposed prefetching approach, 9 out of 29 programs of SPEC2006 can be speeded up by about 4.3% on average with only less than 10% sampling overhead on average.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Manikantan, R.: Performance oriented prefetching enhancements using commit stalls. J. Instr. Level Parallelism 13, 1–28 (2011)
Mowry, T.C.: Tolerating latency through software-controlled data prefetching, Ph.D. thesis. Stanford University, March 1994
Bernstein, D., Cohen, D., Freund, A., Maydan, D.E.: Compiler techniques for data prefetching on the PowerPC. In: Proceedings of the 1995 International Conference on Parallel Architectures and Compilation Techniques, June 1995
Chilimbi, T.M., Hirzel, M.: Dynamic hot data stream prefetching for general-purposes programs. In: Proceedings of the 2002 ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2002
Wu, Y., Serrano, M., Krishnaiyer, R., Li, W., Fang, J.: Value-profile guided stride prefetching for irregular code. In: Horspool, R.Nigel (ed.) CC 2002. LNCS, vol. 2304, pp. 307–324. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45937-5_22
Luk, C.-K., Muth, R., Patil, H., Lowney, P.G., Cohn, R., Weiss, R.: Profile-guided post-link stride prefetching. In: Proceedings of 2002 International Conference on Supercomputing, pp. 167–178, June 2002
Zou, Q., Li, X.F., Zhang, L.B.: Runtime engine for dynamic profile guided stride prefetching. J. Comput. Sci. Technol. 23(4), 633–643 (2008)
Adl-Tabatabai, A.R., Hudson, R.L., Serrano, M.J., Subramoney, S.: Prefetch injection based on hardware monitoring and objects metadata. In: Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation (2004)
Lu, J., Chen, H., Yew, P.-C., Hsu, w.-C.: Design and implementation of a lightweight dynamic optimization system. J. Instr. Level Parallelism 6, 1–24 (2004)
Beyler, J.C., Clavss, P.: Performance driven data cache prefetching in a dynamic software optimization system. In: Proceedings of the 36th International Conference on Supercomputing, pp. 202–209 (2007)
Luk, C.-K., Muth, R., Patil, H., Cohn, R., Lowney, G.: Ispike: a post-link optimizer for the intel itanium architecture. In: Proceedings of the International Symposium on Code Generation and Optimization (2004)
Collins, J., et al.: Speculative precomputation: long-range prefetching of delinquent loads. In: Proceedings of the International Symposium on Computer Architecture, July 2001
Kamruzzaman, Md., Swanson, S., Tullsen, D.M.: Inter-core prefetching for multicore processors using migrating helper threads. In: ASPLOS 2011, 5–11 March 2011
Mehta, S., Fang, Z., Zhai, A., Yew, P.-C.: Multi-stage coordinated prefetching for present-day processors. In: ICS 2014, pp. 73–82 (2014)
Weifeng, Z., Calder, B., Tullsen, D.M.: A self-repairing prefetcher in an event-driven dynamic optimization framework. In: Proceedings of the International Symposium on Code Generation and Optimization, pp. 50–64. IEEE Computer Society (2006)
Qi, F.B., Wang, F., Li, Z.S.: Feedback directed prefetching optimization for linked data structure. J. Softw. 20(Suppl.), 34 − 39 2009. (in Chinese)
Wang, F., Wei, H.M., Qi, F.B.: Prefetching optimization based on profiling compilation. High Perform. Comput. Technol. 186 (2007). (in Chinese)
Zou, Q., Wu, M., Hu, W.W., Zhang, L.B.: An instrument-analysis framework for adaptive prefetch optimization in JVM. J. Softw. 19(7), 1581–1589 (2008). (in Chinese)
Fu, H., Liao, J., Yang, J., et al.: The sunway taihulight supercomputer: system and applications. Sci. China Inf. Sci. 59(7) (2016)
Acknowledgement
The authors would like to thank all colleagues who provide inspiring suggestions and helpful supports. The material was based upon work supported by National Science and Technology Major Project (NSTMP) (Grant No. 2017ZX01028-101). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflected the views of NSTMP.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wei, H., Wang, F., Li, Z. (2018). A Post-link Prefetching Based on Event Sampling. In: Li, C., Wu, J. (eds) Advanced Computer Architecture. ACA 2018. Communications in Computer and Information Science, vol 908. Springer, Singapore. https://doi.org/10.1007/978-981-13-2423-9_5
Download citation
DOI: https://doi.org/10.1007/978-981-13-2423-9_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-2422-2
Online ISBN: 978-981-13-2423-9
eBook Packages: Computer ScienceComputer Science (R0)