Skip to main content
Log in

Trin-Trin: Who’s Calling? A Pin-Based Dynamic Call Graph Extraction Framework

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Multi-core based systems are ubiquitous in data centers. Efficient exploitation of hardware parallelism supported by such systems is imperative on multiple fronts: minimizing latency and power consumption and maximizing throughput. This in turn calls for advanced program analysis and optimization. Call graphs have been long used to this end. Although several static call graph extraction techniques have been proposed in the past, these techniques cannot be applied to analyze programs already running in production. Likewise, the existing dynamic call graph extraction tools have limited use in production owing to, say (but not limited to), lack of support for capturing wall clock time spent in functions of a given program and lack of means to analyze the call graph information captured at run time. In this paper, we present a Pin-based dynamic call graph extraction framework called Trin-Trin. The framework enables extraction of complete, precise and dynamic call graphs. Additionally, the framework can be used seamlessly with already running applications. Furthermore, an analytics engine is provided to facilitate advanced program analysis, e.g., different multithreading context(s) of any function can be extracted in a demand-driven fashion. We evaluate the overhead of Trin-Trin using several Unix utilities, applications from the industry-standard SPEC CINT2006, CFP2006 benchmark suite and Yahoo! properties. Additionally, we present a case study to illustrate how Trin-Trin can be used to analyze performance bottlenecks and performance regressions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Allen F.E.: Program optimization. Annu. Rev. Autom. Program. 5, 239–307 (1969)

    Google Scholar 

  2. Allen, F.E.: Interprocedural data flow analysis. In: IFIP Congress, pp. 398–402 (1974)

  3. Bach M., Charney M., Cohn R., Demikhovsky E., Devor T., Hazelwood K., Jaleel A., Luk C.K., Lyons G., Patil H., Tal A.: Analyzing parallel programs with Pin. Computer 43, 34–41 (2010)

    Article  Google Scholar 

  4. Bacon, D.F., Sweeney, P.F.: Fast static analysis of c++ virtual function calls. In: Proceedings of the 11th ACM SIGPLAN Conference on Object Oriented Programming, Systems, Languages, and Applications, San Jose, CA, pp. 324–341 (1996)

  5. Banning, J.P.: An efficient way to find the side effects of procedure calls and the aliases of variables. In: Conference Record of the Sixth Annual ACM Symposium on the Principles of Programming Languages, New York, NY, pp. 29–41 (1979)

  6. Bruening, D.: Efficient, transparent, and comprehensive runtime code manipulation. Ph.D. thesis, Massachusetts Institute of Technology (2004)

  7. Callahan D., Carle A., Hall M.W., Kennedy K.: Constructing the procedure call multigraph. IEEE Trans. Soft. Eng. 16(4), 483–487 (1990)

    Article  Google Scholar 

  8. Callgrind: a call-graph generating cache profiler. http://valgrind.org/docs/manual/cl-manual.html

  9. Chen Y.F., Nishimoto M.Y., Ramamoorthy C.V.: The C information abstraction system. IEEE Trans. Softw. Eng. 16(3), 325–334 (1990)

    Article  Google Scholar 

  10. Chikofsky E.J., Cross J.H. II: Reverse engineering and design recovery: a taxonomy. IEEE Softw. 7(1), 13–17 (1990)

    Article  Google Scholar 

  11. Choi S.C., Scacchi W.: Extracting and restructuring the design of large systems. IEEE Softw. 7(1), 66–71 (1990)

    Article  Google Scholar 

  12. Cscope: a developer’s tool for browsing source code. http://cscope.sourceforge.net/

  13. Demme, J., Sethumadhavan, S.: Rapid identification of architectural bottlenecks via precise event counting. In: Proceedings of the 38th Annual International Symposium on Computer Architecture, pp. 353–364 (2011)

  14. Doxygen. http://doxygen.org/

  15. Eustace, A., Srivastava, A.: ATOM: a flexible interface for building high performance program analysis tools. In: Proceedings of the USENIX 1995 Technical Conference, New Orleans, LA, pp. 25–25 (1995)

  16. Firefox. http://www.mozilla.org/en-US/firefox/new/

  17. FreeBSD 8.1 ports distribution. ftp://ftp.freebsd.org/pub/FreeBSD/releases/i386/8.1-RELEASE/ports/. MD5:73589e78c9e246f737e43b8c57c8f875

  18. Gerber R., Bik A.J., Smith K.B., Tian X.: The Software Optimization Cookbook. Intel Press, Hillsboro, OR (2006)

    Google Scholar 

  19. Graham, S.L., Kessler, P.B., Mckusick, M.K.: Gprof: a call graph execution profiler. In: Proceedings of the 1982 SIGPLAN Symposium on Compiler Construction, Boston, MA, pp. 120–126 (1982)

  20. Graphviz. http://www.graphviz.org/

  21. Griswold, W.G., Atkinson, D.C., McCurdy, C.: Fast, flexible syntactic pattern matching and processing. In: Proceedings of the 4th International Workshop on Program Comprehension, p. 144 (1996)

  22. Grove D., Chambers C.: A framework for call graph construction algorithms. ACM Trans. Program. Lang. Syst. 23(6), 685–746 (2001)

    Article  Google Scholar 

  23. Grun, P., Dutt, N., Nicolau, A.: Memory aware compilation through accurate timing extraction. In: Proceedings of the 37th Annual Design Automation Conference, Los Angeles, CA, USA, pp. 316–321 (2000)

  24. Hall M.W., Kennedy K.: Efficient call graph analysis. ACM Lett. Programm. Lang. Syst. 1(3), 227–242 (1992)

    Article  Google Scholar 

  25. http://hadoop.apache.org/

  26. Intel® Performance tuning utility 4.0 update 5. http://software.intel.com/en-us/articles/intel-performance-tuning-utility/

  27. Intel® VTune. http://software.intel.com/en-us/intel-vtune/

  28. KCachegrind, http://kcachegrind.sourceforge.net/html/Home.html

  29. Lakhotia, A.: Constructing call multigraphs using dependence graphs. In: Proceedings of the Twentieth Annual ACM Symposium on the Principles of Programming Languages, Charleston, SC, pp. 273–284 (1993)

  30. Lhoták, O.: Comparing call graphs. In: Proceedings of the 7th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, San Diego, CA, pp. 37–42 (2007)

  31. Luk, C.K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: Proceedings of the SIGPLAN ’05 Conference on Programming Language Design and Implementation, Chicago, IL, USA, pp. 190–200 (2005)

  32. McKeeman W.M.: Peephole optimization. Commun. ACM 8(7), 443–444 (1965)

    Article  Google Scholar 

  33. Milanova A., Rountev A., Ryder B.G.: Precise call graphs for C programs with function pointers. Autom. Softw. Eng. 11(1), 7–26 (2004)

    Article  Google Scholar 

  34. Moseley, T., Connors, D.A., Grunwald, D., Peri, R.: Identifying potential parallelism via loop-centric profiling. In: Proceedings of the 4th International Conference on Computing Frontiers, Ischia, Italy, pp. 143–152 (2007)

  35. Müller, H.A., Klashinsky, K.: Rigi-a system for programming-in-the-large. In: Proceedings of the 10th International Conference on Software Engineering, Singapore, pp. 80–86 (1988)

  36. Murphy G.C., Notkin D., Griswold W.G., Lan E.S.C.: An empirical study of static call graph extractors. ACM Trans. Softw. Eng. Methodol. 7(2), 158–191 (1998)

    Article  Google Scholar 

  37. MySQL: the world’s most popular open source database. http://www.MySQL.org/

  38. Neyman J., Pearson E.S.: On the use and interpretation of certain test criteria for purposes of statistical inference. Biometrika 20, 175–240 (1928)

    Google Scholar 

  39. Nicolau, A., Li, G., Kejariwal, A.: Techniques for efficient placement of synchronization primitives. In: Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Raleigh, NC, USA, pp. 199–208 (2009)

  40. Ocamlgraph: a graph library for Objective Caml. http://ocamlgraph.lri.fr/

  41. OProfile—a system profiler for linux. http://oprofile.sourceforge.net/news/

  42. Optimizing InnoDB disk i/o. http://dev.mysql.com/doc/refman/5.6/en/optimizing-innodb-diskio.html

  43. org.apache.hadoop.io.compress.bzip2.CBZip2InputStream. http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/compress/bzip2/CBZip2InputStream.html

  44. org.apache.hadoop.io.compress.bzip2.CBZip2OutputStream. http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/compress/bzip2/CBZip2OutputStream.html

  45. Panda, P.R., Dutt, N.D., Nicolau, A.: Memory organization for improved data cache performance in embedded processors. In: Proceedings of the 9th International Symposium on System Synthesis, pp. 90–95 (1996)

  46. Patil, H., Pereira, C., Stallcup, M., Lueck, G., Cownie, J.: PinPlay: a framework for deterministic replay and reproducible analysis of parallel programs. In: Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization, Toronto, ON, Canada, pp. 2–11 (2010)

  47. Pearson K.: On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arised from random sampling. Philosoph. Mag. Ser. 5(50), 157–175 (1900)

    Article  Google Scholar 

  48. Reiss S.P.: The Field Programming Environment: A Friendly Integrated Environment for Learning and Development. Kluwer, Norwell, MA (1995)

    Book  Google Scholar 

  49. Ryder B.G.: Constructing the call graph of a program. IEEE Trans. Softw. Eng. 5, 216–226 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  50. Sereni, D.: Termination analysis and call graph construction for higher-order functional programs. In: Proceedings of the 12th ACM SIGPLAN International Conference on Functional Programming, Freiburg, Germany, pp. 71–84 (2007)

  51. Shivers, O.G.: Control-flow analysis of higher-order languages of taming lambda. Ph.D. thesis, Carnegie Mellon University (1991)

  52. Spearman C.: The proof and measurement of association between two things. Am. J. Psychol. 15, 72–101 (1904)

    Article  Google Scholar 

  53. Spearman C.: Footrule for measuring correlation. Br. J. Psychol. 2(1), 89–108 (1906)

    MathSciNet  Google Scholar 

  54. Spinellis D.: Cscout: a refactoring browser for C. Sci. Comput. Program. 75(4), 216–231 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  55. Stube, A.O., Rexachs, D., Luque, E.: Software probes: towards a quick method for machine characterization and application performance prediction. In: Proceedings of the 2008 International Symposium on Parallel and Distributed Computing, pp. 23–30 (2008)

  56. SPEC CFP2006. http://www.spec.org/cpu2006/CFP2006/

  57. SPEC CINT2006. http://www.spec.org/cpu2006/CINT2006/

  58. SPEC OMP Benchmarks. http://www.spec.org/omp/

  59. Tallent, N.R., Mellor-Crummey, J.M.: Effective performance measurement and analysis of multithreaded applications. In: Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Raleigh, NC, pp. 229–240 (2009)

  60. The Caml Language. http://caml.inria.fr/

  61. Tip, F., Palsberg, J.: Scalable propagation-based call graph construction algorithms. In: Proceedings of the 15th ACM SIGPLAN Conference on Object Oriented Programming, Systems, Languages, and Applications, Minneapolis, MN, pp. 281–293 (2000)

  62. Valgrind. http://valgrind.org/

  63. Wang P.H., Collins J.D., Wang H., Kim D., Greene B., Chan K.M., Yunus A.B., Sych T., Moore S.F., Shen J.P.: Helper threads via virtual multithreading on an experimental Itanium®2 processor-based platform. SIGPLAN Notices 39(11), 144–155 (2004)

    Article  Google Scholar 

  64. Zhang W., Ryder B.G.: Automatic construction of accurate application call graph with library call abstraction for java: research articles. J. Soft. Maint. Evol. Res. Pract. 19(4), 231–252 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arun Kejariwal.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jalan, R., Kejariwal, A. Trin-Trin: Who’s Calling? A Pin-Based Dynamic Call Graph Extraction Framework. Int J Parallel Prog 40, 410–442 (2012). https://doi.org/10.1007/s10766-012-0193-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-012-0193-x

Keywords

Navigation