Skip to main content
Log in

Orders-of-magnitude performance increases in GPU-accelerated correlation of images from the International Space Station

  • Original Research Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

We implement image correlation, a fundamental component of many real-time imaging and tracking systems, on a graphics processing unit (GPU) using NVIDIA’s CUDA platform. We use our code to analyze images of liquid-gas phase separation in a model colloid-polymer system, photographed in the absence of gravity aboard the International Space Station (ISS). Our GPU code is 4,000 times faster than simple MATLAB code performing the same calculation on a central processing unit (CPU), 130 times faster than simple C code, and 30 times faster than optimized C++ code using single-instruction, multiple-data (SIMD) extensions. The speed increases from these parallel algorithms enable us to analyze images downlinked from the ISS in a rapid fashion and send feedback to astronauts on orbit while the experiments are still being run.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Alerstam, E., Svensson T., Andersson-Engels, S.: Parallel computing with graphics processing units for high-speed Monte Carlo simulation of photon migration. JBO Lett. 13, 060504 (2008). doi:10.1117/1.3041496

    Google Scholar 

  2. Anderson, J.A., Lorenz, C.D., Travesset, A.: General purpose molecular dynamics simulations fully implemented on graphics processing units. J. Comput. Phys. 227, 5342–5359 (2008). doi:10.1016/j.jcp.2008.01.047

    Article  MATH  Google Scholar 

  3. Bailey, A.E., Poon, W.C.K., Christianson, R.J., Schofield, A.B., Gasser, U., Prasad, V., Manley, S., Segre, P.N., Cipelletti, L., Meyer, W.V., Doherty, M.P., Sankaran, S., Jankovsky, A.L., Shiley, W.L., Bowen, J.P., Eggers, J.C., Kurta, C., Lorik, Jr., T., Pusey, P.N., Weitz, D.A.: Spinodal decomposition in a model colloid–polymer mixture in microgravity. Phys. Rev. Lett 99, 205701 (2007). doi:10.1103/PhysRevLett.99.205701

    Article  Google Scholar 

  4. Belleman, R.G., Bédorf, J., Portegies Zwart, S.F.: High performance direct gravitational N-body simulations on graphics processing units II: an implementation in CUDA. New Astron. 13, 103–112 (2008). doi:10.1016/j.newast.2007.07.004

    Article  Google Scholar 

  5. Bik, A.J.C.: The Software Vectorization Handbook. Intel, Hillsboro (2004)

    Google Scholar 

  6. Bodnár, I., Dhont J.K.G., Lekkerkerker, H.N.W.: Pretransitional phenomena of a colloid polymer mixture studied with static and dynamic light scattering. J. Chem. Phys. 100, 19614–19619 (1996)

    Article  Google Scholar 

  7. Bodnár, I., Oosterbaan, W.D.: Indirect determination of the composition of the coexisting phases in a demixed colloid polymer mixture. J. Chem. Phys. 106, 7777–7780 (1997)

    Article  Google Scholar 

  8. Castaño-Díez, D., Mozer, D., Schoenegger, A., Pruggnaller S., Frangakis, A.S.: Performance evaluation of image processing algorithms on the GPU. J. Struct. Biol. 164, 153–160 (2008). doi:10.1016/j.jsb.2008.07.006

    Article  Google Scholar 

  9. Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Skadron, K.: A performance study of general-purpose applications on graphics processors using CUDA. J. Parallel Distrib. Comput. 68, 1370–1380 (2008). doi:10.1016/j.jpdc.2008.05.014

    Article  Google Scholar 

  10. Christiansen, M.: Adobe After Effects 7.0 Studio Techniques. Peachpit, Berkeley (2006)

    Google Scholar 

  11. Fernando, R., Kilgard, M.J.: The Cg Tutorial: The Definitive Guide to Programmable Real-Time Graphics. Addison-Wesley, Boston (2003)

    Google Scholar 

  12. Fraser F., Schewe, J.: Real World Camera Raw with Adobe Photoshop CS3. Peachpit, Berkeley (2008)

    Google Scholar 

  13. Furukawa, H.: A dynamic scaling assumption for phase separation. Adv. Phys. 34, 703–750 (1985)

    Article  Google Scholar 

  14. Garland, M., Le Grand, S., Nickolls, J., Anderson, J., Hardwick, J., Morton, S., Phillips, E., Zhang, Y., Volkov, V.: Parallel Computing Experiences with CUDA. IEEE Micro 28, 13–27 (2008)

    Article  Google Scholar 

  15. Gumerov, N.A., Duraiswami, R.: Fast multipole methods on graphics processors. J. Comput. Phys. 227, 8290–8313 (2008). doi:10.1016/j.jcp.2008.05.023

    Article  MATH  MathSciNet  Google Scholar 

  16. Harris, C., Haines K., Staveley-Smith, L.: GPU accelerated radio astronomy signal convolution. Exp. Astron. 22, 129–141 (2008). doi:10.1007/s10686-008-9114-9

    Article  Google Scholar 

  17. Ibrahim, K.Z., Bodin, F., Pène, O.: Fine-grained parallelization of lattice QCD kernel routine on GPUs. J. Parallel Distrib. Comput. 68, 1350–1359 (2008). doi:10.1016/j.jpdc.2008.06.009

    Article  Google Scholar 

  18. Li, H., Kolpas, A., Petzold, L., Moehlis, J.: Parallel simulation for a fish schooling model on a general-purpose graphics processing unit. Concurr. Comput. Pract. Exp. (2008). doi:10.1002/cpe.1330

  19. Liu, S., Li, P., Luo, Q.: Fast blood flow visualization of high-resolution laser speckle imaging data using graphics processing unit. Opt. Express 16, 14321–14329 (2008). doi:10.1364/OE.16.014321

    Article  Google Scholar 

  20. Liu, W., Schmidt, B., Voss, G., Müller-Wittig, W.: Accelerating molecular dynamics simulation using Graphics Processing Units with CUDA. Comp. Phys. Comm. 179, 634–641 (2008). doi:10.1016/j.cpc.2008.05.008

    Article  Google Scholar 

  21. Lozano, O.M., Otsuka, K.: Real-time Visual Tracker by Stream Processing. J. Signal Process. Syst. (2008). doi:10.1007/s11265-008-0250-2

  22. Lu, P.J., Conrad, J.C., Wyss, H.M., Schofield, A.B., Weitz, D.A.: Fluids of Clusters in Attractive Colloids. Phys. Rev. Lett. 96, 028306 (2006). doi:10.1103/PhysRevLett.96.028306

    Article  Google Scholar 

  23. Lu, P.J., Sims, P.A., Oki, H., Macarthur, J.B., Weitz, D.A.: Target-locking acquisition with real-time confocal (TARC) microscopy. Opt. Express 15, 8702–8712 (2007). doi:10.1364/OE.15.008702

    Article  Google Scholar 

  24. Lu, P.J., Zaccarelli, E., Ciulla, F., Schofield, A.B., Sciortino, F., Weitz, D.A.: Gelation of particles with short-range attraction. Nature 453, 499–503 (2008). doi:10.1038/nature06931

    Article  Google Scholar 

  25. Lu, P.J.: Gelation and Phase Separation of Attractive Colloids. Harvard University Ph.D. Thesis (2008)

  26. Manavski, S.A., Valle, G.: CUDA compatible GPU cards as efficient hardware accelerators for Smith–Waterman sequence alignment. BCM Bioinf. 9(Suppl 2), S10 (2008). doi:10.1186/1471-2105-9-S2-S10

    Article  Google Scholar 

  27. Marziale, L., Richard III, G.C., Roussev, V.: Massive threading: Using GPUs to increase the performance of digital forensics tools. Digital Investigation 4S, S73–S81 (2007). doi:10.1016/j.diin.2007.06.014

    Article  Google Scholar 

  28. McCool, M., Du Toit, S.: Metaprogramming GPUs with Sh. Peters, Wellesley (2004)

  29. Nguyen, H. (ed.): GPU Gems 3. Addison-Wesley, Upper Saddle River (2007)

  30. Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Krüger, J., Lefohn, A.E., Purcell, T.: A survey of general-purpose computation on graphics hardware. Comput. Graph. Forum 26, 80–113 (2007)

    Article  Google Scholar 

  31. Pharr, M. (ed.): GPU Gems 2. Addison-Wesley, Upper Saddle River (2005)

  32. Roeh, D.W., Kindratenko V.V., Brunner, R.J.: Accelerating cosmological data analysis with graphics processors. In Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units. ACM, Washington (2009)

  33. Ruiz, A., Ujaldon, M., Cooper, L., Huang, K.: Non-rigid Registration for Large Sets of Microscopic Images on Graphics Processors, J. Sign. Process. Syst. (2008) doi:10.1007/s11265-008-0208-4

  34. Samant, S.S., Xia, J., Muyan-Özçelik, P., Owens, J.D.: High performance computing for deformable image registration: Towards a new paradigm in adaptive radiotherapy. Med. Phys. 35, 3546–3553 (2008). doi:10.1118/1.2948318

    Article  Google Scholar 

  35. Schatz, M.C., Trapnell, C., Delcher, A.L., Varshney, A.: High-throughput sequence alignment using Graphics Processing Units. BCM Bioinformatics 8, 474 (2007). doi:10.1186/1471-2105-8-474

    Article  Google Scholar 

  36. Schenk, O., Christen, M., Burkhart, H.: Algorithmic perfomance studies on graphics processing units. J. Parallel Distrib. Comput. 68, 1360–1369 (2008). doi:10.1016/j.jpdc.2008.05.008

    Article  Google Scholar 

  37. Shimobaba, T., Ito, T., Masuda, N., Abe, Y., Ichihashi, Y., Nakayama, H., Takada, N., Shiraki, A., Sugie, T.: Numerical calculation library for diffraction integrals using the graphic processing unit: the GPU-based wave optics library. J. Opt. A: Pure Appl. Opt. 10, 075308 (2008). doi:10.1088/1464-4258/10/7/075308

    Article  Google Scholar 

  38. Shimobaba, T., Sato, Y., Miura, J., Takenouchi, M., Ito, T.: Real-time digital holographic microscopy using the graphics processing unit. Opt. Express 16, 11776–11781 (2008). doi:10.1364/OE.16.011776

    Article  Google Scholar 

  39. Sintorn, E., Assarsson, U.: Fast parallel GPU-sorting using a hybrid algorithm. J. Parallel Distrib. Comput. 68, 1381–1388 (2008). doi:10.1016/j.jpdc.2008.05.012

    Article  Google Scholar 

  40. Stantchev, G., Dorland W., Gumerov, N.: Fast parallel Particle-To-Grid interpolation for plasma PIC simulations on the GPU. J. Parallel Distrib. Comput. 68, 1339–1349 (2008). doi:10.1016/j.jpdc.2008.05.009

    Article  Google Scholar 

  41. Stone, J.E., Phillips, J.C., Freddolino, P.L., Hardy, D.J., Trabuco, L.G., Schulten, K.: Accelerating Molecular Modeling Applications with Graphics Processors. J. Comput. Chem. 28, 2618–2640 (2007). doi:10.1002/jcc.20829

    Article  Google Scholar 

  42. Stone, S.S., Haldar, J.P., Tsao, S.C., Hwu, W.-m.W., Sutton, B.P., Liang, Z.-P.: Accelerating advanced MRI reconstructions on GPUs. J. Parallel Distrib. Comput. 68, 1307–1317 (2008). doi:10.1016/j.jpdc.2008.05.013

    Article  Google Scholar 

  43. Taylor, S.: Intel Integrated Performance Primitives. Intel, Hillsboro, OR (2004)

    Google Scholar 

  44. Thibault, J.C., Senocak, I.: CUDA Implementation of a Navier–Stokes solver in multi-GPU desktop platforms for incompressible flows. In 47th AIAA Aerospace Sciences Meeting and Exhibit (2009)

  45. Van Meel, J.A., Arnold, A., Frenkel, D., Portegies Zwart, S.F., Belleman, R.G.: Harvesting graphics power for MD simulations. Mol. Simulation 34, 259–266 (2008). doi:10.1080/08927020701744295

    Article  Google Scholar 

  46. Wirawan, A., Kwoh, C.K., Hieu, N.T., Schmidt, B.: CBESW: sequence alignment on the Playstation 3. BCM Bioinf. 9 377 (2008). doi:10.1186/1471-2105-9-377

    Article  Google Scholar 

  47. Zaccarelli, E., Lu, P.J., Ciulla, F., Weitz, D.A., Sciortino, F.: Gelation as arrested phase separation in short-ranged attractive colloid-polymer mixtures. J. Phys. Condens. Matter 20, 494242 (2008). doi:10.1088/0953-8984/20/49/494242

    Article  Google Scholar 

  48. http://www.nvidia.com/cuda

  49. http://www.khronos.org/opencl

Download references

Acknowledgments

This work was supported by NASA grant NNX08AE09G and the NVIDIA professor partnership program. We thank A. Bik, J. Curley, A. Ghuloum, D. Luebke, E. Phillips, B. Saar, H. Saito, M. Schnubbel-Stutz, L. Vogt, and many helpful individuals throughout NASA and its contractors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter J. Lu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, P.J., Oki, H., Frey, C.A. et al. Orders-of-magnitude performance increases in GPU-accelerated correlation of images from the International Space Station. J Real-Time Image Proc 5, 179–193 (2010). https://doi.org/10.1007/s11554-009-0133-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-009-0133-1

Keywords

Navigation