Distributed and hardware accelerated computing for clinical medical imaging using proton computed tomography (pCT)
Introduction
Proton therapy is a form of radiation treatment that uses protons as agents for delivering dose. Because of the proton’s physical properties, dose delivery can be highly directed or localized to organs of interest. In order to utilize fully this potential capability for targeted-dose delivery, it is necessary to have a priori knowledge of how a proton loses (or deposits) its energy as it traverses material media. The characteristics of the energy deposition depend on the size and generally non-homogeneous electron density of the medium. Proton energy deposition is quantified by the relative stopping power (RSP), which is the ratio between the proton’s stopping powers in the medium and water.
Currently, RSPs used for proton treatment planning are obtained from measurements with X-ray computed tomography (xCT). There is a necessary calibration step that converts xCT measurements (in Hounsfield units) to RSPs. Because the energies involved in xCT and proton therapy are orders of magnitude different, the calibration of xCT-derived RSPs is not very accurate, with margins of error of about 3%. Even such small uncertainties in RSPs might prove critical in delivering dose to the correct location(s).
Proton computed tomography (pCT) is an imaging modality developed for the purpose of directly obtaining the RSPs. The concept for pCT was first proposed by Cormack in 1976 [7], but recently there has been growing interest in the development of clinical pCT devices [27], [6], [30]. The omission of the calibration step that depends on a completely different imaging modality makes pCT-measured RSPs potentially more accurate.
In order for pCT to be a useful tool for proton treatment planning, it is necessary to provide images within clinically meaningful times of ten to fifteen minutes [22]. There are two major obstacles to achieving this. The first is that in order to obtain statistically reliable images, a large number of protons must be collected. The second is that the physics of a proton’s interactions with material media results in non-straight paths. Therefore, data-reduction techniques used in other imaging modalities with probes that travel in straight lines (such as photons in emission and computed tomography) cannot be used in pCT. Reconstruction in pCT must take into account the path of each proton on an event-by-event basis. Performing such event-by-event analysis and coalescing the results from that analysis to reconstruct a coherent tomographic image on a large number of protons require considerable computation and memory. From a real-world example, image reconstruction from 131 million protons on a single-CPU single-GPU computer (i.e., sequential code) takes almost seventy minutes, which extrapolates to almost nine hours for one billion protons.
In this paper, we present our investigations on reducing the time to reconstruct pCT images to within clinically meaningful time scales, while preserving image quality, via a suitable combination of the Message Passing Interface (MPI) and GPUs (i.e., parallel code). There has been growing interest in the use of GPUs in medical physics, where they have been shown to improve image reconstruction [24]. GPUs have been used to accelerate tomographic reconstruction in many fields, including examples from positron emission tomography [2], magnetic induction tomography [19], optoacoustic tomography [32], and transmission tomography [31]. These imaging methods rely on a different physical model from pCT, producing fundamentally different approaches to reconstruction. Like xCT, all of these other imaging methods are based on probes that travel in straight lines. One study in electron tomography [13] used parallelized implementations of an iterative algorithm, using MPI on a CPU cluster. The performance of implementations of iterative solvers on GPUs has been widely studied [10], [34], [35], [18], demonstrating dramatic speedups with one or more GPUs versus the CPU in a single-node system. Of particular relevance to our study are recent reports on the advantages of GPUs in processing list-mode (event-by-event) data in emission computed tomography [3], [8].
Section snippets
Theory
Protons in pCT are typically detected with a configuration [27] such as shown in Fig. 1. The object to be imaged (the target) is positioned in a rotating assembly between two “telescopes” that each comprises a pair of position-sensing planes. A mono-energetic beam of protons is directed from the left and towards the leftmost plane as depicted in Fig. 1. The sensor planes in each telescope measure two positions through which a proton has passed. The positions measured at the left and right
Faster image reconstruction through parallelization and hardware acceleration
The basic parallelization strategy is to partition the proton history data across a set of distributed memory compute nodes, coordinate the algorithm through message passing, and to utilize the GPUs on each node.
The reconstruction begins by reading all binary input histories into memory. The parallel code has each MPI process reading its own subset of histories. Preliminary calculations on the input data are done before the actual reconstruction (i.e., solving the linear equation). These
Material and methods
The baseline for this study is based on code from the dissertation of Penfold [22]. The baseline code (i.e., sequential code) implements pCT reconstruction on a single-CPU, single-GPU computer. The code groups proton histories into blocks, and uses NVIDIA’s CUDA (Computer Unified Device Architecture) version 5.0 [21] to process simultaneously the histories in each block. The sequential code solves the linear equation with a version of a block-iterative algorithm called diagonally-relaxed
Results and discussion
Because the primary goal of pCT is to map the distribution of RSP values, it is important to demonstrate that the results from the parallelized version are consistent with those from the sequential code. Such comparison was performed via a quantitative analysis of the computed RSP measurements from both codes. Table 1 summarizes the mean RSP values for the four materials in the LUCY phantom. These mean RSP values were obtained from an ROI analysis on these four materials using identical
Conclusions
We have demonstrated that image reconstruction in pCT can be accelerated through a hybrid approach that uses both MPI and GPUs. Our implementation scales almost linearly with problem size and scale. This has significant implications for proton therapy and treatment planning, where it is important to obtain pCT images in clinically reasonable times. Through our hybrid MPI–GPU approach we have reconstructed pCT images, while preserving the state-of-the-art image quality, from proton data sets
Future work
We will continue our investigation to reduce image reconstruction time by attempting to address the load imbalance in the linear solver step, porting the linear solver step to GPUs, and investigating MIC-based processors. We will also continue to work with our collaborators at LLUMC to improve the state-of-the-art in reconstructed image quality.
Acknowledgments
US Department of Defense contract no. W81XWH-10-1-0170 and US Department of Energy contract no. DE-SC0005135 sponsored this work. Access to the University of Chicago’s Petascale Active Data Store (PADS) system was supported in part by the National Science Foundation under grant OCI-0821678 and by the University of Chicago. Access to the Magellan system Argonne National Laboratory was provided by the Argonne Leadership Computing Facility and supported with American Recovery and Reinvestment
Nicholas T. Karonis, Ph.D., is a professor of computer science at Northern Illinois University. His research interest is high-performance computing applications and tools, particularly as they apply to computational grids and cloud computing. Karonis received a Ph.D. in computer science from Syracuse University. Contact him at [email protected].
References (35)
- et al.
Block-iterative projection methods for parallel computation of solutions to convex feasibility problems
Linear Algebra Appl.
(1989) - et al.
Iterative algorithms for large partitioned systems, with applications to image reconstruction
Linear Algebra Appl.
(1981) - et al.
GPU computing with Kaczmarz’s and other iterative algorithms for linear systems
Parallel Comput.
(2010) - et al.
Efficient parallel implementation of iterative reconstruction algorithms for electron tomography
J. Parallel Distrib. Comput.
(2008) - et al.
Algebraic reconstruction techniques (ART) for three-dimensional electron microscopy and X-ray photography
J. Theoret. Biol.
(1970) - et al.
Iterative reconstruction for transmission tomography on GPU using Nvidia CUDA
Tsinghua Sci. Technol.
(2010) - et al.
On the efficiency of iterative ordered subset reconstruction algorithms for acceleration on GPUs
Comput. Methods Programs Biomed.
(2010) - T. Beisel, S. Lietsch, K. Thielemans, A method for OSEM PET reconstruction on parallel architectures using STIR, in:...
- L. Caucci, L.R. Furenlid, H.H. Barrett, Maximum likelihood event estimation and list-mode image reconstruction on GPU...
- et al.
Averaging strings of sequential iterations for convex feasibility problems
On diagonally-relaxed orthogonal projection methods
SIAM J. Sci. Comput.
Monte Carlo studies of a proton computed tomography system
IEEE Trans. Nucl. Sci.
Quantitative proton tomography: preliminary experiments
Phys. Med. Biol.
Fully 3D list-mode time-of-flight PET image reconstruction on GPUs using CUDA
Med. Phys.
A comprehensive study of the most likely path formalism for proton-computed tomography
Phys. Med. Biol.
Practical cone-beam algorithm
J. Opt. Soc. Amer. A
Component-averaged row projections: a robust block-parallel scheme for sparse linear systems
SIAM J. Sci. Comput.
Cited by (25)
The role of Monte Carlo simulation in understanding the performance of proton computed tomography
2022, Zeitschrift fur Medizinische PhysikCitation Excerpt :Similarly to reconstruction techniques using straight lines, a linear system of equations is obtained and the same reconstruction techniques can be used. Most algorithms attempt to find the solution which best fits the data in the least-squares sense but they differ in how they solve the problem [46–49]. They can also differ on prior assumptions, e.g., that the reconstructed object should be piecewise constant [16,50,51] or similar to an x-ray prior [50,52].
Network health and e-Science in commercial clouds
2016, Future Generation Computer SystemsCitation Excerpt :A pCT reconstruction is primarily comprised of four stages, data distribution, computing cuts and margins, most likely path (MLP) calculation and an iterative linear solver. Karonis et al. [29] have developed parallel MPI codes that enable large, two billion proton history, images to be reconstructed within ten minutes on a dedicated HPC cluster.
A generalized block-iterative projection method for the common fixed point problem induced by cutters
2022, Journal of Global OptimizationDosimetric accuracy and radiobiological implications of ion computed tomography for proton therapy treatment planning
2019, Physics in Medicine and Biology
Nicholas T. Karonis, Ph.D., is a professor of computer science at Northern Illinois University. His research interest is high-performance computing applications and tools, particularly as they apply to computational grids and cloud computing. Karonis received a Ph.D. in computer science from Syracuse University. Contact him at [email protected].
Kirk L. Duffin, Ph.D., is an associate professor of computer science at Northern Illinois University. In addition to his work in high performance computing, his research interests include scientific visualization, virtual reality, image based modeling and rendering, and spatio-temporal data representations. Duffin received a Ph.D. in computer science from Brigham Young University. Contact him at [email protected].
Caesar E. Ordoñez, Ph.D., has been working in the field of medical imaging for over twenty years. His experience is on the simulation and analysis of novel medical imaging devices for positron emission tomography (PET) and single-photon emission computed tomography. Ordonez received his Ph.D. in nuclear physics from the Massachusetts Institute of Technology.
Bela Erdelyi, Ph.D., is a member of the Beam Physics group at NIU and the Physics Division at ANL He worked on a variety of problems involving the creation, acceleration, transport and manipulation of charged particle beams with an array of applications to nuclear, particle and medical physics. He has also been researching the connections among beam physics, modern mathematics and advanced numerical methods in physics research.
Thomas D. Uram is a member of the research staff at Argonne National Laboratory and the Computation Institute at the University of Chicago. His research interests include large-scale data intensive computing, GPGPU computing, heterogeneous computing architectures, and hybrid programming models.
Eric C. Olson is a software developer specializing in large scale parallel computing, visualization, and interactive applications. Eric worked on interactive scientific data visualization at Argonne National Labs and the University of Chicago for ten years. He is now working on large scale problems as a senior software developer at Webfilings LLC.
George Coutrakon, Ph.D., earned his Ph.D. in physics from State University of New York in 1983. He is currently a medical physicist and associate professor of physics at Northern Illinois University (NIU). Before coming to NIU in Oct. 2008, Dr. Coutrakon was director of proton accelerator operations at Loma Linda University Medical Center (LLUMC) in Southern California since 1990. His current research is in the field of proton computed tomography, or proton CT. He has written proton therapy papers on microdosimetry, beam optics, proton accelerators, and proton CT. Dr. Coutrakon currently teaches medical physics courses and works with NIU, Fermilab, and Loma Linda researchers in building a new proton CT scanner designed to image human anatomy using proton beams instead of conventional X-rays. He is the principal investigator for a DOD grant with oversight for the new proton CT scanner to be completed to start beam testing by mid-2013.
Michael E. Papka, Ph.D., is a senior scientist at Argonne National Laboratory, where he also fulfills two senior leadership roles in the Computing, Environment and Life Sciences Directorate: deputy associate laboratory director, and director of the Argonne Leadership Computing Facility. He is a senior fellow of the University of Chicago/Argonne Computation Institute and an associate professor of computer science at Northern Illinois University. He conducts interdisciplinary research in large-scale scientific visualization and data analysis. His interests include technology in support of science and its use in educational settings. He is a senior member of IEEE and ACM.