Abstract
Four-dimensional imaging (4D-imaging) plays a critical role in achieving precise motion management in radiation therapy. However, challenges remain in 4D-imaging such as a long imaging time, suboptimal image quality, and inaccurate motion estimation. With the tremendous success of artificial intelligence (AI) in the image domain, particularly deep learning, there is great potential in overcoming these challenges and improving the accuracy and efficiency of 4D-imaging without the need for hardware modifications. In this review, we provide a comprehensive overview of how these AI-based methods could drive the evolution of 4D-imaging for motion management. We discuss the inherent issues associated with multiple 4D modalities and explore the current research progress of AI in 4D-imaging. Furthermore, we delve into the unresolved challenges and limitations in 4D-imaging and provide insights into the future direction of this field.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Radiation therapy (radiotherapy) employs high doses of radiation to kill tumor cells, serving a vital component in the treatment of approximately two-thirds of global cancer patients (Baskar et al. 2012; Gutt et al. 2021). Since its inception, radiotherapy has been continually advanced through technological advancements such as image guidance, adaptive radiotherapy, heavy-particle therapy, and ultra-high dose rate (FLASH) radiotherapy (Park et al. 2018; Panta et al. 2012). However, in the radiotherapy treatment of thoracic and abdominal cancers, respiration-induced motion poses a significant limitation to further improving radiotherapy outcomes (Rietzel et al. 2005). Variance in organ shape and position during motion can lead to substantial errors in imaging, treatment planning, and delivery. Inaccuracies in the knowledge of the target’s shape and trajectory often necessitates larger field margins, resulting in suboptimal dose conformation (Cai et al. 2011). Addressing these limitations requires employing advanced imaging technologies capable of precise observation and analysis of movement, enabling effective motion management and facilitating conformal treatment planning.
Recentl advancements in four-dimensional imaging (4D-imaging) techniques, such as 4D computed tomography (4D-CT), 4D magnetic resonance imaging (4D-MRI), 4D cone-beam computed tomography (4D-CBCT), and four-dimensional positron emission tomography (4D-PET), have increased clinical applicability and played crucial roles in managing respiratory motion (Rietzel et al. 2005; Cai et al. 2011). Compared with traditional static imaging, which captures anatomical structures during a breath-hold or at a specific time, 4D-imaging techniques can provide tumor and organ motion information through the respiration cycle in addition to the three-dimensional (3D) anatomical structures. This added capability offers substantial potential for enhancing the accuracy and efficiency of tumor localization. However, 4D-imaging encounters certain inherent challenges such as insufficient spatiotemporal resolution, motion artifacts, and increased radiation doses (Zhang et al. 2021; Terpstra et al. 2023; Noid et al. 2017).
Artificial intelligence (AI) involves developing and utilizing complex computer algorithms that emulate aspects of human intelligence in tasks such as visual perception, pattern recognition, decision-making, and problem-solving, often achieving comparable or enhanced performance (Huynh et al. 2020; Li et al. 2022). In recent years, the availability of large datasets and high-performance computers has led to the emergence of more sophisticated AI agents, presenting immense potential to address unresolved challenges in medical imaging. Specifically, deep learning (DL), a subset of AI, is commonly used for various tasks. DL has demonstrated remarkable capabilities in enhancing imaging quality, efficiency, and diagnostic capabilities (Akagi et al. 2019; Ahishakiye et al. 2021; Litjens et al. 2017). Thus, it has the potential to address many of the challenges faced in 4D-imaging for motion management, and, consequently enhance the quality of radiation therapy. Figure 1 provides an example of 4D-MRI outlining the major steps in 4D-imaging for radiation therapy and highlighting the steps in which AI is involved.
In this review article, we discuss the applications of AI in advancing 4D-imaging with a specific emphasis on motion management. We outline the inherent challenges in current 4D-imaging practices and provide examples of how AI can address these challenges to increase efficiency, accuracy, and image quality. Additionally, we offer insight into future directions in this field. Notably, several studies have attempted real-time reconstruction of 3D CT images from two-dimensional (2D) images to track irregular motion patterns for radiotherapy (Shen et al. 2019; Montoya et al. 2022; Loÿen et al. 2023). However, this review focuses on 4D imaging technologies and thus does not cover 2D-to-3D reconstruction. The remainder of this paper is organized as follows. Section 2 is the search strategies and inclusion criteria for conducting this review. Section 3 provides a concise overview of current 4D-imaging techniques. In Sect. 4, we analyze the challenges specific to each imaging modality and discuss the current progress made in leveraging AI to overcome these challenges. Section 5 discusses the remaining challenges and outlines potential future directions for advancing 4D-imaging. Section 6 conclude of the review.
2 Search strategy and inclusion criteria
The search was conducted in November 2023 using databases including PubMed, Google Scholar, IEEE, ScienceDirect, and Elsevier. The search term included a comprehensive list of descriptors covering the constructs “acquisition mode", “imaging modality", “artificial intelligence" and “clinical application" to ensure exhaustive coverage of the search space. Table 1 provides illustrative examples of search terms used for each category.
This paper aims to provide a comprehensive and accurate introduction to AI’s application in 4D-imaging, with a specific focus on motion management. Studies were selected based on the following criteria: (1) primary research studies involving respiratory motion management; (2) focused on addressing the existing issues in 4D-imaging techniques or supporting the clinical application of 4D-imaging; and (3) included any type of AI algorithms, such as deep neural networks (DNNs).
Papers were excluded if they (1) did not focus on respiratory motion but instead examined other types of physiological movements (such as cardiac motion or gastrointestinal motion), tumor dynamic changes, or blood flow, (2) conducted research and experiments solely on static imaging, or (3) investigated only non-AI methods. Furthermore, we also excluded non-English papers, conference abstracts, posters, and theses for academic degrees.
3 4D-imaging modalities
3.1 4D-CT
4D-CT is a powerful technique for observing internal organ motion and integrating motion information in treatment planning (Vergalasova and Cai 2020). This technique consists of a series of phase-resolved 3D-CT images, each representing a specific breathing bin of the patient’s respiratory cycle (Keall et al. 2006). 4D-CT can be acquired via either prospective or retrospective strategies (Hugo and Rosu 2012). In prospective respiratory-gated acquisition, images are obtained at specific respiratory phases. In contrast, retrospective 4D-CT can be performed using cine mode or helical mode (He et al. 2015). In cine mode, the CT scanner continuously captures multiple axial images at one position before moving to the next. This process is repeated until the entire target region is scanned. In helical mode, the table moves at a constant low speed while the CT scanner continuously acquires images, capturing numerous axial images over multiple respiratory cycles. For retrospective acquisition, a separate signal related to the patient’s breathing state must be acquired simultaneously and synchronized with image acquisition. This respiratory signal is then used to sort the partial images or the acquired projection data into the correct respiratory phase.
In 4D-CT imaging, there are two common image-binning approaches: phase binning and amplitude binning (Abdelnour et al. 2007). Phase binning associates each 3D-CT image with a specific phase or fraction of the breathing cycle period, offering a straightforward interpretation of temporal information (breathing cycle equidistantly sampled) and can be directly used for concepts like mid-position and mid-ventilation (Werner et al. 2017). In contrast, amplitude binning assigns each 3D-CT image to bins on the basis of the full amplitude of the breathing signal. This approach can provide more detailed information on breathing motion and leads to fewer artifacts. However, amplitude binning may necessitate longer acquisition times during prospective gating because of breathing variations such as baseline drifts (Stemkens et al. 2018). In retrospective reconstruction-with identical acquisition time-amplitude sorting may suffer from insufficient binning data points (Zhang et al. 2024).
3.2 4D-MRI
Compared with 4D-CT, 4D-MRI offers superior soft tissue contrast and does not expose patients to ionizing radiation (Harris et al. 2017; Liu et al. 2016). 4D-MRI can be implemented using either prospective or retrospective approaches. Prospective MRI is commonly achieved via using 3D acquisition. The acquired MRI data must be reordered before image reconstruction in k-space because data are collected multiple times in a 3D readout (Li et al. 2017). Most 4D-MRI techniques are retrospective. Therefore, images are continuously acquired over the whole region of interest (ROI) and retrospectively sorted into respiratory phases (Yang et al. 2014). Retrospective MRI mainly adopts a multi-slice 2D acquisition approach with T2 weighted turbo spin echo (T2-TSE) or balanced steady-state free precession (bSSFP) sequences.
To reorder 4D-MRI data, three common methods are used: external surrogates, internal surrogates, and self-navigation (Kavaluus et al. 2020). External surrogates, similar to those in 4D-CT, encounter challenges in MRI such as signal saturation and synchronization issues (Stemkens et al. 2018). MRI-specific internal surrogates include pencil-beam navigators. Self-navigation can be performed in both the image and frequency domains, using 2D image navigators and changes in body surface area (Celicanin et al. 2015; Cai et al. 2011). Similar to 4D-CT, the sorting of 4D-MRI data can be done via either phase-based or amplitude-based methods.
3.3 4D-CBCT
Integrating a CBCT scanner with linear accelerators offers significant advantages for assessing tumor and organ motion during treatment. By examining patients in their treatment position directly before or during radiotherapy, CBCT enhances target localization for beam delivery (Hong et al. 2022). However, conventional 3D-CBCT cannot capture the full range of tumor motion, limiting its ability to localize moving targets accurately. To address this, 4D-CBCT has been developed in recent years as a powerful tool for providing respiration-resolved images that improve the localization of moving targets.
4D-CBCT retrospectively sorts images in the projection space, yielding subsets of projections corresponding to specific respiratory phases (Sonke et al. 2005). Each subset is then reconstructed into phase-resolved images (PRIs), resulting in a set of images. In practice, 10-phase 4D-CBCT reconstruction typically involves dividing the full projection dataset into 10 subsets, each containing approximately one-tenth of the total projections, which can lead to an extremely sparse-view CT problem for PRI reconstruction. The sparse-view nature of 4D-CBCT reconstruction leading to various artifacts, including view aliasing (streaks) and blurring at high-contrast boundaries. Leng et al. (2008) These artifacts can significantly hinder the clinical utility of 4D-CBCT images, making it essential to develop strategies to mitigate these effects and improve image quality.
3.4 4D-PET
4D-PET has been developed alongside 4D-CT to address the impact of patient motion, including respiration, on PET imaging, which affects lesion size, shape, and measured standardized uptake value (SUV) (Nehmeh et al. 2004). Two primary acquisition methods for 4D-PET are commonly used: prospective gated acquisition and retrospective stacked acquisition (Grootjans et al. 2016). In prospective gated PET, the respiratory signal is monitored, and scans are conducted only during a specific breathing state for a pre-determined time window. Retrospective acquisition uses a \('\)list mode\('\), involves continuous acquisition, and individual counts are tagged with the breathing state from a separate respiratory signal and then sorted into separate bins for each breathing state for reconstruction.
4 AI in 4D-imaging
In this review, we cover 64 studies on AI applications in 4D-imaging for respiratory motion management. Figure 2 depicts trends in AI-based studies per modality over the years and the distribution of studies per task. AI-related 4D-imaging is a relatively new research area, with the first article published in 2018. In recent years, there has been a remarkable increase in publications in this field, indicating researchers’ growing interest and recognition. Most studies have focused on CT, MRI, and CBCT, whereas studies specifically focusing on PET imaging have been comparatively limited. Image post-processing and motion estimation are extensively explored fields. Figure 3 provides a knowledge graph summarizing major challenges, researched tasks, current approaches, and strategies for each modality, which may inspire researchers in this field.
4.1 AI in 4D-CT
4D-CT imaging is an essential component of radiotherapy for treating thoracic and abdominal tumors (Madesta et al. 2024). Despite extensive research conducted in 4D-CT, challenges associated with the use of 4D-CT imaging in clinical applications persist. First, current CT scanners cannot cover the entire anatomical ROI within a single gantry rotation, leading to artifacts from organ movement across multiple cycles. Second, achieving accurate motion estimation in 4D-CT is challenging because of irregular motion patterns and changing air density in lungs throughout the respiratory cycle. DL models have emerged as a powerful tool in 4D-CT, improving imaging quality, modeling motion, and enabling automatic target delineation. Since this review focuses on motion management, studies on ventilation image generation and 4D-CT generation are excluded. The related studies are categorized into three groups: artifact reduction, motion estimation, and target delineation, as shown in Table 2.
4.1.1 AI in 4D-CT artifact reduction
Double structure (DS) and interpolation (INT) artifacts are commonly observed in 4D-CT data (Madesta et al. 2024), as illustrated in Fig. 4. DS artifacts result from breathing variability during acquisition, causing inconsistent representations of anatomical structures across different breathing phases, whereas INT artifacts arise due to insufficient projection data for reconstructing image slices at the desired breathing phases and couch positions. These motion artifacts can significantly degrade image quality and affect the accuracy of target volume delineation and dose calculation in radiotherapy (Mori et al. 2019). Traditional post-processing methods, such as registration-based image interpolation and graph-based structure alignment, have been employed to mitigate artifacts, but they are time-consuming and only consider DS artifacts. Recently, DNNs have been used to reduce artifacts in 4D-CT. Mori et al. (2019) initially proposed a DL-based inpainting method for DS artifacts using an autoencoder to translate artifact-affected images into artifact-free images, as shown in Fig. 5a. However, this approach had limitations, including the use of simulated artifacts, a lack of clinical evaluation, and an insufficient reduction of artifacts in 3D. On this basis, Madesta et al. 2024 developed a convolutional neural network (CNN)-based conditional inpainting model that incorporated patient-specific artifact-free images as prior information and operats via 3D patches, as depicted in Fig. 5b. Their method significantly reduces the average root mean squared error (RMSE) by 60% for DS structures and 42% for INT structures with the in-house evaluation data. Nevertheless, incorporating artifact-free images remains challenging in clinical practice. The authors selected prior images that were typically less affected by artifacts: the end-exhalation phase for DS artifacts and the temporal average CT for INT inpainting. However, even with these selections, end-exhalation phase images can still be affected by DS artifacts. Similarly, for INT artifacts, the temporal average CT becomes increasingly blurred with larger motion amplitudes, leading to the loss of fine structural details during inpainting.
Example of 4D-CT data with INT and DS artifacts. Figure reprinted from Madesta et al. 2024
Common deep learning schemes for 4D-imaging enhancement and motion estimation. a Single Image Enhancement: Enhances individual images independently. b Prior-Image Guided Enhancement: Uses a prior image to guide the enhancement of the input image. c Supervised DIR: Guided by the reference DVF. d Unsupervised DIR: Constrained by the distance between the warped and fixed images. e Motion Compensation Enhancement: Combines multiple phase images via DIR to enhance the target phase. The gray dashed line in (e) indicates that the reconstructed result can iteratively improve the DIR process in reverse, as seen in methods like SMEIR. DIR deformable image registration, DVF deformable vector field, STN spatial transformation network, WP warped phase. Notably, the gray dashed line in subfigure e indicates that the reconstruction result can be used to improve the DIR process in a reverse manner, as adopted in certain studies known as SMEIR
4.1.2 AI in 4D-CT motion estimation
Deformable image registration (DIR) is a promising tool for processing 4D-CT images, enabling accurate motion tracking of internal organs and fiducial markers during respiratory cycles. Fast and accurate DIR via 4D-CT assists in treatment planning, including target definition, tumor tracking, and organ-at-risk sparing. Traditional DIR methods for 4D-CT datasets minimize dissimilarity measures to find the optimal transformation mapping between two-phase images. However, these methods have drawbacks such as long computational times, manual parameter tuning, and the risk of being trapped in local optima (Wei et al. 2021). Moreover, repeated application of spatial filters throughout the iteration process in these methods leads to over-smoothed motion fields and false deformation of bony structures with minimal motion. The large appearance variances and low image contrast of abdominal 4D-CT present additional challenges for accurate registration. To address these challenges, numerous DL-based studies, including supervised and unsupervised methods, have extensively explored improved DIR techniques.
Supervised learning-based registration Supervised DIR involves the use of ground truth deformation vector fields (DVFs) from conventional algorithms to guide the training process, as shown in Fig. 5c. Sentker et al. (2018) developed GDL-FIRE, the first CNN-based registration method for 4D-CT, in which ground truth DVFs obtained from three traditional methods are employed. GDL-FIRE achieved a comparable target registration error (TRE) to traditional DIRs but with a significantly reduced computation time, representing a 60-fold increase in speed-up. Teng et al. (2021) developed a patch-based CNN for inter-phase registration, using DVFs from VelocityAI as the ground truth for training. This method proved effective not only on 4D-CT but for demonstrating robustness against artifacts on 4D-CBCT scans. Despite these promising results, manual preparation of training datasets remains laborious, subjective, and prone to error. To overcome this issue, Eppenhof and Eppenhof and Pluim (2018) provided a solution that uses synthetic random transformations to train the network, eliminating the need to manually annotate ground truth DVFs. However, the artificial transformation may significantly differ from the actual lung motion.
Unsupervised learning-based registration Unsupervised registration is highly desirable when ground truth DVFs are unavailable. This approach relies solely on moving and fixed image pairs, without the need for ground truth DVFs. In unsupervised methods, a moving image is deformed via a spatial transformer network (STN) (Jaderberg et al. 2015), and models are trained by minimizing the error between the fixed and deformed images, as depicted in Fig. 5d. However, additional regularization is necessary to improve the reliability and accuracy of DVF predictions. To address this issue, numerous studies have employed generative adversarial networks (GANs) to enforce DVF regularization and prevent unrealistic DVF prediction. Lei et al. (2019) proposed a pioneering GAN-based method for 4D-CT abdominal images, integrating a dilated inception module to extract multi-scale structural features, resulting in robust motion estimation. They further developed MS-DIRNet (Lei et al. 2020), which combines global and local registration networks with a self-attention mechanism in the generator, improving the differentiation of minimal moving structures. Fu et al. (2019) introduced a cascaded model for 4D-CT lung registration, comprising CoarseNet and FineNet. CoarseNet predicts a rough DVF on downsampled images, whereas the patch-based FineNet model predicts local lung motion on a fine-scale image. Additionally, they enhance vessel contrast by extracting pulmonary vascular structures before registration, leading to improved accuracy compared with that of conventional methods. Similarly, Yang et al. (2021) and Jiang et al. (2020) proposed multi-scale unsupervised frameworks for pulmonary 4D-CT registration, using three cascaded models at different resolutions to progressively refine DIR. The submodels are initially trained independently at each resolution, and then jointly optimized in a multi-scale framework to increase the end-to-end registration accuracy. The experimental results demonstrated that the proposed MANet (Yang et al. 2021) outperformed other registration methods, with an average TRE of 1.53 ± 1.02 mm on the Dir-Lab dataset. Moreover, the DVF estimation process is completed in only about approximately 1 s and requires no manual parameter tuning.
In addition, researchers have explored various approaches to incorporate prior information as regularization terms for constraints in thoracic-abdominal 4D-CT registration. Lu et al. (2021) employed recurrent networks to leverage the temporal continuities of 4D-CT, aiming to reduce the influence of artifacts in certain phases. Wei et al. (2021) introduced a U-Net-based model for intra-subject registration, achieving improvements in all ROIs, particularly for tumor volumes. Duan et al. (2023) integrated a lung segmentation network into the registration network to create a spatially adaptive regularization term, accommodating smooth and sliding motion. Furthermore, Iqbal et al. (2024) employed Jacobian regularization to prevent undesirable deformation and folding in the displacement field. Considering complex and large motion patterns in abdominal 4D-CT, Xu et al. (2023) proposed a recursive cascaded full-resolution residual network that performs progressive registration cooperatively. Recently, Xu et al. (2023) adopted a recursive registration strategy using ordinary differential equations integration of voxel velocities. Their method outperformed other learning-based methods, producing the smallest TREs of 1.24 mm and 1.26 mm with two publicly available lung 4D-CT datasets, Dir-Lab and Popi. This method produced less than 0.001% unrealistic image folding (fraction of negative values in the Jacobian determinant) and computed each CT volume in under 1 s.
Despite the impressive results achieved by DL-based methods, they have certain limitations. A primary challenge is their heavy reliance on large amounts of training data. Another limitation is their inability to register images that are significantly different from the training images. To address this issue, one-shot learning methods for DIR have been proposed (Fechter and Baltas 2020; Zhang et al. 2021; Chi et al. 2022). These methods, such as GroupRegNet (Zhang et al. 2021), use CNNs as feature extractors to register multiple 3D-CT images. The computed transformation is used to warp the input image into a common space, and iterative weight updated through backpropagation. Convergence criteria are evaluated to determine when to terminate the iterative process. Compared with the other DL-based methods, GroupRegNet reduced the original TRE from 8.12 ± 4.77 mm to 1.03 ± 0.64 mm on the public dataset, Popi, achieving a 44% reduction in the RMSE. However, these one-shot methods still require optimization for registering unseen images, making them similar to traditional iterative optimization methods. This approach can lead to overfitting and a lack of stability. Additionally, the registration process is typically slower than that of end-to-end DL methods, requiring several minutes to 30 min to complete (Zhang et al. 2021).
4.1.3 AI in 4D-CT tumor delineation
AI-driven automatic target segmentation in 4D-CT imaging is also an intensively investigated field. Manual delineation in each phase of 4D-CT can be time-consuming, exhaustive, and prone to subjective errors due to variations in tumor location caused by respiratory motion. Therefore, there is a demand for developing a computer-aided method for automatic, fast, and accurate tumor segmentation via 4D-CT. Li et al. (2018) utilized an Inception V3 architecture pre-trained with the ImageNet dataset to segment gross tumor volume (GTV) on each phase and combined them to predict the internal GTV (iGTV) for non-small cell lung cancer, demonstrating the potential of DL approaches in improving target delineation accuracy. Ma et al. (2023) explored 3D U-Net and its variants to leverage multiple phases of 4D-CT for automated intelligent delineation of iGTV in lung cancer. Momin et al. (2021) developed a motion region CNN that automated 4D-CT lung data delineation by incorporating global and local motion estimation networks and employing a self-attention strategy. Zhou et al. (2022) proposed a patient-specific target contour prediction model for the pancreas, which achieved a high Dice similarity coefficient (DSC) of 98% for tumor positioning without the need for pancreas segmentation. Yang et al. (2024) introduced a dual-encoding network for liver tumor segmentation, yielding promising results with a mean DSC of 0.869 for GTVs and 0.882 for iGTVs. Overall, these studies highlight the viability of employing AI techniques to increase the precision and efficiency of tumor delineation in 4D-CT imaging across various types of cancer, including lung, liver, and pancreatic tumors.
4.2 AI in 4D-MRI
Currently, 4D-MRI is still under investigation with challenges that need to be overcome before it can be fully adapted for clinical use (Yuan et al. 2019). One major challenge is the tradeoff between spatial and temporal resolution. To achieve a reasonable imaging time, 4D-MRI images are often heavily undersampled, resulting in low spatial resolution and motion artifacts that can blur fast-moving structures. This challenge poses difficulties in modeling DVFs from 4D-MR images for tumor tracking, especially in the abdominal region, which has complex soft anatomical variations. Recently, DL has been employed in 4D-MRI. A detailed summary of DL-based studies is presented in Table 3. Most studies in 4D-MRI have focused on alleviating the spatiotemporal tradeoff during the reconstruction and post-processing stages, whereas others have aimed to improve motion modeling accuracy despite poor image quality. These papers can be categorized into three classes: 4D-MRI reconstruction, 4D-MRI super-resolution, and motion estimation. Figure 6 presents an example of enhanced 4D-MRI results using different deep algorithms.
Visual example of low-quality 4D-MRI (a) and the super-resolved images by EDSR (b), Pixel2pixel (c), and 2.5D-cGAN (d). The selected ROI (yellow rectangle) represents the detailed features affected by respiratory motion. Figure reprinted from Zhi et al. (2023)
4.2.1 AI in 4D-MRI reconstruction
In recent years, researchers have explored various techniques to reconstruct high-quality MR images from undersampled acquisitions, such as parallel imaging and compressed sensing (CS) (Lustig et al. 2007). However, these methods have limitations in efficiently removing artifacts and noise, especially at high acceleration rates. Moreover, selecting the regularization parameters in constrained reconstruction methods is often empirical, computationally intensive, and time-consuming. DL approaches have emerged as an alternative approach that can bypass these issues by unrolling the iterative process and learning the parameters through network training. Several studies have proposed AI-based approaches for 4D-MRI reconstruction. Zhang et al. (2021) proposed a hybrid approach using the parallel non-Cartesian convolutional recurrent neural network (PNCRNN) for undersampled abdominal dynamic parallel MR data. The PNCRNN combines CRNN-based blocks to learn spatial and temporal redundancies, along with non-Cartesian data-consistency (DC) layers that imitate gradient descent for non-Cartesian data fidelity. The PNCRNN achieves high image quality and fast convergence within only a few iterations, and it can be combined with other unrolled networks for abdomen imaging with non-Cartesian sampling. Küstner et al. (2020) proposed a motion-corrected reconstruction network that unrolls an the alternating direction method of multipliers (ADMM) algorithm via cascaded (3+1)D U-Net to exploit spatial-temporal redundancies. They also introduced a self-supervised approach to improve the accuracy and reliability of the registration network (Küstner et al. 2022). Another study proposed the stDLNN (Wang et al. 2023), a method that combines model-based techniques with spatial-temporal dictionary learning approach to increase the efficiency and quality of 4D-MRI reconstruction. The experiment results showed that the stDLNN outperformed other state-of-the-art (SOTA) methods in terms of reconstruction quality and computational efficiency. Furthermore, Murray et al. (2024) proposed Movienet, which exploits space-time-coil correlations and motion preservation instead of k-space data consistency, to accelerate the acquisition and reconstruction of dynamic MR images. Overall, DL approahces have demonstrated the potential to improve the speed and quality of 4D-MRI reconstruction, especially in non-Cartesian acquisitions. These techniques can reduce noise and artifacts effectively, resulting in clearer and more accurate images, even at high acceleration rates. Moreover, DL-based models can also automatically select appropriate reconstruction parameters, eliminating the need for time-consuming empirical parameter selection.
4.2.2 AI in 4D-MRI super-resolution
Super-resolution methods can address the spatial-temporal tradeoff by improving spatial resolution and reducing artifacts in 4D-MR image post-processing. These methods can be broadly categorized as single-image super-resolution and prior image-based super-resolution methods.
Single-image methods generate high-resolution images from single low-resolution input images via image-to-image translation models. However, applying these methods to 4D-MRI faces challenges such as scarce training data, and potential mismatching with ground truth due to respiratory movements. To address these issues, researchers have developed data generation and augmentation modules. Chun et al. (2019) proposed a cascaded model with a downsampling network to generate perfectly paired low- and high-resolution data for training. Gao et al. (2023) developed a 3D GAN and proposed a novel data augmentation approach by gating into multiple respiratory states. Eldeniz et al. (2021) used unsupervised learning to minimize reconstruction artifacts by exploiting incoherent artifact patterns. Park et al. (2021) proposed an in-plane super-resolution method named ACNS, which achieves high image quality with significantly reduced computational time. However, most researchers have used 2D networks, which cannot capture the rich structural information that 3D networks could provide. Moreover, 2D-based methods only enhance the in-plane resolution, leaving the slice thickness unchanged from the original value.
Existing prior image-based super-resolution methods utilize multiple low-quality images or high-quality patient-specific MR images as reference images to leverage additional information regarding the fine anatomical structures. Gulamhussene et al. (2022) proposed a DL-based model that directly learns the relationship between the navigator and static volume slices, enabling high-quality 4D full-liver MRI reconstruction in near real time. Recently, transfer learning has been incorporated to address the time-consuming training for each patient and overcome domain shifting (Gulamhussene et al. 2023). Sarasaen et al. (2021) proposed a U-Net-based super-resolution model with fine-tuning using subject-specific static high-resolution MRI, resulting in high-resolution dynamic images. Terpstra et al. (2023) introduced MODEST, which uses low-dimensional subnetworks to reconstruct 4D-MRI by registering the exhale phase to every other respiratory phase using undersampled 4D-MRI and computed DVFs as input. More recently, Jafari et al. (2023) proposed GRASPNET, which sequentially leverages spatial and temporal correlations, to remove aliasing artifacts in the image domain, while achieving rapid reconstruction within seconds.
4.2.3 AI in 4D-MRI motion estimation
In addition to improving image quality from undersampled acquisitions, significant interest and challenges involve obtaining reliable 3D motion fields from compromised images with inconsistent tumor contrast, severe artifacts, and limited spatial resolution. Lv et al. (2018) introduced an unsupervised CNN-based registration method for motion analysis in abdominal images. This method outperforms non-motion corrected and local affine registration methods in visual score and vessel sharpness, with a substantial reduction in registration time from one hour to one minute. Küstner et al. (2020, 2022) proposed an aliasing-free motion estimation method in k-space using optical flow equations, which demonstrated improved reconstruction quality compared with that of image-based motion-corrected reconstruction. Moreover, Terpstra et al. (2021) developed TEMPEST, a multi-resolution CNN for analyzing DVFs in 3D cine-MRI data. Interestingly, TEMPEST also showed promising results with a public 4D-CT dataset without any retraining, indicating its excellent generalizability.
The integration of motion estimation with reconstruction or super-resolution techniques has also been explored. Xiao et al. (2022) proposed DDEM, a dual-supervised mode that mitigates the challenges of noise and artifacts in unsupervised methods by incorporating referenced DVFs as supplementary constraints. By using DDEM, 4D-DVFs are computed and used to deform prior images, resulting in high-quality 4D-MRI with improved accuracy. Recently, Xiao et al. (2023) extended this method by integrating a DenseNet-based reconstruction module, demonstrating the feasibility of real-time imaging even at high downsampling factors up to 500. In addition, Zhi et al. (2023) developed a cascaded model named CoSF-Net that simultaneously enhances the DIR and image quality of 4D-MRI. It incorporates two registration submodels for coarse-to-fine registration and a 2.5D cGAN super-resolution module. The experiment results showd that CoSF-Net outperformed SOTA networks and algorithms in motion estimation and image resolution enhancement for 4D-MRI.
4.3 AI in 4D-CBCT
The use of 4D-CBCT improves both target coverage and normal tissue avoidance in thoracic IGRT (Rusanov et al. 2022). However, in 4D-CBCT, sparse-view sampling at each respiratory phase leads to noise and streak artifacts in images reconstructed with the clinical back-projection algorithm, adversely affecting target localization accuracy. As a result, improving the quality of 4D-CBCT reconstructions is essential for ensuring the precision of radiation therapy delivery. In addition, motion estimation in 4D-CBCT is impacted by streak artifacts. Existing methods enhance image quality through a combination of enhancement techniques to improve motion estimation accuracy. This review discusses these methods within the context of motion compensation-based enhancement, avoiding repetition. Table 4 report DL-based approaches for 4D-CBCT enhancement.
4.3.1 AI in 4D-CBCT enhancement
In previous years, various algorithms have been developed to address the intra-phase undersampling issue in 4D-CBCT. Notably, compressed-sensing (CS)-based methods have been applied to sparse-view CT/CBCT reconstruction and demonstrated high image quality by leveraging the sparsity characteristics in specific domains (e.g., the gradient domain or other transform domains) (Jiang et al. 2019). However, CS-reconstructed images may lose some fine structures and over-smooth the edge information. The prior-deformation-based methods (Zhang et al. 2018; Ren et al. 2014) assume that the onboard 4D-CBCT is a deformation of the prior 4D-CT. They reconstruct high-quality 4D-CBCT by deforming the prior 4D-CT using the DVFs solved under data fidelity and bending energy constraints. However, the deformation accuracy can be compromised in low-contrast regions. In addition, both CS- and prior-deformation-based algorithms require manual tuning of hyper-parameters and iterative optimization, which can require minutes or even hours to complete. Another category of 4D-CBCT reconstruction methods is motion-compensated algorithms. These methods apply motion models to deform other phases onto the target phase to overcome the intra-phase undersampling. However, the inter-phase deformation accuracy is limited by the poor quality of the initial or intermediate-phase images.
Recently, DL has also been utilized for improving the image quality of sparse-view 4D-CBCT. The existing methods generally fall into three categories: projection pre-processing, image reconstruction, and image post-processing.
Projection pre-processing Projection pre-processing methods utilize DL approaches to interpolate or synthesize unmeasured projection views. After that, the analytical algorithm is adopted for reconstruction. For example, Beaudry et al. (2019) proposed a DL method to reconstruct high-quality 4D-CBCT images from sparse-view acquisitions. They estimated projection data for each respiratory bin by drawing projections from adjacent bins and linear interpolation and then fed them into a CNN model to predict full projection data. This approach successfully promoted streaking artifact removal and noise reduction in FDK reconstructed images. However, DL-based projection pre-processing has not been extensively studied because the raw data of commercial scanners are usually unavailable to most researchers, and improper operations on the projection data may lead to secondary artifacts in the reconstructed images.
Image reconstruction DL approaches have been explored to enhance the image quality of 4D-CBCT images. Similar to 4D-MRI, the hybrid methods integrate data fidelity, domain transformation knowledge, and image restoration into one DL framework to improve the reconstruction performance. Several studies have attempted to exploit spatiotemporal correlation using DNNs as the constraint term in the objective reconstruction model. These methods usually adopt a joint learning strategy for optimization in both the projection and image domains. For example, Liu et al. (2019) incorporated a prior deformation motion derived from CNN into the iterative reconstruction framework for compensating for the CBCT volume and optimized it via a variable splitting algorithm. Chen et al. (2020) adopted a proximal forward-backward splitting method to the proposed 4D-AirNet models. Hu et al. (2022) proposed a framework termed PRIOR for 4D CBCT, which uses a well-trained neural network as the regularization constraint to improve the reconstruction image via an effective iterative strategy. These deep models have achieved promising performance by synergizing iterative and DL methods for image reconstruction. Xiao et al. (2023) developed a motion-sensitive cascaded model for real-time 4D-CBCT reconstruction. This model combines a dual attention mechanism, a residual network, and a principal component analysis model to map single projections from different breathing phases to each phase of 3D-CBCT, enabling real-time 4D-CBCT reconstruction.
Image post-processing Most studies have focused on improving 4D-CBCT via image post-processing, which aims to correct errors or artifacts in the images after reconstruction. These methods process the initially reconstructed image as input and enhance the image quality to better align with the fully-sampled images. Numerous efforts have been devoted to using spatial-temporal information in conventional analytic or total variation (TV)-based images to mitigate streaking artifacts and noise or recover structral details. These methods can be divided into three categories: group data-driven methods, motion-compensated methods, and patient-specific prior image-guided methods.
Group data-driven methods involve training DNNs with datasets containing groups of patients to learn the mapping from initial reconstructed images to target images. Jiang et al. (2019) proposed the use of a symmetric residual CNN to increase the sharpness of edges in TV-regularized undersampled CBCT. Lee et al. (2019) constructed a residual U-Net with a wavelet-based process to remove streaking artifacts from FBP-reconstructed images. Sun et al. (2021) incorporated transfer learning to fine-tune a group-trained model with a patient-specific dataset for individual patients, demonstrating superior performance in recovering small lung textures and eliminating noise. In the above studies, researchers prepared paired training samples by simulating 4D-CBCT from ground truth 4D-CT and then trained supervised models with pixel-level loss functions (e.g., L1 loss, L2 loss). However, it is worth noting that there may be a significant difference between the simulated data and real data, which would inevitably decrease the model performance when applied in the clinical setting. In contrast, Madesta et al. (2020) proposed a self-contained method by training a CNN with pseudo-average and time-average CBCT images to suppress streaking artifacts without additional data requirements.
Recently, GAN-based models, particularly CycleGANs, have gained attention for weakly supervised, even unsupervised learning, specifically tailor to 4D-CBCT, where it is challenging to obtain perfectly matched image pairs owing to respiratory movements. Dong et al. (2022) built a CycleGAN to learn the relationship between unpaired undersampled CBCT images and high-quality CT images with a contrastive loss function to preserve the anatomical structure in the corrected image. Usui et al. (2022) utilized CycleGAN to train unpaired thoracic 4D-CBCT images with high-quality multi-slice CT (MSCT), resulting in enhanced images with fewer artifacts and improved visibility of lung tumor regions. More recently, Zhang et al. (2021, 2022) demonstrated the effectiveness of GAN-based models in enhancing 4D-CBCT for radiomics analysis. Motion compensation (MoCo) compensates for the respiratory motion of each phase-correlated image by employing interphase DVFs Zhang et al. (2019), as shown in Fig. 5e. Compared with earlier MoCo methods (Zhang et al. 2019; Wang and Gu 2013), DL has improved the efficiency and accuracy of MoCo by enhancing the prior motion estimation model. For instance, Huang et al. utilized two DNNs to obtain high-quality DVFs and embedded them into the SMEIR workflow to produce refined 4D images. However, these methods still relied on the estimation of DVFs from low-quality initial images, and the performance of MoCo reconstruction heavily relies on registration accuracy. To solve this issue, researchers have proposed alternative approaches from two different perspectives. On the one hand, Zhang et al. (2023) hypothesized that high-quality initial 4D-CBCT images would improve motion estimation accuracy. Thus, they incorporated a 3D CNN to reduce the structural artifacts from initial FDK-reconstructed images and then estimated motion on the basis of the artifact-mitigated initial images, which could further restore the lost information. On the other hand, Jiang et al. (2022) developed FeaCo-DCN with deformable convolution networks (DCNs) to align adjacent phases to the target phase at the feature level instead of explicitly deriving interphase DVFs from low-quality images. The model achieves SOTA performance in the SPARE challenge with Monte-Carlo 4D-CBCT datasets. However, the image quality may degrade when applying the model to clinical 4D-CBCT scans because of noise variations. This common issue for DL-based methods, which can be resolved by tuning the model via real projections.
Examples of 4D-CBCT images and results by various algorithms. A depicted the prior CT image, while B–E present the CBCT images reconstructed by the FDK, ASD-POCS, 3D U-Net, and proposed models, respectively. F shows the corresponding ground truth CBCT. The red arrows indicate image details for visual inspection. FDK Feldkamp-Davis-Kress. Figure reprinted from Jiang et al. (2021b)
Another category is the patient-specific prior image-guided method, in which the intra-patient prior image is incorporated into the phase-by-phase reconstruction process. Jiang et al. (2021b) proposed a merging-encoder CNN (MeCNN) that leverages patient-specific information from the prior CT image to enhance under-sampled image reconstruction (Fig. 7). They also introduced a dual-encoder CNN for average-image-constrained enhancement that extracts features from both the average 4D-CBCT image and the target phase image (Jiang et al. 2021a). Zhi et al. (2021) developed N-Net and its enhanced version, CycN-Net, to refine phase-resolved images by incorporating prior images reconstructed from the full 4D-CBCT projection set. In CycN-Net, five consecutive phase-resolved images and the prior image are independently encoded, and the extracted feature maps are fused during decoding to predict the target phase. The experimental results demonstrated that the CycN-Net outperformed other 4D-CBCT methods in preserving delicate structures and reducing artifacts and noise. However, the prior image may still contain blurred artifacts from CT exposure, which can result in residual artifacts in the reconstructions.
4.4 AI in 4D-PET
DL-based studies for 4D-PET imaging in the abdomen are relatively limited with only four papers focusing on image enhancement, particularly denoising, as shown in Table 5. PET acquisition is typically completed in 10 to 20 min (Manber et al. 2015), during which patient breathing and movement cause motion artifacts that affect image quality. Conventional image post-processing methods such as Gaussian filtering (Floberg and Holden 2013) and non-local mean filter (Dutta et al. 2013) can improve image quality to some extent, but they often result in over-smoothing in ultra-low-dose data. This limitation has led to the exploration of DL-based approaches, which can be classified into two categories - those that use only low-quality PET data as input (Gong et al. 2018; Zhou et al. 2021) and those that incorporate MR/CT images as input (Munoz et al. 2021). While both approaches have achieved superior denoising performance with static PET data, none have addressed motion estimation and denoising for respiratory-gated PET.
To address the issues of motion estimation and denoising for respiratory-gated PET, several MoCo approaches have been proposed. Similar to 4D-CBCT, these approaches typically involve initial image reconstruction gate-by-gate and image registration for motion estimation among different gates. However, noisy gated images can lead to inaccurate motion estimation, and iterative DIR is time-consuming. To address the noise, Zhou et al. (2020) proposed a siamese adversarial network (SAN) to estimate motion between pairs of low-dose gated images. They first denoised the low-dose gated images and then estimated motion on the basis of the denoised images. Building upon this work, they introduced the MDPET (Zhou et al. 2021), which combines motion correction and denoising into a unified framework. MDPET uses an RNN-based motion estimation network to leverage temporal information, and a denoising network to generate high-quality denoised PET images, as shown in Fig. 8. In addition, Li et al. (2020) proposed an unsupervised non-rigid image registration framework to estimate deformation fields in respiratory-gated images. On this basis, Li et al. (2021) developed a joint estimation method that incorporates DL-based image registration into a constrained image reconstruction algorithm. This unsupervised learning approach does not require ground truth for training, which is often unavailable.
Examples of 4D-PET and denoising results by various algorithms. The average low-dose gated images generated from different motion estimation methods are shown in the 1st row. The corresponding denoised images are shown in the 2nd row. From left to right: ground truth, U-Net denoising from the averaged image without any deformation, U-Net denoising on the averaged image based on NRB-derived deformation fields, U-Net denoising on the averaged image based on VM-derived deformation fields, U-Net denoising on the averaged image based on SAN-derived deformation fields, and the end-to-end output from MDPET. Figure reprinted from Zhou et al. (2021)
5 Discussion
In the past five years, AI has made remarkable advancements in the field of 4D-imaging, leading to improvements in imaging speed and quality. Additionally, AI approaches hold significant promise for motion management, such as the use of deep models to replace traditional iterative registration for real-time tumor tracking. However, some research challenges still need to be addressed. Moving forward, it is crucial to focus on developing and optimizing DL technologies for 4D imaging in the context of clinical practice. This chapter offers a comprehensive overview of the achievements and limitations in current research on AI approaches in 4D-imaging. We also discuss the remaining challenges in this field and propose future research directions.
5.1 Achieved advances
The advances achieved by AI in 4D-imaging for motion management can be summarized as follows:
Improved image quality: AI techniques have shown great promise in enhancing image quality in 4D-imaging during the reconstruction and post-processing stages. For example, DL-based algorithms have successfully reduced DS artifacts in 4D-CT, enhanced spatial resolution in 4D-MRI, suppressed streak artifacts in 4D-CBCT, and mitigated noise in 4D-PET. These improvements enhance tumor visibility, facilitating accurate target localization and treatment delivery, as well as augmenting radiomics analysis.
Accelerated acquisition and processing: AI algorithms have enabled faster acquisition and processing of 4D images. For example, a cascaded model has been proposed to reconstruct 4D-MRI at downsampling factors of up to 500, enabling real-time applications of ultra-fast 4D-MRI (Xiao et al. 2023). Moreover, data-driven methods have replaced iterative processes in image reconstruction and registration with single-step predictions, reducing the computation time from minutes to microseconds (Sentker et al. 2018; Zhang et al. 2021; Xiao et al. 2022). This acceleration greatly reduces the processing time for tasks like image reconstruction and motion estimation, thus expediting the entire workflow.
More accurate motion modeling: AI techniques have been leveraged to enhance motion estimation in 4D imaging. By addressing the challenges associated with poor image quality, such as low spatial resolution, streak artifacts, and noise, AI-driven approaches, including cascaded image refinement and motion estimation, as well as incorporating patient-specific temporal information, have the potential to improve the accuracy and robustness of motion estimation algorithms (Lu et al. 2021; Zhi et al. 2023).
Reduced manual intervention and costs: AI-based automation has reduced manual intervention and costs in 4D-imaging. Model-based reconstruction processes can be unrolled by deep networks, eliminating the need for manual parameter tuning.(Liang et al. 2019) Similarly, AI-based DIR bypasses the parameter selection process in traditional registration algorithms. Additionally, AI approaches have shown promise for fully automated delineation in 4D-imaging reducing the workload of physicians in performing manual delineation on all breathing phases,minimizing time-comsumption, and potentially lowering costs.
5.2 Current limitations
Despite notable achievements, current research in 4D-imaging still has several limitations that need attention. These limitations include the following:
Limited data volume and lack of external validation: Currently, 4D-imaging has not been fully integrated into clinical practice, especially 4D-MRI, resulting in a small volume of available data. Studies often involve a limited number of patients ( \(\le\) 50 patients) from a single institution, and lack external validation. Some studies use simulated datasets instead of real patient data, potentially deviating from real-world clinical scenarios and thus hinders a comprehensive evaluation of AI methods’ accuracy, generalizability, and clinical applicability.
One-sided evaluation: Despite the extensive study of AI approaches in medical imaging, the evaluation standards remain somewhat insufficient. For instance, in reconstruction and enhancement, most studies still rely on evaluation metrics commonly used in general images, such as the RMSE and SSIM (Akagi et al. 2019; Chun et al. 2019; Zhi et al. 2023). However, these metrics may not adequately assess a model’s performance, particularly its effectiveness in addressing respiratory motion-related issues in clinical scenarios.
Absence of public datasets: 4D-imaging is still a relatively new field, and there is a lack of publicly available datasets for cross-comparison. Public datasets are crucial for advancing algorithm research and facilitating algorithm translation into clinical practice. While public datasets exist for 4D-CT DIR, enabling direct comparison of registration accuracy improvements achieved by different AI algorithms, other 4D-imaging tasks and modalities suffer from a scarcity of public datasets (Table 6). This challenges limits the intuitive comparison of algorithm strengths and weaknesses as they often rely on private datasets.
5.3 Remaining challenges
Although significant progress has been made in AI applications of 4D-imaging for motion management, this field is still in its early stages, and several challenges remain. These challenges need to be addressed to further advance research in this area.
Limited data: As mentioned earlier, 4D-imaging studies often suffer from limited data, which poses a significant challenge for AI algorithms, especially DL models. Limited data can easily lead to overfitting, poor generalizability, and decreased performance, particularly when dealing with irregular respiratory patterns in real-life applications.
Lack of ground truth: Data-driven models in 4D-imaging struggle with the absence of ground truth data. In motion estimation, traditional methods are used to obtain the ground truth DVFs for training and validating deep models. However, these DVFs are not the true ground truth and may introduce errors. In image enhancement, obtaining high-quality reference images at the pixel level is challenging because of respiratory motion, making supervised learning models difficult to train and prone to structural distortions. Additionally, reference-based evaluation metrics may be inaccurate for evaluation.
Inability to restore details and avoid distortion: Restoring image quality from undersampled acquisitions is a critical challenge. Current studies predominantly rely on data-driven post-processing approaches. However, these approaches face two primary challenges. First, post-processing methods are incapable of creating details out of nothing (Jiang et al. 2019). Second, the use of deep models for artifact removal and denoising may introduce distortions and loss of anatomical structures due to difficulties in distinguishing specific features. Although incorporating patient-specific prior images has been identified as a potential solution, these prior images may still contain imperfections such as noise, motion artifacts, or blurring, potentially affecting the quality of enhanced images. Additionally, misalignment between static prior images and 4D-imaging can lead to unintended distortions in the enhanced results.
Insufficient motion estimation: The current level of registration accuracy in 4D-imaging remains unsatisfactory, particularly when dealing with complex motion patterns, poor contrast, extremely low spatial resolution, and various artifacts. These factors pose additional challenges to AI models in accurately extracting anatomical features for motion estimation while mitigating potential interference. Furthermore, irregular respiratory motion has yet to be thoroughly investigated.
5.4 Future directions
To advance the field of AI-based 4D-imaging and facilitate its integration into clinical practice, future research should focus on several key directions:
Reliability: It is essential to develop AI approaches that are more reliable, precise, and explainable. Researchers should explore approaches that integrate AI approaches with prior information, instead of relying solely on black-box methods. By incorporating prior knowledge and constraints, AI models can produce more interpretable and trustworthy outputs, enhancing their practical value in clinical settings.
Efficiency: While current studies primarily emphasizes improving accuracy, factors such as processing time, computational cost, and memory usage should also be considered. Real-time applications of 4D-imaging in treatment workflows require low latency, high speed, and computational efficiency. Therefore, future studies should aim to develop AI models that not only achieve high accuracy but also meet the efficiency requirements for real-time applications.
Generalizability: To advance AI in 4D-imaging for clinical practice, it is necessary to move beyond proof-of-concept studies with limited single-source data or simulated data. Conducting multi-institutional studies involving diverse patient populations and imaging systems is crucial for enhancing the generalizability of AI methods. These studies provide valuable insights into the performance and limitations of AI-based 4D-imaging techniques across various clinical settings, enabling more comprehensive feedback and validation of these methods.
Clinical validation: More comprehensive clinical evaluation and validation metrics are highly needed to assess the true clinical impact beyond traditional evaluation metrics. Future research should consider factors such as the integration of 4D-imaging into the clinical workflow and treatment outcomes. This will help verify the effectiveness of AI-based 4D-imaging in real-world clinical practice, facilitating its adoption in clinical radiation therapy.
6 Conclusion
The growth of AI methods, particularly DL methods, has notable increased the advancement of 4D-imaging techniques. Numerous studies have demonstrated the potential of AI models in enhancing the efficiency and accuracy of 4D-imaging. Moreover, AI-based approaches have facilitated the application of 4D imaging in motion management, resulting in substantial reductions in time consumption and human intervention. Despite these remarkable achievements, there are still limitations and challenges in the field that need to be addressed. Future studies should focus on the reliability (or transparency), efficiency, and generalizability of the developed methods and systems to integrate of AI-based 4D-imaging into clinical practice for motion management.
Data availability
No datasets were generated or analysed during the current study.
References
Abdelnour A, Nehmeh S, Pan T, Humm J, Vernon P, Schöder H et al (2007) Phase and amplitude binning for 4d-ct imaging. Phys Med Biol 52(12):3515
Ahishakiye E, Bastiaan Van Gijzen M, Tumwiine J, Wario R, Obungoloch J (2021) A survey on deep learning in medical image reconstruction. Intell Med 1(03):118–127
Akagi M, Nakamura Y, Higaki T, Narita K, Honda Y, Zhou J, Awai K (2019) Deep learning reconstruction improves image quality of abdominal ultra-high-resolution ct. Eur Radiol 29:6163–6171
Balik S, Weiss E, Jan N, Roman N, Sleeman WC, Fatyga M et al (2013) Evaluation of 4-dimensional computed tomography to 4-dimensional cone-beam computed tomography deformable image registration for lung cancer adaptive radiation therapy. Int J Radiat Oncol Biol Phys 86(2):372–379
Baskar R, Lee KA, Yeo R, Yeoh K-W (2012) Cancer and radiation therapy: current advances and future directions. Int J Med Sci 9(3):193
Beaudry J, Esquinas PL, Shieh C-C (2019) Learning from our neighbours: a novel approach on sinogram completion using bin-sharing and deep learning to reconstruct high quality 4dcbct. Med Imaging: Phys Med Imaging 10948:1025–1035
Cai J, Chang Z, Wang Z, Paul Segars W, Yin F-F (2011) Four-dimensional magnetic resonance imaging (4d-mri) using image-based respiratory surrogate: a feasibility study. Med Phys 38(12):6384–6394
Castillo R, Castillo E, Guerra R, Johnson VE, McPhail T, Garg AK, Guerrero T (2009) A framework for evaluation of deformable image registration spatial accuracy using large landmark point sets. Phys Med Biol 54(7):1849
Celicanin Z, Bieri O, Preiswerk F, Cattin P, Scheffler K, Santini F (2015) Simultaneous acquisition of image and navigator slices using caipirinha for 4d mri. Magn Reson Med 73(2):669–676
Chen G, Zhao Y, Huang Q, Gao H (2020) 4d-airnet: a temporally-resolved cbct slice reconstruction method synergizing analytical and iterative method with deep learning. Phys Med Biol 65(17):175020
Chi W, Xiang Z, Guo F (2022) Few-shot learning for deformable image registration in 4dct images. Br J Radiol 95(1129):20210819
Chun J, Zhang H, Gach HM, Olberg S, Mazur T, Green O et al (2019) Mri super-resolution reconstruction for mri-guided adaptive radiotherapy using cascaded deep learning: in the presence of limited training data and unknown translation model. Med Phys 46(9):4148–4164
Dong G, Zhang C, Deng L, Zhu Y, Dai J, Song L, Xie Y (2022) A deep unsupervised learning framework for the 4d cbct artifact correction. Phys Med Biol 67(5):055012
Duan L, Cao Y, Wang Z, Liu D, Fu T, Yuan G, Zheng J (2023) Boundary-aware registration network for 4d-ct lung image with sliding motion. Biomed Signal Process Control 86:105333
Dutta J, Leahy RM, Li Q (2013) Non-local means denoising of dynamic pet images. PLoS ONE 8(12):e81390
Eldeniz C, Gan W, Chen S, Fraum TJ, Ludwig DR, Yan Y et al (2021) Phase2phase: Respiratory motion-resolved reconstruction of free-breathing magnetic resonance imaging using deep learning without a ground truth for improved liver imaging. Invest Radiol 56(12):809–819
Eppenhof KA, Pluim JP (2018) Pulmonary ct registration through supervised learning with convolutional neural networks. IEEE Trans Med Imaging 38(5):1097–1105
Fechter T, Baltas D (2020) One-shot learning for deformable medical image registration and periodic motion tracking. IEEE Trans Med Imaging 39(7):2506–2517
Floberg J, Holden J (2013) Nonlinear spatio-temporal filtering of dynamic pet data using a four-dimensional gaussian filter and expectation-maximization deconvolution. Phys Med Biol 58(4):1151
Freedman JN, Gurney-Champion OJ, Nill S, Shiarli A-M, Bainbridge HE, Mandeville HC (2021) Rapid 4d-mri reconstruction using a deep radial convolutional neural network Dracula. Radiother Oncol 159:209–217
Fu Y, Wu X, Thomas AM, Li HH, Yang D (2019) Automatic large quantity landmark pairs detection in 4dct lung images. Med Phys 46(10):4490–4501
Gao C, Ghodrati V, Shih S-F, Wu HH, Liu Y, Nickel MD (2023) Undersampling artifact reduction for free-breathing 3d stack-of-radial mri based on a deep adversarial learning network. Magn Resonance Imaging 95:70–79
Gong K, Guan J, Liu C-C, Qi J (2018) Pet image denoising using a deep neural network through fine tuning. IEEE Trans Radiat Plasma Med Sci 3(2):153–161
Grootjans W, Tixier F, van der Vos CS, Vriens D, Le Rest CC, Bussink J, Visser EP (2016) The impact of optimal respiratory gating and image noise on evaluation of intratumor heterogeneity on 18f-fdg pet imaging of lung cancer. J Nucl Med 57(11):1692–1698
Gulamhussene G, Meyer A, Rak M, Bashkanov O, Omari J, Pech M, Hansen C (2022) Predicting 4d liver mri for mr-guided interventions. Comput Med Imaging Graph 101:102122
Gulamhussene G, Meyer A, Rak M, Bashkanov O, Omari J, Pech M, Hansen C (2023) Transfer-learning is a key ingredient to fast deep learning-based 4d liver mri reconstruction. Sci Rep 13(1):11227
Gutt R, Malhotra S, Hagan MP, Lee SP, Faricy-Anderson K, Kelly MD et al (2021) Palliative radiotherapy within the veterans health administration: barriers to referral and timeliness of treatment. JCO Oncol Pract 17(12):e1913–e1922
Harris W, Yin F-F, Wang C, Zhang Y, Cai J, Ren L (2017) Accelerating volumetric cine mri (vc-mri) using undersampling for real-time 3d target localization/tracking in radiation therapy: a feasibility study. Phys Med Biol 63(1):01NT01
He T, Xue Z, Teh BS, Wong ST (2015) Reconstruction of four-dimensional computed tomography lung images by applying spatial and temporal anatomical constraints using a bayesian model. J Med Imaging 2(2):024004–024004
Hong J, Reyngold M, Crane C, Cuaron J, Hajj C, Mann J et al (2022) Ct and cone-beam ct of ablative radiation therapy for pancreatic cancer with expert organ-at-risk contours. Sci Data 9(1):637
Hu D, Zhang Y, Liu J, Zhang Y, Coatrieux JL, Chen Y (2022) Prior: Prior-regularized iterative optimization reconstruction for 4d cbct. IEEE J Biomed Health Inform 26(11):5551–5562
Huang X, Zhang Y, Chen L, Wang J (2020) U-net-based deformation vector field estimation for motion-compensated 4d-cbct reconstruction. Med Phys 47(7):3000–3012
Hugo GD, Rosu M (2012) Advances in 4d radiation therapy for managing respiration: part i–4d imaging. Z Med Phys 22(4):258–271
Huynh E, Hosny A, Guthier C, Bitterman DS, Petit SF, Haas-Kogan DA, Mak RH (2020) Artificial intelligence in radiation oncology. Nat Rev Clin Oncol 17(12):771–781
Iqbal MZ, Razzak I, Qayyum A, Nguyen TT, Tanveer M, Sowmya A (2024) Hybrid unsupervised paradigm based deformable image fusion for 4d ct lung image modality. Inform Fusion 102:102061
Jaderberg M, Simonyan K, Zisserman A et al (2015) Spatial transformer networks. Adv Neural Inform Process Syst 28:1
Jafari R, Do RKG, LaGratta MD, Fung M, Bayram E, Cashen T, Otazo R (2023) Graspnet: Fast spatiotemporal deep learning reconstruction of golden-angle radial data for free-breathing dynamic contrast-enhanced magnetic resonance imaging. NMR Biomed 36(3):e4861
Jiang Z, Chang Y, Zhang Z, Yin F-F, Ren L (2022) Fast four-dimensional cone-beam computed tomography reconstruction using deformable convolutional networks. Med Phys 49(10):6461–6476
Jiang Z, Chen Y, Zhang Y, Ge Y, Yin F-F, Ren L (2019) Augmentation of cbct reconstructed from under-sampled projections using deep learning. IEEE Trans Med Imaging 38(11):2705–2715
Jiang Z, Yin F-F, Ge Y, Ren L (2020) A multi-scale framework with unsupervised joint training of convolutional neural networks for pulmonary deformable image registration. Phys Med Biol 65(1):015011
Jiang Z, Zhang Z, Chang Y, Ge Y, Yin F-F, Ren L (2021) Enhancement of 4-d cone-beam computed tomography (4d-cbct) using a dual-encoder convolutional neural network (decnn). IEEE Trans Radiat Plasma Med Sci 6(2):222–230
Jiang Z, Zhang Z, Chang Y, Ge Y, Yin F-F, Ren L (2021) Prior image-guided cone-beam computed tomography augmentation from under-sampled projections using a convolutional neural network. Quant Imaging Med Surg 11(12):4767
Kavaluus H, Seppälä T, Koivula L, Salli E, Collan J, Saarilahti K, Tenhunen M (2020) Retrospective four-dimensional magnetic resonance imaging of liver: Method development. J Appl Clin Med Phys 21(12):304–313
Keall PJ, Mageras GS, Balter JM, Emery RS, Forster KM, Jiang SB et al (2006) The management of respiratory motion in radiation oncology report of aapm task group 76 a. Med Phys 33(10):3874–3900
Küstner T, Pan J, Gilliam C, Qi H, Cruz G, Hammernik K (2020) Deep-learning based motion-corrected image reconstruction in 4d magnetic resonance imaging of the body trunk. In: 2020 Asia-pacific signal and information processing association annual summit and conference (apsipa asc). pp 976–985
Küstner T, Pan J, Gilliam C, Qi H, Cruz G, Hammernik K et al (2022) Self-supervised motion-corrected image reconstruction network for 4d magnetic resonance imaging of the body trunk. APSIPA Trans Signal Inform Process 11(1):e12
Lee D, Kim K, Kim W, Kang S, Park C, Cho H et al (2019) Four-dimensional cbct reconstruction based on a residual convolutional neural network for improving image quality. J Korean Phys Soc 75:73–79
Lei Y, Fu Y, Harms J, Wang T, Curran WJ, Liu T, Yang X (2019) 4d-ct deformable image registration using an unsupervised deep convolutional neural network. In: Artificial intelligence in radiation therapy: First international workshop, Airt 2019, held in conjunction with Miccai 2019, Shenzhen, China, October 17, 2019, proceedings 1. pp 26–33
Lei Y, Fu Y, Wang T, Liu Y, Patel P, Curran WJ, Yang X (2020) 4d-ct deformable image registration using multiscale unsupervised deep learning. Phys Med Biol 65(8):085003
Leng S, Zambelli J, Tolakanahalli R, Nett B, Munro P, Star-Lack J, Chen G-H (2008) Streaking artifacts reduction in four-dimensional cone-beam computed tomography. Med Phys 35(10):4649–4659
Li C, Li W, Liu C, Zheng H, Cai J, Wang S (2022) Artificial intelligence in multiparametric magnetic resonance imaging: A review. Med Phys 49(10):e1024–e1054
Li G, Wei J, Olek D, Kadbi M, Tyagi N, Zakian K, Hunt M (2017) Direct comparison of respiration-correlated four-dimensional magnetic resonance imaging reconstructed using concurrent internal navigator and external bellows. Int J Radiat Oncol Biol Phys 97(3):596–605
Li T, Zhang M, Qi W, Asma E, Qi J (2020) Motion correction of respiratory-gated pet images using deep learning based image registration framework. Phys Med Biol 65(15):155003
Li T, Zhang M, Qi W, Asma E, Qi J (2021) Deep learning based joint pet image reconstruction and motion estimation. IEEE Trans Med Imaging 41(5):1230–1241
Li X, Deng Z, Deng Q, Zhang L, Niu T, Kuang Y (2018) A novel deep learning framework for internal gross target volume definition from 4d computed tomography of lung cancer patients. IEEE Access 6:37775–37783
Liang D, Cheng J, Ke Z, Ying L (2019) Deep mri reconstruction: unrolled optimization algorithms meet neural networks. ArXiv preprintarXiv:1907.11711,
Liang X, Lin S, Liu F, Schreiber D, Yip M (2023) Orrn: An ode-based recursive registration network for deformable respiratory motion estimation with lung 4dct images. IEEE Trans Biomed Eng 70:3265
Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, Sánchez CI (2017) A survey on deep learning in medical image analysis. Med Image Anal 42:60–88
Liu J, Kang Y, Hu D, Chen Y (2019) 4d-cbct reconstruction via motion compensataion learning induced sparse tensor constraint. In: 2019 12th International congress on image and signal processing, biomedical engineering and informatics (cisp-bmei). pp 1–5
Liu Y, Yin F-F, Rhee D, Cai J (2016) Accuracy of respiratory motion measurement of 4d-mri: a comparison between cine and sequential acquisition. Med Phys 43(1):179–187
Loÿen E, Dasnoy-Sumell D, Macq B (2023) Patient-specific three-dimensional image reconstruction from a single x-ray projection using a convolutional neural network for on-line radiotherapy applications. Phys Imaging Radiat Oncol 26:100444
Lu J, Jin R, Song E, Ma G, Wang M (2021) Lung-crnet: a convolutional recurrent neural network for lung 4dct image registration. Med Phys 48(12):7900–7912
Lustig M, Donoho D, Pauly JM (2007) Sparse mri: the application of compressed sensing for rapid mr imaging. Magn Resonance Med: Off J Int Soc Magn Resonance Med 58(6):1182–1195
Lv J, Yang M, Zhang J, Wang X (2018) Respiratory motion correction for free-breathing 3d abdominal mri using cnn-based image registration: a feasibility study. Br J Radiol 91:20170788
Ma Y, Mao J, Liu X, Dai Z, Zhang H, Zhang X, Li Q (2023) Deep learning-based internal gross target volume definition in 4d ct images of lung cancer patients. Med Phys 50(4):2303–2316
Madesta F, Sentker T, Gauer T, Werner R (2020) Self-contained deep learning-based boosting of 4d cone-beam ct reconstruction. Med Phys 47(11):5619–5631
Madesta F, Sentker T, Gauer T, Werner R (2024) Deep learning-based conditional inpainting for restoration of artifact-affected 4d ct images. Med Phys 51:3437
Manber R, Thielemans K, Hutton BF, Barnes A, Ourselin S, Arridge S, Atkinson D (2015) Practical pet respiratory motion correction in clinical pet/mr. J Nucl Med 56(6):890–896
Momin S, Lei Y, Tian Z, Wang T, Roper J, Kesarwala AH, Yang X (2021) Lung tumor segmentation in 4d ct images using motion convolutional neural networks. Med Phys 48(11):7141–7153
Montoya JC, Zhang C, Li Y, Li K, Chen G-H (2022) Reconstruction of three-dimensional tomographic patient models for radiation dose modulation in ct from two scout views using deep learning. Med Phys 49(2):901–916
Mori S, Hirai R, Sakata Y (2019) Using a deep neural network for four-dimensional ct artifact reduction in image-guided radiotherapy. Physica Med 65:67–75
Munoz C, Ellis S, Nekolla SG, Kunze KP, Vitadello T, Neji R, Prieto C (2021) Mri-guided motion-corrected pet image reconstruction for cardiac pet/mri. J Nucl Med 62(12):1768–1774
Murray V, Siddiq S, Crane C, El Homsi M, Kim T-H, Wu C, Otazo R (2024) Movienet: deep space-time-coil reconstruction network without k-space data consistency for fast motion-resolved 4d mri. Magn Resonance Med 91(2):600–614
Nehmeh S, Erdi Y, Pan T, Pevsner A, Rosenzweig K, Yorke E et al (2004) Four-dimensional (4d) pet/ct imaging of the thorax: 4d pet/ct. Med Phys 31(12):3179–3186
Noid G, Tai A, Chen G-P, Robbins J, Li XA (2017) Reducing radiation dose and enhancing imaging quality of 4dct for radiation therapy using iterative reconstruction algorithms. Adv Radiat Oncol 2(3):515–521
Panta RK, Segars P, Yin F-F, Cai J (2012) Establishing a framework to implement 4d xcat phantom for 4d radiotherapy research. J Cancer Res Ther 8(4):565–570
Park S, Farah R, Shea SM, Tryggestad E, Hales R, Lee J (2018) Simultaneous tumor and surrogate motion tracking with dynamic mri for radiation therapy planning. Phys Med Biol 63(2):025015
Park S, Gach HM, Kim S, Lee SJ, Motai Y (2021) Autoencoder-inspired convolutional network-based super-resolution method in mri. IEEE J Transl Eng Health Med 9:1–13
Ren L, Zhang Y, Yin F-F (2014) A limited-angle intrafraction verification (live) system for radiation therapy. Med Phys 41(2):020701
Rietzel E, Chen GT, Choi NC, Willet CG (2005) Four-dimensional image-based treatment planning: target volume segmentation and dose calculation in the presence of respiratory motion. Int J Radiat Oncol Biol Phys 61(5):1535–1550
Rit S, Wolthaus JW, van Herk M, Sonke J-J (2009) On-the-fly motion-compensated cone-beam ct using an a priori model of the respiratory motion. Med Phys 36:2283–2296
Rusanov B, Hassan GM, Reynolds M, Sabet M, Kendrick J, Rowshanfarzad P, Ebert M (2022) Deep learning methods for enhancing cone-beam ct image quality toward adaptive radiation therapy: A systematic review. Med Phys 49(9):6019–6054
Sarasaen C, Chatterjee S, Breitkopf M, Rose G, Nürnberger A, Speck O (2021) Fine-tuning deep learning model parameters for improved super-resolution of dynamic mri with prior-knowledge. Artif Intell Med 121:102196
Sentker T, Madesta F, Werner R (2018) Gdl-fire: Deep learning-based fast 4d ct image registration. In: International conference on medical image computing and computer-assisted intervention. pp 765–773
Shen L, Zhao W, Xing L (2019) Patient-specific reconstruction of volumetric computed tomography images from a single projection view via deep learning. Nat Biomed Eng 3(11):880–888
Sonke J-J, Zijp L, Remeijer P, Van Herk M (2005) Respiratory correlated cone beam ct. Med Phys 32(4):1176–1186
Stemkens B, Paulson ES, Tijssen RH (2018) Nuts and bolts of 4d-mri for radiotherapy. Phys Med Biol 63(21):21TR01
Sun L, Jiang Z, Chang Y, Ren L (2021) Building a patient-specific model using transfer learning for four-dimensional cone beam computed tomography augmentation. Quant Imaging Med Surg 11(2):540
Teng X, Chen Y, Zhang Y, Ren L (2021) Respiratory deformation registration in 4d-ct/cone beam ct using deep learning. Quant Imaging Med Surg 11(2):737
Terpstra ML, Maspero M, Bruijnen T, Verhoeff JJ, Lagendijk JJ, van den Berg CA (2021) Real-time 3d motion estimation from undersampled mri using multi-resolution neural networks. Med Phys 48(11):6597–6613
Terpstra ML, Maspero M, Verhoeff JJ, van den Berg CA (2023) Accelerated respiratory-resolved 4d-mri with separable spatio-temporal neural networks. Med Phys 50(9):5331–5342
Usui K, Ogawa K, Goto M, Sakano Y, Kyogoku S, Daida H (2022) Image quality improvement for chest four-dimensional cone-beam computed tomography by cycle-generative adversarial network. Med Imaging Techno 40(2):37–47
Usui K, Ogawa K, Goto M, Sakano Y, Kyougoku S, Daida H (2022) A cycle generative adversarial network for improving the quality of four-dimensional cone-beam computed tomography images. Radiat Oncol 17(1):69
Vandemeulebroucke J, Sarrut D, Clarysse P, et al (2007) The popi-model, a point-validated pixel-based breathing thorax model. In: XVth International conference on the use of computers in radiation therapy (iccr). 2, pp 195–199
Vergalasova I, Cai J (2020) A modern review of the uncertainties in volumetric imaging of respiratory-induced target motion in lung radiotherapy. Med Phys 47(10):e988–e1008
Wang J, Gu X (2013) Simultaneous motion estimation and image reconstruction (smeir) for 4d cone-beam ct. Med Phys 40(10):101912
Wang Z, She H, Zhang Y, Du YP (2023) Parallel non-cartesian spatial-temporal dictionary learning neural networks (stdlnn) for accelerating 4d-mri. Med Image Anal 84:102701
Wei D, Yang W, Paysan P, Liu H (2021) An unsupervised learning based deformable registration network for 4d-ct images. In: Computational biomechanics for medicine: solid and fluid mechanics informing therapy. pp 63–72
Werner R, Hofmann C, Mücke E, Gauer T (2017) Reduction of breathing irregularity-related motion artifacts in low-pitch spiral 4d ct by optimized projection binning. Radiat Oncol 12:1–8
Weykamp F, Hoegen P, Regnery S, Katsigiannopulos E, Renkamp CK, Lang K et al (2023) Long-term clinical results of mr-guided stereotactic body radiotherapy of liver metastases. Cancers 15(10):2786
Xiao H, Chen K, You T, Liu D, Zhang W, Xue X, Dang J (2023) Real time 4d-cone beam ct accurate estimation based on single-angle projection via dual attention mechanism residual network. IEEE Trans Radiat Plasma Med Sci 7:618
Xiao H, Han X, Zhi S, Wong Y-L, Liu C, Li W (2023) Ultra-fast multi-parametric 4d-mri image reconstruction for real-time applications using a downsampling-invariant deformable registration (d2r) model. Radiother Oncol 189:109948
Xiao H, Ni R, Zhi S, Li W, Liu C, Ren G et al (2022) A dual-supervised deformation estimation model (ddem) for constructing ultra-quality 4d-mri based on a commercial low-quality 4d-mri for liver cancer radiation therapy. Med Phys 49(5):3159–3170
Xu L, Jiang P, Tsui T, Liu J, Zhang X, Yu L, Niu T (2023) 4d-ct deformable image registration using unsupervised recursive cascaded full-resolution residual networks. Bioeng Transl Med 8(6):e10587
Yang J, Cai J, Wang H, Chang Z, Czito BG, Bashir MR, Yin F-F (2014) Is diaphragm motion a good surrogate for liver tumor motion? Int J Radiat Oncol Biol Phys 90(4):952–958
Yang J, Sharp G, Veeraraghavan H, Van Elmpt W, Dekker A, Lustberg T, Gooding M (2017) Data from lung ct segmentation challenge 2017 (lctsc)
Yang J, Yang J, Zhao F, Zhang W (2021) An unsupervised multi-scale framework with attention-based network (manet) for lung 4d-ct registration. Phys Med Biol 66(13):135008
Yang Z, Yang X, Cao Y, Shao Q, Tang D, Peng Z, Li S (2024) Deep learning based automatic internal gross target volume delineation from 4d-ct of hepatocellular carcinoma patients. J Appl Clin Med Phys 25(1):e14211
Yuan J, Wong OL, Zhou Y, Chueng KY, Yu SK (2019) A fast volumetric 4d-mri with sub-second frame rate for abdominal motion monitoring and characterization in mri-guided radiotherapy. Quant Imaging Med Surg 9(7):1303
Zhang W, Oraiqat I, Litzenberg D, Chang K-W, Hadley S, Sunbul NB (2023) Real-time, volumetric imaging of radiation dose delivery deep into the liver during cancer treatment. Nat Biotechnol 41(8):1160–1167
Zhang Y, Deng X, Yin F-F, Ren L (2018) Image acquisition optimization of a limited-angle intrafraction verification (live) system for lung radiotherapy. Med Phys 45(1):340–351
Zhang Y, Huang X, Wang J (2019) Advanced 4-dimensional cone-beam computed tomography reconstruction by combining motion estimation, motion-compensated reconstruction, biomechanical modeling and deep learning. Vis Comput Ind, Biomed, Art 2(1):23
Zhang Y, Jiang Z, Zhang Y, Ren L (2024) A review on 4d cone-beam ct (4d-cbct) in radiation therapy: Technical advances and clinical applications. Med Phys 51(8):5164–5180
Zhang Y, She H, Du YP (2021) Dynamic mri of the abdomen using parallel non-cartesian convolutional recurrent neural networks. Magn Reson Med 86(2):964–973
Zhang Y, Wu X, Gach HM, Li H, Yang D (2021) Groupregnet: a groupwise one-shot deep learning-based 4d image registration method. Phys Med Biol 66(4):045030
Zhang Z, Huang M, Jiang Z, Chang Y, Lu K, Yin F-F, Ren L (2022) Patient-specific deep learning model to enhance 4d-cbct image for radiomics analysis. Phys Med Biol 67(8):085003
Zhang Z, Huang M, Jiang Z, Chang Y, Torok J, Yin F-F, Ren L (2021) 4d radiomics: impact of 4d-cbct image quality on radiomic analysis. Phys Med Biol 66(4):045023
Zhang Z, Liu J, Yang D, Kamilov US, Hugo GD (2023) Deep learning-based motion compensation for four-dimensional cone-beam computed tomography (4d-cbct) reconstruction. Med Phys 50(2):808–820
Zhi S, Kachelrieß M, Pan F, Mou X (2021) Cycn-net: a convolutional neural network specialized for 4d cbct images refinement. IEEE Trans Med Imaging 40(11):3054–3064
Zhi S, Wang Y, Xiao H, Bai T, Li B, Tang Y (2023) Coarse-super-resolution-fine network (cosf-net): a unified end-to-end neural network for 4d-mri with simultaneous motion estimation and super-resolution. IEEE Trans Med Imaging 43:162
Zhou B, Tsai Y-J, Chen X, Duncan JS, Liu C (2021) Mdpet: a unified motion correction and denoising adversarial network for low-dose gated pet. IEEE Trans Med Imaging 40(11):3154–3164
Zhou B, Tsai Y-J, Liu C (2020) Simultaneous denoising and motion estimation for low-dose gated pet using a siamese adversarial network with gate-to-gate consistency learning. In: Medical image computing and computer assisted intervention–miccai 2020: 23rd international conference, Lima, Peru, October 4–8, 2020, proceedings, part vii 23. pp 743–752
Zhou D, Nakamura M, Mukumoto N, Yoshimura M, Mizowaki T (2022) Development of a deep learning-based patient-specific target contour prediction model for markerless tumor positioning. Med Phys 49(3):1382–1390
Acknowledgements
This research was partly supported by research grants of Project of Strategic Importance Fund (P0035421), Projects of RISA (P0043001) and Projects of RI-IWEAR (P0038684) from The Hong Kong Polytechnic University, General Research Fund (15103520, 15104323, 15104822), Innovation and Technology Support Programme (ITS/049/22FP), Health and Medical Research Fund (07183266, 09200576, 10211606), the Health Bureau, The Government of the Hong Kong Special Administrative Region, and the National Natural Science Foundation of China Young Scientist Fund (82202941).
Author information
Authors and Affiliations
Contributions
W. Y. conducted the review of all included papers, wrote the manuscript, and prepared figures and tables. X. H., W. L., and L. W. contributed to the revision of the content and structure of section 4.2. W. J., Q. J., and R. G. revised the content and structure of section 4.1 and provided overall polishing of the manuscript. J. Z. and Z. S. revised the content and structure of section 4.3. D. J. and M. K. worked on the structural revision of section 4.3. R. L., Y. X., L. T., and C. J analyzed current studies, offered future research recommendations, and polished the entire manuscript with figures and tables.
Corresponding authors
Ethics declarations
Conflict of interest
There is no Conflict of interest in this paper.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Yinghui, W., Haonan, X., Jing, W. et al. Artificial intelligence in four-dimensional imaging for motion management in radiation therapy. Artif Intell Rev 58, 103 (2025). https://doi.org/10.1007/s10462-025-11109-w
Accepted:
Published:
DOI: https://doi.org/10.1007/s10462-025-11109-w