Abstract
Forecasting tumor progression and assessing the uncertainty of predictions play a crucial role in clinical settings, especially for determining disease outlook and making informed decisions about treatment approaches. In this work, we propose TGM-ONets, a deep neural operator learning (PI-DeepONet) based computational framework, which combines bioimaging and tumor growth modeling (TGM) for enhanced prediction of tumor growth. Deep neural operators have recently emerged as a powerful tool for learning the solution maps between the function spaces, and they have demonstrated their generalization capability in making predictions based on unseen input instances once trained. Incorporating the physics laws into the loss function of the deep neural operator can significantly reduce the amount of the training data. The novelties of the design of TGM-ONets include the employment of a convolutional block attention module (CBAM) and a gating mechanism (i.e., mixture of experts (MoE)) to extract the features of the input images. Our results show that the TGM-ONets not only can capture the detailed morphological characteristics of the mild and aggressive tumors within and outside the training domain but also can be used to predict the long-term dynamics of both mild and aggressive tumor growth for up to 6 months with a maximum error of less than 6.7 \(\times 10^{-2}\) for unseen input instances with two or three snapshots added. We also systematically study the effects of the number of training snapshots and noisy data on the performance of TGM-ONets as well as quantify the uncertainty of the model predictions. We demonstrate the efficiency and accuracy by comparing the performance of TGM-ONets with three state-of-the-art (SOTA) baseline models. In summary, we propose a new deep learning model capable of integrating the TGM and sequential observations of tumor morphology to improve the current approaches for predicting tumor growth and thus provide an advanced computational tool for patient-specific tumor prognosis.































Similar content being viewed by others
Data availibility statement
The data supporting this studys findings are available from the corresponding author upon reasonable request.
References
Lorenzo G, Heiselman J S, Liss M A, Miga M I, Gomez H, Yankeelov T E, Reali A, Hughes T J. Patient-specific computational forecasting of prostate cancer growth during active surveillance using an imaging-informed biomechanistic model, arXiv preprint arXiv:2310.00060
Xu J, Wang Y, Gomez H, Feng X-Q. Biomechanical modelling of tumor growth with chemotherapeutic treatment: A review, Smart Materials and Structures https://doi.org/10.1088/1361-665X/acf79a
Lorenzo G, Ahmed S R, Hormuth II D A, Vaughn B, Kalpathy-Cramer J, Solorio L, Yankeelov T E, Gomez H. Patient-specific, mechanistic models of tumor growth incorporating artificial intelligence and big data, arXiv preprint arXiv:2308.14925
Yankeelov TE, Atuegwu N, Hormuth D, Weis JA, Barnes SL, Miga MI, Rericha EC, Quaranta V (2013) Clinically relevant modeling of tumor growth and treatment response. Science Translational Medicine 5(187):187ps9-187ps9. https://doi.org/10.1126/scitranslmed.3005686
Lorenzo G, Scott MA, Tew K, Hughes TJ, Zhang YJ, Liu L, Vilanova G, Gomez H (2016) Tissue-scale, personalized modeling and simulation of prostate cancer growth. Proc Natl Acad Sci 113(48):E7663–E7671. https://doi.org/10.1073/pnas.1615791113
Lorenzo G, Scott M, Tew K, Hughes T, Gomez H (2017) Hierarchically refined and coarsened splines for moving interface problems, with particular application to phase-field models of prostate tumor growth. Comput Methods Appl Mech Eng 319:515–548. https://doi.org/10.1016/j.cma.2017.03.009
Lorenzo G, Hughes TJ, Dominguez-Frojan P, Reali A, Gomez H (2019) Computer simulations suggest that prostate enlargement due to benign prostatic hyperplasia mechanically impedes prostate cancer growth. Proc Natl Acad Sci 116(4):1152–1161. https://doi.org/10.1073/pnas.1815735116
Colli P, Gomez H, Lorenzo G, Marinoschi G, Reali A, Rocca E (2020) Mathematical analysis and simulation study of a phase-field model of prostate cancer growth with chemotherapy and antiangiogenic therapy effects. Math Models Methods Appl Sci 30(07):1253–1295. https://doi.org/10.1142/S0218202520500220
Benítez JM, García-Mozos L, Santos A, Montáns FJ, Saucedo-Mora L (2022) A simple agent-based model to simulate 3D tumor-induced angiogenesis considering the evolution of the hypoxic conditions of the cells. Engineering with Computers 38(5):4115–4133. https://doi.org/10.1007/s00366-022-01625-6
Feng Y, Fuentes D, Hawkins A, Bass J, Rylander MN, Elliott A, Shetty A, Stafford RJ, Oden JT (2009) Nanoshell-mediated laser surgery simulation for prostate cancer treatment. Engineering with Computers 25:3–13. https://doi.org/10.1007/s00366-008-0109-y
Srinivasan A, Moure A, Gomez H (2023) Computational modeling of flow-mediated angiogenesis: Stokes–Darcy flow on a growing vessel network, Engineering with Computers 1–19 https://doi.org/10.1007/s00366-023-01889-6
Lagergren JH, Nardini JT, Baker RE, Simpson MJ, Flores KB (2020) Biologically-informed neural networks guide mechanistic modeling from sparse experimental data. PLoS Comput Biol 16(12):e1008462. https://doi.org/10.1371/journal.pcbi.1008462
Oden JT, Lima EA, Almeida RC, Feng Y, Rylander MN, Fuentes D, Faghihi D, Rahman MM, DeWitt M, Gadde M et al (2016) Toward predictive multiscale modeling of vascular tumor growth. Archives of Computational Methods in Engineering 23(4):735–779. https://doi.org/10.1007/s11831-015-9156-x
Fritz M, Jha PK, Köppl T, Oden JT, Wagner A, Wohlmuth B (2021) Modeling and simulation of vascular tumors embedded in evolving capillary networks. Comput Methods Appl Mech Eng 384:113975. https://doi.org/10.1016/j.cma.2021.113975
Wise SM, Lowengrub JS, Frieboes HB, Cristini V (2008) Three-dimensional multispecies nonlinear tumor growth-I: model and numerical method. J Theor Biol 253(3):524–543. https://doi.org/10.1016/j.jtbi.2008.03.027
Frieboes HB, Jin F, Chuang Y-L, Wise SM, Lowengrub JS, Cristini V (2010) Three-dimensional multispecies nonlinear tumor growth-II: tumor invasion and angiogenesis. J Theor Biol 264(4):1254–1278. https://doi.org/10.1016/j.jtbi.2010.02.036
Macklin P, McDougall S, Anderson AR, Chaplain MA, Cristini V, Lowengrub J (2009) Multiscale modelling and nonlinear simulation of vascular tumour growth. J Math Biol 58(4):765–798. https://doi.org/10.1007/s00285-008-0216-9
Anderson AR, Quaranta V (2008) Integrative mathematical oncology. Nat Rev Cancer 8(3):227–234. https://doi.org/10.1038/nrc2329
Cristini V, Lowengrub J (2010) Multiscale modeling of cancer: An integrated experimental and mathematical modeling approach. Cambridge University Press, Cambridge
Oden JT (2018) Adaptive multiscale predictive modelling. Acta Numer 27:353–450. https://doi.org/10.1017/S096249291800003X
Rahman MM, Feng Y, Yankeelov TE, Oden JT (2017) A fully coupled space-time multiscale modeling framework for predicting tumor growth. Comput Methods Appl Mech Eng 320:261–286. https://doi.org/10.1016/j.cma.2017.03.021
Rocha H, Almeida R, Lima E, Resende A, Oden J, Yankeelov T (2018) A hybrid three-scale model of tumor growth. Math Models Methods Appl Sci 28(01):61–93. https://doi.org/10.1142/S0218202518500021
Lima E, Oden J, Almeida R (2014) A hybrid ten-species phase-field model of tumor growth. Math Models Methods Appl Sci 24(13):2569–2599. https://doi.org/10.1142/S0218202514500304
Shen D, Wu G, Suk H-I (2017) Deep learning in medical image analysis. Annu Rev Biomed Eng 19:221–248. https://doi.org/10.1146/annurev-bioeng-071516-044442
Haque IRI, Neubert J (2020) Deep learning approaches to biomedical image segmentation. Informatics in Medicine Unlocked 18:100297. https://doi.org/10.1016/j.imu.2020.100297
Zhang Q, Sampani K, Xu M, Cai S, Deng Y, Li H, Sun JK, Karniadakis GE (2022) AOSLO-net: a deep learning-based method for automatic segmentation of retinal microaneurysms from adaptive optics scanning laser ophthalmoscopy images. Translational Vision Science & Technology 11(8):7–7. https://doi.org/10.1167/tvst.11.8.7
Pereira SP, Oldfield L, Ney A, Hart PA, Keane MG, Pandol SJ, Li D, Greenhalf W, Jeon CY, Koay EJ et al (2020) Early detection of pancreatic cancer. The Lancet Gastroenterology & Hepatology 5(7):698–710. https://doi.org/10.1016/S2468-1253(19)30416-9
Giampaolo F, De Rosa M, Qi P, Izzo S, Cuomo S (2022) Physics-informed neural networks approach for 1D and 2D Gray-Scott systems. Advanced Modeling and Simulation in Engineering Sciences 9(1):1–17. https://doi.org/10.1186/s40323-022-00219-7
Weng Y, Zhou D (2022) Multiscale physics-informed neural networks for stiff chemical kinetics. J Phys Chem A 126(45):8534–8543. https://doi.org/10.1021/acs.jpca.2c06513
Colin T, Iollo A, Lagaert J-B, Saut O (2014) An inverse problem for the recovery of the vascularization of a tumor. Journal of Inverse and Ill-posed Problems 22(6):759–786. https://doi.org/10.1515/jip-2013-0009
Feng X, Hormuth DA, Yankeelov TE (2019) An adjoint-based method for a linear mechanically-coupled tumor model: Application to estimate the spatial variation of murine glioma growth based on diffusion weighted magnetic resonance imaging. Comput Mech 63:159–180. https://doi.org/10.1007/s00466-018-1589-2
Gholami A, Mang A, Biros G (2016) An inverse problem formulation for parameter estimation of a reaction-diffusion model of low grade gliomas. J Math Biol 72(1):409–433. https://doi.org/10.1007/s00285-015-0888-x
Hogea C, Davatzikos C, Biros G (2008) An image-driven parameter estimation problem for a reaction-diffusion glioma growth model with mass effects. J Math Biol 56(6):793–825. https://doi.org/10.1007/s00285-007-0139-x
Knopoff DA, Fernández DR, Torres GA, Turner CV (2013) Adjoint method for a tumor growth pde-constrained optimization problem. Computers & Mathematics with Applications 66(6):1104–1119. https://doi.org/10.1016/j.camwa.2013.05.028
Subramanian S, Scheufele K, Mehl M, Biros G (2020) Where did the tumor start? An inverse solver with sparse localization for tumor growth models. Inverse Prob 36(4):045006. https://doi.org/10.1088/1361-6420/ab649c
Chen X, Summers RM, Yao J (2012) Kidney tumor growth prediction by coupling reaction-diffusion and biomechanical model. IEEE Trans Biomed Eng 60(1):169–173
Konukoglu E, Clatz O, Menze BH, Stieltjes B, Weber M-A, Mandonnet E, Delingette H, Ayache N (2009) Image guided personalization of reaction-diffusion type tumor growth models using modified anisotropic eikonal equations. IEEE Trans Med Imaging 29(1):77–95
Mi H, Petitjean C, Dubray B, Vera P, Ruan S (2014) Prediction of lung tumor evolution during radiotherapy in individual patients with PET. IEEE Trans Med Imaging 33(4):995–1003
Wong KC, Summers RM, Kebebew E, Yao J (2016) Pancreatic tumor growth prediction with elastic-growth decomposition, image-derived motion, and FDM-FEM coupling. IEEE Trans Med Imaging 36(1):111–123
Hormuth DA II, Weis JA, Barnes SL, Miga MI, Rericha EC, Quaranta V, Yankeelov TE (2015) Predicting in vivo glioma growth with the reaction diffusion equation constrained by quantitative magnetic resonance imaging data. Phys Biol 12(4):046006. https://doi.org/10.1088/1478-3975/12/4/046006
Scheufele K, Mang A, Gholami A, Davatzikos C, Biros G, Mehl M (2019) Coupling brain-tumor biophysical models and diffeomorphic image registration. Comput Methods Appl Mech Eng 347:533–567. https://doi.org/10.1016/j.cma.2018.12.008
Raissi M (2018) Deep hidden physics models: Deep learning of nonlinear partial differential equations. The Journal of Machine Learning Research 19(1):932–955
Raissi M, Perdikaris P, Karniadakis GE (2019) Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J Comput Phys 378:686–707. https://doi.org/10.1016/j.jcp.2018.10.045
Li S, Wang G, Di Y, Wang L, Wang H, Zhou Q (2023) A physics-informed neural network framework to predict 3D temperature field without labeled data in process of laser metal deposition. Eng Appl Artif Intell 120:105908. https://doi.org/10.1016/j.engappai.2023.105908
Cai S, Li H, Zheng F, Kong F, Dao M, Karniadakis GE, Suresh S (2021) Artificial intelligence velocimetry and microaneurysm-on-a-chip for three-dimensional analysis of blood flow in physiology and disease. Proc Natl Acad Sci 118(13):e2100697118. https://doi.org/10.1073/pnas.2100697118
Kissas G, Yang Y, Hwuang E, Witschey WR, Detre JA, Perdikaris P (2020) Machine learning in cardiovascular flows modeling: Predicting arterial blood pressure from non-invasive 4D flow MRI data using physics-informed neural networks. Comput Methods Appl Mech Eng 358:112623. https://doi.org/10.1016/j.cma.2019.112623
Sahli Costabal F, Yang Y, Perdikaris P, Hurtado DE, Kuhl E (2020) Physics-informed neural networks for cardiac activation mapping. Frontiers in Physics 8:42. https://doi.org/10.3389/fphy.2020.00042
Lei J, Liu Q, Wang X (2022) Physics-informed multi-fidelity learning-driven imaging method for electrical capacitance tomography. Eng Appl Artif Intell 116:105467. https://doi.org/10.1016/j.engappai.2022.105467
Ouyang H, Zhu Z, Chen K, Tian B, Huang B, Hao J (2023) Reconstruction of hydrofoil cavitation flow based on the chain-style physics-informed neural network. Eng Appl Artif Intell 119:105724. https://doi.org/10.1016/j.engappai.2022.105724
Nguyen TNK, Dairay T, Meunier R, Mougeot M (2022) Physics-informed neural networks for non-Newtonian fluid thermo-mechanical problems: An application to rubber calendering process. Eng Appl Artif Intell 114:105176. https://doi.org/10.1016/j.engappai.2022.105176
Ren P, Rao C, Sun H, Liu Y. SeismicNet: Physics-informed neural networks for seismic wave modeling in semi-infinite domain, arXiv preprint arXiv:2210.14044
Lorenzo G, Hormuth DA II, Jarrett AM, Lima EA, Subramanian S, Biros G, Oden JT, Hughes TJ, Yankeelov TE (2022) Quantitative in vivo imaging to enable tumour forecasting and treatment optimization. In: Cancer Complexity (ed) Computation. New York, Springer, pp 55–97
Zhang E, Dao M, Karniadakis GE, Suresh S (2022) Analyses of internal structures and defects in materials using physics-informed neural networks. Sci Adv 8(7):eabk0644. https://doi.org/10.1126/sciadv.abk0644
Karniadakis GE, Kevrekidis IG, Lu L, Perdikaris P, Wang S, Yang L (2021) Physics-informed machine learning. Nature Reviews Physics 3(6):422–440. https://doi.org/10.1038/s42254-021-00314-5
Cai S, Mao Z, Wang Z, Yin M, Karniadakis G E (2022) Physics-informed neural networks (PINNs) for fluid mechanics: A review, Acta Mechanica Sinica 1–12 https://doi.org/10.1007/s10409-021-01148-1
Jagtap AD, Kharazmi E, Karniadakis GE (2020) Conservative physics-informed neural networks on discrete domains for conservation laws: Applications to forward and inverse problems. Comput Methods Appl Mech Eng 365:113028. https://doi.org/10.1016/j.cma.2020.113028
Yang L, Meng X, Karniadakis GE (2021) B-PINNs: Bayesian physics-informed neural networks for forward and inverse PDE problems with noisy data. J Comput Phys 425:109913. https://doi.org/10.1016/j.jcp.2020.109913
Du P, Zhu X, Wang J-X (2022) Deep learning-based surrogate model for three-dimensional patient-specific computational fluid dynamics. Phys Fluids 34(8):081906. https://doi.org/10.1063/5.0101128
Chen Q, Ye Q, Zhang W, Li H, Zheng X (2023) TGM-Nets: A deep learning framework for enhanced forecasting of tumor growth by integrating imaging and modeling. Eng Appl Artif Intell 126:106867. https://doi.org/10.1016/j.engappai.2023.106867
Ruiz Herrera C, Grandits T, Plank G, Perdikaris P, Sahli Costabal F, Pezzuto S (2022) Physics-informed neural networks to learn cardiac fiber orientation from multiple electroanatomical maps, Engineering with Computers 38(5), 3957–3973. https://doi.org/10.1007/s00366-022-01709-3
Tajdari M, Tajdari F, Shirzadian P, Pawar A, Wardak M, Saha S, Park C, Huysmans T, Song Y, Zhang YJ et al (2022) Next-generation prognosis framework for pediatric spinal deformities using bio-informed deep learning networks. Engineering with Computers 38(5):4061–4084. https://doi.org/10.1007/s00366-022-01742-2
Lee SY, Park C-S, Park K, Lee HJ, Lee S (2023) A physics-informed and data-driven deep learning approach for wave propagation and its scattering characteristics. Engineering with Computers 39(4):2609–2625. https://doi.org/10.1007/s00366-022-01640-7
Fallah A, Aghdam M M (2023) Physics-informed neural network for bending and free vibration analysis of three-dimensional functionally graded porous beam resting on elastic foundation, Engineering with Computers 1–18 https://doi.org/10.1007/s00366-023-01799-7
Mai H T, Mai D D, Kang J, Lee J, Lee J (2023) Physics-informed neural energy-force network: a unified solver-free numerical simulation for structural optimization, Engineering with Computers 1–24 https://doi.org/10.1007/s00366-022-01760-0
Wang S, Wang H, Perdikaris P (2021) Learning the solution operator of parametric partial differential equations with physics-informed DeepONets. Sci Adv 7(40):eabi8605. https://doi.org/10.1126/sciadv.abi8605
Koric S, Viswantah A, Abueidda D W, Sobh N A, Khan K (2023) Deep learning operator network for plastic deformation with variable loads and material properties, Engineering with Computers 1–13 https://doi.org/10.1007/s00366-023-01822-x
Linka K, Schäfer A, Meng X, Zou Z, Karniadakis GE, Kuhl E (2022) Bayesian physics informed neural networks for real-world nonlinear dynamical systems. Comput Methods Appl Mech Eng 402:115346. https://doi.org/10.1016/j.cma.2022.115346
Zakir Ullah M, Zheng Y, Song J, Aslam S, Xu C, Kiazolu GD, Wang L (2021) An attention-based convolutional neural network for acute lymphoblastic leukemia classification. Appl Sci 11(22):10662. https://doi.org/10.3390/app112210662
Yin W, Schütze H, Xiang B, Zhou B (2016) Abcnn: Attention-based convolutional neural network for modeling sentence pairs. Transactions of the Association for computational linguistics 4:259–272. https://doi.org/10.1162/tacl_a_00097
Ling H, Wu J, Huang J, Chen J, Li P (2020) Attention-based convolutional neural network for deep face recognition. Multimedia Tools and Applications 79:5595–5616. https://doi.org/10.1007/s11042-019-08422-2
Shen Y, Huang X-J (2016) Attention-based convolutional neural network for semantic relation extraction, in: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 2526–2536
Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE (1991) Adaptive mixtures of local experts. Neural Comput 3(1):79–87
Wang S, Perdikaris P (2023) Long-time integration of parametric evolution equations with physics-informed deeponets. J Comput Phys 475:111855. https://doi.org/10.1016/j.jcp.2022.111855
Michałowska K, Goswami S, Karniadakis G E, Riemer-Sørensen S. Neural operator learning for long-time integration in dynamical systems with recurrent neural networks, arXiv preprint arXiv:2303.02243
Zhu M, Zhang H, Jiao A, Karniadakis GE, Lu L (2023) Reliable extrapolation of deep neural operators informed by physics or sparse observations. Comput Methods Appl Mech Eng 412:116064. https://doi.org/10.1016/j.cma.2023.116064
Osband I, Aslanides J, Cassirer A. Randomized prior functions for deep reinforcement learning, Advances in Neural Information Processing Systems 31
Xu J, Vilanova G, Gomez H (2016) A mathematical model coupling tumor growth and angiogenesis. PLoS ONE 11(2):e0149422. https://doi.org/10.1371/journal.pone.0149422
Xu S, Xu Z, Kim OV, Litvinov RI, Weisel JW, Alber M (2017) Model predictions of deformation, embolization and permeability of partially obstructive blood clots under variable shear flow. J R Soc Interface 14(136):20170441. https://doi.org/10.1098/rsif.2017.0441
Xu J, Vilanova G, Gomez H (2020) Phase-field model of vascular tumor growth: Three-dimensional geometry of the vascular network and integration with imaging data. Comput Methods Appl Mech Eng 359:112648. https://doi.org/10.1016/j.cma.2019.112648
Kobayashi R (2010) A brief introduction to phase field method, in: AIP Conference Proceedings, Vol. 1270, American Institute of Physics, 282–291. https://doi.org/10.1063/1.3476232
Lu L, Jin P, Pang G, Zhang Z, Karniadakis GE (2021) Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence 3(3):218–229
Chen T, Chen H (1995) Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE Trans Neural Networks 6(4):911–917
Deng B, Shin Y, Lu L, Zhang Z, Karniadakis GE (2022) Approximation rates of DeepONets for learning operators arising from advection-diffusion equations. Neural Netw 153:411–426. https://doi.org/10.1016/j.neunet.2022.06.019
Lu L, Jin P, Karniadakis G E. Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators, arXiv preprint arXiv:1910.03193
Lu L, Meng X, Cai S, Mao Z, Goswami S, Zhang Z, Karniadakis GE (2022) A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data. Comput Methods Appl Mech Eng 393:114778. https://doi.org/10.1016/j.cma.2022.114778
He J, Kushwaha S, Park J, Koric S, Abueidda D, Jasiuk I (2024) Sequential Deep Operator networks (S-DeepONet) for predicting full-field solutions under time-dependent loads. Eng Appl Artif Intell 127:107258. https://doi.org/10.1016/j.engappai.2023.107258
Sun Y, Moya C, Lin G, Yue M, Deepgraphonet: A deep graph operator network to learn and zero-shot transfer the dynamic response of networked systems, IEEE Systems Journal
Goswami S, Yin M, Yu Y, Karniadakis GE (2022) A physics-informed variational deeponet for predicting crack path in quasi-brittle materials. Comput Methods Appl Mech Eng 391:114587. https://doi.org/10.1016/j.cma.2022.114587
Goswami S, Bora A, Yu Y, E G (2023) Karniadakis, Physics-informed deep neural operator networks, in: Machine Learning in Modeling and Simulation: Methods and Applications, Springer, New York, pp. 219–254
Koric S, Abueidda DW (2023) Data-driven and physics-informed deep learning operators for solution of heat conduction equation with parametric heat source. Int J Heat Mass Transf 203:123809. https://doi.org/10.1016/j.ijheatmasstransfer.2022.123809
Hao Y, Di Leoni PC, Marxen O, Meneveau C, Karniadakis GE, Zaki TA (2023) Instability-wave prediction in hypersonic boundary layers with physics-informed neural operators. Journal of Computational Science 73:102120. https://doi.org/10.1016/j.jocs.2023.102120
Iqbal S, Ghani MU, Saba T, Rehman A (2018) Brain tumor segmentation in multi-spectral MRI using convolutional neural networks (CNN). Microsc Res Tech 81(4):419–427. https://doi.org/10.1002/jemt.22994
Chen L, Wu Y, DSouza A M, Abidin A Z, Wismüller A, Xu C (2018) MRI tumor segmentation with densely connected 3D CNN, in: Medical Imaging 2018: Image Processing, Vol. 10574, SPIE, pp. 357–364. https://doi.org/10.1117/12.2293394
Pereira S, Pinto A, Alves V, Silva CA (2016) Brain tumor segmentation using convolutional neural networks in MRI images. IEEE Trans Med Imaging 35(5):1240–1251
Havaei M, Davy A, Warde-Farley D, Biard A, Courville A, Bengio Y, Pal C, Jodoin P-M, Larochelle H (2017) Brain tumor segmentation with deep neural networks. Med Image Anal 35:18–31. https://doi.org/10.1016/j.media.2016.05.004
Havaei M, Dutil F, Pal C, Larochelle H, Jodoin P-M (2016) A convolutional neural network approach to brain tumor segmentation, in: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: First International Workshop, Brainles 2015, Held in Conjunction with MICCAI 2015, Munich, Germany, October 5, 2015, Revised Selected Papers 1, Springer, pp. 195–208. https://doi.org/10.1007/978-3-319-30858-6_17
Woo S, Park J, Lee J-Y, Kweon I S (2018) CBAM: Convolutional block attention module, in: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift, in: International Conference on Machine Learning, PMLR, pp. 448–456
Zhou Y, Li D, Huo S, Kung S-Y (2021) Shape autotuning activation function. Expert Syst Appl 171:114534. https://doi.org/10.1016/j.eswa.2020.114534
Wang S, Wang H, Perdikaris P (2022) Improved architectures and training algorithms for deep operator networks. J Sci Comput 92(2):35. https://doi.org/10.1007/s10915-022-01881-0
Waterhouse S, Cook G, Ensemble methods for phoneme classification, Advances in Neural Information Processing Systems 9
Nguyen MH, Abbass HA, Mckay RI (2006) A novel mixture of experts model based on cooperative coevolution. Neurocomputing 70(1–3):155–163. https://doi.org/10.1016/j.neucom.2006.04.009
Ebrahimpour R, Kabir E, Yousefi MR (2007) Face detection using mixture of MLP experts. Neural Process Lett 26:69–82. https://doi.org/10.1007/s11063-007-9043-z
Übeyli ED, Ilbay K, Ilbay G, Sahin D, Akansel G (2010) Differentiation of two subtypes of adult hydrocephalus by mixture of experts. J Med Syst 34:281–290. https://doi.org/10.1007/s10916-008-9239-4
Ebrahimpour R, Nikoo H, Masoudnia S, Yousefi MR, Ghaemi MS (2011) Mixture of MLP-experts for trend forecasting of time series: A case study of the tehran stock exchange. Int J Forecast 27(3):804–816. https://doi.org/10.1016/j.ijforecast.2010.02.015
Kingma D P, Ba J, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980
Raissi M, Yazdani A, Karniadakis GE (2020) Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations. Science 367(6481):1026–1030. https://doi.org/10.1126/science.aaw4741
Yin M, Zheng X, Humphrey JD, Karniadakis GE (2021) Non-invasive inference of thrombus material properties with physics-informed neural networks. Comput Methods Appl Mech Eng 375:113603. https://doi.org/10.1016/j.cma.2020.113603
Shi X, Chen Z, Wang H, Yeung D-Y, Wong W-K, Woo W-c, Convolutional LSTM network: A machine learning approach for precipitation nowcasting, Advances in Neural Information Processing Systems 28
Kirby R M, Karniadakis G E, Spectral element and hp methods, Encyclopedia of Computational Mechanics
Hanahan D, Weinberg RA (2011) Hallmarks of cancer: the next generation. Cell 144(5):646–674. https://doi.org/10.1016/j.cell.2011.02.013
Lu L, Dao M, Kumar P, Ramamurty U, Karniadakis GE, Suresh S (2020) Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci 117(13):7052–7062. https://doi.org/10.1073/pnas.1922210117
Sanga S, Sinek JP, Frieboes HB, Ferrari M, Fruehauf JP, Cristini V (2006) Mathematical modeling of cancer progression and response to chemotherapy. Expert Rev Anticancer Ther 6(10):1361–1376. https://doi.org/10.1586/14737140.6.10.1361
Ayensa-Jiménez J, Doweidar MH, Sanz-Herrera JA, Doblare M (2022) Understanding glioblastoma invasion using physically-guided neural networks with internal variables. PLoS Comput Biol 18(4):e1010019. https://doi.org/10.1371/journal.pcbi.1010019
Gao Q, Lin H, Qian J, Liu X, Cai S, Li H, Fan H, Zheng Z (2023) A deep learning model for efficient end-to-end stratification of thrombotic risk in left atrial appendage. Eng Appl Artif Intell 126:107187. https://doi.org/10.1016/j.engappai.2023.107187
Qi C R, Su H, Mo K, Guibas L J (2017) Pointnet: Deep learning on point sets for 3D classification and segmentation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 652–660
Garcia-Garcia A, Gomez-Donoso F, Garcia-Rodriguez J, Orts-Escolano S, Cazorla M, Azorin-Lopez J, Pointnet: A 3D convolutional neural network for real-time object class recognition, in, (2016) International joint conference on neural networks (IJCNN). IEEE 2016:1578–1584
Aoki Y, Goforth H, Srivatsan R A, Lucey S (2019) Pointnetlk: Robust & efficient point cloud registration using pointnet, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 7163–7172
Acknowledgements
Q.C and X.Z gratefully acknowledge the support from the starting fund of Jinan University, Guangzhou, Guangdong Province, China.
Author information
Authors and Affiliations
Contributions
Qijing Chen: Conceptualization (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Software (equal); Validation (equal); Writing-original draft (equal), Writing-review & editing (equal). He Li: Conceptualization (equal); Writing-original draft (equal); Writing-review & editing (equal). Xiaoning Zheng: Conceptualization (equal); Funding acquisition (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Software (equal); Supervision (equal); Writing-original draft (equal); Writing-review & editing (equal).
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendices
A Convergence of spectral/hp element (Nektar) results for TGMs
We give a brief introduction to the spectral/hp element method which we use to solve the PDEs for tumor growth and generate the synthetic data. More details about the spectral/hp element method can be found in [110]. We first define the weak form of the PDE and impose the boundary conditions. Then we discretize the computational domain into subdomains. Below we use the one-dimensional Poisson equation in the interval \(0< x \le 1\) for illustration. \(\bigtriangleup u + f\) = 0, where \(u(x = 0)\) = \(g_D\) = 1 and \(\frac{\partial u(x = 1)}{\partial x}\) = \(g_N\) = 1.
-
1.
We get the weak form by multiplying the problem by a discrete test space and integrating the second-order derivative by parts: \(\int _{0}^{1}\frac{\partial v^{\delta }}{\partial x}\frac{\partial u^{\delta }}{\partial x} = \int v^{\delta }f dx + v^{\delta }(1)g_{N}.\)
-
2.
We lift a known solution from the problem by decomposing into a known solution satisfying the Dirichlet boundary conditions and a homogeneous solution such that \(u^{\delta } = u^{D}+u^{H}\), and the weak solution becomes \(\int _{0}^{1}\frac{\partial v^{\delta }}{\partial x}\frac{\partial u^{H}}{\partial x} = \int v^{\delta }f dx + v^{\delta }(1)g_{N}-\int _{0}^{1}\int _{0}^{1}\frac{\partial v^{\delta }}{\partial x}\frac{\partial u^{D}}{\partial x}.\)
We use piecewise linear functions as basis functions and decompose the domain into two subdomains. We can use finer mesh which can give h-convergence and higher-order polynomials as basis functions which can give p-type convergence. For the linear two-subdomain case the approximate expansion has the form \(u^{\delta } = \sum _{i = 0}^{2} \hat{u_{i}}\Phi _i(x)\), where \(\Phi _i(x)\) are the piecewise linear functions. Then we represent f in terms of basis functions \(f(x) = \sum _{i = 0}^2\hat{f_i}\Phi _i(x)\). Finally, we can solve the linear system of equations to get the numerical solution for u, which is also a finite element approximation for this example.
We solve the Eqs. 1–4 in Sect. 2.1 for tumor growth using a spectral/hp element Nektar solver with \(\Delta t\) = \(1.0\,\times \,10^{-2}\), \(1.0\,\times \,10^{-3}\), and \(1.0\,\times \,10^{-4}\). The characteristic length and time are 1 mm and 1 day. We found that at \(\Delta t \le 1.0\,\times \,10^{-3}\), the differences in the results between \(\Delta t\)s are marginal. For aggressive tumors, the maximum difference (i.e., the maximum pointwise absolute difference between \(\phi\) from two simulation runs) between \(\Delta t = 1.0\,\times \,10^{-2}\) and \(\Delta t = 1.0\,\times \,10^{-5}\) is \(2.08\,\times \,10^{-2}\), \(\Delta t = 1.0\,\times \,10^{-3}\) and \(\Delta t = 1.0\,\times \,10^{-5}\) is \(2.99\,\times \,10^{-3}\), and \(\Delta t = 1.0\,\times \,10^{-4}\) and \(\Delta t = 1.0\,\times \,10^{-5}\) is \(1.32\,\times \,10^{-3}\). For all the numerical simulations conducted by Nektar, we use polynomial order = 3, time step size \(\Delta t\) = \(1.0\,\times \,10^{-3}\), mesh size \(6.67\,\times \,10^{-3}\) in both x- and y- directions which resulted in 22,500 quadrilateral elements, and run the solver with 256 CPU nodes in parallel. Table 43 shows the parameters used in the simulations. It takes about 1.1 h to run one mild tumor case up to 80 days and 3.9 h to run one aggressive tumor case up to 200 days.
B Forecast the tumor growth using the initial density of nutrients as the input for the branch net
Prediction for tumor cells and nutrient dynamics for mild tumor cases mapping from the initial density of nutrients. a Prediction errors for training and testing datasets. The blue lines represent the mean of prediction errors in training datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors. b Predictions of the tumor morphologies \(\phi\) at different time. left: R = 0.07 mm; right: R = 0.21 mm (R: the length of the minor axis of the initial ellipsoidal tumor)
Prediction for tumor cells and nutrient dynamics for aggressive tumor cases mapping from the initial density of nutrients. a Prediction errors for training and testing datasets. The blue lines represent the mean of prediction errors in training datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors. b Predictions of the tumor morphologies \(\phi\) at different time. left: R = 0.07 mm; right: R = 0.23 mm (R: the length of the minor axis of the initial ellipsoidal tumor)
We also test the performances of TGM-ONets to forecast tumor growth using the initial density of nutrients as the input for the branch net. The hyper-parameters (i.e., the initial learning rate, the decay step, the \(\omega _{PDE}\) and the \(\omega _{data}\)) are selected to be the same as in Sect. 3.1.1. We parameterized the initial density of nutrients for both mild and aggressive tumors as:
which represents an ellipsoidal nutrients field corresponding to an ellipsoidal tumor in the computational domain for which the y-axis has double length compared to the x-axis. We use the same training datasets as in Sect. 3.1.1 to train TGM-ONets.
For mild tumor cases, prediction errors for all training and testing cases given by TGM-ONets are represented in Fig. 32(a-b), which shows that the average of prediction errors for both \(\phi\) and \(\sigma\) are under \(1.0\,\times \,10^{-3}\) in training datasets and \(2.0\,\times \,10^{-2}\) in testing datasets. the maximum prediction error are around \(2.0\,\times \,10^{-3}\) for \(\phi\) and \(5.0\,\times \,10^{-4}\) for \(\sigma\) in training datasets while \(6.0\,\times \,10^{-2}\) for \(\phi\) and \(5.0\,\times \,10^{-2}\) for \(\sigma\) in testing datasets. Predictions for two specific cases of R = 0.07 mm (in-distribution) and 0.21 mm (out-of-distribution) are illustrated in Fig. 32c, d.
For aggressive tumor cases, prediction errors for all training and testing cases given by TGM-ONets are represented in Fig. 33a, b, which shows that the average of prediction errors for both \(\phi\) and \(\sigma\) are under \(5.0\,\times \,10^{-4}\) in training datasets and \(2.0\,\times \,10^{-2}\) in testing datasets. the maximum prediction error are under \(2.0\,\times \,10^{-3}\) for both \(\phi\) and \(\sigma\) in training datasets while \(4.0\,\times \,10^{-2}\) for both \(\phi\) and \(\sigma\) in testing datasets. Predictions for two specific cases of R = 0.07 mm (in-distribution) and 0.23 mm (out-of-distribution) are illustrated in Fig. 33c, d.
C Forecast the tumor growth using the initial density of tumor cells with varying shapes as the input for the branch net
We evaluate the performance of TGM-ONets to forecast tumor growth using the initial density of tumor cells with varying shapes as the input for the branch net. For mild tumors, we vary the ratios of the y-semiaxes to the x-semiaxes (\(\delta\)) (C.1), positions within the domain (C.2), the length of the minor axis of the initial ellipsoidal tumor centered at (0.5,0.5) (C.3) and centered not at (0.5,0.5) (C.4), circular shapes (C.5) and oblique ellipsoidal tumors (C.6). For aggressive tumors, we vary the ratios of the y-semiaxes to the x-semiaxes (\(\delta\)) (C.7) and positions (C.8). The hyper-parameters (i.e., initial learning rate, the decay step, the \(\omega _{PDE}\) and the \(\omega _{data}\)) are selected to be the same as Sect. 3.1.1.
1.1 C.1 Forecast the mild tumor growth using the initial density of tumor cells with varying ratio (\(\delta\)) of y-semiaxis to the x-semiaxis
For mild tumor cases, we use TGM-ONets to learn the mapping from the initial density of tumor cells with varying \(\delta\) to the solutions of tumor cells and nutrients on the entire computation domain. The growth rate and the length of the minor axis of the initial ellipsoidal tumor R remain the same as 1.5 and 0.05. We sample 1000 values of \(\delta\) from a uniform distribution U(1.0, 2.6). Assuming we have 8 cases of data recording the density of tumor cells and nutrients for every 0.5 days up to 70.5 days with different values of \(\delta\) sampled from U(1.0, 2.6), we follow the same training procedure as in Sect. 3.1.1. We evaluate the performance of TGM-ONets on testing datasets with different values of \(\delta\) sampled from U(1.0, 2.9). The accuracy of predictions associated with both the training and testing datasets are represented in Fig. 34a, from which we can see the maximum prediction error for \(\phi\) and \(\sigma\) are bounded by \(6.0\,\times \,10^{-4}\) and \(3.0\,\times \,10^{-4}\) in training datasets while \(8.0\,\times \,10^{-4}\) and \(2.0\,\times \,10^{-3}\) in testing datasets. The average prediction errors for \(\phi\) and \(\sigma\) are under \(3.0\,\times \,10^{-4}\) in training datasets while \(5.0\,\times \,10^{-4}\) in testing datasets. Predictions for two specific cases of \(\delta\) = 1.4 (in-distribution) and \(\delta\) = 2.9 (out-of-distribution) are illustrated in Fig. 34b.
Prediction for tumor cells and nutrient dynamics for mild tumor cases mapping from the initial density of tumor cells with varying \(\delta\). a Prediction errors for training and testing datasets. The blue lines represent the mean of prediction errors in training datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors. b Predictions of the tumor morphologies \(\phi\) at different time. left: \(\delta\) = 1.4; right: \(\delta\) = 2.9 (\(\delta\): the ratios of the y-semiaxes to the x-semiaxes)
1.2 C.2 Forecast the mild tumor growth using the initial density of tumor cells with varying positions within the domain
In this subsection, we use TGM-ONets to learn the mapping from the initial density of tumor cells with varying positions within the domain to the solutions of tumor cells and nutrients on the entire computation domain for mild tumor cases. The growth rate and the length of the minor axis of the initial ellipsoidal tumor R remain the same as 1.5 mm and 0.05 mm. Let \((x^{*}, y^{*})\) denote the position of tumor cells and nutrients, we sample 1000 values of \(x^{*}\) and \(y^{*}\) from a uniform distribution U(0.4, 0.6). Assuming we have 8 cases of data recording the density of tumor cells and nutrients for every 0.5 days up to 70.5 days with different values of \(x^{*}\) and \(y^{*}\) sampled from U(0.4, 0.6), we follow the same training procedure as in Sect. 3.1.1. We evaluate the performance of TGM-ONets on testing datasets with different values of \(x^{*}\) and \(y^{*}\) sampled from U(0.4, 0.6). The accuracy of predictions associated with both the training and testing datasets are represented in Fig. 35a, from which we can see the maximum prediction error for \(\phi\) and \(\sigma\) are bounded by \(1.0\,\times \,10^{-3}\) in training datasets while \(1.5\times10^{-1}\) in testing datasets. The average prediction errors for \(\phi\) and \(\sigma\) are under \(1.0\,\times \,10^{-2}\) in training datasets while \(8.0\,\times \,10^{-2}\) in testing datasets. Predictions for two specific cases of \((x^{*}, y^{*})\) = (0.58, 0.42) (in-distribution) and \((x^{*}, y^{*})\) = (0.42, 0.58) (in-distribution) are illustrated in Fig. 35b.
Prediction for tumor cells and nutrient dynamics for mild tumor cases mapping from the initial density of tumor cells with varying positions within the computation domain. a Prediction errors for training and testing datasets. The blue lines represent the mean of prediction errors in training datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors. b Predictions of the tumor morphologies \(\phi\) at different time. left: \((x^{*}, y^{*})\) = (0.58, 0.42); right: \((x^{*}, y^{*})\) = (0.42, 0.58) (\((x^{*}, y^{*})\): the center position of tumor cells and nutrients within computation domain)
1.3 C.3 Forecast the mild tumor growth using the initial density of tumor cells with varying length of the minor axis of the initial ellipsoidal tumor centered at (0.5,0.5)
In this subsection, we use TGM-ONets to learn the mapping from the initial density of tumor cells with varying lengths of the minor axis of the initial ellipsoidal tumor R and a larger ratio of the y-semiaxes to the x-semiaxes (\(\delta\) = 3) to the solutions of tumor cells and nutrients on the entire computation domain. The growth rate remains the same as 1.5. We sample 1000 values of R from a uniform distribution U(0.06, 0.20). Assuming we have 9 cases of data with different values of R sampled from U(0.06, 0.20) and an additional case of data with R = 0.22 mm recording the density of tumor cells and nutrients for every 0.5 days up to 70.5 days, we follow the same training procedure as in Sect. 3.1.1. We evaluate the performance of TGM-ONets on testing datasets with different values of R sampled from U(0.06, 0.23). The accuracy of predictions associated with both the training and testing datasets are represented in Fig. 36a, from which we can see the maximum prediction error for \(\phi\) and \(\sigma\) are bounded by \(2.0\,\times \,10^{-3}\) and \(6.0\,\times \,10^{-4}\) in training datasets while \(1.3\,\times \,10^{-2}\) and \(1.0\,\times \,10^{-2}\) in testing datasets. The average prediction errors for \(\phi\) and \(\sigma\) are under \(1.0\,\times \,10^{-3}\) in training datasets while \(5.0\,\times \,10^{-3}\) in testing datasets. Predictions for two specific cases of R = 0.07 mm (in-distribution) and R = 0.23 mm (out-of-distribution) are illustrated in Fig. 36b.
Prediction for tumor cells and nutrient dynamics for mild tumor cases mapping from the initial density of tumor cells with varying lengths of the minor axis of the initial ellipsoidal tumor R and a larger \(\delta\) = 3. a Prediction errors for training and testing datasets. The blue lines represent the mean of prediction errors in training datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors. b Predictions of the tumor morphologies \(\phi\) at different time. left: R = 0.07 mm; right: R = 0.23 mm. (R: the length of the minor axis of the initial ellipsoidal tumor)
1.4 C.4 Forecast the mild tumor growth using the initial density of tumor cells with varying length of the minor axis of the initial ellipsoidal tumor centered not at (0.5,0.5)
In this subsection, we use TGM-ONets to learn the mapping from the non-center initial density of tumor cells with varying lengths of the minor axis of the initial ellipsoidal tumor R to the solutions of tumor cells and nutrients on the entire computation domain. The growth rate remains the same as 1.5 1/day. We sample 1000 values of R from a uniform distribution U(0.06, 0.20). Assuming we have 9 cases of data with different values of R sampled from U(0.06, 0.20) and additional data with R = 0.22 mm recording the density of tumor cells and nutrients for every 0.5 days up to 70.5 days, we follow the same training procedure as in Sect. 3.1.1. We evaluate the performance of TGM-ONets on testing datasets with different values of R sampled from U(0.06, 0.23). The accuracy of predictions associated with both training and testing datasets are represented in Fig. 37a, from which we can see the maximum prediction error for \(\phi\) and \(\sigma\) are bounded by \(3.0\,\times \,10^{-3}\) in training datasets while \(5.0\,\times \,10^{-2}\) in testing datasets. The average prediction errors for \(\phi\) and \(\sigma\) are under \(1.0\,\times \,10^{-3}\) in training datasets while \(2.0\,\times \,10^{-2}\) in testing datasets. Predictions for two specific cases of R = 0.07 mm (in-distribution) and R = 0.23 mm (out-of-distribution) are illustrated in Fig. 37b.
Prediction for tumor cells and nutrient dynamics for mild tumor cases mapping from the non-center initial density of tumor cells with varying scaling factors of R. a Prediction errors for training and testing datasets. The blue lines represent the mean of prediction errors in training datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors. b Predictions of the tumor morphologies \(\phi\) at different time. left: R = 0.07 mm; right: R = 0.23 mm. (R: the length of the minor axis of the initial ellipsoidal tumor)
1.5 C.5 Forecast the mild tumor growth using the initial density of tumor cells with varying radii of the initial circular tumor
In this subsection, we use TGM-ONets to learn the mapping from the initial density of tumor cells with varying radii of the initial circular tumor R to the solutions of tumor cells and nutrients on the entire computation domain for mild tumor cases. The growth rate remains the same as 1.5. We sample 1000 values of R from a uniform distribution U(0.06, 0.20). Assuming we have 9 cases of data with different values of R sampled from U(0.06, 0.20) and an additional case of data with R = 0.22 recording the density of tumor cells and nutrients for every 0.5 days up to 70.5 days, we follow the same training procedure as in Sect. 3.1.1. We evaluate the performance of TGM-ONets on testing datasets with different values of R sampled from U(0.06, 0.23). The accuracy of predictions associated with both the training and testing datasets are represented in Fig. 38a, from which we can see the maximum prediction error for \(\phi\) and \(\sigma\) are bounded by \(8.0\,\times \,10^{-4}\) and \(5.0\,\times \,10^{-4}\) in training datasets while \(1.0\,\times \,10^{-2}\) in testing datasets. The average prediction errors for \(\phi\) and \(\sigma\) are under \(4.0\,\times \,10^{-4}\) in training datasets while \(5.0\,\times \,10^{-3}\) in testing datasets. Predictions for two specific cases of R = 0.07(in-distribution) and R = 0.23(out-of-distribution) are illustrated in Fig. 38b.
Prediction for tumor cells and nutrient dynamics for mild tumor cases mapping from the circular initial density of tumor cells with varying scaling factors of R. a Prediction errors for training and testing datasets. The blue lines represent the mean of prediction errors in training datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors. b Predictions of the tumor morphologies \(\phi\) at different time. left: R = 0.07 mm; right: R = 0.23 mm. (R: the radius of the initial circular tumor)
1.6 C.6 Forecast the mild tumor growth using the initial density of tumor cells with varying oblique (\(\theta\)) and the ratios of the y-semiaxes to the x-semiaxes (\(\delta\) = 2 or 4)
In this subsection, we use TGM-ONets to learn the mapping from the initial density of tumor cells with varying oblique(\(\theta\)) and the ratios of the y-semiaxes to the x-semiaxes(\(\delta\) = 2 or 4) to the solutions of tumor cells and nutrients on the entire computation domain. The growth rate remains the same as 1.5. We sample 1000 values of \(\theta\) from a uniform distribution \(U(0, 2\pi )\). Assuming we have 8 cases of data with different values of \(\theta\) sampled from \(U(0, 2\pi )\), we follow the same training procedure as in Sect. 3.1.1. We evaluate the performance of TGM-ONets on testing datasets with different values of \(\theta\) sampled from \(U(0, 2\pi )\). The accuracy of predictions associated with both training and testing datasets are represented in Fig. 39a, from which we can see the maximum prediction error for \(\phi\) and \(\sigma\) are around \(3.0\,\times \,10^{-3}\) and \(1.5\,\times \,10^{-3}\) in training datasets while \(2.0\,\times \,10^{-1}\) and \(8.0\,\times \,10^{-2}\) in testing datasets. The average prediction errors for \(\phi\) and \(\sigma\) are under \(2.0\,\times \,10^{-3}\) in training datasets while \(1.0\,\times \,10^{-1}\) in testing datasets. Predictions for two specific cases of (\(\theta\) = 0.3\(\pi\), \(\delta\) = 4) (in-distribution) and (\(\theta\) = 0.1\(\pi\), \(\delta\) = 2) (in-distribution) are illustrated in Fig. 39b.
Prediction for tumor cells and nutrient dynamics for mild tumor cases mapping from the initial density of tumor cells with varying oblique \(\theta\) and the ratios of the y-semiaxes to the x-semiaxes (\(\delta\) = 2 or 4). a Prediction errors for training and testing datasets. The blue lines represent the mean of prediction errors in training datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors. b Predictions of the tumor morphologies \(\phi\) at different time. left: \(\theta\) = 0.3\(\pi\), \(\delta\) = 4; right: \(\theta\) = 0.1\(\pi\), \(\delta\) = 2. (\(\theta\): oblique of the initial density of tumor cells, \(\delta\): the ratios of the y-semiaxes to the x-semiaxes)
1.7 C.7 Forecast the aggressive tumor growth using the initial density of tumor cells with varying the ratio of the x-semiaxis to y-semiaxis (\(\delta\))
For aggressive tumor cases, here we use TGM-ONets to learn the mapping from the ellipsoidal initial density of tumor cells with varying \(\delta\) to the solutions of tumor cells and nutrients on the entire computation domain. The growth rate, the length of the minor axis of the initial ellipsoidal tumor R and \(\gamma _{c}\) remain the same as 1.0, 0.05 and 17.5. We sample 1000 values of \(\delta\) from a uniform distribution U(1.0, 3.0). Assuming we have 8 cases of data recording the density of tumor cells and nutrients for every 0.5 days up to 200.5 days with different values of \(\delta\) sampled from U(1.0, 3.0), we follow the same training procedure as in Sect. 3.1.1. We evaluate the performance of TGM-ONets on testing datasets with different values of \(\delta\) sampled from U(1.0, 3.0). The accuracy of predictions associated with both the training and testing datasets are represented in Fig. 40a, from which we can see the maximum prediction error for \(\phi\) and \(\sigma\) are bounded by \(4.0\,\times \,10^{-3}\) and \(8.0\,\times \,10^{-3}\) in both training and testing datasets. The average prediction errors for \(\phi\) and \(\sigma\) are under \(5\,\times \,10^{-3}\) in both training and testing datasets. Predictions for two specific cases of \(\delta\) = 1.4 (in-distribution) and 2.7 (in-distribution) are illustrated in Fig. 40b.
Prediction for tumor cells and nutrient dynamics for aggressive tumor cases mapping from the ellipsoidal initial density of tumor cells with varying \(\delta\). a Prediction errors for training and testing datasets. The blue lines represent the mean of prediction errors in training datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors. b Predictions of the tumor morphologies \(\phi\) at different time. left: \(\delta\) = 1.4; right: \(\delta\) = 2.7. (\(\delta\): the ratios of the y-semiaxes to the x-semiaxes)
1.8 C.8 Forecast the aggressive tumor growth using the initial density of tumor cells with varying positions within the domain
In this subsection, we use TGM-ONets to learn the mapping from the initial density of tumor cells with varying positions within the domain to the solutions of tumor cells and nutrients on the entire computation domain for aggressive tumor cases. The growth rate, the length of the minor axis of the initial ellipsoidal tumor R and \(\gamma _{c}\) remain the same as 1.0 and 0.05 and 17.5. Let \((x^{*}, y^{*})\) denote the position of tumor cells and nutrients, we sample 1000 values of \(x^{*}\) and \(y^{*}\) from a uniform distribution U(0.4, 0.6). Assuming we have 10 cases of data recording the density of tumor cells and nutrients for every 0.5 days up to 200.5 days with different values of \(x^{*}\) and \(y^{*}\) sampled from U(0.4, 0.6), we follow the same training procedure as in Sect. 3.1.1. We evaluate the performance of TGM-ONets on testing datasets with different values of \(x^{*}\) and \(y^{*}\) sampled from U(0.4, 0.6). The accuracy of predictions associated with both the training and testing datasets are represented in Fig. 41a, from which we can see the maximum prediction error for \(\phi\) and \(\sigma\) are bounded by \(3.0\,\times \,10^{-3}\) in training datasets while \(3.0\,\times \,10^{-2}\) in testing datasets. The average prediction errors for \(\phi\) and \(\sigma\) are under \(2.0\,\times \,10^{-3}\) in training datasets while \(2.0\,\times \,10^{-2}\) in testing datasets. Predictions for two specific cases of \((x^{*}, y^{*})\) = (0.54, 0.54) (in-distribution) and \((x^{*}, y^{*})\) = (0.54, 0.46) (in-distribution) are illustrated in Fig. 41b.
Prediction for tumor cells and nutrient dynamics for aggressive tumor cases mapping from the ellipsoidal initial density of tumor cells with varying positions within computation domain. a Prediction errors for training and testing datasets. The blue lines represent the mean of prediction errors in training datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors. b Predictions of the tumor morphologies \(\phi\) at different time. left: (\(x^{*}, y^{*}\)) = (0.54, 0.54); right: (\(x^{*}, y^{*}\)) = (0.54, 0.46). ((\(x^{*}, y^{*}\)): the center position of tumor cells and nutrients within computation domain)
D Significance tests for the difference of prediction errors obtained by TGM-ONets using different input functions for the branch net
In this section, we conduct Kruskal–Wallis and Dunns test to check if the differences in prediction errors obtained by TGM-ONets using varying input functions are statistically significant. We take the average prediction errors over time to compute the statistics for Kruskal–Wallis and Dunns test.
For mild tumors, we vary the input functions from the initial density of tumor cells to the growth rate. The results of Kruskal–Wallis test are summarized on the \(1^{st}\) two rows in Table 44, from which we can see the p-values are only lower than \(1.0\,\times \,10^{-2}\) in training datasets, indicating that the \(\epsilon _{\phi }\) and \(\epsilon _{\sigma }\) in training datasets obtained by TGM-ONets using initial density of tumor cells and growth rate as inputs for the branch net are significantly different. However, the generalization ability of TGM-ONets for unseen datasets of mild tumors is consistent and irrespective of input functions. Results of the median of \(\phi\) and \(\sigma\) in training datasets are summarized in Table 45. These results suggest that TGM-ONets perform better in fitting training datasets for mild tumors mapping from the initial density of tumor cells.
For aggressive tumor cases, we vary the input functions from the initial density of tumor cells to the nutrient uptake. The results of Kruskal–Wallis test are summarized on the last two rows in Table 44 and the results of the median of \(\epsilon _{\phi }\) and \(\epsilon _{\sigma }\) in training datasets are summarized in Table 46. These results also suggest that TGM-ONets performances are consistent and irrespective of input functions for unseen datasets while significantly better fitting training datasets for aggressive tumors mapping from the initial density of tumor cells. We infer that the CBAM blocks utilized in CNN-based branch nets are the reasons accounting for the better performance in fitting training datasets for mild and aggressive tumors mapping from the initial density of tumor cells.
E Long-time predictions using TGM-ONets
In this subsection, we demonstrate more results about long-time prediction using TGM-ONets. Three scenarios are the same as in Sect. 3.2: (1) assuming we have sparse observations at the last time step in the testing domain, (2) assuming we have sparse observations at the early stage in the testing domain, and (3) assuming we have no additional observations in the testing domain.
Tables 47 and 48 show the mean relative \(L^2\) error for \(\phi\) and \(\sigma\) in long-time predictions for mild tumors mapping from the initial density of tumor cells in scenario 1, respectively. Tables 49 - 50 show the mean relative \(L^2\) error for \(\phi\) and \(\sigma\) in long-time predictions for aggressive tumors mapping from the initial density of tumor cells in scenario 1, respectively. We see that with the time and the initial density of tumor cells increasing, the mean relative \(L^2\) errors increase a little bit. However, even at T = 270 with R > 0.2 mm, the maximum mean relative \(L^2\) errors for \(\phi\) and \(\sigma\) are still under around \(7\times \,10^{-2}\).
Tables 51 and 52 show the mean relative \(L^2\) error for \(\phi\) and \(\sigma\) in long-time predictions for mild tumors mapping from the initial density of tumor cells in scenario 2, respectively. Table 53 and 54 shows the mean relative \(L^2\) error for \(\phi\) and \(\sigma\) in long-time predictions for aggressive tumors mapping from the initial density of tumor cells in scenario 2, respectively.
Tables 55 and 56 show the mean relative \(L^2\) error for \(\phi\) and \(\sigma\) in long-time predictions for mild tumors mapping from the initial density of tumor cells in scenario 3, respectively. Tables 57 and 58 show the mean relative \(L^2\) error for \(\phi\) and \(\sigma\) in long-time predictions for aggressive tumors mapping from the initial density of tumor cells in scenario 3, respectively.
Tables 59 and 60 show the mean relative \(L^2\) error for \(\phi\) and \(\sigma\) in long-time prediction for mild tumors mapping from the concomitant changes in the initial density of tumor cells and growth rate in scenario 1, respectively. Tables 61 and 62 show the mean relative \(L^2\) error for \(\phi\) and \(\sigma\) in long-time prediction for aggressive tumors mapping from the concomitant changes in the initial density of tumor cells and nutrient uptake in scenario 1, respectively.
Tables 63 and 64 show the mean relative \(L^2\) error for \(\phi\) and \(\sigma\) in long-time prediction for mild tumors mapping from the concomitant changes in the initial density of tumor cells and growth rate in scenario 2, respectively. Tables 65 and 66 show the mean relative \(L^2\) error for \(\phi\) and \(\sigma\) in long-time prediction for aggressive tumors mapping from the concomitant changes in the initial density of tumor cells and nutrient uptake in scenario 2, respectively.
Tables 67 and 68 show the mean relative \(L^2\) error for \(\phi\) and \(\sigma\) in long-time prediction for mild tumors mapping from the concomitant changes in the initial density of tumor cells and growth rate in scenario 3, respectively. Tables 69 and 70 show the mean relative \(L^2\) error for \(\phi\) and \(\sigma\) in long-time prediction for aggressive tumors mapping from the concomitant changes in the initial density of tumor cells and nutrient uptake in scenario 3, respectively.
F Ranges of prediction errors in training datasets for examining the robustness of TGM-ONets
In this subsection, we provide the ranges of prediction errors in training datasets for examining the robustness of TGM-ONets.
For the results of examining the effects of the number of training snapshots, Fig. 42 shows the ranges of prediction errors in training datasets for mild tumor cases mapping from the initial density of tumor cells, Fig. 43 shows the ranges of prediction errors in training datasets for aggressive tumor cases mapping from the initial density of tumor cells, and Fig. 44 shows the ranges of prediction errors in training datasets for aggressive tumor cases mapping from the initial density of tumor cells using sparser data.
For the results of examining the effects of the noisy measurements, Fig. 45 shows the ranges of prediction errors in training datasets for mild tumor cases mapping from the initial density of tumor cells, and Fig. 46 shows the ranges of prediction errors in training datasets for aggressive tumor cases mapping from the initial density of tumor cells.
Prediction errors in training datasets for examining the effects of the number of training points for mild tumor cases mapping from the initial density of tumor cells. The blue lines represent the mean of prediction errors in training datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Prediction errors in training datasets for examining the effects of the number of training points for aggressive tumor cases mapping from the initial density of tumor cells. The blue lines represent the mean of prediction errors in training datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Prediction errors in training datasets for examining the effects of the number of training points for aggressive tumor cases mapping from the initial density of tumor cells using sparser data. The blue lines represent the mean of prediction errors in training datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Prediction errors in training datasets for examining the effects of the noisy measurements for mild tumor cases mapping from the initial density of tumor cells. The blue lines represent the mean of prediction errors in training datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Prediction errors in training datasets for examining the effects of the noisy measurements for aggressive tumor cases mapping from the initial density of tumor cells. The blue lines represent the mean of prediction errors in training datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
G Ablation & grid studies of TGM-ONets
For all cases considered in Sect. 4 and in this subsection, the distributions for the parameters of the mechanistic model for training and testing are the same as in Sect. 3.
In this subsection, we first consider the contributions of CBAM and MoE blocks in aggressive tumor cases mapping from the initial density of tumor cells as well as in mild tumor cases mapping from the growth rates \(\rho\). We use the same settings mentioned in the first ablation study in Sect. 4. Model performances are summarized in Figs. 47 and 48. We also conduct single-tailed Wilcoxon tests to check if the prediction errors given by Vanilla PI-DeepONet are significantly greater than those given by our proposed methods. The null hypothesis and the alternative hypothesis are the same as in Sect. 4. We take the average prediction errors over time for each training and test sample to compute the statistics for the single-tailed Wilcoxon tests. The results are summarized in Tables 71 and 72, from which we can see the p-values are lower than \(1.0\,\times \,10^{-2}\) for \(\epsilon _{\phi }\) and \(\epsilon _{\sigma }\) in testing datasets for both mild tumors and aggressive tumor cases. These results further showcase the improvements in the generalization ability of TGM-ONets for unseen datasets by utilizing our proposed methods.
Additionally, we further investigate the effects of the number of total hidden layers used in TGM-ONets for mild tumors mapping from the nutrient uptake \(\gamma _{c}\), mild tumors mapping from the initial density of tumor cells, mild tumors mapping from the growth rate \(\rho\) and aggressive tumors mapping from the initial density of tumor cells. Model performances are summarized in Figs. 49, 50, 51 and 52. We also conduct Kruskal–Wallis test and Dunns test to check if the difference of the prediction errors given by TGM-ONets with varying numbers of total hidden layers are statistically significant. We take the average prediction errors over time for each training and test sample to compute the statistics for Kruskal–Wallis and Dunns test. The results of Kruskal–Wallis test are summarized in Tables 73 - 76, from which we can see the effects of the number of total hidden layers are different from case to case. For mild tumors mapping from the nutrient uptake \(\gamma _{c}\), the p-values are lower than \(1.0\,\times \,10^{-2}\) in training and testing datasets. For mild tumors mapping from the initial density of tumor cells, the p-values are lower than \(1.0\,\times \,10^{-2}\) in training datasets. For mild tumors mapping from the growth rate \(\rho\), the p-values are greater than \(1.0\,\times \,10^{-2}\) in training and testing datasets. For aggressive tumors mapping from the initial density of tumor cells, the p-values are only lower than \(1.0\,\times \,10^{-2}\) in training datasets. These results indicate that increasing the number of total hidden layers does enhance the performance, but the degree of enhancement achieved through this approach can be relatively minor sometimes and may not offset the increased computational cost associated with an increased number of model parameters. Further results of Dunns test for each case are summarized in Tables 77 - 83.
Quantitative results of enhancement derived from utilizing CBAM and MoE blocks are also provided in this section. Tables 84 - 89 show the mean relative \(L^{2}\) errors for \(\phi\) and \(\sigma\) for mild tumors mapping from the initial density of tumor cells with R = 0.05 mm, 0.18 mm and 0.23 mm using varying architectures of deep operator networks. Tables 90 - 95 show the mean relative \(L^{2}\) errors for \(\phi\) and \(\sigma\) for aggressive tumors mapping from the nutrient uptake by tumor cells with \(\gamma _{c}\) = 16.1 g/L/day, 16.9 g/L/day and 18.9 g/L/day using varying architectures of deep operator networks. Tables 96 - 101 show the mean relative \(L^{2}\) errors for \(\phi\) and \(\sigma\) for aggressive tumors mapping from the initial density of tumor cells with \(R = 0.09\) mm, 0.16 mm and 0.21 mm using varying architectures of deep operator networks. Tables 102, 103, 104, 105, 106 and 107 show the mean relative \(L^{2}\) errors for \(\phi\) and \(\sigma\) for mild tumors mapping from the growth rate with \(\rho\) = 1.2 1/day, 2.1 1/day and 2.7 1/day using varying architectures of deep operator networks. In all the tables, bolded are the best results, and underlined are the second-best ones.
We also provide the ranges for prediction errors for each case considered in Sect. 4 and in this subsection. From Figs. 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68 and 69, we can see that ranges for the training error are roughly no larger than the ranges for the testing errors for all the cases considered in ablation and grid studies. In Figs. 53, 54 and 65, we can see that TGM-ONets with CBAM and MoE blocks have smaller ranges of prediction errors compared with vanilla PI-DeepONets. In Fig. 55 and 56, we can see that \(w_{data} = 100\) has smaller ranges for the prediction errors than other values for \(w_{data}\). In Figs. 57, 66, 67, 68 and 69, we can see that 8 or 9 hidden layers have smaller ranges for the prediction errors than others. In Fig. 58, we can see that 2 MoE has smaller ranges for the testing errors but 3 MoE has smaller ranges for the training errors. In Fig. 59, we can see that \(1\,\times \,10^{-3}\), \(6\,\times \,10^{-4}\), and \(2\,\times \,10^{-4}\) learning rate have smaller ranges for the prediction errors. In Fig. 60, 1000 decay steps have smaller ranges for the prediction errors. Figure 61 shows that continuous activation functions have smaller ranges for the prediction errors. Figure 62 shows different numbers of training datasets (i.e., 7, 8, 9, 10) have roughly the same ranges for the prediction errors. Figures 63 and 64 show that with or without boundary loss, ranges for the prediction errors do not change much.
Ablation study: Prediction errors for inferring state variables for mild tumor cases mapping from the initial density of tumor cells with varying network structures. a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Ablation study: Prediction errors for inferring state variables for aggressive tumor cases mapping from the nutrient uptake by tumor cells with varying network structures. a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Grid study: Prediction errors for inferring state variables for mild tumor cases mapping from the initial density of tumor cells with varying \(\omega _{data}\). a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Grid study: Prediction errors for inferring state variables for aggressive tumor cases mapping from the initial density of tumor cells with varying \(\omega _{data}\). a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Grid study: Prediction errors for inferring state variables for aggressive tumor cases mapping from the nutrient uptake by tumor cells with varying number of total hidden layers. a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Grid study: Prediction errors for inferring state variables for mild tumor cases mapping from the initial density of tumor cells with varying number of expert networks in MoE block. a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Grid study: Prediction errors for inferring state variables for mild tumor cases mapping from the growth rate with varying initial learning rate. a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Grid study: Prediction errors for inferring state variables for mild tumor cases mapping from the growth rate with varying decay steps for the optimizer. a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Grid study: Prediction errors for inferring state variables for mild tumor cases mapping from the growth rate with varying activation functions. a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Grid study: Prediction errors for inferring state variables for mild tumor cases mapping from the growth rate with varying number of training datasets. a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Grid study: Prediction errors for inferring state variables for mild tumor cases mapping from the growth rate with or without boundary loss. a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Grid study: Prediction errors for inferring state variables for aggressive tumor cases mapping from the nutrient uptake by tumor cells with or without boundary loss. a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Ablation study: Prediction errors for inferring state variables for aggressive tumor cases mapping from the initial density of tumor cells with varying network structures. a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Grid study: Prediction errors for inferring state variables for mild tumor cases mapping from the nutrient uptake \(\gamma _{c}\) with varying number of total hidden layers. a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Grid study: Prediction errors for inferring state variables for mild tumor cases mapping from the initial density of tumor cells with varying number of total hidden layers. a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Grid study: Prediction errors for inferring state variables for mild tumor cases mapping from the growth rate \(\rho\) with varying number of total hidden layers. a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Grid study: Prediction errors for inferring state variables for aggressive tumor cases mapping from the initial density of tumor cells with varying number of total hidden layers. a Prediction errors for training datasets. The blue lines represent the mean of prediction errors in training datasets. b Prediction errors for testing datasets. The red lines represent the mean of prediction errors in testing datasets. The shaded region represents the region encompassed by the maximum and minimum of the prediction errors
Comparison with SOTA models. a Prediction errors for mild tumor case with R = 0.16 mm in scenario 1 (i.e., using a sparse measurement at T = 270). b Prediction errors for aggressive tumor case with R = 0.18 mm in scenario 1 (i.e., using sparse measurements at T = 200, 300 and 400). c Prediction errors for mild tumor case with R = 0.16 mm in scenario 2 (i.e., using no additional measurements in testing domain). d Prediction errors for aggressive tumor case with R = 0.18 mm in scenario 2 (i.e., using no additional measurements in testing domain). In scenario 1, the prediction range is T\(\in\)[0, 270] for mild tumors while T\(\in\)[0, 400] for aggressive tumors. In scenario 2, the prediction range is T\(\in\)[0, 130] for mild tumors while T\(\in\)[0, 230] for aggressive tumors
Comparison with SOTA models. a Prediction errors for mild tumor case with R = 0.21 mm in scenario 1 (i.e., using 1 snapshot at T = 270). b Prediction errors for aggressive tumor case with R = 0.21 mm in scenario 1 (i.e., using 3 snapshots at T = 200, 300 and 400). c Prediction errors for mild tumor case with R = 0.21 mm in scenario 2 (i.e., using no additional snapshots in testing domain). d Prediction errors for aggressive tumor case with R = 0.21 mm in scenario 2 (i.e., using no additional snapshots in testing domain). In scenario 1, the prediction range is T\(\in\)[0, 270] for mild tumors while T\(\in\)[0, 400] for aggressive tumors. In scenario 2, the prediction range is T\(\in\)[0, 130] for mild tumors while T\(\in\)[0, 230] for aggressive tumors
H. Comparison with three state-of-the-art (SOTA) models
In this subsection, we further compare the TGM-ONets ability in predicting mild tumor growth with R = 0.16 mm, 0.21 mm and aggressive tumor growth with R = 0.18 mm, 0.21 mm with three SOTA models. We consider the same two scenarios as in Sect. 5.
For the first scenario, prediction errors for mild tumor cases with R = 0.16 mm and 0.21 mm are displayed in Fig. 70a and Fig. 71a, prediction errors for aggressive tumor cases with R = 0.18 mm and 0.21 mm are displayed in Fig. 70b and Fig. 71b.
For the second scenario, prediction errors for mild tumor cases with R = 0.16 mm and 0.21 mm are displayed in Fig. 70c and Fig. 71. c, prediction errors for aggressive tumor cases with R = 0.18 mm and 0.21 mm are displayed in Fig. 70d and Fig. 71d.
From the results presented above, we can see our fine-tuning method can provide more stable and accurate prediction results compared with the other three SOTA models, which demonstrate the effectiveness and efficiency of our proposed method.
Quantitative results are also provided for all cases considered in Sect. 5 and in this subsection. Tables 108 and 109 show the mean relative \(L^2\) error for mild tumor case with R = 0.08 mm in scenario 1. Tables 110 and 111 show the mean relative \(L^2\) error for mild tumor case with R = 0.16 mm in scenario 1. Tables 112 and 113 show the mean relative \(L^2\) error for mild tumor case with R = 0.21 mm in scenario 1.
Tables 114 and 115 show the mean relative \(L^2\) error for aggressive tumor case with R = 0.05 mm in scenario 1. Tables 116 and 117 show the mean relative \(L^2\) error for aggressive tumor cases with R = 0.18 mm in scenario 1. Tables 118 and 119 show the mean relative \(L^2\) error for aggressive tumor cases with R = 0.21 mm in scenario 1.
Tables 120 and 121 show the mean relative \(L^2\) error for mild tumor case with R = 0.08 mm in scenario 2. Tables 122 and 123 show the mean relative \(L^{2}\) errors for mild tumor case with R = 0.16 mm in scenario 2. Tables 124 and 125 show the mean relative \(L^{2}\) errors for mild tumor case with R = 0.21 mm in scenario 2.
Tables 126 and 127 show the mean relative \(L^2\) error for aggressive tumor case with R = 0.05 mm in scenario 2. Tables 128 and 129 show the mean relative \(L^2\) error for aggressive tumor case with R = 0.18 mm in scenario 2. Tables 130 and 131 show the mean relative \(L^2\) error for aggressive tumor case with R = 0.21 mm in scenario 2.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, Q., Li, H. & Zheng, X. A deep neural network for operator learning enhanced by attention and gating mechanisms for long-time forecasting of tumor growth. Engineering with Computers 41, 423–533 (2025). https://doi.org/10.1007/s00366-024-02003-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00366-024-02003-0