Abstract
This research models a medical resource-sharing problem with three powerful techniques in analytics: mixed-integer linear programming (MILP), deep q-learning (DQL), and double deep q-learning (DDQL). These models can be utilized by a group of collaborating entities in emergency situations to decide how they share resources and mitigate adverse effects on human lives. Due to the sudden spikes in demand, the allocation of healthcare resources becomes dynamic in nature. Medical resource management underwent an unprecedented level of difficulty in the modern history of our healthcare system because of the Coronavirus pandemic. At the beginning of the pandemic, we witnessed a severe shortage of ventilators in several countries including in the U.S. Followed by that, there was a severe shortage of Oxygen supply in treating patients who tested positive for the virus in India. Many lives were unnecessarily lost because of the lack of a resource-sharing framework to use in an adverse situation. When a hospital or region experiences sudden spikes in demand for resources, the key decision involves sending resources from other places. Especially, a swift optimal decision has to be made regarding who will send the resources and when will it be sent to the places of need. To tackle this problem, we developed three analytics models—MILP, DQL, and DDQL—based on the Institute for Health Metrics and Evaluation (IHME)’s public data of ventilator allocation. To the best of our knowledge, DQL and DDQL are introduced in this context for the first time. Our models can be used either to share critical resources between collaborators or transfer patients between the locations of the collaborators. We devise two policies for DQL and DDQL models while optimizing the parameters in them. We call these policies the normal policy and the just ship policy. The just ship policy promptly ships healthcare resources to save patients in other locations even at the risk of not meeting its own demand in the future. The normal policy allows saving resources at a location so that future demands are met at the location rather than satisfying the current demand at other locations. To demonstrate the performance of our models, we applied our models to the IHME’s ventilator data and present the results for all fifty states in the U.S. We show how these models optimize ventilator allocations assuming that all the fifty states collaborate as a task force and implement the solution that emerges from the analytics. The experimental results show that the proposed models are effective in handling the resource sharing problem presented by Covid-19. This research provides a decision-making framework for resource allocation during any disease outbreak, such as, new variants of coronavirus and other outbreaks like monkey-pox.









Similar content being viewed by others
References
Acimovic, J., & Goentzel, J. (2016). Models and metrics to assess humanitarian response capacity. Journal of Operations Management, 45, 11–29.
Al-Ebbini, L., Oztekin, A., & Chen, Y. (2016). FLAS: Fuzzy lung allocation system for us-based transplantations. European Journal of Operational Research, 248(3), 1051–1065.
Alelaiwi A. (2020). Resource allocation management in patient-to-physician communications based on deep reinforcement learning in smart healthcare services. In 2020 IEEE international conference on multimedia & expo workshops (ICMEW) (pp. 1–5). IEEE.
Argerich, M. F., Fürst, J., & Cheng, B. (2020). Tutor4rl: Guiding reinforcement learning with external knowledge. In AAAI spring symposium: combining machine learning with knowledge engineering.
Barnhart, C., Johnson, E. L., Nemhauser, G. L., Savelsbergh, M. W., & Vance, P. H. (1998). Branch-and-price: Column generation for solving huge integer programs. Operations Research, 46(3), 316–329.
Barnhart, C., Cohn, A. M., Johnson, E. L., Klabjan, D., Nemhauser, G. L. & Vance, P. H. (2003). Airline crew scheduling. In Handbook of transportation science (pp. 517–560).
Barto, A., Bradtke, S., & Singh, S. (1993). Learning to act using real-time dynamic programming. Artificial Intelligence, 72, 81–138. https://doi.org/10.1016/0004-3702(94)00011-O
Bellman, R. (1966). Dynamic programming. Science, 153(3731), 34–37.
Beneke, R. R., & Winterboer, R. (1984). Linear programming. Applications to agriculture. Aedos.
Bertsekas, D. P. (2001). Dynamic programming and optimal control. Athena Scientific.
Bertsekas, D. P., & Tsitsiklis, J. N. (1996). Neuro-dynamic programming. Athena Scientific.
Bertsimas, D., & Tsitsiklis, J. N. (1997). Introduction to linear optimization (Vol. 6). Athena Scientific.
Birge, J. R., & Wets, R.J.-B. (1986). Designing approximation schemes for stochastic optimization problems, in particular for stochastic programs with recourse. Stochastic Programming, 84 Part I, 54–102.
Butterworth. K. (1972). Practical application of integer programming to farm planning. Farm Manage Kenilworth England.
CDC. (2023). Number of Covid-19 cases in the United States. https://covid.cdc.gov/covid-data-tracker/#datatracker-home.
Chen, V. C., Ruppert, D., & Shoemaker, C. A. (1999). Applying experimental design and regression splines to high-dimensional continuous-state stochastic dynamic programming. Operations Research, 47(1), 38–53.
Chen, Z., Hu, J., & Min, G. (2019). Learning-based resource allocation in cloud data center using advantage actor-critic. In ICC 2019-2019 IEEE international conference on communications (ICC) (pp. 1–6). IEEE.
Dantzig, G. B. (2002). Linear programming. Operations Research, 50(1), 42–47.
Deo, S., Iravani, S., Jiang, T., Smilowitz, K., & Samuelson, S. (2013). Improving health outcomes through better capacity allocation in a community-based chronic care model. Operations Research, 61(6), 1277–1294.
Dillon, R. L., & Tinsley, C. H. (2008). How near-misses influence decision making under risk: A missed opportunity for learning. Management Science, 54(8), 1425–1440.
Du, B., Wu, C., & Huang, Z. (2019). Learning resource allocation and pricing for cloud profit maximization. In Proceedings of the AAAI conference on artificial intelligence (vol. 33, pp. 7570–7577).
Duan, Y., Chen, X., Houthooft, R., Schulman, J., & Abbeel, P. (2016). Benchmarking deep reinforcement learning for continuous control. In ICML’16: proceedings of the 33rd international conference on international conference on machine learning (vol. 48, pp. 1329–1338).
ElHalawany, B. M., Wu, K., & Zaky, A. B. (2020). Deep learning based resources allocation for internet-of-things deployment underlaying cellular networks. Mobile Networks and Applications, 25, 1833–1841.
Fattahi, M., Keyvanshokooh, E., & Govindan, D. K. K. (2022). Resource planning strategies for healthcare systems during a pandemic. European Journal of Operational Research, 304(1), 192–206.
Ferguson, A. R., & Dantzig, G. B. (1956). The allocation of aircraft to routes-an example of linear programming under uncertain demand. Management Science, 3(1), 45–73.
Gopalakrishnan, B., & Johnson, E. L. (2005). Airline crew scheduling: State-of-the-art. Annals of Operations Research, 140, 305–337.
Govindan, K., Mina, H., & Alavi, B. (2020). A decision support system for demand management in healthcare supply chains considering the epidemic outbreaks: A case study of coronavirus disease 2019 (COVID-19). Transportation Research Part E: Logistics and Transportation Review, 138, 101967.
Green, L. V., & Kolesar, P. J. (2004). Anniversary article: Improving emergency responsiveness with management science. Management Science, 50(8), 1001–1014.
Gupta, S., Starr, M. K., Farahani, R. Z., & Matinrad, N. (2016). Disaster management from a POM perspective: Mapping a new domain. Production and Operations Management, 25(10), 1611–1637.
Huang, Z., van der Aalst, W. M., Lu, X., & Duan, H. (2011). Reinforcement learning based resource allocation in business process management. Data & Knowledge Engineering, 70(1), 127–145.
Huh, W. T., Liu, N., & Truong, V.-A. (2013). Multiresource allocation scheduling in dynamic environments. Manufacturing & Service Operations Management, 15(2), 280–291.
IFRC. (xxxx). Disaster management. https://www.ifrc.org/en/what-we-do/disaster-management/about-disasters/what-is-a-disaster/
IHME. (2020). University of Washington. https://covid19.healthdata.org/projections
Johnson, E. L. (1967). Optimality and computation of (\(\sigma \), s) policies in the multi-item infinite horizon inventory problem. Management Science, 13(7), 475–491.
Johnson, E. L., Nemhauser, G. L., & Savelsbergh, M. W. (2000). Progress in linear programming-based algorithms for integer programming: An exposition. Informs Journal on Computing, 12(1), 2–23.
Kim, S.-H., Cohen, M. A., Netessine, S., & Veeraraghavan, S. (2010). Contracting for infrequent restoration and recovery of mission-critical systems. Management Science, 56(9), 1551–1567.
Kwasinski, A., Wang, W., and Mohammadi, F.S. (2020). Reinforcement learning for resource allocation in cognitive radio networks. In: Machine Learning for Future Wireless Communications, Chapter 2 (ed. F.-L. Luo), 27– 44. Wiley. https://onlinelibrary.wiley.com/. https://doi.org/10.1002/9781119562306.ch2
Lee, J., Bharosa, N., Yang, J., Janssen, M., & Rao, H. R. (2011). Group value and intention to use-a study of multi-agency disaster management information systems for public safety. Decision Support Systems, 50(2), 404–414.
Li, D., Ding, L., & Connor, S. (2020). When to switch? Index policies for resource scheduling in emergency response. Production and Operations Management, 29(2), 241–262.
Lin, R.-C., Sir, M. Y., Sisikoglu, E., Pasupathy, K., & Steege, L. M. (2013). Optimal nurse scheduling based on quantitative models of work-related fatigue. IIE Transactions on Healthcare Systems Engineering, 3(1), 23–38.
Lodree, E. J., Jr., & Taskin, S. (2008). An insurance risk management framework for disaster relief and supply chain disruption inventory planning. Journal of the Operational Research Society, 59(5), 674–684.
Lu, M., Shahn, Z., Sow, D., Doshi-Velez, F., & Lehman, L.-W. H. (2020). Is deep reinforcement learning ready for practical applications in healthcare? A sensitivity analysis of duel-DDQN for sepsis treatment. arXiv preprint arXiv:2005.04301
Luong, N. C., Hoang, D. T., Gong, S., Niyato, D., Wang, P., Liang, Y.-C., & Kim, D. I. (2019). Applications of deep reinforcement learning in communications and networking: A survey. IEEE Communications Surveys & Tutorials, 21(4), 3133–3174.
Mao, H., Alizadeh, M., Menache, I., & Kandula, S. (2016). Resource management with deep reinforcement learning. In Proceedings of the 15th ACM workshop on hot topics in networks (pp. 50–56).
Mills, A. F., Helm, J. E., & Wang, Y. (2021). Surge capacity deployment in hospitals: Effectiveness of response and mitigation strategies. Manufacturing & Service Operations Management, 23(2), 367–387.
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Daan, W., & Riedmiller, M. Playing Atari with deep reinforcement learning. In NIPS deep learning workshop 12, 2013. https://doi.org/10.48550/arXiv.1312.5602.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., & Petersen, S. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533.
Mosavi, A., Ghamisi, P., Faghan, Y., & Duan, P. (2020). Comprehensive review of deep reinforcement learning methods and applications in economics. arXiv preprint arXiv:2004.01509
Ordu, M., Demir, E., Tofallis, C., & Gunal, M. M. (2021). A novel healthcare resource allocation decision support tool: A forecasting-simulation-optimization approach. Journal of the Operational Research Society, 72(3), 485–500.
Parker, F., Sawczuk, H., Ganjkhanloo, F., Ahmadi, F., & Ghobadi, K. (2020). Optimal resource and demand redistribution for healthcare systems under stress from COVID-19. arXiv preprint arXiv:2011.03528
Physiopedia. (xxxx). Disaster management. https://www.physio-pedia.com/Disaster_Management#cite_note-p1-2
Poon, L. L. M., & Peiris, M. (2020). Emergence of a novel human coronavirus threatening human health. Nature Medicine, 26, 317–319.
Stauffer, J. M., Pedraza-Martinez, A. J., Yan, L. L., & Van Wassenhove, L. N. (2018). Asset supply networks in humanitarian operations: A combined empirical-simulation approach. Journal of Operations Management, 63, 44–58.
Sundaramoorthi, D., Chen, V. C., Rosenberger, J. M., Kim, S. B., & Buckley-Behan, D. F. (2009). A data-integrated simulation model to evaluate nurse-patient assignments. Health Care Management Science, 12(3), 252–268.
Sundaramoorthi, D., Chen, V. C., Rosenberger, J. M., Kim, S. B., & Buckley-Behan, D. F. (2010). A data-integrated simulation-based optimization for assigning nurses to patient admissions. Health Care Management Science, 13(3), 210–221.
Sutton, R. S., & Barto, A. G. (1999). Reinforcement learning. Journal of Cognitive Neuroscience, 11(1), 126–134.
Thrun, S. & Schwartz, A. (1993). Issues in using function approximation for reinforcement learning. In Proceedings of the 1993 connectionist models summer school (vol. 6). Lawrence Erlbaum, Hillsdale, NJ.
Tinsley, C. H., Dillon, R. L., & Cronin, M. A. (2012). How near-miss events amplify or attenuate risky decision making. Management Science, 58(9), 1596–1613.
van der Laan, E., van Dalen, J., Rohrmoser, M., & Simpson, R. (2016). Demand forecasting and order planning for humanitarian logistics: An empirical assessment. Journal of Operations Management, 45, 114–122.
van Hasselt, H. (2010). Double q-learning. Advances in Neural Information Processing Systems, 23, 2613–2621.
van Hasselt, H., Guez, A., & Silver, D. (2015). Deep reinforcement learning with double q-learning. CoRR arXiv:1509.06461
Watkins, C. J. C. H. (1989). Learning from delayed rewards.
WHO. (2023). Coronavirus disease (COVID-19) outbreak situation. https://covid19.who.int/
Wolsey, L. A., & Nemhauser, G. L. (1999). Integer and combinatorial optimization (Vol. 55). John Wiley & Sons. https://doi.org/10.1002/9781118627372
Wu, J., Chen, S., & Liu, X. (2020). Efficient hyperparameter optimization through model-based reinforcement learning. Neurocomputing, 409, 381–393.
Yang, T., Hu, Y., Gursoy, M. C., Schmeink, A., & Mathar, R. (2018). Deep reinforcement learning based resource allocation in low latency edge computing networks. In 2018 15th international symposium on wireless communication systems (ISWCS) (pp. 1–5). IEEE.
Ye, H., Li, G. Y., & Juang, B.-H.F. (2019). Deep reinforcement learning based resource allocation for V2V communications. IEEE Transactions on Vehicular Technology, 68(4), 3163–3173.
Ye, Y., Jiao, W., & Yan, H. (2020). Managing relief inventories responding to natural disasters: Gaps between practice and literature. Production and Operations Management, 29(4), 807–832.
Zhang, C., Atasu, A., Ayer, T., & Toktay, B. L. (2008). Truthful mechanisms for medical surplus product allocation. Manufacturing & Service Operations Management, 22(4), 735–753.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
This work neither has any conflicts of interest to report nor involved human participants and/or animals.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, J., Tutun, S., Anvaryazdi, S.F. et al. Management of resource sharing in emergency response using data-driven analytics. Ann Oper Res 339, 663–692 (2024). https://doi.org/10.1007/s10479-023-05702-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-023-05702-x