Skip to main content

Towards Food Handling Robots for Automated Meal Preparation in Healthcare Facilities

  • Conference paper
  • First Online:
Computer Vision Systems (ICVS 2023)

Abstract

Meal preparation in healthcare facilities is an area of work severely affected by the shortage of qualified personnel. Recent advances in automation technology have enabled the use of picking robots to fill the gaps created by changes in the labor market. Building on these advances, we present a robotic system designed to handle packaged food for automated meal preparation in healthcare facilities. To address the challenge of grasping the large variety of packaged foods, we propose a novel technique for model-free grasping pose detection based on geometric features that is optimized for the given scenario. We provide a comprehensive system overview and conduct evaluations of the grasping success on a real robot. The high grasping success rate of \(94\%\) with a processing time of \(\sim \!\!280\) ms indicates the suitability of the proposed approach for automating meal preparation tasks in healthcare facilities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Detry, R., Ek, C.H., Madry, M., Kragic, D.: Learning a dictionary of prototypical grasp-predicting parts from grasping experience. In: 2013 IEEE International Conference on Robotics and Automation (ICRA 2013), pp. 601–608. IEEE (2013). https://doi.org/10.1109/ICRA.2013.6630635

  2. Dong, Z., et al.: PPR-Net:point-wise pose regression network for instance segmentation and 6D pose estimation in bin-picking scenarios. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE (2019). https://doi.org/10.1109/iros40897.2019.8967895

  3. El-Shamouty, M., Kleeberger, K., Lämmle, A., Huber, M.: Simulation-driven machine learning for robotics and automation. tm - Technisches Messen 86(11), 673–684 (2019). https://doi.org/10.1515/teme-2019-0072

  4. Kleeberger, K., Bormann, R., Kraus, W., Huber, M.F.: A survey on learning-based robotic grasping. Current Robot. Rep. 1(4), 239–249 (2020). https://doi.org/10.1007/s43154-020-00021-6

    Article  Google Scholar 

  5. Kleeberger, K., Huber, M.F.: Single shot 6D object pose estimation. In: 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE (2020). https://doi.org/10.1109/icra40945.2020.9197207

  6. Kopicki, M., Detry, R., Schmidt, F., Borst, C., Stolkin, R., Wyatt, J.L.: Learning dexterous grasps that generalise to novel objects by combining hand and contact models. In: 2014 IEEE International Conference on Robotics and Automation (ICRA 2014), pp. 5358–5365. IEEE (2014). https://doi.org/10.1109/ICRA.2014.6907647

  7. Kroemer, O., Ugur, E., Oztop, E., Peters, J.: A kernel-based approach to direct action perception. In: 2012 IEEE International Conference on Robotics and Automation (ICRA 2012), pp. 2605–2610. IEEE (2012). https://doi.org/10.1109/ICRA.2012.6224957

  8. Kumra, S., Kanan, C.: Robotic grasp detection using deep convolutional neural networks. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE (2017). https://doi.org/10.1109/iros.2017.8202237

  9. Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. In: Robotics: Science and Systems IX. Robotics: Science and Systems Foundation (2013). https://doi.org/10.15607/rss.2013.ix.012

  10. Levine, S., Pastor, P., Krizhevsky, A., Quillen, D.: Learning hand-eye coordination for robotic grasping with large-scale data collection. In: Kulić, D., Nakamura, Y., Khatib, O., Venture, G. (eds.) ISER 2016. SPAR, vol. 1, pp. 173–184. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-50115-4_16

    Chapter  Google Scholar 

  11. Mahler, J., et al.: Dex-Net 2.0: deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics. In: Amato, N. (ed.) Robotics: Science and System XIII. Robotics, Robotics Science and Systems Foundation (2017). https://doi.org/10.15607/RSS.2017.XIII.058

  12. Mahler, J., Matl, M., Liu, X., Li, A., Gealy, D., Goldberg, K.: Dex-Net 3.0: computing robust vacuum suction grasp targets in point clouds using a new analytic model and deep learning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE (2018). https://doi.org/10.1109/icra.2018.8460887

  13. Mahler, J., et al.: Learning ambidextrous robot grasping policies. Sci. Robot. 4(26) (2019). https://doi.org/10.1126/scirobotics.aau4984

  14. Morrison, D., Corke, P., Leitner, J.: Learning robust, real-time, reactive robotic grasping. Int. J. Robot. Res. 39(2–3), 183–201 (2020). https://doi.org/10.1177/0278364919859066

    Article  Google Scholar 

  15. Morrison, D., Leitner, J., Corke, P.: Closing the loop for robotic grasping: a real-time, generative grasp synthesis approach. In: Kress-Gazit, H., Srinivasa, S., Atanasov, N. (eds.) Robotics: Science and Systems XIV. Robotics Science and Systems Foundation (2018). https://doi.org/10.15607/RSS.2018.XIV.021

  16. ten Pas, A., Gualtieri, M., Saenko, K., Platt, R.: Grasp pose detection in point clouds. Int. J. Robot. Res. 36(13–14), 1455–1473 (2017). https://doi.org/10.1177/0278364917735594

    Article  Google Scholar 

  17. Satish, V., Mahler, J., Goldberg, K.: On-policy dataset synthesis for learning robot grasping policies using fully convolutional deep networks. IEEE Robot. Autom. Lett. 4(2), 1357–1364 (2019). https://doi.org/10.1109/lra.2019.2895878

    Article  Google Scholar 

  18. Telea, A.: An image inpainting technique based on the fast marching method. J. Graph. Tools 9(1), 23–34 (2004). https://doi.org/10.1080/10867651.2004.10487596

    Article  Google Scholar 

  19. Tremblay, J., To, T., Sundaralingam, B., Xiang, Y., Fox, D., Birchfield, S.: Deep object pose estimation for semantic robotic grasping of household objects. In: Billard, A., Dragan, A., Peters, J., Morimoto, J. (eds.) Proceedings of The 2nd Conference on Robot Learning. Proceedings of Machine Learning Research, vol. 87, pp. 306–316. PMLR (2018)

    Google Scholar 

  20. Zeng, A., et al.: Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching. In: Lynch, K. (ed.) 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 3750–3757. IEEE (2018). https://doi.org/10.1109/ICRA.2018.8461044

Download references

Acknowledgments

This work has received funding from the German Ministry of Education and Research (BMBF) under grant agreement No 01IS21061A for the Sim4Dexterity project and the Baden-Württemberg Ministry of Economic Affairs, Labour and Tourism for the AI Innovation Center “Learning Systems and Cognitive Robotics”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lukas Knak .

Editor information

Editors and Affiliations

Appendix

Appendix

We provide an ablation and hyperparameter study for the components of the grasping pose detection method. The experimental conditions and setup are identical to the evaluations reported in Sect. 5. A reduced number of 100 grasping cycles are performed per experiment, and the grasping success rate as well as the mean \(\mu \) and standard deviation \(\sigma \) of the processing time for each cycle, including image retrieval and grasp computation, are evaluated. All grasp pose detection calculations are performed on an Nvidia RTX3090 GPU. During each cycle, the grasp with the highest score is executed first. If this is unsuccessful, up to two more grasp attempts are made without recomputing the grasps to measure both the top-1 and top-3 accuracy.

For the component ablation study, the algorithm is evaluated when grasp rejection via the proposed ray-casting approach is not used. In addition, the top-3 success rate is measured when the grasp set is not pruned and all grasping poses whose distance to other candidates is smaller than the suction cup diameter are retained. The results are shown in Table 2.

Table 2. Grasping evaluation results for the components ablation study.

We observe that the grasping success rate without the ray-casting rejection approach drops to a top-1 performance of \(77\%\) with a reduced processing time of \(\sim \!\!170\) ms. The failed grasps here are attributed to noise in the camera image leading to invalid grasps, which are prevented by checking the local area against the suction cup in the ray-casting process. For the experiment without pruning of the grasp set, we see that the top-3 performance slightly declines to \(97\%\) with a negligible processing time difference in the range of microseconds. Thus, the results of the ablation study show the positive impact of the developed algorithm components.

For the hyperparameter study of the grasping detection process, the impact of changes in individual parameters on the success rate and processing time is investigated, with all other parameters remaining unchanged. The optimal hyperparameters used in the results in Sect. 5 are: neighborhood size \(A=5\times 5\), threshold \(c=0.9\), and iterations \(T=5\). The results for the varied parameters Ac and T are shown in Table 3.

Table 3. Grasping evaluation results for the hyperparameter study.

We observe that when varying the neighborhood size A, the grasping results show negligible differences in both success rates and speed, indicating that any of the neighborhood sizes can be chosen for successful grasping. For the threshold parameter c, on the other hand, we find a strong difference in success rates: for smaller thresholds \(c=0.7\), the top-1 success rate drops to \(83\%\). This is because with smaller thresholds, the object edges are smoothed by the heuristic and the proposed grasps are predicted closer to the edges, leading to more failed grasps. For larger thresholds \(c=0.95\), small irregularities, e.g., on the lid of objects, lead to strongly varying grasp quality values with many local maxima that are not centered on the objects and thus can lead to unstable grasps. Finally, for the number of iterations T the heuristic is applied to the scene, we find that the grasping success declines for both a higher or lower number of iterations. We assume that this hyperparameter, which depends on the object sizes in pixels, must be tuned individually for the given camera intrinsic parameters and distance of the camera from the objects.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Knak, L., Jordan, F., Nickel, T., Kraus, W., Bormann, R. (2023). Towards Food Handling Robots for Automated Meal Preparation in Healthcare Facilities. In: Christensen, H.I., Corke, P., Detry, R., Weibel, JB., Vincze, M. (eds) Computer Vision Systems. ICVS 2023. Lecture Notes in Computer Science, vol 14253. Springer, Cham. https://doi.org/10.1007/978-3-031-44137-0_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-44137-0_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-44136-3

  • Online ISBN: 978-3-031-44137-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics