Skip to main content

It’s Just Semantics: How to Get Robots to Understand the World the Way We Do

  • Conference paper
  • First Online:
Robotics Research (ISRR 2022)

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 27))

Included in the following conference series:

  • 1041 Accesses

Abstract

Increasing robotic perception and action capabilities promise to bring us closer to agents that are effective for automating complex operations in human-centered environments. However, to achieve the degree of flexibility and ease of use needed to apply such agents to new and diverse tasks, representations are required for generalizable reasoning about conditions and effects of interactions, and as common ground for communicating with non-expert human users. To advance the discussion on how to meet these challenges, we characterize open problems and point out promising research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cadena, C., et al.: Past, present, and future of simultaneous localization and mapping: toward the robust-perception age. IEEE Trans. Robot. 32(6), 1309–1332 (2016)

    Article  Google Scholar 

  2. Van den Berg, J., Lin, M., Manocha, D.: Reciprocal velocity obstacles for real-time multi-agent navigation. In: IEEE International Conference on Robotics and Automation (2008)

    Google Scholar 

  3. Gao, Y., Huang, C.M.: Evaluation of socially-aware robot navigation. Front. Robot. AI 8(721317), 420 (2021)

    Google Scholar 

  4. Alami, R., et al.: Safe and dependable physical human-robot interaction in anthropic domains: State of the art and challenges. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2006)

    Google Scholar 

  5. Zacharaki, A., Kostavelis, I., Gasteratos, A., Dokas, I.: Safety bounds in human robot interaction: a survey. Saf. Sci. 127, 104667 (2020)

    Google Scholar 

  6. Florence, P., Manuelli, L., Tedrake, R.: Self-supervised correspondence in visuomotor policy learning. IEEE Robot. Autom. Lett. 5(2), 492–499 (2020)

    Article  Google Scholar 

  7. Garg, S., et al.: Semantics for robotic mapping, perception and interaction: a survey. Found. Trends® Robot. 8(1–2), 1–224 (2020)

    Article  Google Scholar 

  8. Narita, G., Seno, T., Ishikawa, T., Kaji, Y.: PanopticFusion: online volumetric semantic mapping at the level of stuff and things. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2019)

    Google Scholar 

  9. Grinvald, M., et al.: Volumetric instance-aware semantic mapping and 3D object discovery. IEEE Robot. Autom. Lett. 4(3), 3037–3044 (2019)

    Article  Google Scholar 

  10. Kothari, P., Kreiss, S., Alahi, A.: Human trajectory forecasting in crowds: a deep learning perspective. IEEE Trans. Intell. Transp. Syst. 23, 7386–7400 (2021)

    Article  Google Scholar 

  11. Bohg, J., Morales, A., Asfour, T., Kragic, D.: Data-driven grasp synthesis - a survey. IEEE Trans. Robot. 30(2), 289–309 (2014)

    Article  Google Scholar 

  12. Gualtieri, M., ten Pas, A., Saenko, K., Platt, R.: High precision grasp pose detection in dense clutter. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2016)

    Google Scholar 

  13. Mahler, J., et al.: Learning ambidextrous robot grasping policies. Sci. Robot. 4(26), eaau4984 (2019)

    Google Scholar 

  14. Morrison, D., Corke, P., Leitner, J.: Learning robust, real-time, reactive robotic grasping. Int. J. Robot. Res. 39(2–3), 183–201 (2020)

    Article  Google Scholar 

  15. Breyer, M., Chung, J.J., Ott, L., Siegwart, R., Nieto, J.: Volumetric grasping network: real-time 6 DOF grasp detection in clutter. In: Conference on Robot Learning (2021)

    Google Scholar 

  16. Mo, K., Guibas, L.J., Mukadam, M., Gupta, A., Tulsiani, S.: Where2Act: from pixels to actions for articulated 3D objects. In: IEEE/CVF International Conference on Computer Vision (2021)

    Google Scholar 

  17. Wu, R., et al.: VAT-Mart: learning visual action trajectory proposals for manipulating 3D articulated objects. In: International Conference on Learning Representations (2022)

    Google Scholar 

  18. Xu, Z., Zhanpeng, H., Song, S.: UMPNet: universal manipulation policy network for articulated objects. IEEE Robot. Autom. Lett. 7(2), 2447–2454 (2022)

    Article  Google Scholar 

  19. Pierson, A., Vasile, C.I., Gandhi, A., Schwarting, W., Karaman, S., Rus, D.: Dynamic risk density for autonomous navigation in cluttered environments without object detection. In: International Conference on Robotics and Automation (2019)

    Google Scholar 

  20. Regier, P.: Robot navigation in cluttered environments. Ph.D. thesis, Rheinische Friedrich-Wilhelms-Universität Bonn (2022)

    Google Scholar 

  21. Karpas, E., Magazzeni, D.: Automated planning for robotics. Annu. Rev. Control Robot. Auton. Syst. 3, 417–439 (2019)

    Article  Google Scholar 

  22. Fikes, R.E., Nilsson, N.J.: STRIPS: a new approach to the application of theorem proving to problem solving. Artif. Intell. 2(3–4), 189–208 (1971)

    Article  MATH  Google Scholar 

  23. McDermott, D., et al.: PDDL: the planning domain definition language. Technical report, Yale Center for Computational Vision and Control (1998)

    Google Scholar 

  24. Garrett, C.R., Lozano-Pérez, T., Kaelbling, L.P.: FFRob: leveraging symbolic planning for efficient task and motion planning. Int. J. Robot. Res. 37(1), 104–136 (2018)

    Article  Google Scholar 

  25. Konidaris, G., Kaelbling, L.P., Lozano-Perez, T.: From skills to symbols: learning symbolic representations for abstract high-level planning. J. Artif. Intell. Res. 61, 215–289 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  26. Ames, B., Thackston, A., Konidaris, G.: Learning symbolic representations for planning with parameterized skills. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2018)

    Google Scholar 

  27. Silver, T., Chitnis, R., Tenenbaum, J., Kaelbling, L.P., Lozano-Peréz, T.: Learning symbolic operators for task and motion planning. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2021)

    Google Scholar 

  28. Yuan, W., Paxton, C., Desingh, K., Fox, D.: SORNet: spatial object-centric representations for sequential manipulation. In: Conference on Robot Learning (2022)

    Google Scholar 

  29. Shridhar, M., Manuelli, L., Fox, D.: CLIPort: what and where pathways for robotic manipulation. In: Conference on Robot Learning (2022)

    Google Scholar 

  30. Nair, A., Bahl, S., Khazatsky, A., Pong, V., Berseth, G., Levine, S.: Contextual imagined goals for self-supervised robotic learning. In: Conference on Robot Learning (2020)

    Google Scholar 

  31. Collins, J., Chand, S., Vanderkop, A., Howard, D.: A review of physics simulators for robotic applications. IEEE Access 9, 51416–51431 (2021)

    Article  Google Scholar 

  32. Peng, X.B., Andrychowicz, M., Zaremba, W., Abbeel, P.: Sim-to-real transfer of robotic control with dynamics randomization. In: IEEE International Conference on Robotics and Automation, pp. 3803–3810 (2018)

    Google Scholar 

  33. Zhao, W., Queralta, J.P., Westerlund, T.: Sim-to-real transfer in deep reinforcement learning for robotics: a survey. In: IEEE Symposium Series on Computational Intelligence (2020)

    Google Scholar 

  34. Cohen, V., Burchfiel, B., Nguyen, T., Gopalan, N., Tellex, S., Konidaris, G.: Grounding language attributes to objects using Bayesian eigenobjects. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2019)

    Google Scholar 

  35. Wald, J., Dhamo, H., Navab, N., Tombari, F.: Learning 3D semantic scene graphs from 3D indoor reconstructions. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020)

    Google Scholar 

  36. Gopalan, N., Rosen, E., Konidaris, G., Tellex, S.: Simultaneously learning transferable symbols and language groundings from perceptual data for instruction following. In: Proceedings of Robotics: Science and Systems, Corvalis, Oregon, USA, July (2020). https://doi.org/10.15607/RSS.2020.XVI.102

  37. Rodríguez-Moreno, I., Martínez-Otzeta, J.M., Sierra, B., Rodriguez, I., Jauregi, E.: Video activity recognition: state-of-the-art. Sensors 19(14), 3160 (2019)

    Article  Google Scholar 

  38. Torabi, F., Warnell, G., Stone, P.: Behavioral cloning from observation. In: International Joint Conference on Artificial Intelligence, pp. 4950–4957 (2018)

    Google Scholar 

  39. Bıyık, E., Losey, D.P., Palan, M., Landolfi, N.C., Shevchuk, G., Sadigh, D.: Learning reward functions from diverse sources of human feedback: optimally integrating demonstrations and preferences. Int. J. Robot. Res. 41(1), 45–67 (2022)

    Article  Google Scholar 

  40. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012)

    Google Scholar 

  41. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)

    Google Scholar 

  42. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: IEEE International Conference on Computer Vision (2017)

    Google Scholar 

  43. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems (2017)

    Google Scholar 

  44. Brown, T., Mann, B., Ryder, N., Subbiah, M., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems (2020)

    Google Scholar 

  45. Belta, C., Bicchi, A., Egerstedt, M., Frazzoli, E., Klavins, E., Pappas, G.J.: Symbolic planning and control of robot motion [Grand challenges of robotics]. IEEE Robot. Autom. Mag. 14(1), 61–70 (2007)

    Article  Google Scholar 

  46. Kress-Gazit, H., Fainekos, G.E., Pappas, G.J.: Temporal-logic-based reactive mission and motion planning. IEEE Trans. Robot. 25(6), 1370–1381 (2009)

    Article  Google Scholar 

  47. Mo, K., Qin, Y., Xiang, F., Su, H., Guibas, L.: O2O-afford: annotation-free large-scale object-object affordance learning. In: Conference on Robot Learning (2022)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jen Jen Chung .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chung, J.J., Förster, J., Wulkop, P., Ott, L., Lawrance, N., Siegwart, R. (2023). It’s Just Semantics: How to Get Robots to Understand the World the Way We Do. In: Billard, A., Asfour, T., Khatib, O. (eds) Robotics Research. ISRR 2022. Springer Proceedings in Advanced Robotics, vol 27. Springer, Cham. https://doi.org/10.1007/978-3-031-25555-7_1

Download citation

Publish with us

Policies and ethics