Skip to main content

Audio-Visual Hybrid Approach for Filling Mass Estimation

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12668))

Abstract

Object handover is a fundamental and essential capability for robots interacting with humans in many applications such as household chores. In this challenge, we estimate the physical properties of a variety of containers with different fillings such as container capacity and the type and percentage of the content to achieve collaborative physical handover between humans and robots. We introduce multi-modal prediction models using audio-visual-datasets of people interacting with containers distributed by CORSMAL.

R. Ishikawa and Y. Nagao—Equal contribution.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://corsmal.eecs.qmul.ac.uk/ICPR2020challenge.html.

  2. 2.

    https://github.com/YuichiNAGAO/ICPRchallenge2020.

References

  1. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)

    Google Scholar 

  2. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

    Google Scholar 

  3. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–80 (1997)

    Google Scholar 

  4. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations (ICLR) (2015)

    Google Scholar 

  5. Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  6. Liu, F.T., Ting, K.M., Zhou, Z.: Isolation forest. In: Eighth IEEE International Conference on Data Mining (ICDM), pp. 413–422 (2008)

    Google Scholar 

  7. Logan, B.: Mel frequency cepstral coefficients for music modeling. In: International Symposium on Music Information Retrieval (2000)

    Google Scholar 

  8. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on International Conference on Machine Learning (ICML), pp. 807–814 (2010)

    Google Scholar 

  9. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems (Neurips), vol. 32, pp. 8026–8037. Curran Associates, Inc. (2019)

    Google Scholar 

  10. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations (ICLR) (2015)

    Google Scholar 

  11. Wang, X., Kong, T., Shen, C., Jiang, Y., Li, L.: SOLO: segmenting objects by locations. In: European Conference on Computer Vision (ECCV) (2020)

    Google Scholar 

  12. Xompero, A., Sanchez-Matilla, R., Mazzon, R., Cavallaro, A.: CORSMAL Containers Manipulation (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Reina Ishikawa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ishikawa, R., Nagao, Y., Hachiuma, R., Saito, H. (2021). Audio-Visual Hybrid Approach for Filling Mass Estimation. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12668. Springer, Cham. https://doi.org/10.1007/978-3-030-68793-9_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-68793-9_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-68792-2

  • Online ISBN: 978-3-030-68793-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics