Affordance learning and inference based on vision-speech association in human-robot interactions | IEEE Conference Publication | IEEE Xplore