Learning relationship between speech and image | IEEE Conference Publication | IEEE Xplore