Deep learning based audio and video cross-modal recommendation | IEEE Conference Publication | IEEE Xplore