Efficient Audio-Visual Speaker Recognition Via Deep Multi-Modal Feature Fusion | IEEE Conference Publication | IEEE Xplore