Trilingual Semantic Embeddings of Visually Grounded Speech with Self-Attention Mechanisms | IEEE Conference Publication | IEEE Xplore