End-to-end training image-text matching network | IEEE Conference Publication | IEEE Xplore