Improving Medical Speech-to-Text Accuracy using Vision-Language Pre-training Models | IEEE Journals & Magazine | IEEE Xplore