Learning Contextual Representation with Convolution Bank and Multi-head Self-attention for Speech Emphasis Detection | IEEE Conference Publication | IEEE Xplore