Conferences >2017 8th IEEE International C...

A prosody inspired RNN approach for punctuation of machine produced speech transcripts to improve human readability

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Speech communication human-machine interfaces exploit automatic speech recognition to implement speech-to-text conversion. Unfortunately, in the past, not much effort has...Show More

Metadata

Abstract:

Speech communication human-machine interfaces exploit automatic speech recognition to implement speech-to-text conversion. Unfortunately, in the past, not much effort has been devoted to add punctuation marks to the recognized word chain after speech recognition. This affects human readability and makes interpretation hard. This paper presents an effort to restore punctuation marks by keeping low the latency resulting from this post-processing step. The approach exploits the prosodic structure and proposes a sequential modelling paradigm based on recurrent neural networks. Results show satisfying punctuation restoration abilities, especially taking into account that sentence boundaries are reliably detected. Even if the predicted punctuation sequence is not error free w.r.t. writing standards, human perception is expected to “repair” these errors more easily compared to the case when no punctuation is given at all and the reader is left in confusion regarding the basic segmentation of the word chain.

Published in: 2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)

Date of Conference: 11-14 September 2017

Date Added to IEEE Xplore: 25 January 2018

ISBN Information:

DOI: 10.1109/CogInfoCom.2017.8268246

Conference Location: Debrecen, Hungary