Design and Development of a Human-Machine Dialog Corpus for the Automated Assessment of Conversational English Proficiency

Ramanarayanan, Vikram

doi:10.21437/Interspeech.2020-1988

Design and Development of a Human-Machine Dialog Corpus for the Automated Assessment of Conversational English Proficiency

Vikram Ramanarayanan

This paper presents a carefully designed corpus of scored spoken conversations between English language learners and a dialog system to facilitate research and development of both human and machine scoring of dialog interactions. We collected speech, demographic and user experience data from non-native speakers of English who interacted with a virtual boss as part of a workplace pragmatics skill building application. Expert raters then scored the dialogs on a custom rubric encompassing 12 aspects of conversational proficiency as well as an overall holistic performance score. We analyze key corpus statistics and discuss the advantages of such a corpus for both human and machine scoring.

doi: 10.21437/Interspeech.2020-1988

Cite as: Ramanarayanan, V. (2020) Design and Development of a Human-Machine Dialog Corpus for the Automated Assessment of Conversational English Proficiency. Proc. Interspeech 2020, 419-423, doi: 10.21437/Interspeech.2020-1988

@inproceedings{ramanarayanan20_interspeech,
  author={Vikram Ramanarayanan},
  title={{Design and Development of a Human-Machine Dialog Corpus for the Automated Assessment of Conversational English Proficiency}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={419--423},
  doi={10.21437/Interspeech.2020-1988}
}