ISCA Archive SLaTE 2019
ISCA Archive SLaTE 2019

Tagging child-adult interactions in naturalistic, noisy, daylong school environments using i-vector based diarization system

Prasanna V. Kothalkar, Dwight Irvin, Ying Luo, Joanne Rojas, John Nash, Beth Rous, John H. L. Hansen

Assessing child growth in terms of speech and language is a crucial indicator of long term learning ability and life-long progress. Since the preschool classroom provides a potent opportunity for monitoring growth in young children's interactions, analyzing such data has come into prominence for early childhood researchers. The foremost task of any analysis of such naturalistic recordings would involve parsing and tagging the interactions between adults and young children. An automated tagging system will provide child interaction metrics and would be important for any further processing. This study investigates the language environment of 3-5 year old children using a CRSS based diarization strategy employing an i-vector-based baseline that captures adult-to-child or child-to-child rapid conversational turns in a naturalistic noisy early childhood setting. We provide analysis of various loss functions and learning algorithms using Deep Neural Networks to separate child speech from adult speech. Performance is measured in terms of diarization error rate, Jaccard error rate and shows good results for tagging adult vs. children's speech. Distinction between primary and secondary child would be useful for monitoring a given child and analysis is provided for the same. Our diarization system provides insights into the direction for pre-processing and analyzing challenging naturalistic daylong child speech recordings.


doi: 10.21437/SLaTE.2019-17

Cite as: Kothalkar, P.V., Irvin, D., Luo, Y., Rojas, J., Nash, J., Rous, B., Hansen, J.H.L. (2019) Tagging child-adult interactions in naturalistic, noisy, daylong school environments using i-vector based diarization system. Proc. 8th ISCA Workshop on Speech and Language Technology in Education (SLaTE 2019), 89-93, doi: 10.21437/SLaTE.2019-17

@inproceedings{kothalkar19_slate,
  author={Prasanna V. Kothalkar and Dwight Irvin and Ying Luo and Joanne Rojas and John Nash and Beth Rous and John H. L. Hansen},
  title={{Tagging child-adult interactions in naturalistic, noisy, daylong school environments using i-vector based diarization system}},
  year=2019,
  booktitle={Proc. 8th ISCA Workshop on Speech and Language Technology in Education (SLaTE 2019)},
  pages={89--93},
  doi={10.21437/SLaTE.2019-17}
}