March 24, 2021
March 24, 2021
March 26, 2021
Speech and language development in children are crucial for ensuring effective skills in their long term learning ability and the person’s life-long educational journey. A child’s vocabulary size at the time of kindergarten entry is an early indicator of learning to read and potential long-term success in school. The preschool classroom is thus a promising venue for monitoring growth in young children by measuring their interactions with teachers and classmates. Automatic Speech Recognition (ASR) technologies provide the ability to ‘Early Childhood’ researchers for automatically analyzing naturalistic recordings in these settings. For this purpose, data is collected in a high-quality childcare learning center in the United States using Language Environment Analysis (LENA) devices worn by the preschool children. A preliminary task for ASR of daylong audio recordings would involve diarization i.e. segmenting speech into smaller parts for identifying ‘who spoke when’. This study investigates different Deep Learning-based diarization systems for classroom interactions of 3-5 year old children. However, the focus is on ’speaker group’ diarization which includes classifying speech segments being from adults or children, from across multiple classrooms. SincNet based diarization systems achieve utterance level Diarization Error Rate of 21.6%. Utterance level speaker group confusion matrices also show promising, balanced results. These diarization systems have potential applications in developing metrics for adult-to-child or child-to-child rapid conversational turns in a naturalistic noisy early childhood setting. Such technical advancements will also help teachers better and more efficiently quantify and understand their interactions with children, make changes as needed, and monitor the impact of those changes.
Kothalkar, P. V., & Buzhardt, J., & Hansen, J. H. L., & Irvin, D., & Rous, B. S. (2021, March), Child vs Adult Speaker Diarization of naturalistic audio recordings in preschool environment using Deep Neural Networks Paper presented at ASEE 2021 Gulf-Southwest Annual Conference, Waco, Texas. https://peer.asee.org/36365
ASEE holds the copyright on this document. It may be read by the public free of charge. Authors may archive their work on personal websites or in institutional repositories with the following citation: © 2021 American Society for Engineering Education. Other scholars may excerpt or quote from these materials with the same citation. When excerpting or quoting from Conference Proceedings, authors should, in addition to noting the ASEE copyright, list all the original authors and their institutions and name the host city of the conference. - Last updated April 1, 2015