Enhanced Speech Recognition via A TensorFlow-Powered Lip Reading Model for Educational Applications

Mourya Teja Kunuku; Nasrin Dehbozorgi

Download Paper | Permalink

Conference: 2024 South East Section Meeting
Location: Marietta, Georgia
Publication Date: March 10, 2024
Start Date: March 10, 2024
End Date: March 12, 2024
Tagged Topic: Diversity
Page Count: 9
DOI: 10.18260/1-2--45521
Permanent URL: https://peer.asee.org/45521
Download Count: 324

Paper Authors

biography

Mourya Teja Kunuku Kennesaw state university

visit author page

Ph.D. student at Kennesaw State university. Research Interest include Deep learning, Generative AI, LLMs

visit author page

biography

Nasrin Dehbozorgi Kennesaw State University orcid.org/0009-0004-2748-0654

visit author page

I’m an Assistant Professor of Software Engineering and the director of the AIET lab in the College of Computing and Software Engineering at Kennesaw State University. With a Ph.D. in Computer Science and prior experience as a software engineer in the industry, my interest in both academic and research activities has laid the foundation to work on advancing educational technologies and pedagogical interventions.

visit author page

Download Paper | Permalink

Abstract

Speech Recognition is a widely practiced technology and has a lot of applications in the academic domain and beyond. In educational research, AI-based speech recognition serves different purposes such as analysis of students’ team discussions, and classroom discourses, as well as assisting students with disabilities and hearing problems with transcriptions. However auditory speech recognition presents some challenges like environmental noise, poor audio quality, or even speaker identification in discourse analysis. This paper proposes an innovative approach to address these challenges by introducing a cutting-edge AI model for lip reading using Tensorflow. Our proposed model eliminates the need for auditory inputs in speech recognition, by utilizing artificial intelligence to analyze speech through visual cues of lip reading, also known as Visual Speech Recognition (VSR). The application of this novel method can significantly impact pedagogical practices. By providing a real-time transcription of speech from lip-reading into text, it offers an advanced assistive learning tool for students with disabilities and greatly enhances knowledge accessibility. Furthermore, it empowers educational researchers to analyze video content even in environments with degraded audio quality, especially in remote learning settings.

Citation
Format

Kunuku, M. T., & Dehbozorgi, N. (2024, March), Enhanced Speech Recognition via A TensorFlow-Powered Lip Reading Model for Educational Applications Paper presented at 2024 South East Section Meeting, Marietta, Georgia. 10.18260/1-2--45521