Asee peer logo

Enhanced Speech Recognition via A TensorFlow-Powered Lip Reading Model for Educational Applications

Download Paper |

Conference

2024 South East Section Meeting

Location

Marietta, Georgia

Publication Date

March 10, 2024

Start Date

March 10, 2024

End Date

March 12, 2024

Tagged Topic

Diversity

Page Count

9

DOI

10.18260/1-2--45521

Permanent URL

https://peer.asee.org/45521

Download Count

127

Paper Authors

biography

Mourya Teja Kunuku Kennesaw state university

visit author page

Ph.D. student at Kennesaw State university. Research Interest include Deep learning, Generative AI, LLMs

visit author page

biography

Nasrin Dehbozorgi Kennesaw State University Orcid 16x16 orcid.org/0009-0004-2748-0654

visit author page

I’m an Assistant Professor of Software Engineering and the director of the AIET lab in the College of Computing and Software Engineering at Kennesaw State University. With a Ph.D. in Computer Science and prior experience as a software engineer in the industry, my interest in both academic and research activities has laid the foundation to work on advancing educational technologies and pedagogical interventions.

visit author page

Download Paper |

Abstract

Speech Recognition is a widely practiced technology and has a lot of applications in the academic domain and beyond. In educational research, AI-based speech recognition serves different purposes such as analysis of students’ team discussions, and classroom discourses, as well as assisting students with disabilities and hearing problems with transcriptions. However auditory speech recognition presents some challenges like environmental noise, poor audio quality, or even speaker identification in discourse analysis. This paper proposes an innovative approach to address these challenges by introducing a cutting-edge AI model for lip reading using Tensorflow. Our proposed model eliminates the need for auditory inputs in speech recognition, by utilizing artificial intelligence to analyze speech through visual cues of lip reading, also known as Visual Speech Recognition (VSR). The application of this novel method can significantly impact pedagogical practices. By providing a real-time transcription of speech from lip-reading into text, it offers an advanced assistive learning tool for students with disabilities and greatly enhances knowledge accessibility. Furthermore, it empowers educational researchers to analyze video content even in environments with degraded audio quality, especially in remote learning settings.

Kunuku, M. T., & Dehbozorgi, N. (2024, March), Enhanced Speech Recognition via A TensorFlow-Powered Lip Reading Model for Educational Applications Paper presented at 2024 South East Section Meeting, Marietta, Georgia. 10.18260/1-2--45521

ASEE holds the copyright on this document. It may be read by the public free of charge. Authors may archive their work on personal websites or in institutional repositories with the following citation: © 2024 American Society for Engineering Education. Other scholars may excerpt or quote from these materials with the same citation. When excerpting or quoting from Conference Proceedings, authors should, in addition to noting the ASEE copyright, list all the original authors and their institutions and name the host city of the conference. - Last updated April 1, 2015