Asee peer logo

Using Deep Learning and Augmented Reality to Improve Accessibility: Inclusive Conversations Using Diarization, Captions, and Visualization

Download Paper |

Conference

2023 ASEE Annual Conference & Exposition

Location

Baltimore , Maryland

Publication Date

June 25, 2023

Start Date

June 25, 2023

End Date

June 28, 2023

Conference Session

Design in Engineering Education Division (DEED) Technical Session 1

Tagged Division

Design in Engineering Education Division (DEED)

Tagged Topic

Diversity

Page Count

13

DOI

10.18260/1-2--44572

Permanent URL

https://peer.asee.org/44572

Download Count

254

Request a correction

Paper Authors

biography

Yun Wang Undergraduate at University of Illinois Urbana-Champaign

visit author page

Yun Wang is an Undergraduate at University of Illinois Urbana-Champaign. Research interests include using technology, algorithms to improve accessibility and inclusive education.”

visit author page

author page

Colin P. Lualdi University of Illinois Urbana-Champaign Orcid 16x16 orcid.org/0000-0003-2309-4807

biography

Lawrence Angrave University of Illinois Urbana-Champaign Orcid 16x16 orcid.org/0000-0001-9762-7181

visit author page

Dr. Lawrence Angrave is an award-winning computer science Teaching Professor at the University of Illinois Urbana-Champaign. He creates and researches new opportunities for accessible and inclusive equitable education.

visit author page

author page

Guru Nanma Purushotam

Download Paper |

Abstract

The problem of diarization - identifying different speakers in a conversation stream - has not been sufficiently addressed for deaf and hard-of-hearing students in learning communities such as student design teams in engineering and related STEM disciplines. Though the accuracy of the latest automated real-time speech-to-text systems is now approaching usable low word error rates, the generated text output is an incomplete representation of a multi-party conversation; In short, it solves the “what” but not the “who.” This creates barriers to our ideal of an inclusive and equitable learning community. Thus students who are deaf or hard of hearing are further marginalized and excluded from multi-party peer discussions with non-deaf participants because it is hard to visually follow who is speaking. To address these communication barriers, we utilized the Human Centered Engineering Design framework to identify a set of features that overcomes the above barriers. This paper explores computerized diarization techniques that utilize a wide set of algorithms and audio metrics to assist in speaker identification. These techniques include mel-frequency cepstrum coefficients (MFCC), volume, fundamental frequency identification, and deep learning of voice prints. For the goals described in this paper, a subset of existing algorithms that respected privacy and legal constraints was selected and evaluated for the purposes of identifying speakers using a live audio stream. Several visualization methods were also designed and evaluated. These included visualization of embedding mel-frequency cepstrum, speaker identifier, pitch, volume, and other voice characteristics into a live caption stream. Both diarization and visualization were integrated into a live captioning tool, ScribeAR, previously introduced in ASEE regional proceedings, and rendered using a lightweight Augmented Reality display. In order to facilitate captioning services in areas with limited network connectivity, whisper.cpp, a derivative of OpenAI’s Whisper project, was also incorporated into the application. Links to the open source project are included so that other educators may adopt this inclusive practice. Some accessibility-related opportunities that could be used as motivating design projects for engineering students are described.

Wang, Y., & Lualdi, C. P., & Angrave, L., & Purushotam, G. N. (2023, June), Using Deep Learning and Augmented Reality to Improve Accessibility: Inclusive Conversations Using Diarization, Captions, and Visualization Paper presented at 2023 ASEE Annual Conference & Exposition, Baltimore , Maryland. 10.18260/1-2--44572

ASEE holds the copyright on this document. It may be read by the public free of charge. Authors may archive their work on personal websites or in institutional repositories with the following citation: © 2023 American Society for Engineering Education. Other scholars may excerpt or quote from these materials with the same citation. When excerpting or quoting from Conference Proceedings, authors should, in addition to noting the ASEE copyright, list all the original authors and their institutions and name the host city of the conference. - Last updated April 1, 2015