A Matlab Tool For Speech Processing, Analysis And Recognition: Sar Lab

Veton Kepuska; Mihir Patal; Nicholas Rogers

Download Paper | Permalink

Conference: 2006 Annual Conference & Exposition
Location: Chicago, Illinois
Publication Date: June 18, 2006
Start Date: June 18, 2006
End Date: June 21, 2006
ISSN: 2153-5965
Conference Session: Innovative and Computer-Assisted Lab Studies
Tagged Division: Division Experimentation & Lab-Oriented Studies
Page Count: 19
Page Numbers: 11.67.1 - 11.67.19
DOI: 10.18260/1-2--263
Permanent URL: https://peer.asee.org/263
Download Count: 8791

Paper Authors

biography

Veton Kepuska Florida Tech

visit author page

Kepuska has joined FIT in 2003 after past 12 years of R&D experience in high-tech industry in Boston area in developing speech recognition technologies. Presented work is partially the result of the belief that cutting edge research can only be conducted with appropriate supporting software tools. In order to bring that cutting edge research to undergraduate level, the software tools have to be not only easy to use but also intuitive. Thus, presented SAR-LAB software was designed and developed with a clear goal in mind to evolve into a standard educational as well as research tool in the area of speech processing, analysis and recognition.

visit author page

biography

Mihir Patal Florida Tech

visit author page

As un undergraduate student Mihir was part of the NSF funded team develping MATLAB tool for Speech Processing, Analysis and Recognition: SAR-LAB. Mihir served a crucial role in design and execution phase of the project.

visit author page

biography

Nicholas Rogers Florida Tech

visit author page

Was an undergraduate student when involved in partialy fundened research in Machine Learning. As part of this research, MATLAB based research tool: SAR-LAB emerged. Nicholas played a crucial role in development of such a tool.

visit author page

Download Paper | Permalink

Abstract
NOTE: The first page of text has been automatically extracted and included below in lieu of an abstract

A MATLAB Tool for Speech Processing, Analysis and Recognition: SAR-LAB Abstract

Presented work is related to research performed in developing a “smart-room.” A smart-room can sense all voice activity within the room and pinpoint the source of the audio signal (speaker). The purpose of this audio sensing is two-fold: to monitor for key words, sentences, or phrases that are flagged as relevant by the monitoring entity as well as separation of all acoustic sources from each other (e.g., background noise from the speakers voice) Crucial requirement in successfully creating such a smart-room is the accurate (in terms of recognition performance), efficient (CPU and memory), and consistent recognition of speech (must work equally well for all speakers; i.e., speaker independent, as well as all acoustic environments). To achieve this goal it becomes necessary to develop tools that enable for advanced research in the area of speech processing, analysis and recognition, specifically in this case wake-up-wordi [WUW] recognition. In developing such a system numerous tests of various system models are necessary. Modules ranging from audio signal processing functions and feature extraction, voice activity detection, pattern classification, scoring algorithms, etc., must be combined in order to perform speech recognition. Thus, a major hurdle in this area of research is the analysis, testing, verification, and integration of the individual functions required for speech recognition. To address the analysis and testing issue an appropriate software tool is developed using MATLAB environment that enabled unified framework for tracking the performance of all necessary functions of WUW recognition system. This framework can also be used for testing algorithms and other software components performing speech analysis and recognition tasks. In addition to integrating all of the various components, testing environment can produce additional analysis data all appropriately presented as graphs, charts or images (e.g., spectrogram) that are useful when analyzing and/or troubleshooting such components that are under research. This testing environment has proven to be very useful in aiding research in development of “wake-up word” recognition technology. This tool thus has made research process much more efficient, accurate, and productive.

Introduction

The primary objective of presented work was to develop a speech recognition engine - analysis and testing environment in MATLAB. The problem encountered when working with speech recognition projects is the fact that the processed data comes in the form of a large collection of vectors (e.g., matrix) that typically represent energies of a speech sounds at various frequency bands [1]. Developed testing utility is extremely useful because it provides visual representation of various complex parameters represented as patters, vectors or scalars extracted from time- dependent speech signal. In addition, there are various specific features, original or derived, that are traditionally difficult to analyze due to interdependency and time dependency. It is envisioned that the testing utility will greatly benefit future speech application developers due to its versatility and ease of extensibility. Other future uses include possible integration of the tool into newer versions of MATLAB.

Citation
Format

Kepuska, V., & Patal, M., & Rogers, N. (2006, June), A Matlab Tool For Speech Processing, Analysis And Recognition: Sar Lab Paper presented at 2006 Annual Conference & Exposition, Chicago, Illinois. 10.18260/1-2--263