Asee peer logo

An Automated Approach for Finding Course-specific Vocabulary

Download Paper |


2013 ASEE Annual Conference & Exposition


Atlanta, Georgia

Publication Date

June 23, 2013

Start Date

June 23, 2013

End Date

June 26, 2013



Conference Session

First-Year Programs (FPD) Poster Session

Tagged Division

First-Year Programs

Page Count


Page Numbers

23.155.1 - 23.155.14



Permanent URL

Download Count


Request a correction

Paper Authors


Chirag Variawa University of Toronto

visit author page

Chirag Variawa is a Ph.D. candidate in Industrial Engineering at the University of Toronto. His research is in using artificial intelligence to maximize the accessibility of language used in engineering education instructional materials. His work on the Board of Governors at the University of Toronto further serves to improve accessibility for all members of the university community.

visit author page


Susan McCahan University of Toronto

visit author page

Dr. Susan McCahan is vice-dean, Undergraduate, and is a professor in the Department of Mechanical and Industrial Engineering in the Faculty of Applied Science and Engineering at the University of Toronto.

visit author page


Mark Chignell University of Toronto

visit author page

Mark Chignell is a professor of Mechanical and Industrial Engineering at the University of Toronto where he has been on the faculty since 1990. Prior to that he was an assistant professor in Industrial and Systems Engineering at the University of Southern California from 1984 to 1990. He earned the Ph.D. in Psychology from the University of Canterbury in New Zealand in 1981, and an M.S. in Industrial and Systems Engineering from Ohio State in 1984. Mark is currently president of Vocalage Inc., a University of Toronto spinoff company, director of the Interactive Media Lab, and a visiting scientist at both the IBM Centre for Advanced Studies and Keio University in Japan.

visit author page

Download Paper |


Evaluating a Computational Approach to Identify Domain-specific VocabularyUnderstanding domain-specific vocabulary is often a learning objective in the engineeringcurriculum. This vocabulary includes “technical jargon” in an engineering discipline andlearning the vocabulary of the field is an important technical competency. Final exams, forinstance, include such terms as they attempt to evaluate a students’ mastery of course concepts.These terms may be commonly used in a course, but are unfamiliar particularly to a student newto the field, i.e. a freshman. The corpus of language common to both the instructor and studentconverges as the student masters the domain vocabulary. However, at the freshman-level thedifference in vocabulary between the student and instructor is greatest. Language also plays akey role in creating an accessible and inclusive learning environment. Research suggests thatnew students may experience a sense of alienation in the classroom due in part to the differencein the way language is used in the learning environment relative to their home community. Thisgap in common vocabulary forms a communication-barrier that prohibits access to learning andundermines valid assessment.In this study, the authors attempt to systematically identify domain-specific vocabulary using acomputational and statistical approach to analyze the language used on engineering exams. Thegoal is to create an automated system to identify domain-specific vocabulary on exams or otherteaching materials to help both the instructor teach, and the students navigate, the language of thefield more effectively.The authors employed strategies from the fields of higher education, industrial engineering,computational linguistics, and statistics to create an algorithmic approach to identify andcategorize domain-specific language on engineering examinations. A critical component of thisstudy is the ability to distinguish domain-specific vocabulary from “everyday” language. Thegoal is to preserve the integrity of the exam and help faculty and students navigate the learningof technical language more effectively. Specifically, the authors are developing a computerprogram which automatically identifies domain-specific terms on any document. The programhas been tested on a databank of over 2800 exams (developed from 2000 to present) from a largeNorth American university. Specifically, each word from each exam is examined using a Term-Frequency Inverse Document-Frequency (TF-IDF) algorithm to generate lists of characteristicterms. Then, these lists are compared using IBM SPSS statistics software to further isolatediscipline-specific vocabulary.This study analyses the effectiveness of this program across several engineering disciplines tosee whether the language is categorized accurately. To date, the authors have analyzed thevocabulary of 15 exams in detail and results indicate that the program is able to accuratelyidentify discipline-specific words. Results of this work will be presented and an analysis of thedata will be included in the paper. Going forward, the authors anticipate that this method canalso help flag low-frequency non-domain-specific language on engineering exams and eventuallylead to software that can highlight potentially inaccessible language: both technical vocabularyand separately non-technical but linguistically or culturally difficult vocabulary that may requireexplanation.

Variawa, C., & McCahan, S., & Chignell, M. (2013, June), An Automated Approach for Finding Course-specific Vocabulary Paper presented at 2013 ASEE Annual Conference & Exposition, Atlanta, Georgia. 10.18260/1-2--19169

ASEE holds the copyright on this document. It may be read by the public free of charge. Authors may archive their work on personal websites or in institutional repositories with the following citation: © 2013 American Society for Engineering Education. Other scholars may excerpt or quote from these materials with the same citation. When excerpting or quoting from Conference Proceedings, authors should, in addition to noting the ASEE copyright, list all the original authors and their institutions and name the host city of the conference. - Last updated April 1, 2015