Asee peer logo

Is Natural Language Processing Effective in Education Research? A case study in student perceptions of TA support

Download Paper |

Conference

2023 ASEE Annual Conference & Exposition

Location

Baltimore , Maryland

Publication Date

June 25, 2023

Start Date

June 25, 2023

End Date

June 28, 2023

Conference Session

Research Methodologies – Session 2

Tagged Division

Educational Research and Methods Division (ERM)

Page Count

17

DOI

10.18260/1-2--43887

Permanent URL

https://peer.asee.org/43887

Download Count

149

Request a correction

Paper Authors

biography

Neha Kardam University of Washington

visit author page

Neha Kardam is a third-year Ph.D. student in Electrical and Computer Engineering at the University of Washington, Seattle.

visit author page

biography

Shruti Misra University of Washington

visit author page

I am a graduate student in Electrical and Computer Engineering at the University of Washington, Seattle. My research interest is broadly focused on studying innovation in university-industry partnerships. I am interesting in various ways that universities

visit author page

biography

Denise Wilson University of Washington

visit author page

Denise Wilson is a professor of electrical engineering at the University of Washington, Seattle. Her research interests in engineering education focus on the role of self-efficacy, belonging, and other non-cognitive aspects of the student experience on e

visit author page

Download Paper |

Abstract

Natural language processing (NLP) techniques are widely used in linguistic analysis and have shown promising results in areas such as text summarization, text classification, autocorrection, chatbot conversation management, and many other applications. In education, NLP has primarily been applied to automated essay or open-ended question grading, semantic evaluation of student work, or the generation of feedback for intelligent tutoring-based student interaction. However, what is notably missing from NLP work to date is a robust automated framework for accurately analyzing text-based educational survey data. To address this gap, this case study uses NLP models to generate codes for thematic analysis of student needs for teaching assistant (TA) support and then compares code assignments for NLP vs. those assigned by an expert researcher.

Student responses to short answer questions regarding preferences for TA support were collected from an instructional support survey conducted in a broad range of electrical, computer, and mechanical engineering courses between 2016-2021 in engineering (N>1400) at a large public research institution. The resulting dataset was randomly split into training (60%), validation (20%), and test set (20%). A popular NLP topic modeling approach (Latent Dirichlet Allocation—LDA) was applied to the training dataset, which determined the optimal number of topics of code represented in the dataset to be four. These four topics were labeled as: (1) examples, where students expressed a need for TAs to illustrate additional problem-solving and applied content in engineering courses; (2) questions and answers, where students desired more opportunities to pose questions to TAs and obtain timely answers to those questions; (3) office hours, encompassing additional availability outside of formally scheduled class times; and (4) lab support. For the testing and validation datasets, an experienced researcher then used these four labels as codes to identify the ground truth for each student's response. Ground truth was then compared to NLP model predictions to gauge the accuracy of the model. For the validation dataset, the accuracy with which NLP identified each response as containing or not containing each code ranged from 79.4% to 91.1%, while for the testing dataset, such accuracies ranged from 81.1 to 92.2%. The codes identified by NLP were then combined into themes by a human researcher, resulting in three themes (problem-solving, interactions, and active/experiential learning). Conclusions reached regarding the three themes were identical whether the NLP codes or (human) researcher codes were used for data interpretation. Short-answer questions, despite their value in providing deeper insight into the student experience, are infrequently used in educational research because the resulting data often requires prohibitive human resources to analyze. This study has demonstrated, in a case study of student preferences for TA support, the value of NLP in understanding large numbers of textual, short-answer responses from students. The fact that NLP models can deliver the same bottom line in minutes compared to the hours that traditional thematic analysis methods consume is promising for expanding the use of more nuanced, richer text-based data in survey-based education research.

Kardam, N., & Misra, S., & Wilson, D. (2023, June), Is Natural Language Processing Effective in Education Research? A case study in student perceptions of TA support Paper presented at 2023 ASEE Annual Conference & Exposition, Baltimore , Maryland. 10.18260/1-2--43887

ASEE holds the copyright on this document. It may be read by the public free of charge. Authors may archive their work on personal websites or in institutional repositories with the following citation: © 2023 American Society for Engineering Education. Other scholars may excerpt or quote from these materials with the same citation. When excerpting or quoting from Conference Proceedings, authors should, in addition to noting the ASEE copyright, list all the original authors and their institutions and name the host city of the conference. - Last updated April 1, 2015