Board 408: Toward Building a Human-Computer Coding Partnership: Using Machine Learning to Analyze Short-Answer Explanations to Conceptually Challenging Questions

Harpreet Auby; Namrata Shivagunde; Anna Rumshisky; Milo Koretsky

Download Paper | Permalink

Conference: 2024 ASEE Annual Conference & Exposition
Location: Portland, Oregon
Publication Date: June 23, 2024
Start Date: June 23, 2024
End Date: July 12, 2024
Conference Session: NSF Grantees Poster Session
Tagged Topic: NSF Grantees Poster Session
Permanent URL: https://peer.asee.org/46996

Request a correction

Paper Authors

biography

Harpreet Auby Tufts University orcid.org/0000-0002-0117-6097

visit author page

Harpreet is a graduate student in Chemical Engineering and STEM Education. He works with Dr. Milo Koretsky and helps study the role of learning assistants in the classroom as well as machine learning applications within educational research and evaluation. He is also involved in projects studying the uptake of the Concept Warehouse. His research interests include chemical engineering education, learning sciences, and social justice.

visit author page

author page

Milo Koretsky is the McDonnell Family Bridge Professor in the Department of Chemical and Biological Engineering and in the Department of Education at Tufts University. He is also co-Director of the Institute for Research on Learning and Instruction (IRLI). He received his B.S. and M.S. degrees from UC San Diego and his Ph.D. from UC Berkeley, all in chemical engineering.

visit author page

Download Paper | Permalink

Abstract

In this paper, we report on the progress of a collaboration between engineering education and machine learning researchers to analyze student thinking in written short-answer responses to conceptually challenging questions using machine learning. Eliciting short-answer explanations that ask students to justify their answer choice to conceptually challenging multiple-choice questions has been shown to improve students’ answer choices, engagement, and overall conceptual understanding [1], [2]. Additionally, these short-answer responses provide valuable information for instructors and researchers to gain insight into student thinking [3]; however, analyzing these responses is cumbersome. Previous work utilizing natural language processing in education research has shown that Large Language Models (LLMs) such as T5 [4] and GPT-3 [5] are capable of coding the student responses reaching F1 scores up to 73% when trained on or prompted with in-context examples respectively [4]. Thus, using NLP to qualitatively code short-answer responses can help researchers and instructors gain more information about student thinking. We have the following goals: - For instructors, we want to create a tool to help them learn about patterns of student reasoning and sense-making in short-answer responses. Utilizing this information can help them shift their instructional practices. - For education researchers, we want to create a tool to help them understand and code aspects of student thinking in short-answer responses that can help them develop codes or themes for future study. - For machine learning researchers, we aim to develop language models and a set of prompting strategies to code student answers. The models should be able to identify and annotate the key concept and reasoning behind the answer choice in the given text. We hope to develop language models and prompting strategies which can generalize on newer science questions which can help instructors use it as a tool to gain deeper insights into students’ understanding of the concept.

At the 2022 ASEE Annual Meeting [6], we described the preliminary results of work done to apply large pre-trained generative sequence-to-sequence language models [4], [5] to automate qualitative coding of short-answer explanations to a statics concept question. At the 2023 Annual Meeting [7], we began to conceptualize a human-computer partnership where human coding and computer coding can influence one another to better analyze student narratives of understanding. Additionally, we began thinking about promoting linguistic justice in our coding processes to ensure all narratives of understanding are attended to. This paper describes the progress we are making to improve our prompting strategies for GPT-4 and finetuning other open source LLMs like Llama-2 [8] on the manually coded answers, extending qualitative and machine learning analysis to another engineering context, as well as what we are doing to conceptualize a human-machine partnership to understand student thinking in written short-answer responses.

Citation
Format

Auby, H., & Shivagunde, N., & Rumshisky, A., & Koretsky, M. (2024, June), Board 408: Toward Building a Human-Computer Coding Partnership: Using Machine Learning to Analyze Short-Answer Explanations to Conceptually Challenging Questions Paper presented at 2024 ASEE Annual Conference & Exposition, Portland, Oregon. https://peer.asee.org/46996

TY  - CPAPER
AB  - In this paper, we report on the progress of a collaboration between engineering education and machine learning researchers to analyze student thinking in written short-answer responses to conceptually challenging questions using machine learning. Eliciting short-answer explanations that ask students to justify their answer choice to conceptually challenging multiple-choice questions has been shown to improve students’ answer choices, engagement, and overall conceptual understanding [1], [2]. Additionally, these short-answer responses provide valuable information for instructors and researchers to gain insight into student thinking [3]; however, analyzing these responses is cumbersome. Previous work utilizing natural language processing in education research has shown that Large Language Models (LLMs) such as T5 [4] and GPT-3 [5] are capable of coding the student responses reaching F1 scores up to 73% when trained on or prompted with in-context examples respectively [4]. Thus, using NLP to qualitatively code short-answer responses can help researchers and instructors gain more information about student thinking. We have the following goals: 
- For instructors, we want to create a tool to help them learn about patterns of student reasoning and sense-making in short-answer responses. Utilizing this information can help them shift their instructional practices. 
- For education researchers, we want to create a tool to help them understand and code aspects of student thinking in short-answer responses that can help them develop codes or themes for future study. 
- For machine learning researchers, we aim to develop language models and a set of prompting strategies to code student answers.  The models should be able to identify and annotate the key concept and reasoning behind the answer choice in the given text.  We hope to develop language models and prompting strategies which can generalize on newer science questions which can help instructors use it as a tool to gain deeper insights into students’ understanding of the concept.

At the 2022 ASEE Annual Meeting [6], we described the preliminary results of work done to apply large pre-trained generative sequence-to-sequence language models [4], [5] to automate qualitative coding of short-answer explanations to a statics concept question. At the 2023 Annual Meeting [7], we began to conceptualize a human-computer partnership where human coding and computer coding can influence one another to better analyze student narratives of understanding. Additionally, we began thinking about promoting linguistic justice in our coding processes to ensure all narratives of understanding are attended to. This paper describes the progress we are making to improve our prompting strategies for GPT-4 and finetuning other open source LLMs like Llama-2 [8] on the manually coded answers, extending qualitative and machine learning analysis to another engineering context, as well as what we are doing to conceptualize a human-machine partnership to understand student thinking in written short-answer responses.
AU  - Harpreet Auby
AU  - Namrata Shivagunde
AU  - Anna Rumshisky
AU  - Milo Koretsky
CY  - Portland, Oregon
DA  - 2024/06/23
PB  - ASEE Conferences
TI  - Board 408: Toward Building a Human-Computer Coding Partnership: Using Machine Learning to Analyze Short-Answer Explanations to Conceptually Challenging Questions
UR  - https://peer.asee.org/46996
ER  -