Mechanical Engineering Department at Louisiana Tech University. She is also the Director of the Office for Women in Science and Engineering at Louisiana Tech.William C. Long, Louisiana Tech University ©American Society for Engineering Education, 2025WIP: Evaluating Programming Skills in the Age of LLMs: A HybridApproach to Student AssessmentAbstractThe advent of large language models (LLMs), such as OpenAI’s ChatGPT, has augmented thechallenge of assessing student understanding and ensuring academic integrity is maintained onhomework assignments. In a course with a heavy focus on programming, it is common to have asignificant portion of the grade be determined by such assignments. When an LLM is promptedwith the
Intelligence (AI) is no longer a subject of science fiction or a niche for specializedindustries. AI permeates everyday life, impacting how people work, communicate, and solveproblems locally and globally [1]. AI applications in higher education have grown significantlyin recent years, as evidenced by the adoption of AI-driven instructional design tools andapplications (e.g., Khan Academy's Khanmigo, ChatGPT for Education, MagicSchool), AI-enabled scientific literature search engines (e.g., Semantic Scholar, Consensus), collaborativeapplications (e.g., MS Teams), smart AI features in learning management systems (e.g., Canvas),and AI-based assistants (e.g., Grammarly, Canva).The widespread infusion of generative AI (GenAI) specifically marked a new
is struggling and resorts to outside assistance to complete the work.Introduction tudent cheating on programming homework assignments in introductoryScomputer science courses is a long standing trend [1-4], a problem that widespread access to large language models has substantially exacerbated such as ChatGPT. A survey from 2023 found that 30% of students frequently used GenAI tools for completing assignments [5]. Many academics are expressing concern that this may largely undermine learning processes and decrease academic integrity [6]. ow that advanced LLMs can generate content that is relatively indistinguishableNfrom human created content [7-11], cheating detection has become much more difficult. Research
for teaching 6th-grade Missouri math standards, incorporating project-based learning with LEGO Mindstorm cars and coordinate plane activities. These activities provide hands-on,engaging ways to connect coding with math standards, fostering both computational thinking and mastery ofgrade-level concepts. This provided a framework to implement advanced ML knowledge into STEM education bydeveloping practical methodologies to teach complex ML principles through easily accessible tools. For the high school students, an intentional choice was made to utilize Scratch, rather than Python for program-ming due to the influence of AI tools (such as ChatGPT) that can provide the entire Python code script. Scratchseemed a little more foolproof, as
educators.By removing technical barriers while maintaining pedagogical quality, we aim to support moreefficient and effective assessment creation processes across engineering disciplines. Future workwill focus on measuring this impact through detailed evaluation of system adoption patterns andeducational outcomes.References[1] J. Hassell, "Best Practices for Using Generative AI to Create Quiz Content for the CanvasLMS," 2024 ASEE Midwest Section Conference, ASEE, 2024.[2] S. Willison, "Things we learned about LLMs in 2024," SimonWillison.net, Dec. 31, 2024.[Online]. Available: https://simonwillison.net/2024/Dec/31/llms-in-2024/.[3] J. Yang et al., "Harnessing the Power of LLMs in Practice: A Survey on ChatGPT andBeyond," ACM Transactions on Knowledge
limitation was that we used a general-purpose GPT-4 model without any fine-tunedhuman annotator. Fine-tuning GPT-4 with human-annotated LO evaluation based on theSMART criteria may improve the LLM's performance. The third limitation was that although weused the SMART criteria, its criteria needed to be refined and evaluated by educational experts.This process will help us to design better guidelines for evaluating learning objectives. Lastly,we only used 1 LLM model (i.e., GPT-4) to evaluate LOs. Therefore, exploring the efficacy ofother LLM models and comparing their ability to assess LOs is necessary.References:[1] E. Kasneci et al., “ChatGPT for good? On opportunities and challenges of large language models for education,” Learn. Individ
. R. Adapa, and Y. E. V. P. K. Kuchi, “The Power of Generative AI: A Review of Requirements, Models, Input–Output Formats, Evaluation Metrics, and Challenges,” Future Internet, vol. 15, p. 260, 2023, doi: 10.3390/fi15080260.[15] A. K. Y. Chan and W. Hu, “Students’ voices on generative AI: perceptions, benefits, and challenges in higher education,” International Journal of Educational Technology in Higher Education, vol. 20, no. 1, p. 43, 2023, doi: 10.1186/s41239- 023-00411-8.[16] Tlili et al., “What if the devil is my guardian angel: ChatGPT as a case study of using chatbots in education,” Smart Learning Environments, vol. 10, no. 1, p. 15, 2023, doi: 10.1186/s40561-023-00237-x.[17] M. S
CHATGPT prompt engineering method for automatic question generation in English education,” Education and Information Technologies, vol. 29, no. 9, pp. 11483–11515, Oct. 2023. doi:10.1007/s10639-023-12249-8[10] I. Coffee, “Anki FSRS Explained,” Anki-decks.com, 2024. https://anki-decks.com/blog/post/anki-fsrs-explained/ (accessed Apr. 29, 2025).[11] B. C. Figueras and R. Agerri, “Critical Questions Generation: Motivation and Challenges,” arXiv.org, 2024. https://arxiv.org/abs/2410.14335 (accessed Apr. 29, 2025).[12] S. Mucciaccia, T. Paixão, F. Mutz, A. De Souza, C. Badue, and T. Oliveira-Santos, “Automatic Multiple-Choice Question Generation and Evaluation Systems Based on LLM: A Study Case With University Resolutions
lives. Thisis especially true now, since the world is in the midst of a number of controversies dealing withbiased data sets for training of neural networks, ChatGPT unfair uses, or the Elon Musk’s call fora moratorium on AI development. Results from this research will be used as preliminary findings while planning large-scale regionalresearch activities related to AI that could be supported by NSF, Amazon Machine LearningUniversity or the Department of Education. A collaborative network consisting of localschoolteachers interested in AI and AI-active university professors will be created to furtherpromote and implement AI in the K-12 curriculum. Partnership modalities with the AI4K12organization will be investigated to improve AI literacy
Systematic Literature Review,” in Frontiers in Education 2024, Washington DC, Oct. 2024.[5] L. Labadze, M. Grigolia, and L. Machaidze, “Role of AI chatbots in education: systematic literature review,” Int J Educ Technol High Educ, vol. 20, no. 1, p. 56, Oct. 2023, doi: 10.1186/s41239-023-00426-1.[6] B. Freeman and K. Aoki, “ChatGPT in education: A comparative study of media framing in Japan and Malaysia,” in Proceedings of the 2023 7th International Conference on Education and E-Learning, in ICEEL ’23. New York, NY, USA: Association for Computing Machinery, May 2024, pp. 26–32. doi: 10.1145/3637989.3638020.[7] S. Hadjerrouit, “Learning Management Systems Learnability: Requirements from Learning Theories,” presented at the
students were provided with anexample essay generated by ChatGPT on the topic of the engineering design process. As a class,we reviewed the essay, analyzing its strengths and identifying areas for correction orimprovement. We also explored ways to refine the prompt and discussed potential biases inChatGPT responses. The pre-survey and post-survey questions are detailed in Figure 1. Figure 1: The questions administered on the pre-survey and post-survey aligned with project learning goals for the freshman-level project class.Data Analysis for Freshman-Level Project ClassThe data analysis involved examining the pre-survey and post-survey data and conducting a finalanalysis to compare both surveys to determine growth in each
.[28] Kapor Center. Culturally responsive-sustaining computer science education: A framework, 2021. URL https://kaporfoundation.org/publications/.[29] OpenAI. ChatGPT, 2024. URL https://chatgpt.com.[30] Ryan L Boyd, Ashwini Ashokkumar, Sarah Seraj, and James W Pennebaker. The development and psychometric properties of LIWC-22. Technical report, University of Texas at Austin, 2022. URL https://www.liwc.app/.[31] Matthew L. Newman, James W. Pennebaker, Diane S. Berry, and Jane M. Richards. Lying Words: Predicting Deception from Linguistic Styles. Personality and Social Psychology Bulletin, 29(5):665–675, May 2003. ISSN 0146-1672, 1552-7433. doi: 10.1177/0146167203029005010. URL http://journals.sagepub.com/doi/10.1177
. Inparticular, natural language processing (NLP) a subset of gen-AI, enables computers to quicklyparse and understand text by identifying the meaningful parts of sentences [34]. Since the releaseof ChatGPT and similar chatbots, engineering education researchers have explored diverse usecases of NLP, including for analyzing student writing and assignments, examining curriculums,research data processing, student support, and assessment [35], [36], [37]. Recent work by ourresearch group [38] has also demonstrated the potential for NLP to aid qualitative thematicanalysis by expediting the codebook generation process. Importantly, these efforts takeadvantage of how NLP handles semantically and syntactically different text by identifyingpatterns between word
or health applications, on-device inference means datadoes not need to be transmitted to a server for processing, thus preserving user privacy. This alsosaves bandwidth and battery life [28], as transmitting and receiving are among the most energy-intensive tasks for IoT devices. Local ML models alleviate this burden, mitigate the risk of man-in-the-middle attacks, and enable customization, allowing the model to adapt to individual user needs.While highlighting the benefits of ML, we also addressed its challenges and limitations, suchas adversarial attacks, fairness concerns, and the need for explainable AI (XAI). Many students,having interacted with AI technologies like ChatGPT, were already familiar with AI’s potential forerror. However
published an ASEE conference paper last year on the effects of ChatGPT on student learning in programming courses. With over seven years of experience teaching Computer Science courses, she is currently a faculty member at Embry-Riddle Aeronautical University’s Department of Computer, Electrical, and Software Engineering, where she teaches computer science courses.Dr. Luis Felipe Zapata-Rivera, Embry-Riddle Aeronautical University Dr. Luis Felipe Zapata-Rivera is an Assistant Professor at Embry Riddle Aeronautical University. He earned a Ph.D. in Computer Engineering at Florida Atlantic University, in the past worked as an assistant researcher in the group of educational Technologies at Eafit University in Medellin
instance, a study by Escalante et al. (2023) [6] examined the learning outcomes ofuniversity students receiving feedback from ChatGPT (GPT-4) versus human tutors. Thecommon feature of these students was English is a New Language (ELN). The resultsindicated no significant difference in learning outcomes between the two groups, suggestingthat AI-generated feedback can be effectively incorporated into writing instruction. Otherstudies provide similar results within STEM learning environments. A recent systematicliterature review [7] identified 6 common categories of AI methods used in education from2011-2021. This work highlights the complexity and opportunities of the rapidly evolvingtechnology and how it can be integrated into learning environments