Asee peer logo

Integrating Data-Driven and Career Development Theory-Driven Approaches to Study High School Student Persistence in STEM Career Aspirations

Download Paper |

Conference

2024 ASEE Annual Conference & Exposition

Location

Portland, Oregon

Publication Date

June 23, 2024

Start Date

June 23, 2024

End Date

June 26, 2024

Conference Session

DSA Technical Session 6

Tagged Topic

Data Science & Analytics Constituent Committee (DSA)

Page Count

13

DOI

10.18260/1-2--47651

Permanent URL

https://peer.asee.org/47651

Download Count

71

Paper Authors

biography

tonghui xu University of Massachusetts, Lowell

biography

Hsien-Yuan Hsu University of Massachusetts, Lowell Orcid 16x16 orcid.org/0000-0003-2155-2093

visit author page

Dr. Hsien-Yuan Hsu is an Assistant Professor in Research and Evaluation in the College of Education at the University of Massachusetts Lowell. Dr. Hsu received his PhD in Educational Psychology from Texas A&M University and has a background of statistics

visit author page

Download Paper |

Abstract

High school students’ aspirations for STEM occupations can significantly influence their decisions to pursue a STEM track in college or as a career. Existing large-scale datasets, such as the Education Longitudinal Study of 2002 (ELS:2002), promise a comprehensive investigation of the factors that contribute to high school students' persistence in STEM career aspirations. Prior research on this topic often relies on the theory-driven approach to identify predictors and form hypotheses for statistical tests. Some commonly used theories explaining persistence in STEM career aspirations include Social Cognitive Career Theory (SCCT), Expectancy-Value Theory (EVT), and Expectation States Theory (EST). However, when using the theory-driven approach with large-scale dataset, challenges emerge. Many studies tend to rely on one theory to identify predictors, potentially missing out on the rich insights these datasets offer. Yet, employing multiple theories for predictor identification can lead to an overwhelming number of predictors. This is where the data-driven approach becomes beneficial. We can reduce the number of predictors identified from multiple theories based on the feature selection model. Notably, the predictors selected using this data-driven method remain interpretable since they are originally sourced from established theories. This study proposes a blended approach that integrates theory-driven and data-driven methods. We demonstrate this approach by analyzing the ELS:2002 dataset to construct a model explaining high school students’ persistence in STEM career aspirations. Initially, we use three theory-driven approaches to identify candidate predictors from ELS:2002, following SCCT, EVT, and EST frameworks to maximize data utilization. By using the Boruta algorithm, a data-driven method based on random forest classification, we streamline predictor selection from this extensive list to construct the final model. The analytical data comprises a total sample of 2,741 9th-graders from 361 high schools who expressed STEM career aspirations at the age of 30. The binary outcome variable is whether these students still have STEM career aspirations at the age of 30 in 12th grade. The procedure of implementing the approach includes (a) utilizing three theory-driven approaches to identify potential predictor variables, (b) using the Boruta in R 4.13 to distinguish important, tentative, and unimportant variables, and (c) clustering the important variables into subgroups and conducting different multilevel modeling models to determine the best model and investigate the relationship between student persistence in STEM career aspirations and predictors, while also exploring variability across schools. Out of the 81 candidate predictors chosen through three theory-driven approaches, a total of 17 important predictors were identified by Boruta. These predictors were linked to parental expectations, math performance, student success expectations, student educational expectations, SES, self-efficacy, and gender. Significant variables include self-efficacy, student educational expectations, student success expectations, and gender. The odds ratios show that (a) students with strong math self-efficacy have a higher likelihood of STEM career persistence and students with strong English self-efficacy are less likely to persist STEM career, (b) high educational expectations or high learning success expectations are associated with greater STEM career persistence, and (c) female students are more likely to have STEM career persistence compared to male students.

xu, T., & Hsu, H. (2024, June), Integrating Data-Driven and Career Development Theory-Driven Approaches to Study High School Student Persistence in STEM Career Aspirations Paper presented at 2024 ASEE Annual Conference & Exposition, Portland, Oregon. 10.18260/1-2--47651

ASEE holds the copyright on this document. It may be read by the public free of charge. Authors may archive their work on personal websites or in institutional repositories with the following citation: © 2024 American Society for Engineering Education. Other scholars may excerpt or quote from these materials with the same citation. When excerpting or quoting from Conference Proceedings, authors should, in addition to noting the ASEE copyright, list all the original authors and their institutions and name the host city of the conference. - Last updated April 1, 2015