Asee peer logo

Uncovering Students’ Social Networks: Entity Resolution Methods for Ambiguous Interaction Data

Download Paper |

Conference

2023 ASEE Annual Conference & Exposition

Location

Baltimore , Maryland

Publication Date

June 25, 2023

Start Date

June 25, 2023

End Date

June 28, 2023

Conference Session

COED Modulus Topics

Tagged Division

Computers in Education Division (COED)

Page Count

12

DOI

10.18260/1-2--44526

Permanent URL

https://peer.asee.org/44526

Download Count

222

Paper Authors

biography

Adam Steven Weaver Utah State University

visit author page

Adam Weaver is a B.S. Mechanical Engineering student at Utah State University. His research is focused on developing explicit and efficient disambiguation methods for large-scale social network studies. In addition, he works with applications of Particle Image Velocimetry (PIV) and wrote curriculum using PIV to teach energy conservation to high school students.

visit author page

biography

Jack Elliott Utah State University Orcid 16x16 orcid.org/0009-0006-5833-0557

visit author page

Jack Elliott is a concurrent M.S. (Mechanical Engineering) and Ph.D. (Engineering Education) graduate student at Utah State University. His M.S. research is in experimental fluid dynamics, his Ph.D. work examines student social support networks in engineering education, and his other research activities include developing low-cost technology-based tools for improving fluid dynamics education.

visit author page

Download Paper |

Abstract

The purpose of this paper is to describe our research group’s motivations and methods for resolving ambiguity in engineering students’ interaction data. Students’ social interactions can be an integral part of their success in schooling. For example, researchers have identified positive correlations between students’ interaction levels and students’ sense of belonging, access to knowledge, and ability to work in a team. To identify these relationships, engineering educators often employ a method called Social Network Analysis (SNA). SNA enables a quantitative look at inherently qualitative social interactions via mathematical representations of real-world social networks. With such mathematical representations, researchers can extract network measures, like centrality, which estimates how connected a student is to other students. Such measures, when compared to students’ outcomes of interest (e.g., academic success or belonging) provide insights into which network characteristics relate to desired outcomes.

However, the conclusions drawn from SNA are limited to the study’s scope. For example, online interaction data limits study conclusions to online networks. In face to face (f2f) networks, data collection and consolidation requirements (e.g., removing name variances and reducing missing network information) scale with study scope, making large-scale f2f SNA difficult. To balance authentic interactions with a manageable study scope, researchers often conduct SNA in small settings like single classrooms. Further, such studies often ask students to identify connections from a roster of student enrollment, which reduces the number of potential interactions. To analyze more authentic student networks, our research group is conducting an open response, large scale (1000+ nodes) study of a full 1st and 2nd year cohort of engineering students’ f2f and online social networks over two years.

During this study, our research group found that the primary issue in data consolidation is reference ambiguity (i.e., differences in response spelling, formatting, etc.). Entity resolution (the process of assigning ambiguous connections to real life entities) is a valuable method which researchers have applied in some f2f network contexts, but we have not observed in engineering education studies. To make entity resolution more available to engineering education researchers, this paper presents our development and deployment of a python-based entity resolution module: EntityRAID (Entity Resolution for Ambiguous Interaction Data).

EntityRAID begins by initializing a key with high-confidence names (i.e., self-reported and/or registry names). After initializing the key, EntityRAID compares resolved names (names in the key) to remaining ambiguous names through the Levenshtein distance (literal string similarity) and Double Metaphone (phonetic similarity) algorithms and consolidates similar pairs of names via user-defined thresholds. Lastly, EntityRAID resolves the remaining ambiguous names using a low-confidence key of non-participant full names.

EntityRAID is posted in a public GitHub Repository and will enable engineering educators to perform large-scale SNA at a reduced resource cost. This improvement should allow researchers to focus on more authentic student networks than previously studied, generating broader and more generalizable conclusions about which social practices help students succeed.

Weaver, A. S., & Elliott, J. (2023, June), Uncovering Students’ Social Networks: Entity Resolution Methods for Ambiguous Interaction Data Paper presented at 2023 ASEE Annual Conference & Exposition, Baltimore , Maryland. 10.18260/1-2--44526

ASEE holds the copyright on this document. It may be read by the public free of charge. Authors may archive their work on personal websites or in institutional repositories with the following citation: © 2023 American Society for Engineering Education. Other scholars may excerpt or quote from these materials with the same citation. When excerpting or quoting from Conference Proceedings, authors should, in addition to noting the ASEE copyright, list all the original authors and their institutions and name the host city of the conference. - Last updated April 1, 2015