Asee peer logo

Synthesis of clustering techniques in educational data mining

Download Paper |

Conference

2017 ASEE Annual Conference & Exposition

Location

Columbus, Ohio

Publication Date

June 24, 2017

Start Date

June 24, 2017

End Date

June 28, 2017

Conference Session

Computing Technology Session 1

Tagged Division

Computers in Education

Tagged Topic

Diversity

Page Count

20

DOI

10.18260/1-2--28897

Permanent URL

https://peer.asee.org/28897

Download Count

520

Request a correction

Paper Authors

author page

Doipayan Roy Purdue University

biography

Peter Bermel Purdue University, West Lafayette (College of Engineering) Orcid 16x16 orcid.org/0000-0001-7140-0667

visit author page

DR. PETER BERMEL is an assistant professor of Electrical and Computer Engineering at Purdue University. His research focuses on improving the performance of photovoltaic, thermophotovoltaic, and nonlinear systems using the principles of nanophotonics. Key enabling techniques for his work include electromagnetic and electronic theory, modeling, simulation, fabrication, and characterization.

Dr. Bermel is widely-published in both scientific peer-reviewed journals and publications geared towards the general public. His work, which has been cited over 4400 times, for an h-index value of 24, includes the following topics:
* Understanding and optimizing the detailed mechanisms of light trapping in thin-film photovoltaics
* Fabricating and characterizing 3D inverse opal photonic crystals made from silicon for photovoltaics, and comparing to theoretical predictions
* Explaining key physical effects influencing selective thermal emitters in order to achieve high performance thermophotovoltaic systems

visit author page

biography

Kerrie A Douglas Purdue University, West Lafayette (College of Engineering) Orcid 16x16 orcid.org/0000-0002-2693-5272

visit author page

Dr. Douglas is an Assistant Professor in the Purdue School of Engineering Education. Her research is focused on methods of assessment and evaluation unique to engineering learning contexts.

visit author page

biography

Heidi A. Diefes-Dux Purdue University, West Lafayette (College of Engineering) Orcid 16x16 orcid.org/0000-0003-3635-1825

visit author page

Heidi A. Diefes-Dux is a Professor in the School of Engineering Education at Purdue University. She received her B.S. and M.S. in Food Science from Cornell University and her Ph.D. in Food Process Engineering from the Department of Agricultural and Biological Engineering at Purdue University. She is a member of Purdue’s Teaching Academy. Since 1999, she has been a faculty member within the First-Year Engineering Program, teaching and guiding the design of one of the required first-year engineering courses that engages students in open-ended problem solving and design. Her research focuses on the development, implementation, and assessment of modeling and design activities with authentic engineering contexts.

visit author page

biography

Michael Richey The Boeing Company

visit author page

Michael Richey is an Associate Technical Fellow currently assigned to support workforce development and engineering education research. Michael is responsible for leading learning science research, which focuses on learning ecologies, complex adaptive social systems and learning curves. Michael pursues this research agenda with the goal of understanding the interplay between innovation, knowledge transfer and economies of scale as they are manifested in questions of growth, evolvability, adaptability and sustainability.

Additional responsibilities include providing business leadership for engineering technical and professional educational programs. This includes topics in advanced aircraft construction, composites structures and product lifecycle management. Michael is responsible for leading cross-organizational teams from academic, government focusing on how engineering education must acknowledge and incorporate this new information and knowledge to build new methodologies and paradigms that engage these developments in practice. The objective of this research is focused on achieving continuous improvement and sustainable excellence in engineering education.

visit author page

biography

Krishna Madhavan Purdue University, West Lafayette (College of Engineering)

visit author page

Dr. Krishna Madhavan is an Associate Professor in the School of Engineering Education. In 2008 he was awarded an NSF CAREER award for learner-centric, adaptive cyber-tools and cyber-environments using learning analytics. He leads a major NSF-funded project called Deep Insights Anytime, Anywhere (http://www.dia2.org) to characterize the impact of NSF and other federal investments in the area of STEM education. He also serves as co-PI for the Network for Computational Nanotechnology (nanoHUB.org) that serves hundreds of thousands of researchers and learners worldwide. Dr. Madhavan served as a Visiting Researcher at Microsoft Research (Redmond) focusing on big data analytics using large-scale cloud environments and search engines. His work on big data and learning analytics is also supported by industry partners such as The Boeing Company. He interacts regularly with many startups and large industrial partners on big data and visual analytics problems.

visit author page

author page

Siddharth Shah

Download Paper |

Abstract

Synthesis of clustering techniques in educational data mining

Problem Statement:

With the increasing demand for high quality education coupled with geographic and logistic limitations of the traditional in-class education system, educational institutions are resorting to alternate forms of knowledge dissemination through online learning environments - such as Massive Open Online Courses (MOOC). These learning environments are now producing a tremendous amount of data that can provide deep insights into learning processes and learner behaviors. The large amounts of data that are generated require careful processing to convert them into actionable insights. The process of educational data mining (EDM) is concerned with developing methods for exploring data that come from educational settings, and using those methods to better understand students and the settings in which they learn. As is commonly known, educational platforms are different from classroom settings in that they allow students to register without regard to geographic location, financial and academic status; and to participate in and drop out of courses whenever they want with very little consequences. Naturally, learner behavior and motivation on such platforms is more diverse and very different from those in a traditional educational setup. Investigating types of learners and their behavioral traits is very important for devising effective pedagogical strategies for online learning.

Researchers have tried to use many traditional data mining techniques for studying behavioral patterns of online learners. Romero et al. (1 ,3) studies the application of traditional data mining techniques for EDM, particularly web-based and adaptive platforms. Baker et al. (2) reviews past trends in EDM and the kind of research questions that researchers have been trying to answer over the years. Merceron et al. (4) and Baker et al. (5) provide case studies of the various machine learning and visualization tools that have been applied to EDM. Castro, Felix et al. (7) provides a detailed study on the use of clustering techniques that have been widely used in EDM. This paper focuses on one such data mining technique – namely, the use of clustering techniques for understanding learner types that are typical in an online learning environment. Our goal is to provide a deep synthesis of clustering techniques in educational data mining.

In this paper, we investigate the use of clustering techniques for identifying learner types in Massive Open Online Courses (MOOCs). We discuss some of the challenges presented by such a study and compare different clustering techniques in the context of educational data. Following that, we describe and demonstrate the use of a popular clustering algorithm - the K means algorithm for learner classification. The final section of the paper will focus on the use of K means clustering for learner identification within more constrained contexts presented by a highly technical and advanced engineering MOOC. We shall investigate different types of learner behavior that emerge from the above-mentioned clustering and the ways in which each cluster is different from the rest. We will also discuss some of the technical implications of using K means clustering for learner identification in MOOCs, such as deciding the optimal number of clusters. We will provide a methodology for identifying appropriate labels for each user group according to their dominant behavioral characteristics and use the Kruskal-Wallis test to show that the difference in learner behavior across clusters is statistically significant.

Specific Goals:

The primary goal of this work is twofold: first, to undertake a literature survey of clustering techniques that have been applied in EDM for learner identification and classification; second, use the insights we gain from the literature synthesis to inform educators of the appropriate choice of clustering techniques. We demonstrate the use of one such clustering technique (K-means algorithm) for identification of learner types in a highly technical and advanced MOOC on Nanotechnology. Based on the literature survey, we will provide justifications for why K means clustering algorithms seems to function more efficiently in the context of classifying learner characteristics. We will provide a detailed description of the K-means algorithm and the technical implications of applying K-means for identifying learner types. We will demonstrate this algorithm in action when we attempt to classify learner population in a MOOC into distinct categories and study the characteristic behavioral traits of each category. The use of Kruskal-Wallis test to show that the difference in user behavior across clusters is statistically significant will also be discussed. The paper will also discuss the distinct learner categories that result from clustering and the characteristic traits of each type.

Roy, D., & Bermel, P., & Douglas, K. A., & Diefes-Dux, H. A., & Richey, M., & Madhavan, K., & Shah, S. (2017, June), Synthesis of clustering techniques in educational data mining Paper presented at 2017 ASEE Annual Conference & Exposition, Columbus, Ohio. 10.18260/1-2--28897

ASEE holds the copyright on this document. It may be read by the public free of charge. Authors may archive their work on personal websites or in institutional repositories with the following citation: © 2017 American Society for Engineering Education. Other scholars may excerpt or quote from these materials with the same citation. When excerpting or quoting from Conference Proceedings, authors should, in addition to noting the ASEE copyright, list all the original authors and their institutions and name the host city of the conference. - Last updated April 1, 2015