Quantitative Analysis Of Programs: Comparing Open Source Software With Student Projects

Evan Zelkowitz; Mark C Johnson; Yung-Hsiang Lu

Download Paper | Permalink

Conference: 2006 Annual Conference & Exposition
Location: Chicago, Illinois
Publication Date: June 18, 2006
Start Date: June 18, 2006
End Date: June 21, 2006
ISSN: 2153-5965
Conference Session: Tools and Support for Software Education
Tagged Division: Software Engineering Constituent Committee
Page Count: 20
Page Numbers: 11.1057.1 - 11.1057.20
DOI: 10.18260/1-2--710
Permanent URL: https://peer.asee.org/710
Download Count: 801

Abstract
NOTE: The first page of text has been automatically extracted and included below in lieu of an abstract

Quantitative Analysis of Programs: Comparing Open-Source Software with Student Projects

Abstract

The lack of quantitative measures is a common problem in a programming course. Even though most students understand the importance of comments and good program structures, there is no quantitative “rule of thumb” to guide students in determining whether their programs have suﬃcient comments or are well-structured. For example, an instructor may require one line of comment for every ten lines of codes. These numbers are determined without suﬃcient scientiﬁc support; hence, students may resist the requirements and treat them as burdens. Open-source programs are widely used today and they can be considered as samples for teaching programming. We analyze 6 open-source software projects with 6233 ﬁles and 3.27 million lines of code to discover their commonalities. The projects are python, gdb, emacs, httpd, kde, and doxygen. These open-source programs are used and contributed by many programmers. These particular programs are selected as examples of high quality code by virtue of their extensive and successful use in industry and academia. These programs are used also because it is diﬃcult to obtain large-scale non-trivial programs from companies and sample programs from textbooks are usually very small. Because quality measures are often subjective, we focus on quantitative measures that can be objective and obtained by software tools. In our analysis of open source software, we ﬁnd that the average length of codes between comments is fewer than one hundred characters, or only a few lines. Most comments are short, only one or two lines. While global variables are often considered detrimental to program organization by instructors, global variables are actually frequently used in open- source programs maintained by multiple programmers. Hence, instructors should not use the presence of global variables as the sole indication of poor program structures. The 6 projects are written in C or C++ and functions are the fundamental unit of C/C++. In these projects, most functions call only a few other functions. This study shows strong similarities in these diﬀerent projects and suggests the possibility of using a quantitative approach to teaching programming. We compare the results with the programs written by the students in a senior-level software engineering course. We discover that their programs have similar properties as open-source programs. Hence, we hypothesize that students may beneﬁt by using these quantitative measures from open-source programs as samples and learn better programming skills and styles.

Introduction

Open-source software provides abundant opportunities to study the properties of successful software projects. These projects are considered successful because they enjoy a large pop-

Citation
Format

Lu, Y., & Zelkowitz, E., & Johnson, M. C. (2006, June), Quantitative Analysis Of Programs: Comparing Open Source Software With Student Projects Paper presented at 2006 Annual Conference & Exposition, Chicago, Illinois. 10.18260/1-2--710

Quantitative Analysis Of Programs: Comparing Open Source Software With Student Projects

Paper Authors

Yung-Hsiang Lu Purdue University

Evan Zelkowitz Purdue University

Mark C Johnson Purdue University

Abstract NOTE: The first page of text has been automatically extracted and included below in lieu of an abstract

Citation

APA

APA - LaTeX bibitem

MLA

MLA - LaTeX bibitem

Bibtex

EndNote - RIS

Abstract
NOTE: The first page of text has been automatically extracted and included below in lieu of an abstract