Asee peer logo

Quantitative Analysis Of Programs: Comparing Open Source Software With Student Projects

Download Paper |


2006 Annual Conference & Exposition


Chicago, Illinois

Publication Date

June 18, 2006

Start Date

June 18, 2006

End Date

June 21, 2006



Conference Session

Tools and Support for Software Education

Tagged Division

Software Engineering Constituent Committee

Page Count


Page Numbers

11.1057.1 - 11.1057.20



Permanent URL

Download Count


Request a correction

Paper Authors

author page

Yung-Hsiang Lu Purdue University

author page

Evan Zelkowitz Purdue University

author page

Mark C Johnson Purdue University

Download Paper |

NOTE: The first page of text has been automatically extracted and included below in lieu of an abstract

Quantitative Analysis of Programs: Comparing Open-Source Software with Student Projects


The lack of quantitative measures is a common problem in a programming course. Even though most students understand the importance of comments and good program structures, there is no quantitative “rule of thumb” to guide students in determining whether their programs have sufficient comments or are well-structured. For example, an instructor may require one line of comment for every ten lines of codes. These numbers are determined without sufficient scientific support; hence, students may resist the requirements and treat them as burdens. Open-source programs are widely used today and they can be considered as samples for teaching programming. We analyze 6 open-source software projects with 6233 files and 3.27 million lines of code to discover their commonalities. The projects are python, gdb, emacs, httpd, kde, and doxygen. These open-source programs are used and contributed by many programmers. These particular programs are selected as examples of high quality code by virtue of their extensive and successful use in industry and academia. These programs are used also because it is difficult to obtain large-scale non-trivial programs from companies and sample programs from textbooks are usually very small. Because quality measures are often subjective, we focus on quantitative measures that can be objective and obtained by software tools. In our analysis of open source software, we find that the average length of codes between comments is fewer than one hundred characters, or only a few lines. Most comments are short, only one or two lines. While global variables are often considered detrimental to program organization by instructors, global variables are actually frequently used in open- source programs maintained by multiple programmers. Hence, instructors should not use the presence of global variables as the sole indication of poor program structures. The 6 projects are written in C or C++ and functions are the fundamental unit of C/C++. In these projects, most functions call only a few other functions. This study shows strong similarities in these different projects and suggests the possibility of using a quantitative approach to teaching programming. We compare the results with the programs written by the students in a senior-level software engineering course. We discover that their programs have similar properties as open-source programs. Hence, we hypothesize that students may benefit by using these quantitative measures from open-source programs as samples and learn better programming skills and styles.


Open-source software provides abundant opportunities to study the properties of successful software projects. These projects are considered successful because they enjoy a large pop-

Lu, Y., & Zelkowitz, E., & Johnson, M. C. (2006, June), Quantitative Analysis Of Programs: Comparing Open Source Software With Student Projects Paper presented at 2006 Annual Conference & Exposition, Chicago, Illinois. 10.18260/1-2--710

ASEE holds the copyright on this document. It may be read by the public free of charge. Authors may archive their work on personal websites or in institutional repositories with the following citation: © 2006 American Society for Engineering Education. Other scholars may excerpt or quote from these materials with the same citation. When excerpting or quoting from Conference Proceedings, authors should, in addition to noting the ASEE copyright, list all the original authors and their institutions and name the host city of the conference. - Last updated April 1, 2015