Asee peer logo

Data Warehousing From The Web

Download Paper |

Conference

2004 Annual Conference

Location

Salt Lake City, Utah

Publication Date

June 20, 2004

Start Date

June 20, 2004

End Date

June 23, 2004

ISSN

2153-5965

Conference Session

Computers in Education Poster Session

Page Count

11

Page Numbers

9.368.1 - 9.368.11

DOI

10.18260/1-2--13935

Permanent URL

https://peer.asee.org/13935

Download Count

371

Request a correction

Paper Authors

author page

Michael Whalen

author page

Chris Fernandes

Download Paper |

Abstract
NOTE: The first page of text has been automatically extracted and included below in lieu of an abstract

Session 1520

Data Warehousing from the Web Chris Fernandes and Michael Whalen Department of Computer Science Union College Schenectady, NY 12308

Abstract

Data warehousing is the ability to collect information from various data repositories and combine them into a single structured repository that can be queried for new information such as performance trends, decision modeling, predictions, and association rules. Internet web sites are data repositories containing useful but unstructured data. In this paper, we describe a data warehouse, developed from the registration web pages at Union College, which allows faculty and students to get on-line access to course enrollment trends, classroom availability, student class schedules, and other pertinent information. The results of this project were so successful in the type of information that could be obtained that the administration became concerned about student privacy issues.

Introduction

Traditional database systems, such as those used by bank tellers, librarians, and airline reservation assistants, are often characterized as online transaction processing (OLTP) systems. They are required to process frequent queries, usually in real-time, that request information about the current status of specific objects and events, such as a bank account balance or availability of a library book. As the status of these entities change, an OLTP system must update the database to reflect these changes so that the database always represents a snapshot of the current state of the world. On the other hand, an online analytical processing (OLAP) system is a database that keeps track of historical data and processes more complicated queries involving summaries and trends rather than individual entities. Table 1 summarizes the two systems.

A data warehouse is a common OLAP system in use today. Retail stores use them to keep track of buying trends. This enables them to stock inventory more accurately. The National Basketball Association uses a system called Advanced Scout to record details about games and extract patterns such as the effectiveness of certain players when on the court with another given player 1. In general, the creation of the data warehouse is a crucial first step in data mining—the process of extracting useful associations to facilitate managerial decision-making.

In recent years, the Web has become a popular source from which to form a data warehouse. It contains a great deal of easily accessible raw data with its main drawback being that it is unstructured. Creating a warehouse from a subset of it would solve that problem and permit analytical queries to be issued upon it. Proceedings of the 2004 American Society for Engineering Education Annual Conference & Exposition Copyright © 2004, American Society for Engineering Education

Whalen, M., & Fernandes, C. (2004, June), Data Warehousing From The Web Paper presented at 2004 Annual Conference, Salt Lake City, Utah. 10.18260/1-2--13935

ASEE holds the copyright on this document. It may be read by the public free of charge. Authors may archive their work on personal websites or in institutional repositories with the following citation: © 2004 American Society for Engineering Education. Other scholars may excerpt or quote from these materials with the same citation. When excerpting or quoting from Conference Proceedings, authors should, in addition to noting the ASEE copyright, list all the original authors and their institutions and name the host city of the conference. - Last updated April 1, 2015