Getting Tired of Massive Journal Usage Statistics: A Case Study on Engineering Journal Usage Analysis Using K-Means Clustering

Qianjin Zhang

Download Paper | Permalink

Conference: 2020 ASEE Virtual Annual Conference Content Access
Location: Virtual On line
Publication Date: June 22, 2020
Start Date: June 22, 2020
End Date: June 26, 2021
Conference Session: Improving and Understanding Engineering Collections and Publication
Tagged Division: Engineering Libraries
Page Count: 12
DOI: 10.18260/1-2--34706
Permanent URL: https://peer.asee.org/34706
Download Count: 539

Paper Authors

biography

Qianjin Zhang University of Iowa orcid.org/0000-0003-0738-9357

visit author page

Qianjin (Marina) Zhang is the Engineering and Informatics Librarian at the Lichtenberger Engineering Library, the University of Iowa. As a subject librarian, she manages collection and provides instruction, reference and consultation services for the engineering faculty and students. Her work also focuses on data management education and outreach to engineering students. She holds a MA in Information Resources & Library Science from the University of Arizona, and a BS in Biotechnology from Jiangsu University of Science and Technology (China).

visit author page

Download Paper | Permalink

Abstract

In 2018-2019, due to increases in the costs of information resources and flat collection budgets, University of Iowa Libraries has experienced a large-scale journal cancellation. As part of the University Libraries system, the Engineering Library went through a difficult process of identifying a list of journals with low usage and high cost, gathering feedback from our users and finalizing a list for cancellation. Since such a difficult situation may occur again in the future, we see the importance of continuously monitoring and evaluating collections in a proactive manner.

However, it would be challenging for engineering librarians who are responsible for both collection management and public service to review massive usage statistics on a regular basis. In order to tackle this challenge, we initiated a case study of measuring engineering journal usage in an alternative approach. The dataset was extracted from a data analytics company’s journal usage statistics report prepared for the University of Libraries. We decided to reuse data from their report because it would save us time in data consolidation. The dataset contained journal titles, subfields and three key indicators including the number of publications per journal by authors of our institution, the number of citations to journal made by our authors and the number of downloads. Since the downloads were only available for the most recent four years (from 2015 to 2018), we selected the same period of data for the number of publications and the number of citations. We segmented a total of 821 journal titles into four clusters using K-Means clustering technique where the first cluster of 38 titles with a high number of publications, citations and downloads; the second cluster of 142 titles with a low number of publications but a moderate number of citations and a high number of downloads; the third cluster of titles with a low number of publications and citations but a moderate number of downloads; the forth cluster of titles with a low number of publications, citations and downloads.

In conclusion, our case study of measuring engineering journal usage converted massive journal usage statistics into four clusters of journal titles in a straightforward format. The clusters of journal titles also provided us with a comprehensive view on how engineering journals had been used by both authors and users of our institution in the most recent four years. Last but not the least, this case study showed a possibility of implementing data analytics in academic libraries.

Citation
Format

Zhang, Q. (2020, June), Getting Tired of Massive Journal Usage Statistics: A Case Study on Engineering Journal Usage Analysis Using K-Means Clustering Paper presented at 2020 ASEE Virtual Annual Conference Content Access, Virtual On line . 10.18260/1-2--34706

TY  - CPAPER
AB  - In 2018-2019, due to increases in the costs of information resources and flat collection budgets, University of Iowa Libraries has experienced a large-scale journal cancellation. As part of the University Libraries system, the Engineering Library went through a difficult process of identifying a list of journals with low usage and high cost, gathering feedback from our users and finalizing a list for cancellation. Since such a difficult situation may occur again in the future, we see the importance of continuously monitoring and evaluating collections in a proactive manner. 

However, it would be challenging for engineering librarians who are responsible for both collection management and public service to review massive usage statistics on a regular basis. In order to tackle this challenge, we initiated a case study of measuring engineering journal usage in an alternative approach. The dataset was extracted from a data analytics company’s journal usage statistics report prepared for the University of Libraries. We decided to reuse data from their report because it would save us time in data consolidation. The dataset contained journal titles, subfields and three key indicators including the number of publications per journal by authors of our institution, the number of citations to journal made by our authors and the number of downloads. Since the downloads were only available for the most recent four years (from 2015 to 2018), we selected the same period of data for the number of publications and the number of citations. We segmented a total of 821 journal titles into four clusters using K-Means clustering technique where the first cluster of 38 titles with a high number of publications, citations and downloads; the second cluster of 142 titles with a low number of publications but a moderate number of citations and a high number of downloads; the third cluster of titles with a low number of publications and citations but a moderate number of downloads; the forth cluster of titles with a low number of publications, citations and downloads. 

In conclusion, our case study of measuring engineering journal usage converted massive journal usage statistics into four clusters of journal titles in a straightforward format. The clusters of journal titles also provided us with a comprehensive view on how engineering journals had been used by both authors and users of our institution in the most recent four years. Last but not the least, this case study showed a possibility of implementing data analytics in academic libraries.  
AU  - Qianjin Zhang
CY  - Virtual On line 
DA  - 2020/06/22
PB  - ASEE Conferences
TI  - Getting Tired of Massive Journal Usage Statistics: A Case Study on Engineering Journal Usage Analysis Using K-Means Clustering
UR  - https://peer.asee.org/34706
DO  - 10.18260/1-2--34706
ER  -