|Title:||The NASA Astrophysics Data System: Sociology, Bibliometrics, and Impact|
|Authors:||Kurtz, Michael J.; Eichhorn, Guenther; Accomazzi, Alberto; Grant, Carolyn S.; Demleitner, Markus; Murray, Stephen S.; Martimbeau, Nathalie; Elwell, Barbara|
|Affiliation:||AA(Harvard-Smithsonian Center for Astrophysics,Cambridge, MA 02138 USA, |
|Journal:||Submitted to The Journal of the American Society for Information Science and Technology.|
The NASA Astrophysics Data System (ADS), along with astronomy's journals and data centers, has developed a distributed on-line digital library which has become the dominant means by which astronomers search, access and read their technical literature. By combining data from the text, citation, and reference databases with data from the ADS readership logs we have been able to create Second Order Bibliometric Operators, a customizable class of collaborative filters which permits substantially improved accuracy in literature queries. Using the ADS usage logs along with membership statistics from the International Astronomical Union and data on the population and gross domestic product (GDP) we develop an accurate model for world-wide basic research where the number of scientists in a country is proportional to the GDP of that country, and the amount of basic research done by a country is proportional to the number of scientists in that country times that country's per capita GDP.
We examine the obsolescence function as measured by actual reads, and show that it can be well fit by the sum of four exponentials with very different time constants. We compare the obsolescence function as measured by readership with the obsolescence function as measured by citations. We find that the citation function is proportional to the sum of two of the components of the readership function. This proves that the normative theory of citation is true in the mean. We further examine in detail the similarities and differences between the citation rate, the readership rate and the total citations for individual articles, and discuss some of the causes.
Using the number of reads as a bibliometric measure for individuals, we introduce the read-cite diagram to provide a two-dimensional view of an individual's scientific productivity. We develop a simple model to account for an individual's reads and cites and use it to show that the position of a person in the read-cite diagram is a function of age, innate productivity, and work history. We show the age biases of both reads and cites, and develop two new bibliometric measures which have substantially less age bias than citations: SumProd, a weighted sum of total citations and the readership rate, intended to show the total productivity of an individual; and Read10, the readership rate for papers published in the last ten years, intended to show an individual's current productivity. We also discuss the effect of normalization (dividing by the number of authors on a paper) on these statistics.
We apply SumProd and Read10 using new, non-parametric techniques to rank and compare different astronomical research organizations. We then introduce the concept of utility time to measure the impact of the ADS and the electronic astronomical library on astronomical research. We find that in 2002 it amounted to the equivalent of 736 FTE researchers, or \$250 Million, or the astronomical research done in France.
keywords - digital libraries; bibliometrics; sociology of science; information retrieval