Basic Methodology

We use the following method to identify and classify papers that will measure the impact and productivity of HST and its various instruments. First, we search the full text, in pdf format, of the major astronomical journals with Adobe Reader with the following search string:


Any paper containing any of the keywords will be flagged for further inspection. Papers that do not contain any of the keywords are not processed further. Although NASA's ADS system provides a convenient interface for keyword searches in the paper abstracts, the full text search we are carrying out on the downloaded papers ensures that publications are correctly identified even if the abstract does not contain sufficient information for this (see Grothkopf & Treumann 2003).

Next, each paper that contains any of our search keyword are carefully examined and false hits are discarded. The most obvious false hits include papers referring to "Hubble time" or "Hubble constant," but, many of the papers that actually refer to HST data or even use HST data are not counted are HST papers.

HST and Non-HST Papers

The philosophy of our paper classification is to include those and only those papers that present analysis of HST data to reach a scientific conclusion. These criteria are easily applied to most papers and usually lead to a clear classification of the publication being an HST– or a non–HST paper.

A fraction of papers only use HST images as a visual reference in the form of an overlay image. If the details of the HST image are not discussed and it does not contribute to the scientific results of the paper, we will not count this paper as an HST publication. Frequently, one or more HST data sets are re–reduced and re–published. Regardless of whether or not an HST data set has been published previously, we count the papers as HST papers if analysis of HST data is presented.

A subset of papers presents ground–based follow–up observations of targets identified through HST observations: unless these papers include the analysis of actual HST data they will not be counted as HST publications. As a general rule if a paper cites previously published HST data but this data has little impact on the conclusions of the paper, it is not counted as an HST paper. Papers that speak about future observations and/or capabilities are also not considered to be HST papers. Moreover, we do not include papers about HST-related engineering, instrument design, or software.

Experience shows that the status of a small fraction of papers remains uncertain even after careful inspection by the library or scientific staff because they do not ambiguously fit in any of the categories. These are classified as uncertain.

Assigning Programs, Program Types and Instruments to Publications

Once a paper has been classified as HST publication, we proceed with determining: 1) the specific data sets and proposal types that provided the data; 2) the instruments used in the program; 3) the list of authors of the paper and the list of investigators on the original proposals. Based on this information we further classify the HST papers according to whether or not the data has been obtained from general observing (GO) programs or from the HST archive.

All entries in the HST publication database are individually checked in order to link them to the observing programs that generated the data used in the publications. Authors often specify the program or data set identifiers and then these are adopted. Frequently, however, insufficient information is provided in the paper. In this cases our library and archive teams query the archive database and attempts to identify the observations that have been used in the program. This step requires significant effort, but often leads to an unambiguous identification of the data set, a valuable information for linking data products to publications and for evaluating the performance of observing programs. Once the program and data set identifiers have been established, an automatic classification is carried out to decide whether the paper in question is classified as general observing, archival, or partly archival using the following criteria:

GO paper: At least one author was investigator on the GO proposal that obtained the data.

AR paper: No overlap between the paper authors and investigators on the GO proposal that obtained the data.

GO+AR: Combination of GO data sets with AR data sets.

Questions? Comments?

Please email with any questions or comments about our publications database.