In the beginning was the Hubble Data Archive (HDA), which contained data from HST. Users searched the catalog, sometimes called the DADS catalog, using Starview, a software package developed at STScI. Data were retrieved through the Data Archiving and Distribution System (DADS) and written to tapes, which were mailed to the user. As networks developed and became robust, the demand for web-based search and retrieval grew. A web interface was first made available in 1994. A few years later, the Multimission Archive at STScI (MAST), funded under a separate contract, was created to host data from optical, UV and IR space-based missions. At about the same time, on the fly calibration (OTFC), later replaced by on the fly recalibration (OTFR), was developed for some HST data. At STScI a user could find data from many missions, some available for immediate download, some delayed by OTFR processing. Starview could be used to search for data from some missions, while the web interface, sometimes called MAST web, could be used to search all missions. Some complicated searches of the HST catalog, which Starview supports, are not supported by the web. The alphabet soup of acronyms continued to grow as more missions and tools were added to the archive. The distinction between HDA and MAST lessened except in funding.
In this document the holdings at STScI, whether archived and maintained under the Hubble contract or the MAST contract, are treated as one archive, called the archive, as that's what it looks like to an outside user. No distinction is made between datasets stored on spinning disk and those stored in archive appliances. Where it is important, the distinctions between HDA and MAST are delineated. For example, there are some differences when retrieving HST or FUSE data, and some acknowledgement distinctions are made, as required by the funding sources.
While this document was in revision, work on the Hubble Legacy Archive (HLA) began. As the HLA is under development, it will not be discussed in this version of the Archive Manual.
1. Introduction to the Archive
1.1 Data in the Archive
As noted above, the archive primarily contains data from UV, optical and IR spaced based missions, some of which are still active (e.g.., HST and GALEX). A complete listing of the missions is available on the MAST web site, http://archive.stsci.edu/missions.html.
Table 1.1 lists the archive holdings as of November 2007. Data from the Kepler Mission, scheduled for launch in 2008, and the EPOCh (Extrasolar Planet Observation and Characterization) portion of the EPOXI mission will be part of the archive. See the EPOXI site, http://epoxi.astro.umd.edu/, for information on the mission.
| Mission | Instrument | Description |
|---|---|---|
| ASTRO | ASTRO Observatory | |
| ASTRO | HUT | Hopkins Ultraviolet Explorer |
| ASTRO | UIT | Ultraviolet Imaging Telescope |
| ASTRO | WUPPE | Wisconsin Ultraviolet Photo-Polarimeter Experiment |
| Copernicus | - | Copernicus |
| DSS | - | Digitized Sky Survey |
| EUVE | - | Extreme Ultraviolet Explorer |
| FUSE | - | Far Ultraviolet Spectragraphic Explorer |
| GALEX | - | Galaxy Explorer |
| GSC | - | Guide Star Catalogs |
| HPOL | - | Halfwave Spectropolarimeter |
| HST | - | Hubble Space Telescope |
| IUE | - | International Ultraviolet Explorer |
| ORFEUS | Orbiting Retrievable Far and Extreme Ultraviolet Spectrometers-SPAS | |
| ORFEUS | BEFS | Berkeley Extrme and Far-UV Spectrometer |
| ORFEUS | IMAPS | Interstellar Medium Absorption Profile Spectrograph |
| ORFEUS | TUES | Tübingen Ultraviolet Echelle Spectrometer |
| VLA-FIRST | - | Very Large Array - Faint Images of the Radio Sky at Twenty-cm |
| XXM-OM | - | Xray Multi-Mirror Telescope - Optical Monitor data |
For HST and FUSE, the archive contains the calibration reference files, such as flat fields.
For HST, the archive contains engineering files (aka observation logs or jitter files) that may be useful for diagnosing some questions about observations, and the spacecraft ephemeris.
Retrieved data from all missions are primarily in FITS format.
The archive has searchable catalogs of the data, consisting of information about the observations and targets. For HST data, the catalog is populated from the header keywords of the data files and is quite extensive.
The archive also holds community contributed high level science products (HLSPs). These are fully processed images and spectra that are ready for scienctific analysis. A complete list is available at http://archive.stsci.edu/hlsp .
Copies of the HST data and the archive catalog are maintained at the Space Telescope European Coordinating
Facility (ST-ECF) in Garching, Germany (http://ecf.hq.eso.org/)
and at the Canadian Astronomy Data Centre (CADC) in Victoria, Canada
(http://cadcwww.dao.nrc.ca/). There is a significant amount of
collaboration and coordination between STScI, ST-ECF and CADC to ensure that the data held in common is identical and the basic services provided are similar. However, the archives are not identical. Therefore, European and Canadian astronomers should consult the ST-ECF and CADC Web pages or contact the ST-ECF (stdesk@eso.org) or CADC (cadc@hia.nrc.ca) for information about using their archive systems.
HST and FUSE data become available to the astronomical community upon the expiration of a proprietary period. For HST, most general observer (GO) and guaranteed time observer (GTO) observations have proprietary periods of a year, but some observations have shorter or longer proprietary periods. For FUSE, the normal proprietary period is 6 months. Nearly all calibration observations are made public immediately upon receipt. The archive catalog contains information on these proprietary observations.
For GALEX Guest Investigator (GI) data, the usual proprietary period is 6 months. Extensions of the proprietary period may occur. In general, the archive stores the non-proprietary GALEX GI data in what is called the GII catalog. See the GALEX chapter for more information.
Proprietary datasets may be retrieved only by GOs and GTOs with the appropriate authorization (contact the archive hotseat at archive@stsci.edu). See the section on registration.
All publications based on HST archival data should carry the following footnote:
"Based on observations made with the NASA/ESA Hubble Space Telescope, obtained from the data archive at the Space Telescope Science Institute. STScI is operated by the Association of Universities for Research in Astronomy, Inc. under NASA contract NAS 5-26555."
In addition, if the archival research was supported by a grant from STScI, the publication should also carry the following acknowledgment at the end of the text:
"Support for this work was provided by NASA through grant number ________ from the Space Telescope Science Institute, which is operated by AURA, Inc., under NASA contract NAS 5-26555."
Please send one preprint or reprint of each refereed publication based on HST archival research to the following address:
Librarian
Publications based on all other data from the archive should carry the following acknowledgment:
"Some/all of the data presented in this paper were obtained from the Multimission Archive at the Space Telescope Science Institute (MAST). STScI is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS5-26555. Support for MAST for non-HST data is provided by the NASA Office of Space Science via grant NAG5-7584 and by other grants and contract."
See the MAST Data Use Policy for the current MAST grant number.
Registration is not required to search the Archive and retrieve public data. Registration and authorization is required to retrieve proprietary data, even for the Principal Investigator (PI) . An anonymous user has full access to the archive catalog and previews of public data. However, only registered and authorized users can retrieve proprietary data. You can register as an Archive user on-line by using the Web form at
http://archive.stsci.edu/registration,
which is essentially instantaneous. Alternatively, you can send e-mail to archive@stsci.edu. Within
two working days of the receipt of your e-mail, you will be notified by e-mail of your registration as an archive user and will be provided with a username and a password. (Note: The password can be changed from the registration page or via the Commands pull-down menu in StarView.)
PIs, GOs and GTOs can retrieve their proprietary data from the archive. To do so, they must be
registered and authorized users. PIs, GOs and GTOs desiring this option must register with the
archive (see above). PI's should request authorization for themselves when they register for their account.
Only PIs may authorize anyone to access their data. If a co-I wishes access to their data, they must have the
PI on the proposal send e-mail to archive@stsci.edu
stating the proposal ID number and the identities of anyone who should be able to retrieve the data.
There are many ways of searching the archive for data of interest. These include, but are not limited to, StarView, MAST (web based), SpecView, Aladin and many Virtual Observatory (VO) services. There are also special interfaces or tools, such as the CASJobs implementation for searching GALEX data. Some of these interfaces, applications and tools are described here, others are described in the MAST chapter. Still others are not discussed at all. See the MAST chapter and the following links for on-line discussions of interfaces, tools, and search hints.
STScI developed StarView to allow general access to the HST archive and the tables in the HST database. Using StarView, one can ask common questions about the data in the archive, determine whether the data are public, view a preview of the data and retrieve the data of interest. StarView incorporates the Java based tool Specview thus giving access to an extremely useful plotting and analysis tool. Access is also provided to the Digitize Sky Survey (DSS) and Vizier. With StarView, HST and FUSE data may be searched and retrieved.
The current version of Starview, 7.3, is Java based. It runs on various platforms and can be installed at one's home site. See the StarView webpage for download and installation information.
For HST, StarView is also used for observation planning, duplication checking, calibration file review and proprietary status checking. It may be used to investigate the On-The-Fly Reprocessing (OTFR) flags, although most users simply check the trailer files that are distributed with their data.
The archive holdings can be accessed via the web interface at
http://archive.stsci.edu. Most users will find the Web interface more convenient than StarView to use, as it can be accessed by any Web browser, and requires no special software. Note: while the Web interface does not provide the special purpose functions of StarView as regards HST data, it does provide access to additional functions, including previews and links to references.
Through the web interface, MAST provides a separate search page for each mission. These pages are accessible via the main MAST webpage at
http://archive.stsci.edu.
In addition, MAST provides several tools (cross-mission search, catalog cross correlation (Vizier, spectral co=plotters, etc.) to give a more global look at the archive holdings.
The web interfaces can also be accessed using HTTP GET requests. The GET request allows search parameters to be included in the URL. As such, they can be called from within programs to automate data searches. The results can be returned in a variety of formats including HTML, VOTable XML format, Excel spreadsheet format and comma-separated values (CSV), which can simplify utilization by user-written programs. See
http://archive.stsci.edu/VO/mast_services.html for more information. Also see the MAST chapter for more tools and services.
The Virtual Observatory (VO) project is creating standards to facilitate the discovery and joining of data among astronomical archives. These projects include the US-Virtual Observatory (NVO) which is a member of the International Virtual Observatory Alliance (IVOA). MAST is working to make the archive data compliant with the current VO defined standards and protocols.
An example of a VO interface is the
VO DataScope Data Inventory Service, commonly known as DataScope. Using this interface, thearchive holdings can be searched simultaneously with data from many other surveys and missions. DataScope was developed for the NVO and is hosted at HEASARC. The survey and mission data accessed include SDSS, 2MASS, RASS, HST, Chandra and EGRET.
In addition to a standard web search form, for GALEX (GALaxy Evolution eXplorer), a separate interface, called CASJobs, exists. CASJobs is a Batch Query Service that allows SQL access to the GALEX databases. Registered CASJobs users receive local storage on the database server, where tables may be created using the "select into" statement. This storage is called MYDB. Tables created in MYDB may be extracted to FITS, VOTable, or CSV using the extract page. Each user controls their own MYDB, which means the user can drop tables in their MYDB to make more space. MYDB is a proper database: tables in MYDB may be joined with tables in any GALEX target database. For more details on CASJobs, go to http://galex.stsci.edu/casjobs/. See
the GALEX chapter for more information on GALEX.
For most missions in the archive, retrieval can be as simple as clicking on the dataset name. However for HST and FUSE data the procedure is different.
As noted above, public data may be retrieved anonymously. Only registered and authorized archive users (see section 1.5, above) can retrieve proprietary data. Once you have an archive password, you can retrieve your proprietary data by using StarView or MAST to select your datasets and choose the delivery mode.
Whatever means is used to search the archive, when retrieval is requested for HST or FUSE data the Retrieval Options page is displayed. You are required to select a delivery option for the data. Current options are
For every request you will be asked for an archive username and password. For anonymous retrieval, enter anonymous for the user name and your e-mail address as your password. For an SFTP/FTP retrieval, you will also be required to give your home username and password so that the retrieved data can be written to your disk.
When retrieving a large number of datasets, it is better to submit several smaller requests than one large one. To avoid delay, please keep the number of datasets in each request under 100 - 200.
System resources required for OTFR may significantly delay availability of the data to programs requesting large volumes of data. Even smaller requests may sometimes encounter delays because competing requests fluctuate greatly. If you are making a large request (greater than 350 ACS, 700 STIS, 700 NICMOS, or 1500 WFPC2 datasets at one time), please submit the requests early on a Friday (Eastern Time) for weekend processing to avoid peak processing times.
Large requests can only be started by Archive staff. Operations is not staffed after hours or during the weekend, so make sure to submit large requests during business hours. To avoid a logjam of multiple large requests during a given weekend, please contact the Archive Hotseat (410-338-4547 or archive@stsci.edu) prior to submitting your request (weekend time will be granted on a first come, first served basis).
As a guide to the system capabilities, Table 1.2 gives the maximum number of datasets that should be submitted in a single large request and the maximum number of datasets that can be processed per weekend (above the "normal" load). See the large request information page for current guidelines on what is a large or very large request.
1.2 Other Archives Containing HST Data
1.3 Proprietary Data
1.4 Publication of Archival Research Results
The results of investigations with archival data are generally published in the scientific literature. Archive staff examine the literature for such publications in order to include them in the reference database, and display them as part of the search results. Clearly indicating what datasets are used in your publication will improve the accuracy of this task. Some publishers require authors to use the ADS (Astrophysics Data System) naming convention for datasets. The archive participates in this effort and provides a link, http://archive.stsci.edu/pub_dsn.html, on its main web page to information on the naming convention and verification tools.
1.4.1 HST Data
Space Telescope Science Institute
3700 San Martin Drive
Baltimore, MD 21218 USA 1.4.2 All Other Data
1.5 Registration
1.6 Searching for Data
1.6.1 StarView
1.6.2 MAST
1.6.3 Virtual Observatory Services
1.6.4 CASJobs
1.7 Retrieving Data
1.7.1 HST and FUSE Datasets
| Instrument | Datasets per Request | Maximium datasets per weekend |
|---|---|---|
| ACS | 500 | 1000 |
| NICMOS | 500 | 2000 |
| WFPC2 | 750 | 4500 |
After a request is submitted, the archive system processes it. An e-mail notification is sent immediately when the system has accepted the request and again when the system has completed processing the request, indicating whether or not the transfer was successful. The messages will go to the e-mail address specified on the RetrievalOptions page (for anonymous retrieval) or, if an archive account was used, to the e-mail address on the archive account registration form.
How long a retrieval takes depends on a variety of factors, including: the type of data in the request, the size of the request, the number of requests in the system at the time, and the destination of the request (the internet connections between STScI and some sites, especially those overseas, is sometimes a significant source of delay). If everything is running smoothly, one should expect a median turn around time of an hour. If it takes more than one day, and you do not think any of the factors listed above are playing a significant role, please contact us at archive@stsci.edu.
If the data are retrieved to the staging disk (STAGE option), the data will be written to a subdirectory. Each data retrieval request will be in its own subdirectory, identified by the request ID number, which will be included in the notification message you will receive. To find your data, from your home account type:
% ftp archive.stsci.edu
% login: archive user name
% password: ******
ftp> binary
ftp> cd /stage/username/request_ID_number
ftp> ls
Note the use of binary in the above example. Not all ftp clients automatically set the data mode. Any attempt to ftp fits files in ASCII mode will result in corrupted data files, with no errors from ftp.
For anonymous retrievals, use anonymous for archive user name and your e-mail address for the password. The subdirectory will be /stage/anonymous/request_ID_number.
After locating your data on the archive host, transfer them across the Internet using FTP as described in a later section. Because the disk space available to each user within the data directory is limited, the files created for you are temporary and are deleted automatically after a few days. Please transfer your data promptly and delete your data from the staging area after the transfer has completed. When disk space is tight, we would appreciate notification that you have completed copying over your retrieved files so that we may delete them. (Send email to archive@stsci.edu.)
All non-HST/FUSE data retrieved through the MAST web search interface is directly downloaded to the user's system as a tar or zip file. The data are also all available via anonymous ftp (archive.stsci.edu, cd pub/mission/data). Users will need to know the desired dataset names when using this option. More information on MAST retrievals is provided in the MAST Chapter of this manual.
The Operations and Engineering Division (OED) at STScI is committed to providing outstanding and timely support to archive researchers. We provide assistance and advice on methods and strategies for finding information in the archives and provide a hotseat staff for researchers who have specific problems or questions about using the archive. Archive researchers who need extensive advice on search strategies or help analyzing their astronomical data can visit STScI.
The Operations and Engineering Division (OED) at STScI is responsible for the management, scientific
and technical oversight, and operation of the STScI archives. OED staff also support astronomers who wish to use public data from the archives for their own research. To provide assistance for archive
researchers, the OED staff includes archive specialists (with bachelor or masters level degrees in
physics or astronomy) and archive scientists (Ph.D astronomers). The support provided by the OED includes:
We welcome your comments and questions about the archive in general or about archive user support. As discussed above, communication regarding all aspects of the archive should normally be directed to the archive hotseat (e-mail: archive@stsci.edu, or telephone (410) 338-4547). This will allow Archive Branch staff to respond to your requests even when individual members of the group are away. If you feel your needs are not being adequately addressed through the hotseat, place a message in the Suggestion Box located on the main archive page, http://archive.stsci.edu.
Occasionally, a retrieval fails because of a network timeout, disk space inadequacy, or other reasons. The staff at archive@stsci.edu are available for any questions about the request.
Documentation is available on-line for all archive holdings. The main archive page provides links to a MAST tutorial, a general introduction to MAST and a "getting started" page. Each mission page has links to mission specific information (About ...), a mission specific "getting started" page and to the MAST tutorial. MAST's HST page contains similar links. Under the About HST link are links to documentation on HST, its instruments and their calibration, proposal instructions, the Archive Manual and much more.
The Archive Manual (i.e., this document) is also available, either as a postscript or pdf file, from the archive via anonymous ftp. To get a copy of the postscript version, follow these instructions. For the pdf version, substitute pdf for ps in the file name.
>ftp archive.stsci.edu
Those files that end in .gz were compressed with the utility gzip and can be uncompressed with the utility gunzip.
If you have used the STAGE option to retrieve your HST or FUSE data to the archive host computer (archive.stsci.edu) you must transfer your files to your local computer via ftp. See Table 1.3 for examples of ftp sessions.
Datasets from other MAST missions are stored online and can be retrieved
via anonymous ftp or through the browser using wget. See the MAST chapter
and the MAST tutorial for more information.
Final calibrated data for the HST Instruments STIS, GHRS, FOS and FOC are
available on disk in the hstonline area. See archive.stsci.edu/hstonline/ for details.
1.7.2 MAST Data
1.8 User Support
1.8.1 Support for Archival Research
OED staff will not normally do an astronomer's archive search, generate requests for data, or reanalyze data from the archive. OED staff will provide assistance and documentation so that archive researchers can perform these tasks.
1.8.2 MAST Newletter
Archive users should consult the Web page at http://archive.stsci.edu for up-to-date archive information. The MAST newsletter, another source of archive information, is accessable from this web site. The newsletter is also distributed electronically via a mailing list. STScI archive users are encouraged to subscribe by sending an e-mail message to archive_news-request@stsci.edu. The single word SUBSCRIBE should be included in the body of the message.
1.8.4 Questions and Comments
1.8.5 When a Retrieval Fails
1.8.6 Documentation
>login: anonymous
>passwd: your e-mail address
ftp> cd pub
ftp> cd manuals
ftp> dir [lists the available files]
ftp> ascii
ftp> mget archive_manual*.ps.gz
ftp> bye 1.9 Using FTP
Sample Function |
Unix Commands |
|---|---|
|
Retrieve HST data from the data directory as named in
the acknowledgement e-mail - in this case, "dir0129" (e.g., data retrieval was
requested using an archive user account)
|
%ftp archive.stsci.edu (or stdatu.stsci.edu) (login as "anonymous") ftp> cd staging ftp> dir ftp> binary ftp> prompt ftp> mget x* ftp> bye
|
|
Retrieve PostScript and/or text versions of manual, abstracts, catalogs, general information. Files that end in .gz can be uncompressed by using the command "gunzip". The uncompressed files are also available.
|
ftp archive.stsci.edu
(login as "anonymous") ftp> ls
ftp> mget *.ps.Z (or whatever)
|
| 1 As appropriate. In the case of the Hubble Deep Field images, "binary" would be the appropriate datatype. | |
You can analyze or recalibrate HST datasets using STSDAS, which can be installed under IRAF. A comprehensive discussion of STSDAS and IRAF features or a tutorial on how to use the software is beyond the scope of this manual. Contact the STSDAS hotseat (help@stsci.edu) for answers to any STSDAS-related questions that you may have. To learn how to use this data analysis package, you can request copies of the documentation by sending e-mail to help@stsci.edu. Various STSDAS and software related documentation is also available on-line at
http://www.stsci.edu/hst/HST_overview/documents.
1.10 Using STSDAS/IRAF to Analyze Your Data