As the archive has expanded, some users need to perform searches and make requests for large amounts of HST data. We present here some information to explain the best methods and some additional guidelines for ensuring a smooth and complete finish to your large requests.
You should be aware that while DADS can handle fairly large requests, there are some limitations. These limits are discussed in the request section below. The limits may influence how you wish to search and request data. You may wish to make many small searches and requests rather than one or two extremely large searches.
The HST web based search interface provides a means to search the HST archive. Currently, the page limits the number of rows to be returned. The default is 100 rows, but the limit can be increased to a maximum of 1500 rows. This limit is imposed due to the length of time needed to render the search results in HTML. Even 1500 rows will take some time to render in HTML. However, if you wish to make a number of searches and requests, the web interface should serve you well. You may wish to perform the same basic search, but limit the results by including a filter and/or observation time limit.
You may use HTTP GET requests to search as well as the web form-based interface. The GET request allows the search parameters to be included in a URL in your browser. As such, they can also be called from within programs to automate data searches. The results can be returned in a variety of formats including VOTable XML format, excel spreadsheet, and comma-separated values which can simplify ingesting results into user-written programs. In addition, submitting GET requests can bypass restrictions currently placed on the web search forms (e.g., restrictions on the max_records value).
In general, mission searches are specified in the form:
http://archive.stsci.edu/[data set]/search.php?action=Search¶ms
In general, any parameter listed on the HTML search form can be specified in a GET request. If you are parsing the results, it would be easier and probably faster, to request the search results in CSV or VOTable format using the output format parameter.
Below is an example of a search for ACS observations taken with the F814W filter within the specified radius (in arcminutes), centered on a given RA and Dec. In this example, the RA and Dec are in decimal degrees but other standard formats may also be used. Note that for some instruments (including ACS) the option to use more than one filter exists. So if you are looking for a specific filter, surround the filter name with wild cards. The example uses max_records parameter to specify that up to 5000 records be displayed. In this example the default output columns will be displayed. The output format specified is CSV. If you expect more than 1500 rows to be displayed, we do NOT recommend specifying the HTML format. The time to render such a long web page is prohibitive. The other output choices are VOTable and EXCEL.
You may specify any parameter that can be used in the web form in the URL. In the example above the instrument and filter fields are shown as an example. The search parameters that you may use are documented in a web page. This page is also available from the web search form by clicking on the Field Descriptions form heading. To specify a parameter, use the column name found in the first column of the the table. The page also lists valid values and ranges that may be useful as you formulate your search.
You may choose to specify specific columns as output using the selectedColumnsCsv format. Use the same parameters that are used to specify the output columns. Separate your columns with commas. In the example below, the dataset name, ra and dec are displayed as output.
If you are using a script to run GET requests, please make sure that you have only a few searches running at a time. Many searches tend to overrun the capacity of our current webserver.
There are a number of MAST webservices that may be useful. They are described more fully on the mast_services web page.
n example of the type of query these tables will allow is: How many high galactic latitude observations taken with WFPC2 exist [for RA between 9 - 18 hours in the northern hemisphere (dec > -20) observable from the northern hemisphere in the early spring] where there have been at least two observations, and where at least one of those observerations was in the I-band, and at least one of those observations was in any other filter?
Choose the following options:
Galactic Latitude Above & below plane +/- > 20.0 degrees
Exposures in Band I >0
Number of Unique Bands >1
Total Number of Exposures >1
RA 09..18
Dec >-20.0
You will get a page listing the information found for the pointings or sky regions that meet the criteria you specified. At the top of the page is a summary of all the pointings. The numbers on the clickable buttons indicate that the number of observations included in the pointing for that bandpass. If you click on the buttons, you will get a list of the observations included in that bandpass/pointing. You can choose to retrieve data from the page listing the observations, but please obeserve the guidelines for size of a submitted observation.
The pointings search can also be run as a Web service (GET request).
StarView has a problem returning large (e.g. thousands) of records. There is a problem database connection from Java that occasionally hangs the query. StarView can also run out of allocated memory for very large returns as well. Because of these limitations Starview is probably best used in a support role to examine information held in tables that are not accessible from the web interfaces.
StarView also has an interface to the pointings table.
System resources required for On-The-Fly Reprocessing may significantly delay
availability of the data to programs requiring large volumes of data.
Even smaller requests may sometimes encounter delays as the number of requests in the system can fluctuate widely.
Requests larger than 2000 datasets may experience delays to allow for sufficient resources. We recommend breaking large OTFR requests into smaller requests of around 500 datasets each for maximum efficiency.
Optimal non-OTFR requests should be for 5000 datasets or less.
Large requests can only be started by Archive staff. Operations is not staffed after hours or during the weekend, so requests submitted then will be delayed.
To avoid a logjam of multiple large requests, please contact the Archive Hotseat (410-338-4547 or archive@stsci.edu) prior to submitting your request.
If you are searching and submitting in small enough batches, the standard web and Starview request submission
pages are adequate.
You may wish to perform a search that will gather all the datasets you will need
and then break them into smaller groups for submission.
If you know the dataset names, then you can submit batches via the
dataset input web page.
Paste in the list of the dataset names and click on the "Submit datasets for retrieval".
This brings up a page that lists all the dataset submitted already marked.
Just click the button "Submit marked data for retrieval from STDADS".
This brings up the retrieval options page.
Fill in the retrieval options form in as usual, observing the following caveats:
After filling in and submitting the retrieval options form, wait until you get the page
that tells you that the request was sent to ST-DADS.
For large requests you may need to wait for several minutes.
The request will not be completed if you change the page before you get this response.
For smaller requests, the page will list the request id and the dataset names in the request.
For larger requests this information may not be listed due to a database connection timeout.
However, the request will have been submitted. You should get a confirming email.
For very large requests you may not get the confirming email for a half an hour or so. In one case
we had a report of not receiving mail for 5 hours, but we think that this is rare.
If you do not receive an email, please contact the archive hotseat.
Requests