
What is the Hubble Source Catalog (HSC)? What data does it contain?
The Hubble Source Catalog (HSC) combines the tens of thousands of visit-based, general-purpose source lists in the Hubble Legacy Archive (HLA) into a single master catalog.
In the current Beta 0.2 release, the HSC contains members of the ACS/WFC and WFPC2 Source Extractor source lists from HLA version 7.1 that have valid detections (i.e., data with quality flag values less than 5; see HLA Source List FAQ for a definition of the flagging system for HLA source lists). Data from the ACS/HRC (High Resolution Camera) are NOT included in the HSC at the current time. Approximately 1/3 of the ACS/WFC and WFPC2 images in HLA DR7.1 are not included in the HSC due to image quality and other issues.
This is a Beta version of the HSC; you can expect some rough edges and a larger fraction of artifacts than will be present in the Version 1 release. In addition, the database may change without warning since we are working to make improvements. For the upcoming HSC Version 1 release (tentatively planned for fall 2013), we plan to include sources from WFC3 images and may also incorporate the DAOPHOT source lists.
What are five things you should know about the HSC?
Click here for a 1-page, graphic description of all five of the following.
All HLA magnitudes are in the ABMAG system. Here is a discussion of the ABMAG, VEGAMAG and STMAG systems. A handy, though not exact conversion for ACS is provided in Sirianni et al. (2005) The Synphot package provides a more generic conversion mechanism for all HST instruments.
Where can I find examples of good and bad regions of the HSC?
Nearby galaxy: M101 (10918_01), N > 1 version
To get the image below, click on "Advanced HSC controls", then check the box for "Require NumImages > 1", then click on HSC (beta), then wait a while (i.e., there are 23,759 HSC sources).
Star field: M31 (10265_01), N > 5 version
To get the image below, click on "Advanced HSC controls", then check the box for "Require NumImages > 5", then click on HSC (beta). There are 1914 HSC sources.
Field of faint galaxies: HDF (10189_e1), N > 2 version
To get the image below, click on "Advanced HSC controls", then check the box for "Require NumImages > 5", then click on HSC (beta). There are 901 HSC sources.
Galaxy cluster: Gal-Clus-002352+042307 (06000_01), N > 0 version
To get the image below, click on "Advanced HSC controls", then click on HSC (beta). There are 409 HSC sources.
Nearby galaxy with high background: M83 (8234_01), N > 1 version
To get the image below, click on "Advanced HSC controls", then check the box for "Require NumImages > 1", then click on HSC (beta). There are 1356 HSC sources.
NOTE: This image has some "doubling" and the sources do not line up with the circles well.
Merging galaxy: Antennae (10188_10), N > 1 version
To get the image below, click on "Advanced HSC controls", then check the box for "Require NumImages > 1", then click on HSC (beta), then wait a while (i.e., there are 4311 HSC sources).
NOTE: This image has severe "doubling" and is missing many objects in regions with high background.
Galaxy cluster: CLJ1226.9+3332 (9033_01_e1), N > 0 version
To get the image below, click on "Advanced HSC controls", then click on HSC (beta), There are 758 HSC sources.
NOTE: There are missing sources in a large part of the image since no SExtractor catalog was made for this image. The HSC objects circled are based on a different image.
Parallel field: (8013_41) N > 0
To get the image below, click on "Advanced HSC controls", then click on HSC (beta). There are 7769 HSC sources.
NOTE: Using N > 1 cleans up the edge effects pretty well, but the numerous detections along the diffraction spikes, bleeding columns, and saturated centers of the stars remain.
What are the primary "known problems" with HSC Beta 0.2?
MagAper2 magnitudes for objects on the PC (Planetary Camera) portion of the WFPC2 may differ when compared to the same object when it is on the WF (Wide Filed) portion of the WFPC2. The primary reason for this is that the Point Spread Function (PSF) on the PC is different than on the WF. While the rebinning of the PC to match the pixel size of the WF largely takes care of this effect, differences still remain. We are currently evaluating how large this difference is for point sources and will include documentation on the effect HERE in the near future.
How are HLA images and source lists constructed?
The Hubble Source Catalog (HSC) is based on HLA Source Extractor Bertin & Arnouts 1996 source lists. To build these source lists, the HLA first constructs a "white light" or "detection" image by combining the different filter observations within each visit for each detector. This filter-combined drizzled image provides added depth. Source Extractor is run on the white light detection image to identify each source and determine its position.
Next, the combined drizzled image for each filter used in the detection image is checked for sources at the positions indicated by the detection image. If a valid source is detected at a given position. then its properties are entered into the HLA source list appropriate for the visit, detector, and filter, with a flag value less than 5.
Sources that are found in the white light detection image, but not in a particular filter used to make the white light image, are regarded as "filter-based nondetections". These can be examined by asking for level = 1 under Detection Options on the HSC Detailed Search form.
More details about how HLA images are constructed can be found at the HLA Images FAQ . More details about how HLA source lists are constructed can be found at the HLA Source List FAQ .
Note that corrections for Charge Transfer Efficiency (CTE) problems have been made to the WFPC2 HLA source lists, but not the ACS source lists. See the HLA CTE FAQ
How are "matches" defined in the HSC? What is the algorithm that combines the sources?
The detections (and nondetections) that correspond to
the same physical object (as determined by the algorithm defined in Budavari &
Lubow 2012)
are given a unique
, are given a unique MatchID number and an associated position (MatchRA, MatchDec). Each member of the match, including nondetections, also has an assigned MemID value
and a source position (SourceRA, SourceDec). The procedure dramatically improves the relative astrometry between visits, to a level of about 11 mas or better for the majority of the sources. Each source detection and nondetection has some separation distance, D, from the match position. The DSigma value is the standard deviation of the D values. Visit-based nondetections are listed at the end of the results and have blank MatchID and MemID values.
The plot shows the distribution of the DSigma values, the standard deviations of the source positions in matches. The red curve is the distribution for the current HLA astrometry. The blue curve is for the astrometry that has been corrected for the HSC based on the method described in Budavari & Lubow (2012). The median value before astrometric correction is 56 mas (the distribution extends to larger values than shown), while after astrometric correction the median value is 11 mas.
Each source detection and nondetection has some separation distance, D, from the match position. The D-Sigma value is the standard deviation of the D values.
Many of the detections in the Detailed Search form are in matches that
involve a single visit and detector. These cases have D=0 and D-Sigma=0.Searches on the Detailed Search Form for
only crossmatched detections are made by specifying the constraint D > 0
using the User-specified field and the Field Descriptions
options.
How are HSC Levels 0, 1,and 2 for the Detection Options defined?
The output of the Detailed Search Form has the following options
available via the Detection Options field.
Level 0 - Includes all detections (no nondetections). Such detections have a Det value of Y.
Level 1 - Includes detections and filter-based nondetections. Sources that are found in the white light detection image, but not in one or more of the filters included in the visit, are called "filter-based nondetections". Filter-based nondetections detections have a Det value of N and an assigned MatchID value.
Only certain properties for filter-based nondetections are reported, including the image name, exposure time, and filter. Certain properties such as fluxes and magnitudes are indeterminate and are left blank on the search form.
Level 2 - Includes detections, filter-based nondetections, and visit-based nondetections. Visit-based nondetections are cases where an image overlaps with the specified positional search constraints, but no sources are detected there. Visit-level nondetections have a Det value of N and no assigned MatchID value.
These nondetections have image information, but have blank source positions, fluxes, and magnitudes.
Nondetections have "N" in the "Det" column, and are listed last on the form.
Note: Users studying nondetections may want to use the HLA image cutout Cutout FAQ capability to confirm them.
What is the difference between the HSC Summary and Detailed Search forms?
The HSC Summary Search Form includes results for all detections for a given object on a single row. The magnitudes for different visits are averaged together.
The HSC Detailed Search Form shows each individual detection for a given object on a single row.
The HSC Summary Search Form lists aggregate properties of each match such as the number of visits and filters. In addition, the average magnitude values and their associated RMS scatter (sigma), are listed for the most frequently used filters. More information about a match can be obtained by searching the Detailed Search Form for the appropriate MatchID value (under the User-specified field), or by clicking on the blue MatchID value in the first column when displaying the html version of the requested query. Frequently used filters are listed first; other filters are listed next using a different format
The HSC Detailed Search Form displays an entry for each individual detection in a match on a single row. Each object has a unique MatchID value. Each member of the match, including nondetections, also has an assigned MemID value, a unique position (MatchRA, MatchDec), and a separation distance D from the match position. See FAQ for a more detailed discussion of how matches are defined.
What filters are listed in the Summary Search form?
By default, the most frequently used filters across all HSC sources are listed in the HSC Summary Search form:
A_F814W W2_F606W W2_F814W A_F775W W2_F300W
where A stands for ACS/WFC and W2 stands for WFPC2.
Note that values for the same filter, but different instruments (e.g., A_F814W and W2_814W), are NOT averaged together, since the same aperture size has not been used for different instruments.
Following the globally most frequently used filters, two other filters within the match are listed using the format:
Filter1 Mag1 Mag_Sigma1 Filter2 Mag2 Mag_Sigma2 ...
Users can customize the list of filters by selecting from all available for ACS and WFPC2 (see 4 below).
This hybrid system of including the most frequently used filters in a standard fixed-column format, followed by a format with the appropriate filter listed in the preceding column, is a result of the great diversity of the HST filter-instrument combinations.
You can use the "Output Columns" feature to include only the filters you request, as described in this FAQ and demonstrated in Hubble Source Catalog Walkthrough.
Where can a definition of what all the fields mean be found?
Click on "Field Descriptions" in the upper right of the HSC Summary or Detailed Search forms. This provides a short description of the various parameters that can be used in a query. For more detailed information (for example how is the concentration index = CI defined ?), go to the HLA FAQ.
How can I customize the HSC output?
The output definition portions of the HSC Detailed and Summary Search forms can be used to change the output from the default values.
The "Output Columns" feature can be used to modify the order in which the columns appear; remove a column, or add a new column (using the "add" button at the bottom left - the parameter will then show up in the list of columns included under "Output Columns).
The "Sort By" feature allows you to sort using a wide range of parameters. There are three pull-down menus to allow you to sort sequentially - e.g., first on angular separation, then on Match ID, then on CI (concentration index).
The "Output Coordinates" allow you to change from sexagesimal (e.g., 13 37 00.874 -29 51 55.40), to degrees (204.2536415 -29.8653877), to hours (for Right Ascension: 13.6169094 -29.8653877).
The "Output Format" can be changed from the default (HTML_Table - which is displayed in real time) to a large number of alternative formats that can be downloaded.
The "Maximum Records" and "Records per Page" can also be defined.
In the future, it may be possible to save your custom-made catalog definitions for future use.
Can a list of targets be used for a HSC search?
A list of targets can be searched using the "File Upload Form" button, which is located near the top right of either the HSC Summary or Detailed Search forms. The "Local File Name" box (or "Browse" pull-down menu) allows you to provide the name of a file listing the targets you would like to include in the search. A number of different format options for the input file are allowed, as defined in the target field definition portion of the form.
How can I overlay the HSC objects on an image?
The HLA Interactive Display facility can be used to overlay the HSC catalog (click on the pink "Hubble (Beta)" in the upper right part of the display). Here is an example for a part of M101.
Nearby Galaxy: M101 (10918_01), N > 1 version To get the image below, click on "Advanced HSC controls", then check the box for "Require NumImages > 1", then click on HSC (beta), then wait a while (i.e., there are 23,759 HSC sources).
You can get to the HLA from your browser using hla.stsci.edu
Other catalogs that can be overlayed, if data exist for that portion of the sky, are (HLA DAOphot, HLA SExtractor, SDSS (Sloan Digital Sky Survey), 2MASS (Two Micron All Sky Survey), GSC2 (Guide Star Catalog 2), FIRST (Faint Images of the Radio Sky at Twenty Centimeters) or GALEX (the Galaxy Evolution Explorer ultraviolet survey).
Information for individual objects can be obtained by clicking on the object. See Basic Hubble Source Catalog Walkthrough for an example. More detailed information about the options provided within the interactive display (e.g., how to see all the columns in a catalog view) is available at Interactive Display Help
What are specific limitations and artifacts that HSC users should be aware of?
The Hubble Source Catalog is composed of visit/detector-based, general-purpose source lists from the Hubble Legacy Archive (hla.stsci.edu) . While the catalog may be sufficient to accomplish the science goals for certain projects, in other cases astronomers may need to make their own catalogs to achieve the optimal performance that is possible with the data (e.g., to go deeper). In addition, the Hubble observations are inherently different than large-field surveys such as SDSS, due to the pointed, small field-of-view nature of the observations, and the wide range of instruments, filters, and detectors. Here are some of the primary limitations that users should keep in mind when using the HSC .
- Uniformity - Coverage can be very non-uniform (unlike surveys like SDSS), since a wide range of HST instruments, filters, and exposure times have been combined. We recommend that users pan out to see the full HSC field when using the Interactive Display in order to have a better feel for the uniformity of a particular dataset.
- Depth - The HSC does not go as deep as it is possible to go. This is due to a number of different reasons, ranging from using an early version of a WFPC2 catalog (see FAQ ), to the use of visit-based source lists rather than a deep mosaic image where a large number of images have been added together.
- Completeness and nondetections - Information about limiting magnitudes is not currently inlcuded in the HSC.
In addition, the current generation of HLA WFPC2 and ACS Source Extractor source lists have problems finding sources in regions with high background. The WFC3 sources lists are much better in this regard, but are not currently included in the HSC. The next generation of WFPC2 and ACS source lists, which will use the improved WFC3 algorithms, will be incorporated into the HSC when available (tentatively planned for fall 2013)
- False detections - Uncorrected cosmic rays are a common cause of blank sources. Users may want to use NumImage > 1 to help filter out this and other artifacts.
Another common cause of "false detections" is the attempt by the detection software to find large, diffuse sources. In some cases this is due to the algorithm being too agressive when looking for these objects and finding noise. In other cases the objects are real, but not obvious unless observed with the right contrast stretch and field-of-view. It is not easy to filter out these potential artifacts without loosing real objects. One technique users might try is to use a size criteria (e.g., concenteration index = CI) to distinquish real vs. false sources for a particular dataset. This would be particularly reasonable in cases where the user is only interested in compact sources rather than extended objects. -
- Doubling - The HLA pipeline is designed to correct the inherant ~ 2" astrometric uncertainty (due to uncertainties in guide star positions) by comparing with SDSS and GSC2 positions. While this works in the vast majority of cases, it does not work for a few percent of the data for a variety of reasons (e.g., very crowded fields, high backgrounds, few or no SDSS or GSC2 sources in a field).
One of the steps in the HSC matching algorithm is to search for objects within a radius of 0.3" and only include these detections as potential matches. Hence, cases where the HLA has not been able to correct the postions to fit within a 0.3 " radius are left out of the potential pool of detections to be matched. This results in what is called "doubling" (though tripling or higher numbers are also possible if there are several detections with poor absolute astrometry). These cases can generally be identified by looking for a common pattern, i.e., double detections with the same separation and orientation. Here is an example for the Antennae galaxies.
Procedures for removing doubling are currently being developed for the HSC. This artifact should be largely eliminated by summer, 2013.
- Mismatched sources - The HSC matching algorithm uses a friends-of-friends algorithm, together with a Bayesian method to break up long chains (see Budavari & Lubow 2012) to match detections from different images. In some cases the algorithm has been too aggressive and two very close, but physically separate objects, have been matched together.
- Bad images (and hence bad source lists) - Images taken when Hubble has lost lock on guide stars (generally after an earth occultation) are the primary cause of bad images. We attempt to remove these images from the HLA, but occasionally a bad image is missed and a corresponding bad source list is generated. A document showing these and other examples of potential bad images can be found at HLA Images FAQ. If you come across what you believe is a bad image please inform us at archive@stsci.edu
- Miscellaneous bad information - There is occassionally incorrect information about an object in the catalog due to a variety of errors ranging from incorrect information in the image header to undetected errors in the pipeline processing. If you suspect a problem, please send an e-mail describing the issue to the archive@stsci.edu so we can track it down.
Is there a summary of known anomalies in the ACS and WFPC2 data?
Yes - HLA Images FAQ. Here is a figure from the document showing a variety of artifacts associated with very bright objects.
How good is the photometry for the HSC?
Due to the great diversity of the Hubble data, this is a hard question to answer. Our approach is to make detailed comparisons for a large number of HSC fields, and present them as "uses cases"(i.e. A comparison of the HSC with Brown et al 2006). By comparing flux differences between sources in the same match and having the same detector and filter, Fig 5 (see Budavari & Lubow 2012) indicates flux uncertainties of about 2.5% for ACS and 5% for WFPC2, based on the width of the distributions at half the maximum values. In addition, we will produce a PASP-type article describing a wide range of HSC quality comparisons. A draft of the article will be available for Version 1.
Send a note to archive@stsci.edu. Please include enough information (e.g., a screen save of the problem) to make it possible to diagnose the issue.
What are the future plans for the HSC?