Introducing the new MAST Forum

The MAST Forum provides a new way of interacting with the archive users in the astronomical community.

Jonathan Hargis

An important aim of the MAST project is to enhance our communication with the astronomical community. As part of this effort, we have recently launched the MAST Forum at https://forum.stsci.edu/. We hope this will be a vibrant, welcoming, and productive environment for MAST users to connect with each other and the MAST staff. If you have questions about how to search and retrieve data from MAST, how the data are processed in the backend pipeline, or what the resulting data products are, the forum is a great resource for getting answers.

The forum has a number of categories where questions and discussion threads can be posted. We have sections related specifically to the missions supported by STScI, including the Hubble Space Telescope (and Hubble Legacy Archive and Hubble Source Catalog projects), Kepler/K2, and GALEX. In addition, MAST will be archiving the data from a number of upcoming missions, including Pan-STARRS, Gaia, JWST, and TESS, and discussion categories for each of these projects are available now. Lastly, we have categories for many of the search interfaces and tools that MAST provides, for example, the Discovery Portal and CasJobs.

The MAST Forum is available now for anyone with an STScI Single Sign-On account (called MyST). To create an account, click the "Register" button on the forum homepage. This will prompt you to create a MyST login and password. After completing registration, continue to the MAST Forum site and login with your new MyST username and password. Please note that although posting on the MAST Forum is not available for anonymous users, forum contents are indexed and searchable via web search engines.

We welcome your feedback and suggestions for how MAST can improve process of doing archival science with NASA/STScI data and the other missions we support. Email us at archive@stsci.edu or post your thoughts for discussion on the forum under "Help Improve MAST".

Hosting your High Level Science Product at MAST

Your science-ready data products can have a high-visibility, permanent home at MAST

Anton Koekemoer

A widely used feature of MAST is our collection of High Level Science Products (HLSPs): science-ready, fully-reduced, publication-quality products, which we receive from teams in the community, for data related to any of the large number of of missions that we support at MAST. Our curated HLSP collection (accessible to the entire community at http://archive.stsci.edu/hlsp) contains a wide variety of products, including full-depth mosaics from large surveys, photometric catalogs, image atlases, spectral atlases, time series lightcurves, spectral linelists, model simulations, and more! These include data on planets, stars, ISM, galaxies, AGN, clusters, and many other types of objects, across a wide range of wavelengths from radio/IR through to optical, UV and X-ray, related to projects with HST, GALEX, Kepler/K2, SWIFT/UVOT, XMM/OM, IUE, and many other facilities.

We are always happy to receive new HLSPs that are related to observations obtained with any of the missions that we host at MAST. Benefits of hosting your HLSPs at MAST include free, permanent, highly visible locations for your data, as well as our ability to tie your data directly into searches made using our archive interfaces (for example the MAST Portal or the Hubble Legacy Archive). We also generally associate a given HLSP dataset with refereed papers provided by the contributing team, which provides additional visibility and citation opportunities for the work done by that team. Large HST programs and Archival Treasury programs already commit to delivering HLSPs to MAST, but we also welcome such products from others in the community. If you have HLSPs that you think would be relevant to host at MAST, please feel free to contact us by sending email to archive@stsci.edu. Our Archive Scientists will work directly with you to discuss the options for providing your HLSPs with a high-visibility, permanent home at MAST.

Big Data and the MAST Archives

A new study explores how STScI will meet the opportunities and challenges of Big Data in astronomy.

Joshua Peek

Pan-STARRS on a forklift
STScI staff members (from left to right) Jeff Valenti, Andrew Fruchter, Rick White, and Armin Rest celebrate the safe arrival of the Pan-STARRS storage hardware at the Institute.
Big data is everywhere, and astronomy is no exception. The data we handle at STScI are arriving faster and in larger volumes—JWST, TESS, and WFIRST will collect much more data than our current observatories, and MAST is already the home the massive 2000-terabyte Pan-STARRS database, recently brought into the Institute with a forklift (see right).

A team of scientists, engineers, and IT experts at the Institute recently completed a study of the technological and conceptual impacts of Big Data in astronomy, focusing on our current and future data holdings and addressing how astronomers can achieve the maximum scientific potential with this wealth of data. A key theme of the report is that we should see the raw volume and velocity of the data not as a limitation, but as a great opportunity for scientific discovery.

The Institute’s big-data study centered around a number of science cases, which pushed up against the limits of our current science computing capabilities. Supporting these advanced science cases will require growth and development in at least four key areas: image classification algorithms (including Machine Learning and Deep Learning methodologies); time domain methods; parallel and high performance computing; and advanced database methods. The future of astronomical data science involves not only taking advantage of the growth in computational processing power, but also implementing clever methodologies and tools that will provide us a deeper understanding of the physics of the cosmos from these huge data sets.

MAST is enabling the next generation of archival science methods in a number of ways. The development of application programming interfaces – allowing users within and beyond the Institute to programmatically access specific computationally intensive tasks – is one area of active work. As an example, the GALEX gPhoton database (a database of over 1 trillion time-tagged photon events) allows users to extract data cubes and light curves rather than only co-added images of GALEX targets.

The full big-data report is available at http://archive.stsci.edu/reports/BigDataSDTReport_Final.pdf. MAST welcomes your feedback and suggestions for how we can better enable archival data science in the big-data era.

The Hubble Catalogue of Variables

ESA and STScI scientists are producing a catalog of variable objects chosen from the 80 million sources in the Hubble Source Catalog.

HCV light curve
Fig 1: Phased lightcurve of the new Cepheid variable HSCv1 27271321 found in M31. Blue and red points are observations in the F606W and F814W filters, respectively.
The Hubble Catalogue of Variables (HCV) is a three-year project funded by the European Space Agency (ESA) in collaboration with a research group at the National Observatory of Athens, Greece, led by Alceste Bonanos. The goal is to produce a catalogue of variable sources chosen from the 80 million sources in Version 2 of the Hubble Source Catalog (HSC). As part of the project the group will validate the candidates using a wide variety of different algorithms (Sokolovsky et al. 2016, submitted) and make them available in a catalogue.

Over its 26 years of operation, Hubble has visited many regions of the sky multiple times, providing an opportunity to conduct a systematic search for variable objects among the sources in the HSC. The HCV is expected to contain one of the largest collections of variable point sources and extended objects available, spanning a baseline which is more than 20 years long, and reaching unprecedented magnitude depths.

The HSC provides the backbone for the HCV project. It combines source lists generated from over 50,000 individual Hubble images into a single master catalogue. This is done by computing astrometric corrections for each Hubble image based on reference stars from Pan-STARRS, the 2MASS, and SDSS catalogues within the field of view of each image. In the future, Gaia will be used as the primary astrometric backbone, improving the absolute astrometric accuracy of the HSC from 100 mas to about 10 mas. The astrometrically corrected lists are matched using a technique suggested by Budavari & Lubow (2012). The first version of the HSC (Whitmore et al. 2016) was released in February 2015 and contains photometry for 30 million sources, based on images obtained with the WFPC2, ACS/WFC, WFC3/UVIS and WFC3/IR cameras. Version 2 is scheduled to be released in the fall of 2016.

The task undertaken by the HCV team is to define algorithms that will detect and validate a candidate variable star within the HSC. As the Hubble data were collected with different instruments, filters and observing strategies, the photometric accuracy and data quality varies greatly across the HSC, making the detection of variable objects non-trivial. The team is developing algorithms to reliably detect a broad range of light-curve features and variability types, combining several methods to eliminate outliers and reliably detect changes in brightness. An example light curve of a Cepheid variable discovered in M31 is shown in Figure 1.

The HCV will be ingested into the MAST archives in the spring of 2018. The HCV pipeline will be deployed at STScI, where it will be used to produce new, updated versions of the HCV following future releases of the HSC.

Inside the Archive: Searching the Virtual Observatory with the MAST Portal

Learn how to expand your archive searches beyond STScI. Dive into the entire Virtual Observatory database using the MAST Portal.

Geoff Wallace

The MAST Data Discovery Portal allows you to query the Virtual Observatory (VO), performing cone searches on tens of thousands of data collections simultaneously. While many of these collections are relatively small, many large-scale surveys such as WISE, SDSS, and GSC-II are available, as well as all published Vizier catalogs. Using the Portal to search the VO allows astronomers to quickly explore and discover new data of all types, all from a single interface 1.

VO Search drop-down
Fig 1: To search the VO, select the "All Virtual Observatory Collections" drop-down from the list of collections.
To perform a VO search, first select “All Virtual Observatory Collections” from the collection dropdown at the top-left of the page (see Fig 1). In the search box, enter a target object or coordinates (available coordinate formats are available by clicking “Show Examples…” below the box). Optionally specify a radius and hit enter or click the search button.

After a few seconds, initial search results will begin to return. As more resources respond, a refresh button will appear in the upper left corner of the result grid indicating how many new results are available to load. You may filter on the available metadata by using the columns on the left. This is useful for specifying that you only want image resources or optical data, for example.

Upon finding a resource in which you are interested, there are a number of actions you may perform. First, basic information about a resource can be viewed by using the blue info button for that resource’s row. Clicking the save icon will download a VO Table of the cone search results to your computer. Alternatively, you may click the binoculars icon to load the data in another tab within the Portal. While there are a few resources the software recognizes and can display with a custom view (e.g. the Hubble Legacy Archive), the vast majority of results will be generic VO Tables. In these cases, the Portal will attempt to identify columns of interest such as positions or footprints to plot in AstroView. With the data loaded in the grid, the Portal’s generic tools will be available to do filtering, charting, column arithmetic and exporting.


1 Curious about what resources are in the Virtual Observatory? Search the VO directory here to find out.

High Level Science Products: EVEREST

MAST is hosting a new set of detrended K2 light curves from Campaigns 0-7.

Scott Fleming

Example EVEREST light curve
Fig 1: Example detrended K2 light curve from the EVEREST project.
EVEREST (“EPIC Variability Extraction and Removal for Exoplanet Science Targets”, Luger et al. 2016, AJ, accepted) is a new High Level Science Product (HLSP) at MAST that offers one of the best detrended K2 light curves from Campaigns 0-7. The team uses a combination of pixel-level decorrelations to remove instrument systematics and Gaussian processes to capture intrinsic stellar variability (see Figure 1 for an example light curve). The team is able to achieve photometric precision comparable to the original Kepler mission for targets brighter than 13th magnitude in the Kepler bandpass, and within a factor of two of the original mission for fainter targets. For more details and a summary of available data products, visit the MAST HLSP page: https://archive.stsci.edu/prepds/everest/