{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Hubble Catalog of Variables Notebook (API version)\n", "### September 2019, Rick White & Steve Lubow\n", "\n", "This notebook shows how to access the [Hubble Catalogs of Variables (HCV)](http://archive.stsci.edu/hlsp/hcv/). The HCV is a large catalog of faint variable objects extracted from version 3 of the [Hubble Source Catalog](https://archive.stsci.edu/hst/hsc). The HCV project at the National Observatory of Athens was funded by the European Space Agency (PI: Alceste Bonanos). The data products for the HCV are available both at the [ESA Hubble Archive](http://archives.esac.esa.int/ehst) at [ESAC](https://www.cosmos.esa.int/web/esdc) through the [HCV Explorer](http://archives.esac.esa.int/hcv-explorer) interface and at STScI.\n", "\n", "This notebook uses the [MAST HSC catalog interface](https://catalogs.mast.stsci.edu/hsc), which supports queries to the current and previous versions of the [Hubble Source Catalog](https://archive.stsci.edu/hst/hsc). It allows searches of several different tables including the HCV summary and detailed tables. It also has an associated [API](https://catalogs.mast.stsci.edu/docs/hsc.html) that is used for data access in this notebook.\n", "\n", "For similar examples using the [MAST CasJobs](https://mastweb.stsci.edu/hcasjobs) interface,\n", "a SQL database query interface that is more complex to use but more powerful than the API, see the [HCV_CasJobs_demo notebook](HCV_casjobs_demo.html).\n", "\n", "This notebook is available for [download](HCV_API_demo.ipynb)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Instructions: \n", "* Complete the initialization steps [described below](#initialization).\n", "* Run the notebook to completion. \n", "* Modify and rerun any sections of the Table of Contents below.\n", "\n", "Running the notebook from top to bottom takes less than 1 minute (depending on the speed of your computer and network connection).\n", "\n", "# Table of Contents\n", "* [Intialization](#initialization)\n", "* [Variable objects in IC 1613](#ic1613)\n", " * [Name resolver](#resolver)\n", " * [Select objects from HCV](#summary)\n", " * [Information on HCV variable classification](#classification)\n", " * [Sky coverage](#positions)\n", " * [Properties of variable objects](#variability)\n", " * [Color magnitude diagram](#cmd)\n", "* [Light curve for a nova in M87](#m87)\n", " * [Extract and plot light curve for the nova](#lightcurve)\n", " * [HLA cutout images for selected measurements](#cutouts)\n", "* [Compare the HCV automatic classification to expert validations](#expert)\n", " * [Plot MAD variability index distribution](#mad_expert)\n", " * [Plot fraction of artifacts vs. MAD](#artifacts)\n", "* [Plot light curve for most variable high-quality candidate in the HCV](#most_variable)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Initialization " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Install Python modules\n", "\n", "_This notebook requires the use of **Python 3**._\n", "\n", "This needs some special modules in addition to the common requirements of `astropy`, `numpy` and `scipy`. For anaconda versions of Python the installation commands are:\n", "\n", "
\n",
    "conda install requests\n",
    "conda install pillow\n",
    "
\n", "\n", "Run the commands one at a time since conda may ask for confirmation." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import astropy, pylab, time, sys, os, requests, json\n", "import numpy as np\n", "\n", "from PIL import Image\n", "from io import BytesIO\n", "\n", "from astropy.table import Table, join\n", "from astropy.io import ascii\n", "\n", "# Set page width to fill browser for longer output lines\n", "from IPython.core.display import display, HTML\n", "display(HTML(\"\"))\n", "# set width for pprint\n", "astropy.conf.max_width = 150" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Useful functions\n", "\n", "* The `hcvcone(ra,dec,radius [,keywords])` function searches the HCV catalog near a position.\n", "* The `hcvsearch()` function performs general non-positional queries.\n", "* The `hcvmetadata()` function gives information about the columns available in a table. \n", "* The `resolve(name)` function uses the MAST Name Resolver (which relies on both SIMBAD and NED) to get the RA,Dec position for an object." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hscapiurl = \"https://catalogs.mast.stsci.edu/api/v0.1/hsc\"\n", "\n", "def hcvcone(ra,dec,radius,table=\"hcvsummary\",release=\"v3\",format=\"csv\",magtype=\"magaper2\",\n", " columns=None, baseurl=hscapiurl, verbose=False,\n", " **kw):\n", " \"\"\"Do a cone search of the HSC catalog (including the HCV)\n", " \n", " Parameters\n", " ----------\n", " ra (float): (degrees) J2000 Right Ascension\n", " dec (float): (degrees) J2000 Declination\n", " radius (float): (degrees) Search radius (<= 0.5 degrees)\n", " table (string): hcvsummary, hcv, summary, detailed, propermotions, or sourcepositions\n", " release (string): v3 or v2\n", " magtype (string): magaper2 or magauto (only applies to summary table)\n", " format: csv, votable, json\n", " columns: list of column names to include (None means use defaults)\n", " baseurl: base URL for the request\n", " verbose: print info about request\n", " **kw: other parameters (e.g., 'numimages.gte':2)\n", " \"\"\"\n", " \n", " data = kw.copy()\n", " data['ra'] = ra\n", " data['dec'] = dec\n", " data['radius'] = radius\n", " return hcvsearch(table=table,release=release,format=format,magtype=magtype,\n", " columns=columns,baseurl=baseurl,verbose=verbose,**data)\n", "\n", "\n", "def hcvsearch(table=\"hcvsummary\",release=\"v3\",magtype=\"magaper2\",format=\"csv\",\n", " columns=None, baseurl=hscapiurl, verbose=False,\n", " **kw):\n", " \"\"\"Do a general search of the HSC catalog (possibly without ra/dec/radius)\n", " \n", " Parameters\n", " ----------\n", " table (string): hcvsummary, hcv, summary, detailed, propermotions, or sourcepositions\n", " release (string): v3 or v2\n", " magtype (string): magaper2 or magauto (only applies to summary table)\n", " format: csv, votable, json\n", " columns: list of column names to include (None means use defaults)\n", " baseurl: base URL for the request\n", " verbose: print info about request\n", " **kw: other parameters (e.g., 'numimages.gte':2). Note this is required!\n", " \"\"\"\n", " \n", " data = kw.copy()\n", " if not data:\n", " raise ValueError(\"You must specify some parameters for search\")\n", " if format not in (\"csv\",\"votable\",\"json\"):\n", " raise ValueError(\"Bad value for format\")\n", " url = \"{}.{}\".format(cat2url(table,release,magtype,baseurl=baseurl),format)\n", " if columns:\n", " # check that column values are legal\n", " # create a dictionary to speed this up\n", " dcols = {}\n", " for col in hcvmetadata(table,release,magtype)['name']:\n", " dcols[col.lower()] = 1\n", " badcols = []\n", " for col in columns:\n", " if col.lower().strip() not in dcols:\n", " badcols.append(col)\n", " if badcols:\n", " raise ValueError('Some columns not found in table: {}'.format(', '.join(badcols)))\n", " # two different ways to specify a list of column values in the API\n", " # data['columns'] = columns\n", " data['columns'] = '[{}]'.format(','.join(columns))\n", "\n", " # either get or post works\n", " # r = requests.post(url, data=data)\n", " r = requests.get(url, params=data)\n", "\n", " if verbose:\n", " print(r.url)\n", " r.raise_for_status()\n", " if format == \"json\":\n", " return r.json()\n", " else:\n", " return r.text\n", "\n", "\n", "def hcvmetadata(table=\"hcvsummary\",release=\"v3\",magtype=\"magaper2\",baseurl=hscapiurl):\n", " \"\"\"Return metadata for the specified catalog and table\n", " \n", " Parameters\n", " ----------\n", " table (string): hcvsummary, hcv, summary, detailed, propermotions, or sourcepositions\n", " release (string): v3 or v2\n", " magtype (string): magaper2 or magauto (only applies to summary table)\n", " baseurl: base URL for the request\n", " \n", " Returns an astropy table with columns name, type, description\n", " \"\"\"\n", " url = \"{}/metadata\".format(cat2url(table,release,magtype,baseurl=baseurl))\n", " r = requests.get(url)\n", " r.raise_for_status()\n", " v = r.json()\n", " # convert to astropy table\n", " tab = Table(rows=[(x['name'],x['type'],x['description']) for x in v],\n", " names=('name','type','description'))\n", " return tab\n", "\n", "\n", "def cat2url(table=\"hcvsummary\",release=\"v3\",magtype=\"magaper2\",baseurl=hscapiurl):\n", " \"\"\"Return URL for the specified catalog and table\n", " \n", " Parameters\n", " ----------\n", " table (string): hcvsummary, hcv, summary, detailed, propermotions, or sourcepositions\n", " release (string): v3 or v2\n", " magtype (string): magaper2 or magauto (only applies to summary table)\n", " baseurl: base URL for the request\n", " \n", " Returns a string with the base URL for this request\n", " \"\"\"\n", " checklegal(table,release,magtype)\n", " if table == \"summary\":\n", " url = \"{baseurl}/{release}/{table}/{magtype}\".format(**locals())\n", " else:\n", " url = \"{baseurl}/{release}/{table}\".format(**locals())\n", " return url\n", "\n", "\n", "def checklegal(table,release,magtype):\n", " \"\"\"Checks if this combination of table, release and magtype is acceptable\n", " \n", " Raises a ValueError exception if there is problem\n", " \"\"\"\n", " \n", " releaselist = (\"v2\", \"v3\")\n", " if release not in releaselist:\n", " raise ValueError(\"Bad value for release (must be one of {})\".format(\n", " ', '.join(releaselist)))\n", " if release==\"v2\":\n", " tablelist = (\"summary\", \"detailed\")\n", " else:\n", " tablelist = (\"summary\", \"detailed\", \"propermotions\", \"sourcepositions\",\n", " \"hcvsummary\", \"hcv\")\n", " if table not in tablelist:\n", " raise ValueError(\"Bad value for table (for {} must be one of {})\".format(\n", " release, \", \".join(tablelist)))\n", " if table == \"summary\":\n", " magtypelist = (\"magaper2\", \"magauto\")\n", " if magtype not in magtypelist:\n", " raise ValueError(\"Bad value for magtype (must be one of {})\".format(\n", " \", \".join(magtypelist)))\n", "\n", "\n", "def resolve(name):\n", " \"\"\"Get the RA and Dec for an object using the MAST name resolver\n", " \n", " Parameters\n", " ----------\n", " name (str): Name of object\n", "\n", " Returns RA, Dec tuple with position\n", " \"\"\"\n", "\n", " resolverRequest = {'service':'Mast.Name.Lookup',\n", " 'params':{'input':name,\n", " 'format':'json'\n", " },\n", " }\n", " resolvedObjectString = mastQuery(resolverRequest)\n", " resolvedObject = json.loads(resolvedObjectString)\n", " # The resolver returns a variety of information about the resolved object, \n", " # however for our purposes all we need are the RA and Dec\n", " try:\n", " objRa = resolvedObject['resolvedCoordinate'][0]['ra']\n", " objDec = resolvedObject['resolvedCoordinate'][0]['decl']\n", " except IndexError as e:\n", " raise ValueError(\"Unknown object '{}'\".format(name))\n", " return (objRa, objDec)\n", "\n", "\n", "def mastQuery(request, url='https://mast.stsci.edu/api/v0/invoke'):\n", " \"\"\"Perform a MAST query.\n", "\n", " Parameters\n", " ----------\n", " request (dictionary): The MAST request json object\n", " url (string): The service URL\n", "\n", " Returns the returned data content\n", " \"\"\"\n", " \n", " # Encoding the request as a json string\n", " requestString = json.dumps(request)\n", " r = requests.post(url, data={'request': requestString})\n", " r.raise_for_status()\n", " return r.text" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Variable objects near IC 1613 " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Use MAST name resolver to get position of IC 1613 " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "target = 'IC 1613'\n", "ra, dec = resolve(target)\n", "print(target,ra,dec)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Select objects near IC 1613 from HCV \n", "\n", "This searches the HCV summary table for objects within 0.5 degrees of the galaxy center. Note that this returns both variable and non-variable objects." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "radius = 0.5 # degrees\n", "t0 = time.time()\n", "tab = ascii.read(hcvcone(ra,dec,radius,table=\"hcvsummary\"))\n", "print(\"Completed in {:.1f} sec\".format(time.time()-t0))\n", "\n", "# clean up the output format\n", "tab['MeanMag'].format = \"{:.3f}\"\n", "tab['MeanCorrMag'].format = \"{:.3f}\"\n", "tab['MAD'].format = \"{:.4f}\"\n", "tab['Chi2'].format = \"{:.4f}\"\n", "tab['RA'].format = \"{:.6f}\"\n", "tab['Dec'].format = \"{:.6f}\"\n", "\n", "# show some of the variable sources\n", "tab[tab['AutoClass']>0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Description of the variable classification columns \n", "\n", "Several of the table columns have information on the variability.\n", "\n", "* The columns `AutoClass` and `ExpertClass` have summary information on the variability for a given `MatchID` object.\n", " * `AutoClass`: Classification as provided by the system: 0=constant 1=single filter variable candidate (SFVC) 2=multi-filter variable candidate (MFVC)\n", " * `ExpertClass`: Classification as provided by expert: 0=not classified by expert, 1=high confidence variable, 2=probable variable, 4=possible artifact\n", "* The columns `MAD` and `Chi2` are variability indices using the median absolute deviation and the $\\chi^2$ parameter for the given filter.\n", "* The column `VarQualFlag` is a variability quality flag (see Section 5 of the paper). The five letters correspond to CI, D, MagerrAper2, MagAper2-MagAuto, p2p; AAAAA corresponds to the highest quality flag.\n", "* The column `FilterDetFlag` is the filter detection flag: 1=source is variable in this filter, 0=source is not variable in this filter.\n", "\n", "See the HCV paper by [Bonanos et al. (2019, AAp)](https://www.aanda.org/component/article?access=doi&doi=10.1051/0004-6361/201936026) for more details on the computation and meaning of these quantities." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Find objects with measurements in both F475W and F814W\n", "\n", "This could be done in a SQL query through the CasJobs interface. Here we use the `Astropy.table.join` function instead." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "w475 = np.where(tab['Filter']=='ACS_F475W')\n", "w814 = np.where(tab['Filter']=='ACS_F814W')\n", "\n", "# the only key needed to do the join is MatchID, but we include other common columns\n", "# so that join includes only one copy of them\n", "jtab = join(tab[w475],tab[w814],\n", " keys=['MatchID','GroupID','SubGroupID','RA','Dec','AutoClass','ExpertClass'],\n", " table_names=['f475','f814'])\n", "print(len(jtab),\"matched F475W+F814W objects\")\n", "jtab[jtab['AutoClass']>0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plot object positions on the sky \n", "\n", "We mark the galaxy center as well. Note that this field is in the outskirts of IC 1613. The 0.5 degree search radius (which is the maximum allowed in the API) allows finding these objects." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pylab.rcParams.update({'font.size': 16})\n", "pylab.figure(1,(10,10))\n", "pylab.plot(jtab['RA'], jtab['Dec'], 'bo', markersize=1,\n", " label='{:,} HCV measurements'.format(len(tab)))\n", "pylab.plot(ra,dec,'rx',label=target,markersize=10)\n", "pylab.gca().invert_xaxis()\n", "pylab.gca().set_aspect('equal')\n", "pylab.xlabel('RA [deg]')\n", "pylab.ylabel('Dec [deg]')\n", "pylab.legend(loc='best')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plot HCV MAD variability index versus magnitude in F475W \n", "\n", "The median absolute deviation variability index is used by the HCV to identify variables. It measures the scatter among the multi-epoch measurements. Some scatter is expected from noise (which increases for fainter objects). Objects with MAD values that are high are likely to be variable.\n", "\n", "This plots single-filter and multi-filter variable candidates (SFVC and MFVC) in different colors. Note that variable objects with low F475W MAD values are variable in a different filter (typically F814W in this field).\n", "\n", "This plot is similar to the upper panel of Figure 4 in [Bonanos et al. (2019, AAp)](https://www.aanda.org/component/article?access=doi&doi=10.1051/0004-6361/201936026)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wnovar = np.where(jtab['AutoClass']==0)[0]\n", "wsfvc = np.where(jtab['AutoClass']==1)[0]\n", "wmfvc = np.where(jtab['AutoClass']==2)[0]\n", "x = jtab['MeanCorrMag_f475']\n", "y = jtab['MAD_f475']\n", "\n", "pylab.rcParams.update({'font.size': 16})\n", "pylab.figure(1,(15,10))\n", "pylab.plot(x[wnovar], y[wnovar], 'x', markersize=4, color='silver',\n", " label='{:,} non-variable'.format(len(wnovar)))\n", "pylab.plot(x[wsfvc], y[wsfvc], 'o', markersize=5, color='blue',\n", " label='{:,} single-filter variable candidates'.format(len(wsfvc)))\n", "pylab.plot(x[wmfvc], y[wmfvc], 'o', markersize=5, color='tab:cyan',\n", " label='{:,} multi-filter variable candidates'.format(len(wmfvc)))\n", "\n", "pylab.xlabel('ACS_F475W [mag]')\n", "pylab.ylabel('ACS_F475W MAD [mag]')\n", "pylab.legend(loc='best', title='{} HSC measurements near {}'.format(len(jtab),target))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plot variables in color-magnitude diagram \n", "\n", "Many of the candidate variables lie on the instability strip.\n", "\n", "This plot is similar to the lower panel of Figure 4 in [Bonanos et al. (2019, AAp)](https://www.aanda.org/component/article?access=doi&doi=10.1051/0004-6361/201936026)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wnovar = np.where(jtab['AutoClass']==0)[0]\n", "wsfvc = np.where(jtab['AutoClass']==1)[0]\n", "wmfvc = np.where(jtab['AutoClass']==2)[0]\n", "x = jtab['MeanCorrMag_f475'] - jtab['MeanCorrMag_f814']\n", "y = jtab['MeanCorrMag_f475']\n", "\n", "pylab.rcParams.update({'font.size': 16})\n", "pylab.figure(1,(15,10))\n", "pylab.plot(x[wnovar], y[wnovar], 'x', markersize=4, color='silver',\n", " label='{:,} non-variable'.format(len(wnovar)))\n", "pylab.plot(x[wsfvc], y[wsfvc], 'o', markersize=5, color='blue',\n", " label='{:,} single-filter variable candidates'.format(len(wsfvc)))\n", "pylab.plot(x[wmfvc], y[wmfvc], 'o', markersize=5, color='tab:cyan',\n", " label='{:,} multi-filter variable candidates'.format(len(wmfvc)))\n", "pylab.gca().invert_yaxis()\n", "pylab.xlim(-1.1, 4)\n", "pylab.xlabel('ACS F475W - F814W [mag]')\n", "pylab.ylabel('ACS F475W [mag]')\n", "pylab.legend(loc='best', title='{} HSC measurements near {}'.format(len(jtab),target))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Get a light curve for a nova in M87 " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Extract light curve for a given MatchID \n", "\n", "Note that the `MatchID` could be determined by positional searches, filtering the catalog, etc. This object comes from the top left panel of Figure 9 in [Bonanos et al. (2019, AAp)](https://www.aanda.org/component/article?access=doi&doi=10.1051/0004-6361/201936026)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "matchid = 1905457\n", "\n", "t0 = time.time()\n", "\n", "# get light curves for F606W and F814W\n", "nova_606 = ascii.read(hcvsearch(table='hcv',MatchID=matchid,Filter='ACS_F606W'))\n", "print(\"{:.1f} sec: retrieved {} F606W measurements\".format(time.time()-t0,len(nova_606)))\n", "\n", "nova_814 = ascii.read(hcvsearch(table='hcv',MatchID=matchid,Filter='ACS_F814W'))\n", "print(\"{:.1f} sec: retrieved {} F814W measurements\".format(time.time()-t0,len(nova_814)))\n", "\n", "# get the object RA and Dec as well\n", "nova_tab = ascii.read(hcvsearch(table='hcvsummary',MatchID=matchid,Filter='ACS_F814W'))\n", "print(\"{:.1f} sec: retrieved object info\".format(time.time()-t0))\n", "\n", "nova_606" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pylab.rcParams.update({'font.size': 16})\n", "pylab.figure(1,(15,10))\n", "\n", "x = nova_606['MJD']\n", "y = nova_606['CorrMag']\n", "e = nova_606['MagErr']\n", "pylab.errorbar(x,y,yerr=e,fmt='ob',ecolor='k',elinewidth=1,markersize=8,label='ACS F606W')\n", "\n", "x = nova_814['MJD']\n", "y = nova_814['CorrMag']\n", "e = nova_814['MagErr']\n", "pylab.errorbar(x,y,yerr=e,fmt='or',ecolor='k',elinewidth=1,markersize=8,label='ACS F814W')\n", "\n", "pylab.gca().invert_yaxis()\n", "pylab.xlabel('MJD [days]')\n", "pylab.ylabel('magnitude')\n", "pylab.legend(loc='best', title='Nova in M87 (MatchID: {})'.format(matchid))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Get HLA image cutouts for the nova \n", "\n", "The [Hubble Legacy Archive (HLA)](https://hla.stsci.edu) images were the source of the measurements in the HSC and HCV, and it can be useful to look at the images. Examination of the images can be useful to identified cosmic-ray contamination and other possible image artifacts. In this case, no issues are seen, so the light curve is reliable.\n", "\n", "Note that the ACS F606W images of M87 have only a single exposure, so they do have cosmic ray contamination. The accompanying F814W images have multiple exposures, allowing CRs to be removed. In this case the F814W combined image is used to find objects, while the F606W exposure is used only for photometry. That reduces the effects of F606W CRs on the catalog but it is still a good idea to confirm the quality of the images.\n", "\n", "The `get_hla_cutout` function reads a single cutout image (as a JPEG grayscale image) and returns a PIL image object. See the documentation on the [fitscut image cutout service](http://hla.stsci.edu/fitscutcgi_interface.html) for more information on the web service being used." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def get_hla_cutout(imagename,ra,dec,size=33,autoscale=99.5,asinh=True,zoom=1):\n", " \n", " \"\"\"Get JPEG cutout for an image\"\"\"\n", " \n", " url = \"https://hla.stsci.edu/cgi-bin/fitscut.cgi\"\n", " r = requests.get(url, params=dict(ra=ra, dec=dec, size=size, \n", " format=\"jpeg\", red=imagename, autoscale=autoscale, asinh=asinh, zoom=zoom))\n", " im = Image.open(BytesIO(r.content))\n", " return im\n", "\n", "# sort images by magnitude from brightest to faintest\n", "phot = nova_606\n", "isort = np.argsort(phot['CorrMag'])\n", "# select the brightest, median and faintest magnitudes\n", "ind = [isort[0], isort[len(isort)//2], isort[-1]]\n", "\n", "# we plot zoomed-in and zoomed-out views side-by-side for each selected image\n", "nim = len(ind)*2\n", "ncols = 2 # images per row\n", "nrows = (nim+ncols-1)//ncols\n", "\n", "imsize1 = 19\n", "imsize2 = 101\n", "mra = nova_tab['RA'][0]\n", "mdec = nova_tab['Dec'][0]\n", "\n", "pylab.rcParams.update({\"font.size\":16})\n", "pylab.figure(1,(12, (12/ncols)*nrows))\n", "t0 = time.time()\n", "ip = 0\n", "for k in ind:\n", " im1 = get_hla_cutout(phot['ImageName'][k],mra,mdec,size=imsize1)\n", " ip += 1\n", " pylab.subplot(nrows,ncols,ip)\n", " pylab.imshow(im1,origin=\"upper\",cmap=\"gray\")\n", " pylab.title('{} m={:.3f}'.format(phot['ImageName'][k],phot['CorrMag'][k]),fontsize=14)\n", " im2 = get_hla_cutout(phot['ImageName'][k],mra,mdec,size=imsize2)\n", " ip += 1\n", " pylab.subplot(nrows,ncols,ip)\n", " pylab.imshow(im2,origin=\"upper\",cmap=\"gray\")\n", " xbox = np.array([-1,1])*imsize1/2 + (imsize2-1)//2\n", " pylab.plot(xbox[[0,1,1,0,0]],xbox[[0,0,1,1,0]],'r-',linewidth=1)\n", " pylab.title('{} m={:.3f}'.format(phot['ImageName'][k],phot['CorrMag'][k]),fontsize=14)\n", "pylab.tight_layout()\n", "print(\"{:.1f} s: got {} cutouts\".format(time.time()-t0,ip))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compare the HCV automatic classification to expert validations \n", "\n", "The HCV includes an automatic classification `AutoClass` for candidate variables as well as an expert validation for some fields that were selected for visual examination. For this example, we select all the objects in the HCV that have expert classification information." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t0 = time.time()\n", "\n", "#get data for objects with an expert validation\n", "constraints = {\"ExpertClass.gte\": 1}\n", "tab = ascii.read(hcvsearch(table=\"hcvsummary\",**constraints))\n", "print(\"Retrieved {} rows in {:.1f} sec\".format(len(tab),time.time()-t0))\n", "\n", "# clean up the output format\n", "tab['MeanMag'].format = \"{:.3f}\"\n", "tab['MeanCorrMag'].format = \"{:.3f}\"\n", "tab['MAD'].format = \"{:.4f}\"\n", "tab['Chi2'].format = \"{:.4f}\"\n", "tab['RA'].format = \"{:.6f}\"\n", "tab['Dec'].format = \"{:.6f}\"\n", "\n", "# tab includes 1 row for each filter (so multiple rows for objects with multiple filters)\n", "# get an array that has only one row per object\n", "mval, uindex = np.unique(tab['MatchID'],return_index=True)\n", "utab = tab[uindex]\n", "print(\"{} unique MatchIDs in table\".format(len(utab)))\n", "\n", "tab" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An `ExpertClass` value of 1 indicates that the object is confidently confirmed to be a variable; 2 means that the measurements do not have apparent problems and so the object is likely to be variable (usually the variability is too small to be obvious in the image); 4 means that the variability is likely to be the result of artifacts in the image (e.g., residual cosmic rays or diffraction spikes from nearby bright stars).\n", "\n", "Compare the distributions for single-filter variable candidates (SFVC, `AutoClass`=1) and multi-filter variable candidates (MFVC, `AutoClass`=2). The fraction of artifacts is lower in the MFVC sample." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sfcount = np.bincount(utab['ExpertClass'][utab['AutoClass']==1])\n", "mfcount = np.bincount(utab['ExpertClass'][utab['AutoClass']==2])\n", "sfrat = sfcount/sfcount.sum()\n", "mfrat = mfcount/mfcount.sum()\n", "\n", "print(\"Type Variable Likely Artifact Total\")\n", "print(\"SFVC {:8d} {:6d} {:8d} {:5d} counts\".format(sfcount[1],sfcount[2],sfcount[4],sfcount.sum()))\n", "print(\"MFVC {:8d} {:6d} {:8d} {:5d} counts\".format(mfcount[1],mfcount[2],mfcount[4],mfcount.sum()))\n", "print(\"SFVC {:8.3f} {:6.3f} {:8.3f} {:5.3f} fraction\".format(sfrat[1],sfrat[2],sfrat[4],sfrat.sum()))\n", "print(\"MFVC {:8.3f} {:6.3f} {:8.3f} {:5.3f} fraction\".format(mfrat[1],mfrat[2],mfrat[4],mfrat.sum()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plot the MAD variability index distribution with expert classifications \n", "\n", "Note that only the filters identified as variable (`FilterDetFlag` > 0) are included here.\n", "\n", "This version of the plot shows the distributions for the various `ExpertClass` values along with, for comparison, the distribution for all objects in gray (which is identical in each panel). Most objects are classified as confident or likely variables. Objects with lower MAD values (indicating a lower amplitude of variability) are less likely to be identified as confident variables because low-level variability is more difficult to confirm via visual examination." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "w = np.where(tab['FilterDetFlag']>0)[0]\n", "mad = tab['MAD'][w]\n", "e = tab['ExpertClass'][w]\n", "\n", "xrange = [7.e-3, 2.0]\n", "bins = xrange[0]*(xrange[1]/xrange[0])**np.linspace(0.0,1.0,50)\n", "\n", "pylab.rcParams.update({'font.size':16})\n", "pylab.figure(1,(12,12))\n", "\n", "pylab.subplot(311)\n", "pylab.hist(mad,bins=bins,log=True,color='lightgray',label='All')\n", "wp = np.where(e==1)[0]\n", "pylab.hist(mad[wp],bins=bins,log=True,label='Confident',color='C2')\n", "pylab.xscale('log')\n", "pylab.ylabel('Count')\n", "pylab.legend(loc='upper left')\n", "pylab.title('HCV Expert Validation')\n", "\n", "pylab.subplot(312)\n", "pylab.hist(mad,bins=bins,log=True,color='lightgray',label='All')\n", "wp = np.where(e==2)[0]\n", "pylab.hist(mad[wp],bins=bins,log=True,label='Likely',color='C1')\n", "pylab.xscale('log')\n", "pylab.ylabel('Count')\n", "pylab.legend(loc='upper left')\n", "\n", "pylab.subplot(313)\n", "pylab.hist(mad,bins=bins,log=True,color='lightgray',label='All')\n", "wp = np.where(e==4)[0]\n", "pylab.hist(mad[wp],bins=bins,log=True,label='Artifact',color='C0')\n", "pylab.xscale('log')\n", "pylab.ylabel('Count')\n", "pylab.legend(loc='upper left')\n", "\n", "pylab.xlabel('MAD Variability Index [mag]')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The plot below shows the same distributions, but plotted as stacked histograms. The top panel uses a linear scale on the y-axis and the bottom panel uses a log y scale." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "w = np.where(tab['FilterDetFlag']>0)[0]\n", "mad = tab['MAD'][w]\n", "e = tab['ExpertClass'][w]\n", "\n", "xrange = [7.e-3, 2.0]\n", "bins = xrange[0]*(xrange[1]/xrange[0])**np.linspace(0.0,1.0,50)\n", "\n", "pylab.rcParams.update({'font.size':16})\n", "pylab.figure(1,(15,12))\n", "\n", "pylab.subplot(211)\n", "hlog = False\n", "pylab.hist(mad,bins=bins,log=hlog,label='Artifact')\n", "\n", "wp = np.where(e<4)[0]\n", "pylab.hist(mad[wp],bins=bins,log=hlog,label='Likely Variable')\n", "\n", "wp = np.where(e==1)[0]\n", "pylab.hist(mad[wp],bins=bins,log=hlog,label='Confident Variable')\n", "pylab.xscale('log')\n", "pylab.xlabel('MAD Variability Index [mag]')\n", "pylab.ylabel('Count')\n", "pylab.legend(loc='upper right',title='HCV Expert Validation')\n", "\n", "pylab.subplot(212)\n", "hlog = True\n", "pylab.hist(mad,bins=bins,log=hlog,label='Artifact')\n", "\n", "wp = np.where(e<4)[0]\n", "pylab.hist(mad[wp],bins=bins,log=hlog,label='Likely Variable')\n", "\n", "wp = np.where(e==1)[0]\n", "pylab.hist(mad[wp],bins=bins,log=hlog,label='Confident Variable')\n", "pylab.xscale('log')\n", "pylab.xlabel('MAD Variability Index [mag]')\n", "pylab.ylabel('Count')\n", "pylab.legend(loc='upper right',title='HCV Expert Validation')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plot the fraction of artifacts as a function of MAD variability index \n", "\n", "This shows how the fraction of artifacts varies with the MAD value. For larger MAD values the fraction decreases sharply, presumably because such large values are less likely to result from the usual artifacts. Interestingly, the artifact fraction also declines for smaller MAD values (MAD < 0.1 mag). Probably that happens because typical artifacts are more likely to produce strong signals than the weaker signals indicated by a low MAD value. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "w = np.where(tab['FilterDetFlag']>0)[0]\n", "mad = tab['MAD'][w]\n", "e = tab['ExpertClass'][w]\n", "\n", "xrange = [7.e-3, 2.0]\n", "bins = xrange[0]*(xrange[1]/xrange[0])**np.linspace(0.0,1.0,30)\n", "\n", "all_count, bin_edges = np.histogram(mad,bins=bins)\n", "artifact_count, bin_edges = np.histogram(mad[e==4],bins=bins)\n", "wnz = np.where(all_count>0)[0]\n", "nnz = len(wnz)\n", "\n", "artifact_count = artifact_count[wnz]\n", "all_count = all_count[wnz]\n", "xerr = np.empty((2,nnz),dtype=float)\n", "xerr[0] = bin_edges[wnz]\n", "xerr[1] = bin_edges[wnz+1]\n", "\n", "# combine bins at edge into one big bin to improve the statistics there\n", "iz = np.where(all_count.cumsum()>10)[0][0]\n", "if iz > 0:\n", " all_count[iz] += all_count[:iz].sum()\n", " artifact_count[iz] += artifact_count[:iz].sum()\n", " xerr[0,iz] = xerr[0,0]\n", " all_count = all_count[iz:]\n", " artifact_count = artifact_count[iz:]\n", " xerr = xerr[:,iz:]\n", "iz = np.where(all_count[::-1].cumsum()>40)[0][0]\n", "if iz > 0:\n", " all_count[-iz-1] += all_count[-iz:].sum()\n", " artifact_count[-iz-1] = artifact_count[-iz:].sum()\n", " xerr[1,-iz-1] = xerr[1,-1]\n", " all_count = all_count[:-iz]\n", " artifact_count = artifact_count[:-iz]\n", " xerr = xerr[:,:-iz]\n", "\n", "x = np.sqrt(xerr[0]*xerr[1])\n", "xerr[0] = x - xerr[0]\n", "xerr[1] = xerr[1] - x\n", "\n", "frac = artifact_count/all_count\n", "# error on fraction using binomial distribution (approximate)\n", "ferr = np.sqrt(frac*(1-frac)/all_count)\n", "\n", "pylab.rcParams.update({'font.size':16})\n", "pylab.figure(1,(12,12))\n", "\n", "pylab.errorbar(x,frac,xerr=xerr,yerr=ferr,fmt='ob',\n", " markersize=5,label='Artifact fraction')\n", "\n", "pylab.xscale('log')\n", "pylab.xlabel('MAD Variability Index [mag]')\n", "pylab.ylabel('Artifact Fraction')\n", "pylab.legend(loc='upper right',title='HCV Expert Validation')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plot light curve for the most variable high quality candidate in the HCV \n", "\n", "Select the candidate variable with the largest MAD value and `VarQualFlag` = 'AAAAA'. To find the highest MAD value, we sort by MAD in descending order and select the first result." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "constraints = {'VarQualFlag': 'AAAAA', 'sort_by': 'MAD.desc', 'pagesize': 1}\n", "\n", "t0 = time.time()\n", "tab = ascii.read(hcvsearch(table='hcvsummary',**constraints))\n", "print(\"Completed in {:.1f} sec\".format(time.time()-t0))\n", "\n", "# clean up the output format\n", "tab['MeanMag'].format = \"{:.3f}\"\n", "tab['MeanCorrMag'].format = \"{:.3f}\"\n", "tab['MAD'].format = \"{:.4f}\"\n", "tab['Chi2'].format = \"{:.4f}\"\n", "tab['RA'].format = \"{:.6f}\"\n", "tab['Dec'].format = \"{:.6f}\"\n", "\n", "print(\"MatchID {} has largest MAD value = {:.2f}\".format(\n", " tab['MatchID'][0],tab['MAD'][0]))\n", "tab" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get and plot the light curve." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "matchid = tab['MatchID'][0]\n", "mfilter = tab['Filter'][0]\n", "\n", "t0 = time.time()\n", "lc = ascii.read(hcvsearch(table=\"hcv\",MatchID=matchid,Filter=mfilter))\n", "print(\"{:.1f} sec: retrieved {} {} measurements\".format(time.time()-t0,len(lc),mfilter))\n", "\n", "pylab.rcParams.update({'font.size': 16})\n", "pylab.figure(1,(15,10))\n", "\n", "x = lc['MJD']\n", "y = lc['CorrMag']\n", "e = lc['MagErr']\n", "pylab.errorbar(x,y,yerr=e,fmt='ob',ecolor='k',elinewidth=1,markersize=8,label=mfilter)\n", "\n", "pylab.gca().invert_yaxis()\n", "pylab.xlabel('MJD [days]')\n", "pylab.ylabel('magnitude')\n", "pylab.legend(loc='best', title='MatchID: {} MAD={:.2f}'.format(matchid, tab['MAD'][0]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Extract cutout images for the entire light curve (since it does not have many points)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "# sort images in MJD order\n", "ind = np.argsort(lc['MJD'])\n", "\n", "# we plot zoomed-in and zoomed-out views side-by-side for each selected image\n", "nim = len(ind)*2\n", "ncols = 2 # images per row\n", "nrows = (nim+ncols-1)//ncols\n", "\n", "imsize1 = 19\n", "imsize2 = 101\n", "mra = tab['RA'][0]\n", "mdec = tab['Dec'][0]\n", "\n", "pylab.rcParams.update({\"font.size\":14})\n", "pylab.figure(1,(12, (12/ncols)*nrows))\n", "t0 = time.time()\n", "ip = 0\n", "for k in ind:\n", " im1 = get_hla_cutout(lc['ImageName'][k],mra,mdec,size=imsize1)\n", " ip += 1\n", " pylab.subplot(nrows,ncols,ip)\n", " pylab.imshow(im1,origin=\"upper\",cmap=\"gray\")\n", " pylab.title(lc['ImageName'][k],fontsize=14)\n", " im2 = get_hla_cutout(lc['ImageName'][k],mra,mdec,size=imsize2)\n", " ip += 1\n", " pylab.subplot(nrows,ncols,ip)\n", " pylab.imshow(im2,origin=\"upper\",cmap=\"gray\")\n", " xbox = np.array([-1,1])*imsize1/2 + (imsize2-1)//2\n", " pylab.plot(xbox[[0,1,1,0,0]],xbox[[0,0,1,1,0]],'r-',linewidth=1)\n", " pylab.title('m={:.3f} MJD={:.2f}'.format(lc['CorrMag'][k],lc['MJD'][k]),fontsize=14)\n", " print(\"{:.1f} s: finished {} of {} epochs\".format(time.time()-t0,ip//2,len(ind)))\n", "pylab.tight_layout()\n", "print(\"{:.1f} s: got {} cutouts\".format(time.time()-t0,ip))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.6" } }, "nbformat": 4, "nbformat_minor": 2 }