Hubble Catalog of Variables Notebook (CasJobs version)

September 2019, Rick White & Steve Lubow

This notebook shows how to access the Hubble Catalogs of Variables (HCV). The HCV is a large catalog of faint variable objects extracted from version 3 of the Hubble Source Catalog. The HCV project at the National Observatory of Athens was funded by the European Space Agency (PI: Alceste Bonanos). The data products for the HCV are available both at the ESA Hubble Archive at ESAC through the HCV Explorer interface and at STScI.

Data tables in MAST CasJobs are queried from Python using the mastcasjobs module. For similar examples using the MAST API, which is easier to use but less powerful than CasJobs, see the HCV_API_demo notebook.

This notebook is available for download.

Initialization

Install Python modules

This notebook requires the use of Python 3.

This needs some special modules in addition to the common requirements of astropy, numpy and scipy. For anaconda versions of Python the installation commands are:

conda install requests
conda install pillow
pip install git+git://github.com/dfm/casjobs@master
pip install git+git://github.com/rlwastro/mastcasjobs@master

Run the commands one at a time since conda may ask for confirmation.

If you already have an older version of the mastcasjobs module, you may need to update it:

pip install --upgrade git+git://github.com/rlwastro/mastcasjobs@master

Set up your CasJobs account information

You must have a MAST Casjobs account (see https://mastweb.stsci.edu/hcasjobs to create one). Note that MAST Casjobs accounts are independent of SDSS Casjobs accounts.

For easy startup, you can optionally set the environment variables CASJOBS_USERID and/or CASJOBS_PW with your Casjobs account information. The Casjobs user ID and password are what you enter when logging into Casjobs.

This script prompts for your Casjobs user ID and password during initialization if the environment variables are not defined.

In [1]:
HSCContext= "HSCv3"

%matplotlib inline
import astropy, pylab, time, sys, os, requests, json
import numpy as np

from PIL import Image
from io import BytesIO

from astropy.table import Table, join

# check that version of mastcasjobs is new enough
# we are using some features not in version 0.0.1
from pkg_resources import get_distribution
from distutils.version import StrictVersion as V
assert V(get_distribution("mastcasjobs").version) >= V('0.0.2'), """
A newer version of mastcasjobs is required.
Update mastcasjobs to current version using this command:
pip install --upgrade git+git://github.com/rlwastro/mastcasjobs@master
"""

import mastcasjobs

# Set page width to fill browser for longer output lines
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))
# set width for pprint
astropy.conf.max_width = 150

Set up Casjobs environment.

In [2]:
import getpass
if not os.environ.get('CASJOBS_USERID'):
    os.environ['CASJOBS_USERID'] = input('Enter Casjobs UserID:')
if not os.environ.get('CASJOBS_PW'):
    os.environ['CASJOBS_PW'] = getpass.getpass('Enter Casjobs password:')

Useful functions

The resolve(name) function uses the MAST Name Resolver (which relies on both SIMBAD and NED) to get the RA,Dec position for an object.

In [3]:
def resolve(name):
    """Get the RA and Dec for an object using the MAST name resolver
    
    Parameters
    ----------
    name (str): Name of object

    Returns RA, Dec tuple with position
    """

    resolverRequest = {'service':'Mast.Name.Lookup',
                       'params':{'input':name,
                                 'format':'json'
                                },
                      }
    resolvedObjectString = mastQuery(resolverRequest)
    resolvedObject = json.loads(resolvedObjectString)
    # The resolver returns a variety of information about the resolved object, 
    # however for our purposes all we need are the RA and Dec
    try:
        objRa = resolvedObject['resolvedCoordinate'][0]['ra']
        objDec = resolvedObject['resolvedCoordinate'][0]['decl']
    except IndexError as e:
        raise ValueError("Unknown object '{}'".format(name))
    return (objRa, objDec)


def mastQuery(request, url='https://mast.stsci.edu/api/v0/invoke'):
    """Perform a MAST query.

    Parameters
    ----------
    request (dictionary): The MAST request json object
    url (string): The service URL

    Returns the returned data content
    """
    
    # Encoding the request as a json string
    requestString = json.dumps(request)
    r = requests.post(url, data={'request': requestString})
    r.raise_for_status()
    return r.text

Variable objects near IC 1613

Use MAST name resolver to get position of IC 1613

In [4]:
target = 'IC 1613'
ra, dec = resolve(target)
print(target,ra,dec)
IC 1613 16.19913 2.11778

Select objects near IC 1613 with ACS F475W and F814W measurements from HCV

This searches the HCV summary table for objects within 0.5 degrees of the galaxy center. Note that this returns both variable and non-variable objects. We restrict the sample to objects with measurements in the two filters of interest. This uses the SearchHCVMatchID function to do the cone search.

In [5]:
DBtable = "HCV_demo"
jobs = mastcasjobs.MastCasJobs(context="MyDB")

# drop table if it already exists
jobs.drop_table_if_exists(DBtable)

#get main information
radius = 1800.0 # arcsec
query = """
select m.MatchID, m.GroupID, m.SubGroupID, m.RA, m.Dec,
   m.AutoClass, m.ExpertClass, m.NumFilters,
   f.Filter, f.FilterDetFlag, f.VarQualFlag, f.NumLC, 
   f.MeanMag, f.MeanCorrMag, f.MAD, f.Chi2
into mydb.{DBtable}
from SearchHCVMatchID({ra},{dec},{radius}) s
join HCVmatch m on m.MatchID=s.MatchID
join HCVfilter f on f.MatchID=s.MatchID and (f.Filter='ACS_F475W' or f.Filter='ACS_F814W')
""".format(**locals())

t0 = time.time()
results = jobs.quick(query, task_name="HCV demo", context=HSCContext)

print("Completed in {:.1f} sec".format(time.time()-t0))
print(results)

# fast retrieval using special MAST Casjobs service
tab = jobs.fast_table(DBtable, verbose=True)

# clean up the output format
tab['MeanMag'].format = "{:.3f}"
tab['MeanCorrMag'].format = "{:.3f}"
tab['MAD'].format = "{:.4f}"
tab['Chi2'].format = "{:.4f}"
tab['RA'].format = "{:.6f}"
tab['Dec'].format = "{:.6f}"

# show some of the variable sources
tab[tab['AutoClass']>0]
Completed in 1.5 sec
Rows Affected
-------------
        40196
3.1 s: Retrieved 6.34MB table MyDB.HCV_demo
3.6 s: Converted to 40196 row table
Out[5]:
Table length=522
MatchIDGroupIDSubGroupIDRADecAutoClassExpertClassNumFiltersFilterFilterDetFlagVarQualFlagNumLCMeanMagMeanCorrMagMADChi2
int64int64int64float64float64int64int64int64str9int64str5int64float64float64float64float64
1213128869810-516.0956592.161246112ACS_F475W1BACAA1225.24825.2500.129611.8544
1213128869810-516.0956592.161246112ACS_F814W0AAAAC1225.21525.2170.06112.3024
8265372269810-516.0972392.162066212ACS_F475W1AACAA1225.23725.2380.231264.6089
8265372269810-516.0972392.162066212ACS_F814W1BACAA1225.06925.0700.141620.4986
10043722869810-516.1113092.160908212ACS_F475W1ABCAA1223.24923.2490.12151231.5608
10043722869810-516.1113092.160908212ACS_F814W1ABCAA1222.85022.8500.0719329.2444
435874569810-516.1119612.157844121ACS_F475W1AACAA726.10826.1110.101144.0548
7821790469810-516.1103552.162592212ACS_F475W1AACAA925.37425.3750.434344.1083
7821790469810-516.1103552.162592212ACS_F814W1AACAB925.18925.1890.180411.1780
2465231969810-516.1074122.164411122ACS_F475W0BABAC825.22525.2230.04853.9213
................................................
9956325869810-516.1019402.130867142ACS_F475W1AAAAC1223.92623.9270.043413.8320
9956325869810-516.1019402.130867142ACS_F814W0ABAAC1122.55022.5510.015015.7979
2664896269810-516.1049192.131477212ACS_F475W1AACBA1125.46125.4610.158219.5376
2664896269810-516.1049192.131477212ACS_F814W1BAAAB1225.21725.2190.10195.4801
3125794169810-516.1073932.128815212ACS_F475W1AACAA1223.16823.1680.06643751.2152
3125794169810-516.1073932.128815212ACS_F814W1AACAA1222.88522.8870.0862814.8062
1400546069810-516.1096572.127282212ACS_F475W1AACCA1225.47025.4700.177456.3907
1400546069810-516.1096572.127282212ACS_F814W1AACAA1025.24825.2500.124116.0616
5755715069810-516.1084612.128060212ACS_F475W1AACAA1225.31925.3190.139118.9906
5755715069810-516.1084612.128060212ACS_F814W1AABAC1225.31925.3200.08774.2321

Description of the variable classification columns

Several of the table columns have information on the variability.

  • The columns AutoClass and ExpertClass have summary information on the variability for a given MatchID object.
    • AutoClass: Classification as provided by the system: 0=constant 1=single filter variable candidate (SFVC) 2=multi-filter variable candidate (MFVC)
    • ExpertClass: Classification as provided by expert: 0=not classified by expert, 1=high confidence variable, 2=probable variable, 4=possible artifact
  • The columns MAD and Chi2 are variability indices using the median absolute deviation and the $\chi^2$ parameter for the given filter.
  • The column VarQualFlag is a variability quality flag (see Section 5 of the paper). The five letters correspond to CI, D, MagerrAper2, MagAper2-MagAuto, p2p; AAAAA corresponds to the highest quality flag.
  • The column FilterDetFlag is the filter detection flag: 1=source is variable in this filter, 0=source is not variable in this filter.

See the HCV paper by Bonanos et al. (2019, AAp) for more details on the computation and meaning of these quantities.

Find objects with measurements in both F475W and F814W

This could also be done in the SQL query. Here we use the Astropy.table.join function instead.

In [6]:
w475 = np.where(tab['Filter']=='ACS_F475W')
w814 = np.where(tab['Filter']=='ACS_F814W')

# the only key needed to do the join is MatchID, but we include other common columns
# so that join includes only one copy of them
jtab = join(tab[w475],tab[w814],
            keys=['MatchID','GroupID','SubGroupID','RA','Dec','AutoClass','ExpertClass'],
            table_names=['f475','f814'])
print(len(jtab),"matched F475W+F814W objects")
jtab[jtab['AutoClass']>0]
17090 matched F475W+F814W objects
Out[6]:
Table length=258
MatchIDGroupIDSubGroupIDRADecAutoClassExpertClassNumFilters_f475Filter_f475FilterDetFlag_f475VarQualFlag_f475NumLC_f475MeanMag_f475MeanCorrMag_f475MAD_f475Chi2_f475NumFilters_f814Filter_f814FilterDetFlag_f814VarQualFlag_f814NumLC_f814MeanMag_f814MeanCorrMag_f814MAD_f814Chi2_f814
int64int64int64float64float64int64int64int64str9int64str5int64float64float64float64float64int64str9int64str5int64float64float64float64float64
9645769810-516.1415162.177815212ACS_F475W1AACAA1223.19223.1920.0686424.70972ACS_F814W1AAAAC1222.94622.9470.040277.5733
81365369810-516.1283532.160147142ACS_F475W0AAAAC1025.50825.5070.02430.29452ACS_F814W1BACAB1124.90624.9080.07958.5797
101269269810-516.1348092.144720212ACS_F475W1AACAA725.43925.4380.144617.98672ACS_F814W1AABAB825.20825.2100.11526.6471
108538669810-516.1185442.160845142ACS_F475W0AAAAC1124.35124.3520.01743.13112ACS_F814W1BABBB1123.47623.4760.069740.7630
128685769810-516.1192052.184252122ACS_F475W1AABAB1223.13723.1380.047753.22822ACS_F814W0AAAAC1221.07321.0750.012846.8982
130927169810-516.1305712.152512212ACS_F475W1AACAA1225.34725.3480.139920.04342ACS_F814W1AAAAC1225.04325.0440.08045.2954
147964669810-516.1208522.152737122ACS_F475W0CACAA1223.69123.6920.033241.95252ACS_F814W1AABAB1222.50122.5030.036268.2414
166131569810-516.1105712.143526142ACS_F475W1AACAB1225.22025.2210.07805.86482ACS_F814W0AAAAC1225.88725.8890.05950.8037
182647469810-516.1015322.171855122ACS_F475W0AAAAC1125.73525.7370.02590.51952ACS_F814W1BACAB1124.88924.8900.08229.2131
184962169810-516.1470492.153722212ACS_F475W1AACAA1125.50125.5020.213457.07252ACS_F814W1AACAA1125.39325.3940.148318.3074
...........................................................................
10213280069810-516.1261602.145442212ACS_F475W1AACAA1223.69123.6920.0750129.57172ACS_F814W1AACBA1224.50324.5050.103827.1660
10223942369810-516.1387062.155652122ACS_F475W1AABAA1222.81022.8110.0982307.52262ACS_F814W0AABAA1222.73522.7370.038296.9001
10323269469810-516.1075652.174399142ACS_F475W0AAAAC1122.42222.4220.00601.71182ACS_F814W1AAAAC1122.42522.4260.033527.9930
10430019569810-516.1249812.171772212ACS_F475W1AACAA1225.54025.5410.105458.75872ACS_F814W1AACAA1225.38225.3840.12179.4811
10517375769810-516.1337032.184506212ACS_F475W1AAAAA1221.37221.3720.15892810.15722ACS_F814W1AACAA1222.24022.2420.1825829.2399
10646679569810-516.1270562.166390112ACS_F475W1AACAA1225.35725.3580.142513.40652ACS_F814W0AAAAC1225.25725.2590.04622.8032
10664036369810-516.1357962.149099122ACS_F475W1CACAA825.57725.5790.11615.81682ACS_F814W0AABAB924.94424.9460.06618.3312
10684321369810-516.1103422.150373142ACS_F475W0CACCB1223.69123.6910.026930.76902ACS_F814W1CACBA1224.42924.4310.071025.8158
10783453869810-516.1071052.160084112ACS_F475W1AABAC1225.40025.4010.08183.42542ACS_F814W0AAAAC1225.12825.1290.05781.7916
10804805369810-516.1505722.142590122ACS_F475W1AACCA1025.29725.2970.069722.90852ACS_F814W0AAAAC1124.54424.5460.02731.6361

Plot object positions on the sky

We mark the galaxy center as well. Note that this field is in the outskirts of IC 1613. The 0.5 degree search radius (which is the maximum allowed in the API) allows finding these objects.

In [7]:
pylab.rcParams.update({'font.size': 16})
pylab.figure(1,(10,10))
pylab.plot(jtab['RA'], jtab['Dec'], 'bo', markersize=1,
          label='{:,} HCV measurements'.format(len(tab)))
pylab.plot(ra,dec,'rx',label=target,markersize=10)
pylab.gca().invert_xaxis()
pylab.gca().set_aspect('equal')
pylab.xlabel('RA [deg]')
pylab.ylabel('Dec [deg]')
pylab.legend(loc='best')
Out[7]:
<matplotlib.legend.Legend at 0x815579550>

Plot HCV MAD variability index versus magnitude in F475W

The median absolute deviation variability index is used by the HCV to identify variables. It measures the scatter among the multi-epoch measurements. Some scatter is expected from noise (which increases for fainter objects). Objects with MAD values that are high are likely to be variable.

This plots single-filter and multi-filter variable candidates (SFVC and MFVC) in different colors. Note that variable objects with low F475W MAD values are variable in a different filter (typically F814W in this field).

This plot is similar to the upper panel of Figure 4 in Bonanos et al. (2019, AAp).

In [8]:
wnovar = np.where(jtab['AutoClass']==0)[0]
wsfvc = np.where(jtab['AutoClass']==1)[0]
wmfvc = np.where(jtab['AutoClass']==2)[0]
x = jtab['MeanCorrMag_f475']
y = jtab['MAD_f475']

pylab.rcParams.update({'font.size': 16})
pylab.figure(1,(15,10))
pylab.plot(x[wnovar], y[wnovar], 'x', markersize=4, color='silver',
          label='{:,} non-variable'.format(len(wnovar)))
pylab.plot(x[wsfvc], y[wsfvc], 'o', markersize=5, color='blue',
          label='{:,} single-filter variable candidates'.format(len(wsfvc)))
pylab.plot(x[wmfvc], y[wmfvc], 'o', markersize=5, color='tab:cyan',
          label='{:,} multi-filter variable candidates'.format(len(wmfvc)))

pylab.xlabel('ACS_F475W [mag]')
pylab.ylabel('ACS_F475W MAD [mag]')
pylab.legend(loc='best', title='{} HSC measurements near {}'.format(len(jtab),target))
Out[8]:
<matplotlib.legend.Legend at 0x81ae3fcf8>

Plot variables in color-magnitude diagram

Many of the candidate variables lie on the instability strip.

This plot is similar to the lower panel of Figure 4 in Bonanos et al. (2019, AAp).

In [9]:
wnovar = np.where(jtab['AutoClass']==0)[0]
wsfvc = np.where(jtab['AutoClass']==1)[0]
wmfvc = np.where(jtab['AutoClass']==2)[0]
x = jtab['MeanCorrMag_f475'] - jtab['MeanCorrMag_f814']
y = jtab['MeanCorrMag_f475']

pylab.rcParams.update({'font.size': 16})
pylab.figure(1,(15,10))
pylab.plot(x[wnovar], y[wnovar], 'x', markersize=4, color='silver',
          label='{:,} non-variable'.format(len(wnovar)))
pylab.plot(x[wsfvc], y[wsfvc], 'o', markersize=5, color='blue',
          label='{:,} single-filter variable candidates'.format(len(wsfvc)))
pylab.plot(x[wmfvc], y[wmfvc], 'o', markersize=5, color='tab:cyan',
          label='{:,} multi-filter variable candidates'.format(len(wmfvc)))
pylab.gca().invert_yaxis()
pylab.xlim(-1.1, 4)
pylab.xlabel('ACS F475W - F814W [mag]')
pylab.ylabel('ACS F475W [mag]')
pylab.legend(loc='best', title='{} HSC measurements near {}'.format(len(jtab),target))
Out[9]:
<matplotlib.legend.Legend at 0x81a39e5f8>

Get a light curve for a nova in M87

Extract light curve for a given MatchID

Note that the MatchID could be determined by positional searches, filtering the catalog, etc. This object comes from the top left panel of Figure 9 in Bonanos et al. (2019, AAp).

In [10]:
matchid = 1905457

jobs = mastcasjobs.MastCasJobs(context=HSCContext)
t0 = time.time()

# get light curves for F606W and F814W
nova_606 = jobs.quick("""select * from HCVdetailed
where MatchID={} and Filter='ACS_F606W'
""".format(matchid), task_name="HCV demo")
print("{:.1f} sec: retrieved {} F606W measurements".format(time.time()-t0,len(nova_606)))

nova_814 = jobs.quick("""select * from HCVdetailed
where MatchID={} and Filter='ACS_F814W'
""".format(matchid), task_name="HCV demo")
print("{:.1f} sec: retrieved {} F814W measurements".format(time.time()-t0,len(nova_814)))

# get the object RA and Dec as well
nova_tab = jobs.quick("""select MatchID, RA, Dec from HCVmatch
where MatchID={}
""".format(matchid), task_name="HCV demo")
print("{:.1f} sec: retrieved object info".format(time.time()-t0))

nova_606
0.5 sec: retrieved 21 F606W measurements
1.0 sec: retrieved 22 F814W measurements
1.5 sec: retrieved object info
Out[10]:
Table length=21
MatchIDFilterMJDImageNameMagCorrMagMagErrCID
int64str9float64str26float64float64float64float64float64
1905457ACS_F606W53767.4197952871hst_10543_29_acs_wfc_f606w25.32725.32671328029930.13050.84064811468124413.499119758606
1905457ACS_F606W53768.4190663833hst_10543_30_acs_wfc_f606w23.969423.96761098021960.03941.0437963008880611.9895544052124
1905457ACS_F606W53769.3576541713hst_10543_31_acs_wfc_f606w23.600523.60008421247370.03060.9741666316986089.30478286743164
1905457ACS_F606W53771.1017052617hst_10543_33_acs_wfc_f606w23.610523.61575105175430.0303000010.966574072837832.96933674812317
1905457ACS_F606W53772.8098534283hst_10543_35_acs_wfc_f606w23.62179923.6029938853950.03280.9920370578765874.45478820800781
1905457ACS_F606W53774.4746564087hst_10543_37_acs_wfc_f606w24.01549924.01238680238810.0425999981.02462971210483.4874792098999
1905457ACS_F606W53775.2799112014hst_10543_38_acs_wfc_f606w23.903723.90775473410580.03810.9851852059364323.49740219116211
1905457ACS_F606W53776.0893441574hst_10543_39_acs_wfc_f606w24.01549924.01821088403150.04141.053148150444038.44466972351074
1905457ACS_F606W53776.9439852673hst_10543_40_acs_wfc_f606w24.06570124.07438894903280.0445999991.241574048995977.6749701499939
1905457ACS_F606W53778.562434162hst_10543_42_acs_wfc_f606w24.43079924.42749372240490.0595000011.074259281158453.26662158966064
1905457ACS_F606W53779.4097260174hst_10543_43_acs_wfc_f606w24.41589924.41505845731050.0626000020.9652777910232542.09669971466064
1905457ACS_F606W53780.2090778507hst_10543_44_acs_wfc_f606w24.69759924.69448381196230.0804999990.8106481432914736.5630521774292
1905457ACS_F606W53782.61915897hst_10543_47_acs_wfc_f606w24.56080124.55707489945920.0653999971.099074125289922.88692712783813
1905457ACS_F606W53784.2176541714hst_10543_73_acs_wfc_f606w24.782724.77701107022670.0790000040.8881481885910033.16040587425232
1905457ACS_F606W53786.1563230369hst_10543_86_acs_wfc_f606w24.909124.85326845876910.09021.037777781486518.54914951324463
1905457ACS_F606W53787.0225386124hst_10543_92_acs_wfc_f606w25.11459925.08782309605640.11270.9244444370269784.14372444152832
1905457ACS_F606W53792.6850963715hst_10543_49_acs_wfc_f606w25.22800125.2318322366430.11471.1442593336105311.9895496368408
1905457ACS_F606W53793.5260915786hst_10543_a1_acs_wfc_f606w25.15360125.16043598617860.10710.9256481528282177.5528678894043
1905457ACS_F606W53796.1486378878hst_10543_b8_acs_wfc_f606w24.017223.99859200518370.04361.5117592811584510.1193161010742
1905457ACS_F606W53798.5892174062hst_10543_50_acs_wfc_f606w25.56360125.57089652347360.153500010.95324075222015412.9010400772095
1905457ACS_F606W53799.4608942058hst_10543_c4_acs_wfc_f606w25.589525.59087074177370.16020.929814815521245.97602415084839
In [11]:
pylab.rcParams.update({'font.size': 16})
pylab.figure(1,(15,10))

x = nova_606['MJD']
y = nova_606['CorrMag']
e = nova_606['MagErr']
pylab.errorbar(x,y,yerr=e,fmt='ob',ecolor='k',elinewidth=1,markersize=8,label='ACS F606W')

x = nova_814['MJD']
y = nova_814['CorrMag']
e = nova_814['MagErr']
pylab.errorbar(x,y,yerr=e,fmt='or',ecolor='k',elinewidth=1,markersize=8,label='ACS F814W')

pylab.gca().invert_yaxis()
pylab.xlabel('MJD [days]')
pylab.ylabel('magnitude')
pylab.legend(loc='best', title='Nova in M87 (MatchID: {})'.format(matchid))
Out[11]:
<matplotlib.legend.Legend at 0x81719bc18>

Get HLA image cutouts for the nova

The Hubble Legacy Archive (HLA) images were the source of the measurements in the HSC and HCV, and it can be useful to look at the images. Examination of the images can be useful to identified cosmic-ray contamination and other possible image artifacts. In this case, no issues are seen, so the light curve is reliable.

Note that the ACS F606W images of M87 have only a single exposure, so they do have cosmic ray contamination. The accompanying F814W images have multiple exposures, allowing CRs to be removed. In this case the F814W combined image is used to find objects, while the F606W exposure is used only for photometry. That reduces the effects of F606W CRs on the catalog but it is still a good idea to confirm the quality of the images.

The get_hla_cutout function reads a single cutout image (as a JPEG grayscale image) and returns a PIL image object. See the documentation on the fitscut image cutout service for more information on the web service being used.

In [12]:
def get_hla_cutout(imagename,ra,dec,size=33,autoscale=99.5,asinh=True,zoom=1):
    
    """Get JPEG cutout for an image"""
    
    url = "https://hla.stsci.edu/cgi-bin/fitscut.cgi"
    r = requests.get(url, params=dict(ra=ra, dec=dec, size=size, 
            format="jpeg", red=imagename, autoscale=autoscale, asinh=asinh, zoom=zoom))
    im = Image.open(BytesIO(r.content))
    return im

# sort images by magnitude from brightest to faintest
phot = nova_606
isort = np.argsort(phot['CorrMag'])
# select the brightest, median and faintest magnitudes
ind = [isort[0], isort[len(isort)//2], isort[-1]]

# we plot zoomed-in and zoomed-out views side-by-side for each selected image
nim = len(ind)*2
ncols = 2 # images per row
nrows = (nim+ncols-1)//ncols

imsize1 = 19
imsize2 = 101
mra = nova_tab['RA'][0]
mdec = nova_tab['Dec'][0]

pylab.rcParams.update({"font.size":16})
pylab.figure(1,(12, (12/ncols)*nrows))
t0 = time.time()
ip = 0
for k in ind:
    im1 = get_hla_cutout(phot['ImageName'][k],mra,mdec,size=imsize1)
    ip += 1
    pylab.subplot(nrows,ncols,ip)
    pylab.imshow(im1,origin="upper",cmap="gray")
    pylab.title('{} m={:.3f}'.format(phot['ImageName'][k],phot['CorrMag'][k]),fontsize=14)
    im2 = get_hla_cutout(phot['ImageName'][k],mra,mdec,size=imsize2)
    ip += 1
    pylab.subplot(nrows,ncols,ip)
    pylab.imshow(im2,origin="upper",cmap="gray")
    xbox = np.array([-1,1])*imsize1/2 + (imsize2-1)//2
    pylab.plot(xbox[[0,1,1,0,0]],xbox[[0,0,1,1,0]],'r-',linewidth=1)
    pylab.title('{} m={:.3f}'.format(phot['ImageName'][k],phot['CorrMag'][k]),fontsize=14)
pylab.tight_layout()
print("{:.1f} s: got {} cutouts".format(time.time()-t0,ip))
7.8 s: got 6 cutouts

Compare the HCV automatic classification to expert validations

The HCV includes an automatic classification AutoClass for candidate variables as well as an expert validation for some fields that were selected for visual examination. For this example, we select all the objects in the HCV that have expert classification information.

In [13]:
DBtable = "HCV_demo2"
jobs = mastcasjobs.MastCasJobs(context="MyDB")

# drop table if it already exists
jobs.drop_table_if_exists(DBtable)

#get data for objects with an expert validation
query = """
select m.MatchID, m.GroupID, m.SubGroupID, m.RA, m.Dec,
   m.AutoClass, m.ExpertClass, m.NumFilters,
   f.Filter, f.FilterDetFlag, f.VarQualFlag, f.NumLC, 
   f.MeanMag, f.MeanCorrMag, f.MAD, f.Chi2
into mydb.{DBtable}
from HCVmatch m
join HCVfilter f on m.MatchID=f.MatchID
where m.ExpertClass>0
""".format(**locals())

t0 = time.time()
results = jobs.quick(query, task_name="HCV demo", context=HSCContext)

print("Completed in {:.1f} sec".format(time.time()-t0))
print(results)

# fast retrieval using special MAST Casjobs service
tab = jobs.fast_table(DBtable, verbose=True)

# clean up the output format
tab['MeanMag'].format = "{:.3f}"
tab['MeanCorrMag'].format = "{:.3f}"
tab['MAD'].format = "{:.4f}"
tab['Chi2'].format = "{:.4f}"
tab['RA'].format = "{:.6f}"
tab['Dec'].format = "{:.6f}"

# tab includes 1 row for each filter (so multiple rows for objects with multiple filters)
# get an array that has only one row per object
mval, uindex = np.unique(tab['MatchID'],return_index=True)
utab = tab[uindex]
print("{} unique MatchIDs in table".format(len(utab)))

tab
Completed in 1.1 sec
Rows Affected
-------------
        31258
2.3 s: Retrieved 4.94MB table MyDB.HCV_demo2
2.8 s: Converted to 31258 row table
13533 unique MatchIDs in table
Out[13]:
Table length=31258
MatchIDGroupIDSubGroupIDRADecAutoClassExpertClassNumFiltersFilterFilterDetFlagVarQualFlagNumLCMeanMagMeanCorrMagMADChi2
int64int64int64float64float64int64int64int64str11int64str5int64float64float64float64float64
175633256019011.367288-73.211441243ACS_F475W1ABCAC622.79422.7930.035830.3918
175633256019011.367288-73.211441243ACS_F775W1CAAAA1722.72622.7270.025268.3289
175633256019011.367288-73.211441243ACS_F850LP1AAAAC522.87722.8770.03874.4088
175903810459048011.21039241.609066122ACS_F475W0AABAC525.20025.1990.02321.8665
175903810459048011.21039241.609066122ACS_F814W1AACAC523.59823.6000.083315.6776
1781246126111201.685883-47.475422226WFC3_F225W0AAAAC721.96121.9600.00300.4532
1781246126111201.685883-47.475422226WFC3_F275W0AAAAB1021.12721.1250.003114.8806
1781246126111201.685883-47.475422226WFC3_F336W0AAAAC620.14320.1420.00609.9915
1781246126111201.685883-47.475422226WFC3_F555W0AAACC618.61018.6080.014679.3761
1781246126111201.685883-47.475422226WFC3_F606W1AAABB7018.55418.5540.0323115.9390
................................................
229061081041481-5311.729706-12.854458212ACS_F475W1BACAA1224.11324.1140.1773136.6645
229061081041481-5311.729706-12.854458212ACS_F814W1BABAB1224.06224.0630.112565.7496
2291729369810-516.1172332.153573142ACS_F475W0BBACB1221.49121.4920.0205385.4723
2291729369810-516.1172332.153573142ACS_F814W1BBCCB1222.24922.2500.0611142.0765
22920404126116201.692520-47.492012226WFC3_F225W0AAAAC624.25424.2530.05260.4217
22920404126116201.692520-47.492012226WFC3_F275W0CAAAC823.04123.0380.030611.3090
22920404126116201.692520-47.492012226WFC3_F336W0AAABC621.65021.6490.00771.4146
22920404126116201.692520-47.492012226WFC3_F555W0AAABB520.01220.0090.019512.9443
22920404126116201.692520-47.492012226WFC3_F606W1AAAAC6319.78519.7860.040124.5835
22920404126116201.692520-47.492012226WFC3_F814W1CAACA719.43119.4310.071371.8553

An ExpertClass value of 1 indicates that the object is confidently confirmed to be a variable; 2 means that the measurements do not have apparent problems and so the object is likely to be variable (usually the variability is too small to be obvious in the image); 4 means that the variability is likely to be the result of artifacts in the image (e.g., residual cosmic rays or diffraction spikes from nearby bright stars).

Compare the distributions for single-filter variable candidates (SFVC, AutoClass=1) and multi-filter variable candidates (MFVC, AutoClass=2). The fraction of artifacts is lower in the MFVC sample.

In [14]:
sfcount = np.bincount(utab['ExpertClass'][utab['AutoClass']==1])
mfcount = np.bincount(utab['ExpertClass'][utab['AutoClass']==2])
sfrat = sfcount/sfcount.sum()
mfrat = mfcount/mfcount.sum()

print("Type Variable Likely Artifact Total")
print("SFVC {:8d} {:6d} {:8d} {:5d} counts".format(sfcount[1],sfcount[2],sfcount[4],sfcount.sum()))
print("MFVC {:8d} {:6d} {:8d} {:5d} counts".format(mfcount[1],mfcount[2],mfcount[4],mfcount.sum()))
print("SFVC {:8.3f} {:6.3f} {:8.3f} {:5.3f} fraction".format(sfrat[1],sfrat[2],sfrat[4],sfrat.sum()))
print("MFVC {:8.3f} {:6.3f} {:8.3f} {:5.3f} fraction".format(mfrat[1],mfrat[2],mfrat[4],mfrat.sum()))
Type Variable Likely Artifact Total
SFVC     3323   3055     1761  8139 counts
MFVC     2101   2442      851  5394 counts
SFVC    0.408  0.375    0.216 1.000 fraction
MFVC    0.390  0.453    0.158 1.000 fraction

Plot the MAD variability index distribution with expert classifications

Note that only the filters identified as variable (FilterDetFlag > 0) are included here.

This version of the plot shows the distributions for the various ExpertClass values along with, for comparison, the distribution for all objects in gray (which is identical in each panel). Most objects are classified as confident or likely variables. Objects with lower MAD values (indicating a lower amplitude of variability) are less likely to be identified as confident variables because low-level variability is more difficult to confirm via visual examination.

In [15]:
w = np.where(tab['FilterDetFlag']>0)[0]
mad = tab['MAD'][w]
e = tab['ExpertClass'][w]

xrange = [7.e-3, 2.0]
bins = xrange[0]*(xrange[1]/xrange[0])**np.linspace(0.0,1.0,50)

pylab.rcParams.update({'font.size':16})
pylab.figure(1,(12,12))

pylab.subplot(311)
pylab.hist(mad,bins=bins,log=True,color='lightgray',label='All')
wp = np.where(e==1)[0]
pylab.hist(mad[wp],bins=bins,log=True,label='Confident',color='C2')
pylab.xscale('log')
pylab.ylabel('Count')
pylab.legend(loc='upper left')
pylab.title('HCV Expert Validation')

pylab.subplot(312)
pylab.hist(mad,bins=bins,log=True,color='lightgray',label='All')
wp = np.where(e==2)[0]
pylab.hist(mad[wp],bins=bins,log=True,label='Likely',color='C1')
pylab.xscale('log')
pylab.ylabel('Count')
pylab.legend(loc='upper left')

pylab.subplot(313)
pylab.hist(mad,bins=bins,log=True,color='lightgray',label='All')
wp = np.where(e==4)[0]
pylab.hist(mad[wp],bins=bins,log=True,label='Artifact',color='C0')
pylab.xscale('log')
pylab.ylabel('Count')
pylab.legend(loc='upper left')

pylab.xlabel('MAD Variability Index [mag]')
Out[15]:
Text(0.5, 0, 'MAD Variability Index [mag]')

The plot below shows the same distributions, but plotted as stacked histograms. The top panel uses a linear scale on the y-axis and the bottom panel uses a log y scale.

In [16]:
w = np.where(tab['FilterDetFlag']>0)[0]
mad = tab['MAD'][w]
e = tab['ExpertClass'][w]

xrange = [7.e-3, 2.0]
bins = xrange[0]*(xrange[1]/xrange[0])**np.linspace(0.0,1.0,50)

pylab.rcParams.update({'font.size':16})
pylab.figure(1,(15,12))

pylab.subplot(211)
hlog = False
pylab.hist(mad,bins=bins,log=hlog,label='Artifact')

wp = np.where(e<4)[0]
pylab.hist(mad[wp],bins=bins,log=hlog,label='Likely Variable')

wp = np.where(e==1)[0]
pylab.hist(mad[wp],bins=bins,log=hlog,label='Confident Variable')
pylab.xscale('log')
pylab.xlabel('MAD Variability Index [mag]')
pylab.ylabel('Count')
pylab.legend(loc='upper right',title='HCV Expert Validation')

pylab.subplot(212)
hlog = True
pylab.hist(mad,bins=bins,log=hlog,label='Artifact')

wp = np.where(e<4)[0]
pylab.hist(mad[wp],bins=bins,log=hlog,label='Likely Variable')

wp = np.where(e==1)[0]
pylab.hist(mad[wp],bins=bins,log=hlog,label='Confident Variable')
pylab.xscale('log')
pylab.xlabel('MAD Variability Index [mag]')
pylab.ylabel('Count')
pylab.legend(loc='upper right',title='HCV Expert Validation')
Out[16]:
<matplotlib.legend.Legend at 0x8157c60b8>

Plot the fraction of artifacts as a function of MAD variability index

This shows how the fraction of artifacts varies with the MAD value. For larger MAD values the fraction decreases sharply, presumably because such large values are less likely to result from the usual artifacts. Interestingly, the artifact fraction also declines for smaller MAD values (MAD < 0.1 mag). Probably that happens because typical artifacts are more likely to produce strong signals than the weaker signals indicated by a low MAD value.

In [17]:
w = np.where(tab['FilterDetFlag']>0)[0]
mad = tab['MAD'][w]
e = tab['ExpertClass'][w]

xrange = [7.e-3, 2.0]
bins = xrange[0]*(xrange[1]/xrange[0])**np.linspace(0.0,1.0,30)

all_count, bin_edges = np.histogram(mad,bins=bins)
artifact_count, bin_edges = np.histogram(mad[e==4],bins=bins)
wnz = np.where(all_count>0)[0]
nnz = len(wnz)

artifact_count = artifact_count[wnz]
all_count = all_count[wnz]
xerr = np.empty((2,nnz),dtype=float)
xerr[0] = bin_edges[wnz]
xerr[1] = bin_edges[wnz+1]

# combine bins at edge into one big bin to improve the statistics there
iz = np.where(all_count.cumsum()>10)[0][0]
if iz > 0:
    all_count[iz] += all_count[:iz].sum()
    artifact_count[iz] += artifact_count[:iz].sum()
    xerr[0,iz] = xerr[0,0]
    all_count = all_count[iz:]
    artifact_count = artifact_count[iz:]
    xerr = xerr[:,iz:]
iz = np.where(all_count[::-1].cumsum()>40)[0][0]
if iz > 0:
    all_count[-iz-1] += all_count[-iz:].sum()
    artifact_count[-iz-1] = artifact_count[-iz:].sum()
    xerr[1,-iz-1] = xerr[1,-1]
    all_count = all_count[:-iz]
    artifact_count = artifact_count[:-iz]
    xerr = xerr[:,:-iz]

x = np.sqrt(xerr[0]*xerr[1])
xerr[0] = x - xerr[0]
xerr[1] = xerr[1] - x

frac = artifact_count/all_count
# error on fraction using binomial distribution (approximate)
ferr = np.sqrt(frac*(1-frac)/all_count)

pylab.rcParams.update({'font.size':16})
pylab.figure(1,(12,12))

pylab.errorbar(x,frac,xerr=xerr,yerr=ferr,fmt='ob',
               markersize=5,label='Artifact fraction')

pylab.xscale('log')
pylab.xlabel('MAD Variability Index [mag]')
pylab.ylabel('Artifact Fraction')
pylab.legend(loc='upper right',title='HCV Expert Validation')
Out[17]:
<matplotlib.legend.Legend at 0x817402358>

Plot light curve for the most variable high quality candidate in the HCV

Select the candidate variable with the largest MAD value and VarQualFlag = 'AAAAA'. To find the highest MAD value, we sort by MAD in descending order and select the first result.

In [18]:
jobs = mastcasjobs.MastCasJobs(context=HSCContext)

# join to the Groups table as well to get the target name

query = """
select top 1 m.MatchID, m.GroupID, m.SubGroupID, g.TargetName, m.RA, m.Dec,
   m.AutoClass, m.ExpertClass, m.NumFilters,
   f.Filter, f.FilterDetFlag, f.VarQualFlag, f.NumLC, 
   f.MeanMag, f.MeanCorrMag, f.MAD, f.Chi2
from HCVmatch m
join HCVfilter f on m.MatchID=f.MatchID
join Groups g on m.GroupID=g.GroupID
where f.VarQualFlag='AAAAA'
order by f.MAD desc
""".format(**locals())

t0 = time.time()
tab = jobs.quick(query, task_name="HCV demo", context=HSCContext)

print("Completed in {:.1f} sec".format(time.time()-t0))

# clean up the output format
tab['MeanMag'].format = "{:.3f}"
tab['MeanCorrMag'].format = "{:.3f}"
tab['MAD'].format = "{:.4f}"
tab['Chi2'].format = "{:.4f}"
tab['RA'].format = "{:.6f}"
tab['Dec'].format = "{:.6f}"

print("MatchID {} in group '{}' has largest MAD value = {:.2f}".format(
    tab['MatchID'][0],tab['TargetName'][0],tab['MAD'][0]))
tab
Completed in 0.5 sec
MatchID 5742711 in group 'M31' has largest MAD value = 0.86
Out[18]:
Table length=1
MatchIDGroupIDSubGroupIDTargetNameRADecAutoClassExpertClassNumFiltersFilterFilterDetFlagVarQualFlagNumLCMeanMagMeanCorrMagMADChi2
int64int64int64str3float64float64int64int64int64str9str4str5int64float64float64float64float64
5742711104590416M3110.92860141.164295101ACS_F814WTrueAAAAA522.26422.2650.85816698.4430

Get and plot the light curve.

In [19]:
matchid = tab['MatchID'][0]
mfilter = tab['Filter'][0]

jobs = mastcasjobs.MastCasJobs(context=HSCContext)
t0 = time.time()

# get light curves for F606W and F814W
lc = jobs.quick("""select * from HCVdetailed
where MatchID={} and Filter='{}'
""".format(matchid, mfilter), task_name="HCV demo")
print("{:.1f} sec: retrieved {} {} measurements".format(time.time()-t0,len(lc),mfilter))

pylab.rcParams.update({'font.size': 16})
pylab.figure(1,(15,10))

x = lc['MJD']
y = lc['CorrMag']
e = lc['MagErr']
pylab.errorbar(x,y,yerr=e,fmt='ob',ecolor='k',elinewidth=1,markersize=8,label=mfilter)

pylab.gca().invert_yaxis()
pylab.xlabel('MJD [days]')
pylab.ylabel('magnitude')
pylab.legend(loc='best', title='MatchID: {} in {} MAD={:.2f}'.format(matchid, tab['TargetName'][0], tab['MAD'][0]))
0.5 sec: retrieved 5 ACS_F814W measurements
Out[19]:
<matplotlib.legend.Legend at 0x8174ee6a0>

Extract cutout images for the entire light curve (since it does not have many points).

In [20]:
# sort images in MJD order
ind = np.argsort(lc['MJD'])

# we plot zoomed-in and zoomed-out views side-by-side for each selected image
nim = len(ind)*2
ncols = 2 # images per row
nrows = (nim+ncols-1)//ncols

imsize1 = 19
imsize2 = 101
mra = tab['RA'][0]
mdec = tab['Dec'][0]

pylab.rcParams.update({"font.size":14})
pylab.figure(1,(12, (12/ncols)*nrows))
t0 = time.time()
ip = 0
for k in ind:
    im1 = get_hla_cutout(lc['ImageName'][k],mra,mdec,size=imsize1)
    ip += 1
    pylab.subplot(nrows,ncols,ip)
    pylab.imshow(im1,origin="upper",cmap="gray")
    pylab.title(lc['ImageName'][k],fontsize=14)
    im2 = get_hla_cutout(lc['ImageName'][k],mra,mdec,size=imsize2)
    ip += 1
    pylab.subplot(nrows,ncols,ip)
    pylab.imshow(im2,origin="upper",cmap="gray")
    xbox = np.array([-1,1])*imsize1/2 + (imsize2-1)//2
    pylab.plot(xbox[[0,1,1,0,0]],xbox[[0,0,1,1,0]],'r-',linewidth=1)
    pylab.title('m={:.3f} MJD={:.2f}'.format(lc['CorrMag'][k],lc['MJD'][k]),fontsize=14)
    print("{:.1f} s: finished {} of {} epochs".format(time.time()-t0,ip//2,len(ind)))
pylab.tight_layout()
print("{:.1f} s: got {} cutouts".format(time.time()-t0,ip))
2.4 s: finished 1 of 5 epochs
4.9 s: finished 2 of 5 epochs
7.3 s: finished 3 of 5 epochs
10.2 s: finished 4 of 5 epochs
12.7 s: finished 5 of 5 epochs
12.9 s: got 10 cutouts