DETECTING POINT SOURCES IN EUVE SURVEY SKYMAPS
J. W. Lewis
Center for EUV Astrophysics, 2150 Kittredge St.,
University of California, Berkeley, California 94720, USA
ABSTRACT
A major focus of the Extreme Ultraviolet Explorer all-sky survey is the
detection of new sources of extreme ultraviolet radiation. This paper
describes the challenges involved in producing a catalog containing as many
statistically significant detections as possible, with a small expected number
of spurious detections.
1. MAPPING THE EXTREME ULTRAVIOLET SKY
The Extreme Ultraviolet Explorer (EUVE) satellite, launched in June 1992,
has successfully completed the first phase of its mission, a six-month all-sky
survey over the entire EUV band [1]. One of the primary goals of the survey is
the detection and identification of objects that emit extreme ultraviolet (EUV)
radiation. Most of the project's data processing resources to date have been
devoted to the production and analysis of full-sky maps in each of the four EUV
spectral bands observed by our instrument.
1.1 From Telemetry to Skymaps
The EUVE End-to-End System (EES) software [2] is designed to extract science
and engineering data from the telemetry received from the satellite. The data
received is not in the form of images, but of individual photon events,
characterized by the arrival time of the photon and the location of the event
in detector coordinates. When this information is combined with the orientation
of the spacecraft at that instant, it is possible to "remap" each event to its
proper place in the sky and add it to the skymap for the appropriate EUV
bandpass.
In addition to maps of EUV intensity over the sky, the EES produces maps of
effective exposure time. Owing to the geometry of the survey, the ecliptic
poles were observed during every orbit for the duration of the survey, whereas
areas near the ecliptic plane were scanned for only a few days as the plane of
the survey rotated in ecliptic longitude. The exposure maps reflect this
difference in view times as well as other effects, such as vignetting and
corrections for the finite telemetry bandwidth. (See fig. 1.)
1.2 The EUV Background
The major challenge in detecting EUV sources is that, by optical astronomy
standards, most objects of interest are exceedingly faint. An object with a
count rate of 1 event per second would be considered a very bright EUV source!
Furthermore, these sources must be detected against a relatively bright
background consisting of several independently varying components. Detection
of faint EUV objects depends strongly on our ability to model the EUV
background.
Relatively few of the photon events received are due to point sources of EUV
radiation. Many of the events are not photons at all: A significant portion
of the background consists of charged particles that register as photons when
they hit the detectors. Another important background component is the
geocoronal emission at 304 and 584 A. A further complication is the
ultraviolet "leak" present in some bandpasses; objects (such as B-type stars)
that are bright in the far UV can be seen in the skymaps, even though the EUV
emission from these objects is negligible.
2. EUVE SOURCE DETECTION
The source detection problem can be cast as an exercise in hypothesis
testing: Given an observation of a patch of sky, how strongly can the null
hypothesis, that the observation is due to a random fluctuation in the
background, be rejected?
2.1 The Detection Method
The first step is to estimate the background intensity. An annulus around
the suspected source is used to estimate the background count rate:
beta = sum(n_i)/sum{t_i),
where beta is the background rate in counts per pixel per second, n_i is the
number of counts received in pixel i, and t_i is the effective exposure time
at pixel i. Both summations are computed for the set of pixels lying within
the background annulus, so there is no need to normalize for the pixel size
or the annulus area.
The expected number of counts per pixel in a small neighborhood around the
source is given by
e_i = beta * t_i + I * phi * r_i,
where I is the total number of counts received from a hypothetical source, r_i
is the arc distance between the source position and pixel i and phi is the
point spread function, expressed as the probability density of receiving a
source photon at radius r_i from the source position.
The next step is to vary the source intensity, I, and the source position
(perturbing the r_i's) until the following statistic is minimized:
C = -2 * sum(e_i - n_i * log(e_i))
In this step, the sum is computed over a set of pixels lying within a small
disk around the suspected source position. The parameter values which minimize
C, the "Cash statistic", are the maximum likelihood estimate of the source
position and intensity [3]. This statistic has a chi^2 distribution about its
minimum value, C_min, which makes it useful for obtaining confidence intervals
for the parameters of interest. In particular, this property allows us to
decide whether a given confidence interval for the source intensity includes
the value I = 0.
C_0 = -2 * sum(beta * t_i - n_i * log(beta * t_i))
Delta-C = C_0 - C_min
The quantity Delta-C is the basis for the decision rule. Given a maximum
acceptable probability P_bg that the hypothetical source is due to a background
fluctuation, set a threshold T so that
P(chi^2 > T) = P_bg.
If Delta-C > T, a source is reported at the maximum likelihood position. If
Delta-C <= T, the observation is assumed to consist of background only. Fig.
2 shows a typical EUVE source.
2.2 Threshold Setting
The chi^2 threshold used by this method controls the quality of the catalog
produced. As the threshold is raised, the final catalog will contain fewer
spurious sources, but this will also cause more real (but faint) sources to be
rejected. The following rule of thumb can be used to estimate the expected
number of spurious detections:
N_spurious = P_bg * A_sky / A_psf.
P_bg is the probability threshold defined above, A_sky is the area of sky
covered by the survey, and A_psf is the area of the point spread function.
Using the EUVE point spread function and sky coverage, a threshold of T = 36
(i.e. a 6 sigma rejection of the background-only model) should result in a
catalog containing no more than one expected spurious detection due to
background fluctuations. In practice, a slightly lower threshold would be used
to increase significantly the number of faint sources detected, while accepting
on the order of ten spurious detections in the published catalog.
3. PRELIMINARY RESULTS
3.1 Simulation and Testing
When the final catalog of EUV sources is released, it will be necessary
to document the purity of the catalog (expressed as the expected number of
spurious sources) and to provide a map of sky coverage showing the minimum
detectable count rate for each part of the sky. In principle, this information
could be computed from the theoretical behavior of the Cash statistic, but
extensive testing is necessary to confirm that the theory has been correctly
implemented in the software.
From the perspective of software quality assurance, it would be desirable to
use real data for testing. But to distinguish real sources from spurious
detections would require a trustworthy catalog, and since we are the first
mission to survey the entire sky in the full EUV band, no such catalog
currently exists! We have therefore concentrated on validating the code via
large-scale simulations.
Our first major simulation involved generating simulated skymaps for
approximately two weeks' worth of data. The exposure map was taken directly
from the first two weeks of data. We then approximated the EUV background by
applying a flat count rate to the exposure map to yield an expected number of
counts per map pixel, and next we generated a sequence of Poisson pseudorandom
numbers which were stored in the background map. We then generated a catalog
of about 1000 simulated sources, uniformly distributed over the region of
nonzero exposure. Each simulated source was then assigned a count rate of
between 0.01 and 10.0 counts/sec. The exposure maps were used to determine
the total number of counts expected from each simulated source. The source
intensity and point spread function were used to calculate the expected
contribution from the simulated source at each pixel in a small neighborhood
surrounding the source coordinates, and another set of random Poisson events
was generated according to this distribution and accumulated into the skymaps.
The detection software was then run on the simulated skymap to produce a
catalog of detections. By comparing the detection catalog to the input
catalog, we were able to classify the detections as "real" or "spurious",
compare the detection significance to the theoretical prediction, and compare
the best-fit flux to the actual number of source events generated for each
simulated source.
The results of this simulation were encouraging. The detection significance,
limiting count rate, flux estimates, and number of spurious detections all
behaved as expected.
3.2 Early Survey Results
At this writing, 446 strongly detected objects have been compiled for a
bright source list released internally at the Center for EUV Astrophysics. Of
these sources, many have been tentatively identified with the following types
of objects: 79 late-type stars (spectral classes F,G,K or M); 39 white dwarfs;
7 cataclysmic variables; 4 A stars, 4 stars of unknown spectral type; 3 BL Lac
objects; 1 active galactic nucleus; 1 B star (after excluding many detections
of O and B stars thought to be artifacts of the far UV leak); 1 low-mass X-ray
binary; 1 central star of a planetary nebula; and 1 pulsar. It is expected
that both the total number of detections and the number of reliable
identifications will increase significantly when the latest revision of the
source detection software is applied to the reprocessed survey data.
4. ONGOING RESEARCH
4.1 Advanced Detection Techniques
Even though the detection method seems to perform well in simulations,
there are still opportunities for substantial improvement. The weak points
of the current system are the method of estimating the background, the point
spread function (PSF) model, and the use of coarsely binned data to estimate
the source position and intensity.
The background count rate is currently assumed to be locally flat in the
neighborhood of a source. However, some components of the background (for
example, the particle background and geocoronal emission) can vary rapidly
with time, and the observed background might show significant deviations from
the flat model. One way around this problem would be to build a separate
background map. The average number of telemetered counts in each band over
a time interval of a few minutes is a reliable estimate of the sum of all
relevant background components. By accumulating a map of minute-to-minute
background rates, it is no longer necessary to resort to averaging annular
regions surrounding sources; one could simply look up the background values
for each pixel in the neighborhood of the source.
A background map could also be used to eliminate spurious detections of
O and B stars via the far UV leak. From preliminary survey results, we have
seen that O and B stars down to approximately 5th magnitude can be seen in
the skymaps. Given a catalog of such objects, the magnitude and spectral type
can be used to calculate the expected contribution due to the far ultraviolet
leak. By treating this as part of the background, we can be sure that any O
and B stars appearing in our final catalog are there because they are emitting
significant EUV radiation.
The current detection software uses a single "survey average" PSF for each
bandpass. This is not a good model of the real data, since the width of the
PSF varies by as much a factor of 10 between the central regions of the
detector and the edges. By taking this variation into account, we expect a
modest improvement in sensitivity compared to the use of a survey average PSF.
Our full-sky maps have a resolution of approximately 1.2 arc minutes. At
this resolution, the core of the PSF is only a few pixels wide, which implies
that we may not be making the best possible use of the available data. Better
results can be obtained by applying the detection method to an alternate set
of data products, the "pigeonhole" files which contain the exact time, detector
coordinates, and celestial coordinates of each event received from a small
patch of sky around each pigeonhole location. This unbinned data
representation will allow us to give a more accurate position and confidence
level for each source detected.
4.2 Diffuse Feature Detection
We are investigating the application of the point source detection method
to look for extended features in the background. For example, we have already
seen the Cygnus Loop and Vela supernova remnants in the raw skymaps, but this
is only because we knew in advance where to look. A systematic search for
extended objects may yield scientifically interesting results.
Another possible extension of the detection method would be to search for
statistically significant depletions in the diffuse background. The very
existence of the diffuse EUV background is hotly debated, and if we can show
that objects such as molecular clouds are seen as shadows against a brighter
background, we will have made a very significant discovery.
For this type of work it becomes crucial to understand the behavior of each
component of the background. Our goal is to be able to subtract away all
sources of terrestrial and instrumental background in order to produce full-sky
maps of the astrophysical EUV background and identify any statistically
significant enhancements or depletions.
5. ACKNOWLEDGEMENTS
I thank Richard Lieu, Herman Marshall, and Vince Saba for their input during
the development of the detection software.
REFERENCES
1. Haisch, B., Bowyer, S., Malina, R.F., "Overview of and Initial Results from
the Extreme Ultraviolet Explorer Mission", in JBIS, this issue (1993).
2. Vedder, P.W. et al., "Overview of the Data Analysis System for the Extreme
Ultraviolet Explorer Project", ASP Conf. Series, 25, 496, (1992).
3. Cash, W., "Parameter Estimation In Astronomy Through Application of the
Likelihood Ratio", Ap.J., 228, 939-947, (1979).
Page created by webmastr@cea.berkeley.edu
Last modified 10/14/98