Notice: Around the end of September 2020, archive.stsci.edu will begin using HTTPS exclusively. read more
Download Options for Kepler Data and Catalogs
Besides the standard search interfaces (i.e., see the
Portal or Kepler data search
interfaces), Kepler light curves and Target Pixel Files may be
retrieved in a
number of ways as described below. Most of these options are now possible
because the files themselves
are all stored online in publicly-accessible directories.
For a complete list of available data sets, see the Kepler
Data Products page.
FTP and HTTP
For individual files, both FTP and HTTP can be used to download Kepler data
and catalogs. For FTP, connect to archive.stsci.edu anonymously and cd to pub/kepler
You will see the available directories using ls. For HTTP, just go to
Examples for the browser paths to light curves and target pixel files
are shown below, where KKKKKKKKK is the KIC ID and XXXX is the first 4 digits
of the KIC ID including the initial zeros.
The light curves for each quarter are bundled into a set of tar files, each
between 3-5 GB in size. These smaller tar files replaced the large tar files
that had all the light curves as the large files proved to be difficult to download.
There are separate tar files for long and short cadence. All light curves
for each quarter are stored in a subdirectory within the master tar file directory
e.g. Q0_public, Q10_public.
There is a README file within each subdirectory that contains a list of the tar
files for that quarter and the associated checksums and some suggestions for
downloading the tar files.
There are also tarfiles for easily identified subsets of lightcurves (KOIs, Red Giants, Eclipsing Binaries) stored in subdirectories in the same location.
Tar Files per Target
In each light curve and target pixel file directory, there are also tar files that
bundle the data from all Quarters for that target. There are separate tar bundles
for long and short cadence data. The Q code in the tar file name is a
17-digit string that indicates how many epochs were observed in a given Quarter.
For long cadence data, this is
always zero or one. For short cadence data, there may be multiple epochs
(stretches of observations), so the numbers can range from zero to three.
Note that a target is considered observed if there is *either* an extracted
lightcurve or target pixel file in that Quarter.
As a quick example, Q0103... would mean this target was not observed in
Quarters 0 or 2, was observed once in Quarter 1, observed three times in
Quarter 3, etc."
If your system supports Wget or Curl, there are several other options for retrieving
data, which will create a shell script on your desktop computer.
You may then run the script from the command line to copy the requested
files directly to your computer.
One advantage to using shell scripts
is that large requests can be retrieved in batches simply by dividing the script
into several smaller ones.
Note these scripts are primarily intended for Linux,
Unix, and Mac users but alternatives may exist for Windows users.
Download Existing WGET Scripts
Sets of wget scripts are available for public light curves and target pixel files.
These scripts are located in the wget_scripts subdirectory of the tar files directory. Please read
the README file.
Create your own CURL or WGET Scripts
We currently offer 2 methods for generating shell scripts of CURL or WGET
commands. Either method will create a script file on your desktop computer
that can be run to download the found files (e.g, using the sh command).
We also give examples of how to create your own WGET commands.
If you know what data you want, a quick way to create shell scripts is
to use one of our
available IDL or Python programs. These programs
accept several parameters for specifying ID numbers, cadence, dates, quarters,
data type, and command type. For example (assuming IDL is installed on your desktop
return all available long-cadence target pixel files for Kepler ID 7730747:
get_kepler, '7730747', data_type="target_pixel_file".
or return all Quarter 7 and Quarter 8 short-cadence lightcurve files for this Kepler ID:
get_kepler, '7730747', cadence="short", quarters=['7','8'].
A python example to retrieve TPF files for ID 7730747:
python get_kepler.py '7730747' -t target_pixel_file.
Type python get_kepler.py -h to see all the available python arguments.
Here are some examples of creating your own wget commands
where instead of retrieving one file per command (as above), you
retrieve entire directories:
Download a whole directory of data using WGET
wget -q -nH --cut-dirs=6 -r -l0 -c -N -np -R 'index*' -erobots=off https://archive.stsci.edu/pub/kepler/lightcurves//0014/001429092/
wget -q -nH --cut-dirs=6 -r -l0 -c -N -np -R 'index*' -erobots=off https://archive.stsci.edu/pub/kepler/target_pixel_files/0014/001429092/
You could download larger amounts by going back up the tree:
wget -q -nH --cut-dirs=6 -r -l0 -c -N -np -R 'index*' -erobots=off https://archive.stsci.edu/pub/kepler/lightcurves/
would download all the lightcurves.
If you know exactly which datasets you want, another method for retrieving data
(and bypassing the search step)
is to use the dataset retrieval page at
The list of datasets can be entered with a space or a comma delimiter,
or as an uploaded file, but they must be specified with both the Kepler ID and
the timestamp of the observation in a form like: kplr000757076-2009131105131
(or with a cadence type like KPLR000757450-2011208035123;SC).