A framework for testing and benchmarking machine learning methods on astronomical data

Hello Universe is a new project at MAST designed to help astronomers develop machine learning (ML) methods for astronomical discovery. ML will be an essential tool for analyzing the rich data sets of the upcoming decade, and Hello Universe provides a framework for testing ML algorithms and new techniques. Each entry in the Hello Universe collection includes: 

  • Data: a high-level science product (HLSP) data set for testing and benchmarking ML algorithms 
  • Code: a tutorial Jupyter notebook that provides step-by-step examples of how to apply an ML technique to the data

Though these data sets are motivated by the needs of a novice data science learner, they are sufficient for a wide range of tasks. Hello Universe entries include examples of:

  • analyzing 2D (image) and 1D (vector or light curve) data sets.
  • applying techniques for regression and for classification.
  • developing supervised and unsupervised learning models.
  • using best practices for training and optimizing models.
  • selecting metrics for assessing model performance.
Hello Universe text with Hubble icon

 

Entries

Get Involved!

  • Contribute to Hello Universe

    Contribute to Hello Universe
    Have an idea for a data set + notebook pair? We welcome your contributions to Hello Universe! Please contact archive@stsci.edu to get started.
  • Run Hello Universe on TIKE

    Run Hello Universe on TIKE
    Want to interact with Hello Universe notebooks or come up with your own? Edit and run notebooks, or create your own ideas with TIKE.