A framework for testing and benchmarking machine learning methods on astronomical data

Hello Universe is a new project at MAST designed to help astronomers develop machine learning (ML) methods for astronomical discovery. ML will be an essential tool for analyzing the rich data sets of the upcoming decade, and Hello Universe provides a framework for testing ML algorithms and new techniques. Each entry in the Hello Universe collection includes: 

  • Data: a high-level science product (HLSP) data set for testing and benchmarking ML algorithms 
  • Code: a tutorial Jupyter notebook that provides step-by-step examples of how to apply an ML technique to the data

Though these data sets are motivated by the needs of a novice data science learner, they are sufficient for a wide range of tasks. Hello Universe entries include examples of:

  • analyzing 2D (image) and 1D (vector or light curve) data sets.
  • applying techniques for regression and for classification.
  • developing supervised and unsupervised learning models.
  • using best practices for training and optimizing models.
  • selecting metrics for assessing model performance.
Hello Universe text with Hubble icon

 

Entries

Contribute to Hello Universe!

Have an idea for a data set + notebook pair? We welcome your contributions to Hello Universe! Please contact archive@stsci.edu to get started.