Benchmarking for AI for Science at Exascale (BASE)

With ever increasing volumes of data from large-scale experiential facilities and observatories, and the impending arrival of the world’s first exascale supercomputers, there is a clear need for AI solutions that are scalable (as a powerful, modern technique for big data reduction and interpretation).

The performance and efficacy of machine learning systems, in particular deep learning systems, now exceeds human level capability around a rapidly expanding number of tasks, such as object recognition and classification or for anomaly detection across a range of complex engineering systems. At the same time however, the rate at which the scientific community is producing data, through large-scale experimental facilities and observatories, is increasing at an exponential rate, thanks to the latest developments in sensor and storage technologies. With this burgeoning growth in the volume of data that our science communities need to assimilate and mine, difficulties with the current generation of hardware/software ecosystems for big data analytics have become commonplace. Preparing AI technology for the exascale (and co-designing exascale hardware for the AI algorithms themselves) has therefore become an urgent requirement and internationally recognised endeavour.

Although AI Benchmarking is becoming a well-explored topic, several issues are still to be addressed, including, but not limited to:

  • The fact that there are currently no efforts aimed at AI benchmarking that targets exascale hardware and capability, particularly for science.
  • A range of scientific problems involving real-world large-scale scientific datasets, such as those from experimental facilities or observatories, are largely ignored in benchmarking activities.
  • Gap analysis across UK science indicates that benchmarks are needed to serve as a catalogue of techniques offering template solutions to different types of scientific problems.

Whilst scoping the development of an AI benchmark suite, this working group aims to address these issues and opportunities. The benchmark initiative will focus upon removing noise from images – a common issue across multiple disciplines.

The working group is engaging in two parallel activities. One is to build a community across multiple scientific disciplines for synthesising an overall scope for developing AI benchmarks. The second is to develop and evaluate an example benchmark, using the chosen noise filtering challenge as the example problem, across three scientific disciplines. The use cases and disciplines we have selected are a) removing noise from cryogenic electron microscopic (Cryo-EM) datasets (life sciences), b) X-Ray tomographic images (material sciences), and c) weak lensing images (astronomy).

Some of the challenges described above will be addressed through a set of community engagement activities:

  • A one-week study group on creating an example benchmark for noise filtering
  • A two-day, domain-specific workshop
  • A two-day, cross-domain workshop
  • A two-day, evaluation workshop.

Latest news, impact, events and media