Benchmarking for AI for Science at Exascale (BASE)

With ever increasing volumes of data from large-scale experiential facilities and observatories, and the impending arrival of the world’s first exascale supercomputers, there is a clear need for AI solutions that are scalable (as a powerful, modern technique for big data reduction and interpretation).

The performance and efficacy of machine learning systems, in particular deep learning systems, now exceeds human level capability around a rapidly expanding number of tasks, such as object recognition and classification or for anomaly detection across a range of complex engineering systems. At the same time however, the rate at which the scientific community is producing data, through large-scale experimental facilities and observatories, is increasing at an exponential rate, thanks to the latest developments in sensor and storage technologies. With this burgeoning growth in the volume of data that our science communities need to assimilate and mine, difficulties with the current generation of hardware/software ecosystems for big data analytics have become commonplace. Preparing AI technology for the exascale (and co-designing exascale hardware for the AI algorithms themselves) has therefore become an urgent requirement and internationally recognised endeavour.

Although AI Benchmarking is becoming a well-explored topic, several issues are still to be addressed, including, but not limited to:

The fact that there are currently no efforts aimed at AI benchmarking that targets exascale hardware and capability, particularly for science.
A range of scientific problems involving real-world large-scale scientific datasets, such as those from experimental facilities or observatories, are largely ignored in benchmarking activities.
Gap analysis across UK science indicates that benchmarks are needed to serve as a catalogue of techniques offering template solutions to different types of scientific problems.

Whilst scoping the development of an AI benchmark suite, this working group aims to address these issues and opportunities. The benchmark initiative will focus upon removing noise from images – a common issue across multiple disciplines.

The working group is engaging in two parallel activities. One is to build a community across multiple scientific disciplines for synthesising an overall scope for developing AI benchmarks. The second is to develop and evaluate an example benchmark, using the chosen noise filtering challenge as the example problem, across three scientific disciplines. The use cases and disciplines we have selected are a) removing noise from cryogenic electron microscopic (Cryo-EM) datasets (life sciences), b) X-Ray tomographic images (material sciences), and c) weak lensing images (astronomy).

Some of the challenges described above will be addressed through a set of community engagement activities:

A one-week study group on creating an example benchmark for noise filtering
A two-day, domain-specific workshop
A two-day, cross-domain workshop
A two-day, evaluation workshop.

BASE-II

Following the success of the BASE-I project, BASE-II will address identified challenges and requirements in developing AI for Science solutions at exascale. SciML bench, released as part of BASE-I, provided the scientific community with examples and templates for several challenging problems from different research areas. However successful, benchmarks alone cannot be the only solution for the AI for Science community’s challenges. Having gathered feedback from the community, a core set of requirements were identified:

AI Benchmarking
AI/HPC Convergence
AI Hardware/Software Co-Design
Learning from Largescale datasets
AI at Exascale Toolbox

BASE-II aims to develop a suite of exascale-ready software and relevant designs for addressing these highly prioritized requirements from the AI for Science community — Blueprinting AI for Science at Exascale.

We will ensure that our deliverables remain relevant to UKRI’s e-Infrastructures, and to the communities, through tight engagements with various ExCALIBUR-funded projects, industries, various user bases, academia, national laboratories, and international organisations. In addition, knowledge exchange activities will underpin the maximum flow of information between relevant communities, leading to our success.

For more information, please visit the BASE-II website.

Theme

High Priority Use Cases

Contact

Jeyan Thiyagalingam, STFC Laboratories

Latest news

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Benchmarking for AI for Science at Exascale (BASE)

BASE-II

Theme

Contact

Related resources

Hardware & Enabling Software update

BASE II Blueprinting AI for Science at EXASCLE

Explore our projects

ExaJULES

CompBioMedX

SysGenX

Latest news

AI on Real-World Applications: Placement opportunities at STFC

RISC-V testbed at the RISC-V Summit Europe

A busy week at PASC24

Sign up for the latest ExCALIBUR updates and news