The Exascale Computing Algorithms and Infrastructures Benefiting UK Research (ExCALIBUR) programme
ExCALIBUR is a programme within the UK Research and Innovation (UKRI) Strategic Priorities Fund (SPF) with total funding of £46m from October 2019 to March 2025. It is delivered in partnership between UKRI Research Councils (led by the Engineering and Physical Sciences Research Council (EPSRC)) and the department for Business, Energy and Industrial Strategy (BEIS) Public Sector Research Establishments (PSRE) (led by the Met Office)
ExCALIBUR is delivering research and innovative algorithmic development to redesign the UK’s high priority simulation codes to fully harness the power of future supercomputers across scientific and engineering applications. (Such computers are referred to as Exascale since they are targeting delivering a billion billion floating point operations a second and a similar amount of data). It is committed to bringing together an unprecedented range of UK domain/subject-matter experts, mathematicians and computational scientists who will identify common issues and opportunities and focus their combined scientific expertise and resources to accelerate toward interdisciplinary solutions.
The programme objectives have been designed to specifically address the benefits sought:
- Efficiency – The UK’s most important scientific simulation codes will be able to harness the power of the supercomputers of the mid-2020s resulting in an increase in scientific productivity for a given investment.
- Capability – Capitalising on this efficiency will enable the UK to continue to push the boundaries of science across a wide range of fields delivering transformational change in capability.
- Expertise – A new, forward-facing, interdisciplinary approach to Research Software Engineer (RSE) career development will position the next generation of UK software engineers at the cutting-edge of scientific supercomputing.
A key element of the delivery of the ExCALIBUR Programme is the development of cross-cutting (XC) themes and associated activities that apply across Use Cases. The XC themes have been developed in consultation with the community via a Market Engagement event which proved very informative. The Met Office and EPSRC have subsequently launched a series of calls and the grant awardees will be announced over the coming months once contracts have been signed.
Knowledge Exchange is a vital component in achieving the objectives of ExCALIBUR, therefore each of these XC activities will have a named Knowledge Exchange Co-ordinator (KEC). This role will enable integration across the Programme, where researchers are developing software and algorithms in preparation for future Exascale systems. It will foster links with beneficiaries from academia, Public Sector Research Establishments and industry, to collaborate on designs and disseminate of the outcomes from ExCALIBUR.
The output of each activity under the XC themes will be applicable to at least two out of: the Weather & Climate Prediction Use Case; the Fusion Modelling Use Case; and any collection of the Design & Development Working Group Use Cases.
The themes are:
Cross Cutting “Common approaches and solutions”
The activities of this work package will explore and accelerate the development of solutions and approaches to problems that are common to more than one use case and that will thereby enhance the ability of the community to better exploit exascale computing power.
- I/O & Storage – Met Office led, awarded to NCAS, University of Reading in collaboration with the
- University of Cambridge
- Data Workflow – Met Office led, awarded to NCAS, University of Reading
- I/O Infrastructure investigations, awarded to Met Office
- Workflow Design and Analysis, awarded to Met Office
- Coupling (EPSRC led, bids under evaluation)
- Domain Specific Languages (EPSRC led, bids under evaluation)
Cross Cutting “Potential Disruptors”
The activities of this work package will explore and accelerate the development of such new, potentially disruptive, technologies that will enhance the ability of the community to better exploit exascale computing power. Each will deliver the following:
- Exposing parallelism: Parallel-in-Time – Met Office led, awarded to University of Exeter in collaboration with Imperial College London.
- Exposing parallelism: Task Parallelism – Met Office led, awarded to Durham University in collaboration with STFC Hartree
- Machine Learning: optimising numerical methods & augmenting physically based applications – Met Office led, awarded to EPCC, University of Edinburgh
- Containers – awarded to Met Office
- Future Computing Paradigms (EPSRC led, bids under evaluation)
- Verification, Validation and Uncertainty Quantification (EPSRC led, bids under evaluation)
Fusion modelling use case
Awards have been made under Project Neptune to the Universities of York, Exeter, Oxford and Warwick, to Imperial College and UCL in London, and also to STFC. These mostly small short duration grants form an initial research phase of Neptune designed to identify and select models, algorithms and software engineering techniques most suitable for the use case at the Exascale. Two online project-wide events, specifically a kick-off meeting and a workshop have already taken place, and favourable feedback has been received from the grantees concerning the cost-effectiveness of both these and the routine reporting procedures.
Grantees, when they have exercised the option, have also felt they benefited from making formal presentations of their work to others involved in the project.
Significant ‘self-organisation’ with regard to research collaboration has already taken place among the grantees, probably also assisting in the production of an increasing range of reports accessible to all participants, covering the background material, and addressing critical research questions.
The Y3 research plan was drawn up recognising the importance of such teamwork. It has been designed to implement the transition phase between research and the production of software designed for long-term use.
Weather and climate use case
Despite the challenging year that we have all had in adjusting to new working conditions and many taking on new caring responsibilities, an impressive amount of progress has been made in a number of ExCALIBUR activities. Two example highlights are:
- Work being led by STFC has started investigation of application of PSyclone to a range of the marine systems used in the Weather & Climate modelling system. PSyclone is a domain specific compiler being developed and used for the new atmosphere modelling system, GungHo and LFRic. This implements the principle of a separation of concerns which means that, through automatic code generation, the science developers do not have to worry about the detail of how to make the algorithms operate in a massively parallel supercomputing environment. Application of this approach to the marine systems represents an exciting expansion of the scope of the separation of concerns.
- In the context of the Weather & Climate use case verification is the process of comparing forecasts with observations to determine in a statistical way how accurate the predictions are. (In other areas this is sometimes referred to as validation.) The current system used by the Weather & Climate use case for weather verification is decades old and is becoming increasingly hard to migrate to ever newer supercomputers. The last month has seen a significant step towards the adoption of a modern, comprehensive verification system developed in the US called MET. This new system is written in C++ but with Python-based wrappers. It is a highly configurable system that gives access to a multitude of tools and will allow for significant future expansion of its capability. Initial comparisons have shown almost exact agreement between the output of the old and new systems. Work will continue on expanding the range of MET applications supported and also on optimising its performance.
UKRI high priority use case
EPSRC have recently held a series of visits with the Design and Development Working groups (DDWGs) to understand their progress and developments made so far, to reflect on challenges encountered starting a grant in the last year and to interact with researchers at the heart of preparing for exascale software and algorithms.
Over the course of the last year the DDWGs have delivered workshops and training, these have been to aid a combination of code design, community building and skills development. The understanding the DDWGs have gained will provide a foundation for the next phase of the UKRI High Priority Use Case (HPUC), which recently launched and will close in September 2021. This funding opportunity is open to the ten DDWGs, where they can submit single proposals or reconfigure and apply together. Whilst the call is closed to the ten groups, new partners can become involved in the next phase of applications to ensure the vision of the HPUCs can be delivered.
A few upcoming workshops and activities hosted by the DDWGs are listed below:
- Turbulence at the exascale podcast
- ELEMENT and ExCALIBUR SLE have recently held workshops discussing their ‘Vision paper and Strategic agenda for meshing’ and ‘Data Visualisation and Data Flows’ respectively.
- 26 June and 15 July, next sessions of the Performance Analysis Series for the ExCALIBUR Knowledge Integration activity.
One further phase of this funding opportunity will be launched by the programme to ensure there is a balance in the disciplines supported by ExCALIBUR portfolio of High Priority Use Cases.
UKRI Hardware and Enabling Software (H&ES) Group
n the Autumn ExCALIBUR newsletter we reported on the ExCALIBUR Hardware and Enabling Software programme (H&ES) which is providing £4.5m of capital funding over 2020-2024 to set up pre-Exascale testbeds featuring novel hardware and supporting software such as compilers, debuggers and schedulers. The H&ES initiative has now supported several testbed projects, and these have begun to be taken up by researchers in the ExCALIBUR community and beyond. More information is available from the H&ES website.
H&ES testbeds are generally small systems that offer researchers the opportunity to test key aspects such as portability and performance of their codes to new processing architectures, accelerators, interconnects and toolsets. For example Edinburgh Parallel Computing Centre (EPCC) were awarded £250,000 from the programme to set up a testbed based on the Cerebras CS-1 Wafer Scale Engine which is the first of its kind in Europe. The CS-1 is the world’s largest processing unit with 1.2 trillion transistors, 400,000 processor cores, 18 gigabytes of SRAM, and an interconnect between processors capable of moving 100 million billion bits per second. Please see the EPCC website for more information.
A key activity of the H&ES programme is to work with the ExCALIBUR community, and more broadly with researchers in data intensive and compute intensive fields, to understand what is required in order to benchmark future Exascale systems. Initial work has focussed on collaboration with ExCALIBUR funded projects from UKRI, UKAEA and the Met Office. The goal here is to identify a core set of mini apps and proxy apps from conventional benchmarking suites that will permit the performance of diverse systems and architectures to be characterised. This work will be prototyped on the H&ES testbeds and go on to inform procurement decisions about future Exascale and pre-Exascale clusters. The H&ES Benchmarking team is also starting to explore how to devise ‘algorithmic benchmarks’ that are readily adaptable to a wide range of possible future architectures.
Research Software Engineer Knowledge Integration
An editorial board led by Professor Mark Parsons and comprising membership from the DDWGs; the Met Office, UKAEA, H&ES and SSI have been working together to review the current landscape of skills and training required for Research Software Engineers for software and supercomputers of the mid 20s. This landscape review will be launched in the coming weeks and will shape the direction of future RSE KI funding opportunities within ExCALIBUR.
Alongside this review training has been developed and delivered by DDWGs the outputs of these courses will be available on the ExCALIBUR website in the coming month.
For more information on this activity please contact: Sarah King
ExCALIBUR programme website
A website for the ExCALIBUR programme is being created and will be launched in Summer 2021. This will provide key information on the progress of the programme and will be a repository for key documentation for dissemination of the ExCALIBUR benefits and outcomes.
International connections / the Exascale community
The programme, supported by the ExCALIBUR Steering Committee, have made initial contacts with the US Exascale Computing Project and the Japanese RIKEN programme to support international connections with the Exascale community. Similarly contact is planned with other overseas programmes to support programme benefit realisation.