The potential of Exascale commonly leads to sophisticated target applications that are highly integrated and derive function from the interplay between numerous physical processes. Different physical processes are commonly simulated using different numerical methods and different software libraries. To simulate integrated systems with multiple interacting physical processes it is necessary to couple software codes and support the efficient and accurate exchange of data. Challenges in coupling different simulation codes include stability and accuracy of the simulation, and computational efficiency. These challenges are amplified at Exascale by the special characteristics of the numerical methods that are suited to Exascale, and by the scale and complexity of Exascale hardware. A degradation of stability or accuracy, or suboptimal computational performance, can make Exascale computation intractable, or at best mean failure to exploit exascale potential.
Two clear consensuses on exascale computing that have emerged are:
- High-order versions of methods, where possible, are the approach of choice at Exascale as they allow the balance between floating point operations and memory movements to be adjusted to match the characteristics of Exascale hardware; and
- Exascale hardware will be heterogeneous, frequently with GPU-type accelerators.
Stability and accuracy of coupled problems are increasingly challenging to maintain as the order of methods increases, especially for physical models with multiple spatial and temporal scales. Moreover, the level of parallelism and diversity of hardware in exascale systems makes efficiency when coupling a major challenge. Coupled simulations amplify complexities around communication overheads, load-balancing, and the concurrent execution of different solvers on different types of hardware in a heterogeneous system. The distribution and dissemination of coupled workflows is also a challenge; simply put, how do we reliably distribute software for coupled simulations that depend on numerous separate components in such a way that it avoids replication of code as far as is possible but still allows for a self-contained and portable package that can be deployed across different platforms?
The project research and software development programme has four parts:
- Mathematical analysis and stability of coupling technologies
- Software frameworks for coupling
- Challenges in reproducibility
- Applications