xDSL adds distributed memory support to MLIR

MLIR is a compiler technology first developed by Google and then made open source around five years ago. The xDSL project have contributed support for distributed memory parallelism into MLIR via the MPI dialect that they have developed during ExCALIBUR.

MLIR aims to bridge the gap between programming language front ends and back ends. Traditionally a programming language front end, such as one for Fortran, Python or C, will generate some Intermediate Representation (IR), such as LLVM-IR, which is understandable by a back end such as those targeting CPUs or GPUs. However, the major challenge is that this IR is rather low level and the front end need to undertake a significant amount of work to transform the programmer’s code into this IR, with numerous activities duplicated between front ends.

Instead, MLIR provides far more structure to the IR in the form of IR dialects. These dialects range from high level concepts such as mathematical stencils or linear algebra operations, all the way to low level considerations such as memory allocation and handling. Dialects can be mixed, and transformations exist which lower between dialects and undertake other transformations. Effectively, this means that a front end can transform a user’s code into a much higher level, common, IR representation and then leverage shared infrastructure to lower to a form understandable by the back ends. In addition to many in-build dialects, MLIR also provides a framework so that other people can develop their own dialects and transformations.

MLIR has become very popular in recent years, with a wide variety of tools such as TensorFlow built on top of it, and a large community supporting it has arisen. However, a challenge is that the framework itself requires a steep learning curve and is fairly esoteric, resulting in a significant investment of time required to work with it. To this end, the xDSL project have been developing a Python compiler toolbox which is 1:1 compatible with MLIR. Initially started by Mathieu Fehr from the School of Informatics who has continued to contribute heavily, this enables compiler developers to design dialects and write transformations in Python, providing a much higher productivity environment for them to work in and thus significantly reducing the time required for development.

The focus of the xDSL project is leveraging MLIR concepts to provide a common HPC Domain Specific Language (DSL) framework. As part of this, xDSL have been developing dialects that are missing in MLIR but critical to support the HPC community. One of these dialects has been the MPI dialect, and using xDSL as a fast prototyping tool, this ExCALIBUR project has worked closely with a variety of stakeholders including the MPI forum, to design, refine, and validate this dialect. This was then presented to the MLIR community, and after numerous discussions and feedback sessions, has now been merged into the main MLIR codebase.

Our work adds, for the first time, support for distributed memory programming to MLIR and this has the potential for a very large impact. It unlocks the ability for other tools built atop MLIR to now target large scale parallelism, and is also an important demonstration of xDSL being used as a fast prototyping tool so that underlying concepts can be explored and refined, before these are merged into MLIR.