BioDAC: Bio-image processing at exascale 

The aim of the project is to assess the potential of exascale computing for bio-imaging and bioimage informatics. The focus is on light sheet microscopy techniques, motivated by the challenging size of typical datasets (from terabytes to petabytes), but many outcomes can be easily generalised for other data-rich microscopy modalities.

Tackling the challenges of light sheet image analysis, progress on the software side allows processing of the data, time point by time point, but typically the analysis is time-consuming and does not scale well. It is exceedingly difficult to analyse sufficient data to attain statistical significance or optimise analysis by hyperparameter search. Moreover, computational cost and complexity holds back the design of the algorithms. To address these challenges, we will build an ecosystem around the Exascale Data Testbed (Cambridge Data Accelerator – DAC), focusing on microscopy data. Streamlining the exploitation of the DAC by the biological community will unlock the potential of the data rich microscopy techniques. 

Objectives: 

  1. Knowledge exchange. The overlapping communities the project targets are biologists, applied mathematicians, computer scientists and the heterogeneous ExCALIBUR network, their work spanning the remits of EPSRC, BBSRC, MRC and STFC.  
  1. Provide a novel approach to reconstruction of light sheet microscopy data. Achieving superior image quality and improving downstream analysis and reproducibility is crucial when preparing samples. Optimising the imaging, and acquiring the data represent a long or expensive process. The envisaged solution would seamlessly integrate data acquisition, transfer to, and processing on high performance clusters. 
  1. Streamline deep learning for microscopy image segmentation. The fast-evolving field of artificial intelligence, and especially deep learning, represent a step change in the performance of 2D and 3D microscopy image segmentation tools. The community is well-aware and extensively uses software for 2D data, however 3D models are not as widely shared, are much more laborious to train with limited availability of labelled data. The spectrum of the applications ranges from the relatively easier cases of nuclei segmentations to the more challenging case of cytoplasm- or membrane-labelled cells, often exhibiting complex morphologies, compounded with the resolution limitations.  
  1. Implement new requirements. The challenge we address is to gather best-practice advice for agile, robust and reliable ways to address new requirements in bio-image analysis.   

Latest news