BBSRC funds development by CCP-EM of machine learning techniques for cryoEM map interpretation
20 Feb 2020



Researchers from the Collaborative Computational Project for Electron cryo-Microscopy (CCP-EM) propose to build a tool combining deep learning with protein structure feature libraries to enrich the information extracted from cryo-electron microscopy maps.

 Protein structural feature libraries


Principal Investigator Dr Martyn Winn together with Co-Investigators Dr Agnel-Praveen Joseph and the Head of Scientific Machine Learning – SciML,  Dr Jeyarajan Thiyagalingam, were recently awarded ~£150,000 by UKRI research council BBSRC for their proposal entitled “Intermediate-to-low resolution feature detection in cryoEM maps using cascaded neural networks”. The proposal was ranked 1st out of 71 proposals to the Tools and Resources Development Fund, and is due to run from approximately early April 2020 to April 2021.

Dr Winn and team proposed the application of artificial intelligence techniques, specifically machine learning - deep neural networks - to elucidate protein structure feature recognition in intermediate-to-low resolution cryoEM maps. The developed approach would help to extend the interpretability of protein structures at intermediate and low-resolutions, and make better use of such data to gain insights into the mechanisms of biological function. 

The proposed development will be implemented as a user-friendly tool and distributed to the scientific community. The developed tool will be integrated into the software suite of the Collaborative Computational Project for Electron cryo-Microscopy (CCP-EM). It’s anticipated that other scientific fields could potentially benefit from the machine learning architecture designed for such multi-label 3D segmentation from noisy data.

Further Details
The work will exploit the hierarchical organization of protein structural features. It will use structural feature libraries of different sizes ranging from secondary structures (e.g. alpha helices and beta sheets) and smaller motifs (e.g. turns of the protein chain) to sub-folds and folds. A specialized set of motifs or sub-folds covering intermediate size features will be generated based on compactness (contacts). Deep neural network architectures will be designed to detect these 3D structural features in the map, with cascaded networks arranged to reflect the structural hierarchy. There's also a plan to use the developed networks for validation of existing structure models derived from low resolution data, and it's anticpated that in the future the work could be extended to potentially build structural models by assembling the features using additional sequence based information.

CCP-EM information


Contact: Geatches, Dawn (STFC,DL,SC)