VP-Detector: A 3D multi-scale dense convolutional neural network for macromolecule localization and classification in cryo-electron tomograms

doi:10.1016/j.cmpb.2022.106871

Computer Methods and Programs in Biomedicine

Volume 221, June 2022, 106871

https://doi.org/10.1016/j.cmpb.2022.106871 Get rights and content

Abstract

Background and objective: Cryo-electron tomography (cryo-ET) with subtomogram averaging (STA) is indispensable when studying macromolecule structures and functions in their native environments. Due to the low signal-to-noise ratio, the missing wedge artifacts in tomographic reconstructions, and multiple macromolecules of varied shapes and sizes, macromolecule localization and classification remain challenging. To tackle this bottleneck problem for structural determination by STA, we design an accurate macromolecule localization and classification method named voxelwise particle detector (VP-Detector).

Methods: VP-Detector is a two-stage particle detection method based on a 3D multiscale dense convolutional neural network (3D MSDNet). The proposed network uses 3D hybrid dilated convolution (3D HDC) to avoid the resolution loss caused by scaling operations. Meanwhile, it uses 3D dense connectivity to encourage the reuse of feature maps to reduce trainable parameters. In addition, the weighted focal loss is proposed to focus more attention on difficult samples and rare classes, which relieves the class imbalance caused by multiple particles of various sizes. The performance of VP-Detector is evaluated on both simulated and real-world tomograms, and it shows that VP-Detector outperforms state-of-the-art methods.

Results: The experiments show that VP-Detector outperforms the state-of-the-art methods on particle localization with an F1-score of 0.951 and a precision of 0.978. In addition, VP-Detector can replace manual particle picking in experiment on the real-world tomograms. Furthermore, it performs well in classifying large-, medium-, and small-weight proteins with accuracies of 1, 0.95, and 0.82, respectively. Finally, ablation studies demonstrate the effectiveness of 3D HDC, 3D dense connectivity, weighted focal loss, and training on small training sets.

Conclusions: VP-Detector can achieve high accuracy in particle detection with few trainable parameters and support training on small datasets. It can also relieve the class imbalance caused by multiple particles with various shapes and sizes.

Introduction

Cryo-electron tomography (cryo-ET) is a promising imaging technique that allows three-dimensional (3D) visualization of protein or macromolecular complexes at molecular resolution in their native context. The tomographic reconstruction is generated from a tilt series imaged by rotating the biological sample. However, the resolution of tomographic reconstructions is limited by two factors: the low signal-to-noise ratio (SNR) due to the low electron dose and the missing wedge caused by the missing tilt angles. When numerous noisy copies of a complex of interest appear in a set of tomograms, subtomogram averaging (STA) can significantly increase their resolution by extracting, classifying, aligning, and averaging subtomograms in their native biological surroundings [17], [35], [36]. STA provides biological insights into the interaction and function of structures imaged under close-to-life conditions. Accurate subtomogram localization and classification, as the first two steps in the STA process, are critical for the subsequent subtomogram alignment and averaging to improve the final resolution of subtomograms. Nevertheless, extracting and distinguishing macromolecules from the complex and crowded native environment is challenging because of the low-resolution tomographic reconstruction and varied shapes and sizes of multiple macromolecules.

Traditional particle identification approaches such as template matching are employed to localize each type of macromolecule independently in tomographic reconstructions. However, due to the low SNR and missing wedge, template matching is only suitable for large complexes and ineffective for identifying small structures or macromolecular species with different states. Recently, the deep learning technique has gained revolutionary success in the fields of computer vision and image processing. In particular, convolutional neuron networks (CNNs) have been applied to object detection [24], classification [5], [10], [25], [28] and segmentation [8], [27], [31] in cryo-ET due to their natural ability to handle high noise data and achieve faster speed than template matching. However, the previously mentioned approaches for localizing and classifying subtomograms have the following deficiencies: (1) Downscaling and upscaling operations employed in CNNs can cause resolution loss, which makes particle detection in low SNR cryo-ET images difficult. (2) It is difficult to obtain adequate training data because collecting a large amount of high-quality cryo-ET images and labeling the ground truth are both time-consuming. However, these networks are typically designed with multiple trainable parameter which require large numbers of training data to prevent overfitting. (3) The number of distinct macromolecules varies significantly, resulting in the class imbalance problem that has been overlooked in previous studies.

To overcome the above limitations, we propose an efficient and automatic two-stage particle detection method called VP-Detector for localizing and classifying macromolecules in cryo-electron tomograms. VP-Detector uses a 3D multiscale dense convolutional neural network (3D MSDNet), taking advantage of the following strategies: (1) To avoid the loss of the resolution caused by the scaling operations in traditional CNNs, the 3D hybrid dilated convolution (3D HDC) module is proposed in our network, which can capture multiscale features without sacrificing resolution. (2) To train on small datasets without overfitting, 3D dense connectivity is proposed in our network, which encourages the reuse of feature maps to reduce the number of trainable parameters. (3) To address the class imbalance problem caused by the varying size of multiple macromolecules, the weighted focal loss is proposed in our network, which can reweight the loss of different classes based on prior knowledge and focus on difficult samples and rare classes.

In the experiments, VP-Detector is evaluated on two simulated datasets and one real-world dataset, and the results demonstrates that VP-Detector is accurate and effective. First, particle localization is assessed on precision, recall, miss rate, and F1-score metrics. The result shows that VP-Detector outperforms the state-of-the-art methods on the simulated tomograms with the highest precision and F1-score. Next, VP-Detector can replace manual particle picking approach on real-world tomograms. Then, the performance of particle classification is measured based on class accuracy and group accuracy. The results show that VP-Detector performs well in all groups of proteins. Finally, ablation studies are conducted to prove the effectiveness of 3D HDC, 3D dense connectivity, weighted focal loss, and training on small training sets.

Section snippets

Traditional particle detection

Over the past few years, several software packages have been designed for cryo-ET together with STA, such as Eman2 [7], RELION [2], Dynamo [4], and emClarity [15]. Most software packages support manual picking of particles on two-dimensional (2D) orthogonal slices. However, manual picking is labour intensive and subjective. Automatic picking methods are more efficient than manual picking for dealing with thousands of macromolecules in cryo-ET images. There are two types of automatic picking

Methods

In this section, we will start with the overall workflow of the two-stage VP-Detector algorithm for particle localization and particle classification. Then, we will describe the architecture of the 3D multiscale dense convolutional neural network (3D MSDNet) used in the VP-Detector, including three key contributions. First, to detect particles in low SNR cryo-ET images, the 3D HDC module is proposed in our network. Second, to train on a limited number of cryo-ET images, the 3D dense

Dataset

To verify the effectiveness of our VP-Detector for biological particle localization and classification in cryo-electron tomograms, we carried out experiments on two simulated datasets and one real-world dataset.

Simulated datasets Two simulated datasets are from the Shape Retrieval Challenge (SHREC) in 2019 and 2020 at the workshop on 3D object retrieval [11], [12]. Each dataset consists of nine sets of $512 \times 512 \times 512$ tomograms with 1 $nm / voxel$ resolution. Only one tomogram without the ground truth

Conclusion

In this paper, we present an automatic and accurate learning-based approach, VP-Detector, for 3D bioparticle picking in cryo-electron tomograms. Our method consists of a particle localization stage and particle classification stage. The proposed network employs the 3D HDC to avoid the resolution loss, the 3D dense connectivity to encourage the reuse of feature maps to reduce trainable parameters, and the weighted focal loss to relieve the class imbalance. The support of training on small

Statements of ethical approval

The data used in this publication has been publicly available.

Declaration of Competing Interest

Authors declare that they have no conflict of interest.

Acknowledgments

The research is supported by the National Key Research and Development Program of China (Nos. 2021YFF0704300, 2017YFA0504702), the Strategic Priority Research Program of the Chinese Academy of Sciences (No. XDA16021400 ), and the NSFC projects grants (61932018, 62072441, 62072280 and 62072283).

References (44)

J.M. Bell et al.
New software tools in Eman2 inspired by emdatabank map challenge
J. Struct. Biol.
(2018)
D. Castaño-Díez et al.
Dynamo catalogue: geometrical tools and data management for particle picking in subtomogram averaging of cryo-electron tomograms
J. Struct. Biol.
(2017)
I. Gubins et al.
Shrec 2020: classification in cryo-electron tomograms
Comput. Graph.
(2020)
R. Han et al.
Autom: a novel automatic platform for electron tomography reconstruction
J. Struct. Biol.
(2017)
J.R. Kremer et al.
Computer visualization of three-dimensional image data using IMOD
J. Struct. Biol.
(1996)
F. Wang et al.
Deeppicker: a deep learning approach for fully automated particle picking in cryo-em
J. Struct. Biol.
(2016)
P. Wang et al.
Understanding convolution for semantic segmentation
2018 IEEE Winter Conference on Applications of Computer Vision (WACV)
(2018)
X. Zhang et al.
Dilated convolution neural network with Leakyrelu for environmental sound classification
2017 22nd International Conference on Digital Signal Processing (DSP)
(2017)
T.A. Bharat et al.
Resolving macromolecular structures from electron cryo-tomography data using subtomogram averaging in RELION
Nat. Protoc.
(2016)
J. Böhm et al.
Toward detecting and identifying macromolecules in a cellular context: template matching applied to electron tomograms
Proc. Natl. Acad. Sci.
(2000)

C. Che et al.

Improved deep learning-based macromolecules structure classification from electron cryo-tomograms

Mach. Vis. Appl.

(2018)

L.-C. Chen, G. Papandreou, F. Schroff, H. Adam, Rethinking atrous convolution for semantic image segmentation, arXiv...

M. Chen et al.

A complete data processing workflow for cryo-et and subtomogram averaging

Nat. Methods

(2019)

M. Chen et al.

Convolutional neural networks for automated annotation of cellular cryo-electron tomograms

Nat. Methods

(2017)

H.-S. Choi et al.

Phase-aware speech enhancement with deep complex U-Net

International Conference on Learning Representations

(2019)

S. Gao et al.

Macromolecules structural classification with a 3Ddilated dense network in cryo-electron tomography

IEEE/ACM Trans. Comput. Biol. Bioinf.

(2021)

I. Gubins et al.

Classification in cryo-electron tomograms

SHREC’19 Track

(2019)

K. He et al.

Deep residual learning for image recognition

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

(2016)

B.A. Himes et al.

Emclarity: software for high-resolution cryo-electron tomography and subtomogram averaging

Nat. Methods

(2018)

G. Huang et al.

Densely connected convolutional networks

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

(2017)

J. Hutchings et al.

Subtomogram averaging of COPII assemblies reveals how coat organization dictates membrane shape

Nat. Commun.

(2018)

A. Jansson et al.

Singing voice separation with deep u-net convolutional networks

18th international Society for Music Information Retrieval Conference

(2017)

Cited by (7)

Computational Methods Toward Unbiased Pattern Mining and Structure Determination in Cryo-Electron Tomography Data
2023, Journal of Molecular Biology
Cryo-electron tomography can uniquely probe the native cellular environment for macromolecular structures. Tomograms feature complex data with densities of diverse, densely crowded macromolecular complexes, low signal-to-noise, and artifacts such as the missing wedge effect. Post-processing of this data generally involves isolating regions or particles of interest from tomograms, organizing them into related groups, and rendering final structures through subtomogram averaging. Template-matching and reference-based structure determination are popular analysis methods but are vulnerable to biases and can often require significant user input. Most importantly, these approaches cannot identify novel complexes that reside within the imaged cellular environment. To reliably extract and resolve structures of interest, efficient and unbiased approaches are therefore of great value. This review highlights notable computational software and discusses how they contribute to making automated structural pattern discovery a possibility. Perspectives emphasizing the importance of features for user-friendliness and accessibility are also presented.
Computational methods for three-dimensional electron microscopy (3DEM)
2022, Computer Methods and Programs in Biomedicine
DeepETPicker: Fast and accurate 3D particle picking for cryo-electron tomography using weakly supervised deep learning
2024, Nature Communications
MEAN SHIFT CLUSTERING AS A LOSS FUNCTION FOR ACCURATE AND SEGMENTATION-AWARE LOCALIZATION OF MACROMOLECULES IN CRYO-ELECTRON TOMOGRAPHY
2024, bioRxiv
TomoTwin: generalized 3D localization of macromolecules in cryo-electron tomograms with structural data mining
2023, Nature Methods
Computational methods for in situ structural studies with cryogenic electron tomography
2023, Frontiers in Cellular and Infection Microbiology

View all citing articles on Scopus

View full text

VP-Detector: A 3D multi-scale dense convolutional neural network for macromolecule localization and classification in cryo-electron tomograms

Abstract

Introduction

Section snippets

Traditional particle detection

Methods

Dataset

Conclusion

Statements of ethical approval

Declaration of Competing Interest

Acknowledgments

J. Struct. Biol.

J. Struct. Biol.

Comput. Graph.

J. Struct. Biol.

J. Struct. Biol.

J. Struct. Biol.

Resolving macromolecular structures from electron cryo-tomography data using subtomogram averaging in RELION

Nat. Protoc.

Toward detecting and identifying macromolecules in a cellular context: template matching applied to electron tomograms

Proc. Natl. Acad. Sci.

Improved deep learning-based macromolecules structure classification from electron cryo-tomograms

Mach. Vis. Appl.

A complete data processing workflow for cryo-et and subtomogram averaging

Nat. Methods

Convolutional neural networks for automated annotation of cellular cryo-electron tomograms

Nat. Methods

Phase-aware speech enhancement with deep complex U-Net

International Conference on Learning Representations

Macromolecules structural classification with a 3Ddilated dense network in cryo-electron tomography

IEEE/ACM Trans. Comput. Biol. Bioinf.

Classification in cryo-electron tomograms

SHREC’19 Track

Deep residual learning for image recognition

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Emclarity: software for high-resolution cryo-electron tomography and subtomogram averaging

Nat. Methods

Densely connected convolutional networks

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Subtomogram averaging of COPII assemblies reveals how coat organization dictates membrane shape

Nat. Commun.

Singing voice separation with deep u-net convolutional networks

18th international Society for Music Information Retrieval Conference