S2S-ARSNet: Sequence-to-Sequence automatic renal segmentation network

doi:10.1016/j.bspc.2022.104121

Biomedical Signal Processing and Control

Volume 79, Part 1, January 2023, 104121

https://doi.org/10.1016/j.bspc.2022.104121 Get rights and content

Highlights

•
A Sequence-to-Sequence Automatic Renal Segmentation Network (S2S-ARSNet) is proposed for diuretic renography.
•
Through embedding the ConvLSTM into the Unet structure, inter-slice sequence information and intra-slice spatial information can be automatically learned. The spatial and temporal information is enriched and the segmentation results are further optimized.
•
A pre-trained Unet was used to generate an auxiliary mask at the intermediate moment to supervise each frame, which eliminated the interference of displacement caused by patients in clinical diagnosis.
•
Compared with the existing methods, the proposed method achieves a better result with a DSC of 0.9470 and an IOU of 0.9004.

Abstract

Accurate segmentation of kidney contours in diuretic renography has important implications for clinical diagnosis and treatments. However, the lack of clear boundaries and high-quality images makes automatic segmentation challenging. This paper proposes a novel automatic renal segmentation network, S2S-ARSNet, combining Convolutional Long Short-Term Memory (ConvLSTM) with the Unet structure. Unet is used to learn the spatial information of each sequence, and ConvLSTM is used to discover the temporal information between sequences and automatically update the temporal state of the sequence. Moreover, an additional pre-trained Unet is applied to generate coarse masks at different times to simulate the displacement that may occur during the detection process. In this way, the spatiotemporal information is modelled, and all the information of the 3D data is fully utilized to eliminate false positives and improve the segmentation accuracy. Extensive experiments were performed on the diuretic renography dataset. The experimental results show that the proposed method can significantly enhance the kidney segmentation performance compared with other single-image-based deep learning segmentation methods.

Introduction

Pelvicaliectasis (distension of the pelvicalyceal system) with or without megaureter (distension of the ureter) is the most common indication for radionuclide evaluation of the kidneys in pediatric patients [1]. Both obstructive and non-obstructive causes can lead to pelvic dilation and renal pelvis enlargement, impairing renal function and the growth potential of the developing kidney [2]. Diuretic renography is a tool for measuring uptake and drainage at admission and during conservative or postoperative follow-up [3]. Renogram analysis can help clinicians understand the condition and determine specific treatment options. Compared with intravenous urography, contrast-enhanced ultrasonography and conventional radionuclide nephrography, radionuclide scintigraphy can reliably distinguish pelvic dilatation and megaureter caused by obstructive and non-obstructive causes, which is beneficial to the early diagnosis of the disease [4]. Diuretic nephrography is widely used in clinical practice as a safe, valuable and non-invasive method for assessing renal function. Typically, kidney contours are manually marked by radiologists, a time-consuming, laborious, and unrepeatable process. Different doctors have different understandings of diuretic nephrography. Other imaging doctors will produce different results when outlining the kidney on the same image. Even the same doctor cannot guarantee the results of two operations on the same idea. Exactly, it's very subjective. Unified contour criteria significantly impact subsequent processing [5], [6]. In addition, different imaging methods have various segmentation difficulties due to other imaging principles, while using existing segmentation methods, it is difficult to obtain consistent results across different tasks. An end-to-end model based on deep learning techniques is proposed to address these issues for automatic renal segmentation.

In recent years, the rapid development of deep learning techniques has promoted the progress of various computer vision tasks [7]. Convolutional Neural Networks (CNNs) have achieved remarkable success in numerous medical image segmentation challenges, gradually replacing traditional artificial and machine learning methods. At the same time, it also plays an essential role in the localization and segmentation tasks of organs or lesions, and the kidney is no exception [8], [9]. Automatic processing of medical images reduces the time, cost, and error of human-based processing and effectively improves the working efficiency of clinicians. So far, there have been many studies. The well-known Fully Convolutional Neural Network (FCN) [10] is a milestone in semantic segmentation. Unlike traditional CNN, which uses a fully connected layer after the convolution layer to obtain fixed-length feature vectors, FCN can accept input images of any size, realize pixel-level classification, and solve image problems at the semantic level. Subsequently, Ronneberger et al. [11], referring to the structure of FCN, proposed Unet, which achieved good results in the competition of medical image segmentation. Their network consists of symmetric encoding and decoding paths. The encoding path is used to extract many dimensionality-reduced features from input data. In the decoding path, feature maps are gradually sampled, and finally, a segmented map with the same size as the input was generated. Furthermore, skip connections are helpful for further improvement [12]. Over the past few years, many Unet variants have been proposed to deal with various problems [13], [14], [15], [16].

However, all the above methods rely on a single image and do not involve sequence modelling. That means that segmentation is a static process without any sequence information. Although these methods are simple and commonly used, the segmentation results are not accurate enough to meet the requirements of high-precision medical image processing. Besides, diuretic renography, as dynamic imaging with a duration of about 20 min and then only taking a single image at a particular moment as the final prediction result of the sequence, not only lost a lot of information but also failed to highlight the characteristics of dynamic imaging, which is not the best choice for sequence segmentation problems [17].

An end-to-end model based on deep learning technology is proposed to address these issues for automatic renal segmentation. To further improve the accuracy of renal segmentation in diuretic renography, a robust segmentation model based on Unet structure and nested ConvLSTM is proposed, inspired by convolutional long short-term memory (ConvLSTM). This architecture is termed Sequence-to-Sequence Automatic Renal Segmentation Network (S2S-ARSNet). We apply the proposed network to a challenging clinical segmentation problem: automatic renal segmentation in a diuretic renography dataset collected by Xinhua Hospital Affiliated to Shanghai Jiao Tong University. The contributions of our work can be summarized as follows.

1.
A Sequence-to-Sequence Automatic Renal Segmentation Network (S2S-ARSNet) is proposed for diuretic renography.
2.
By embedding ConvLSTM into the Unet structure, inter-slice sequence information and intra-slice spatial information can be automatically learned. The spatial and temporal information is enriched, and the segmentation results are further optimized.
3.
A pre-trained Unet was to generate an auxiliary mask at the intermediate moment to supervise each frame, eliminating the interference of displacement caused by patients in clinical diagnosis.
4.
Compared with the existing methods, the proposed method achieves a better result with a DSC of 0.9470 and an IOU of 0.9004.

This paper is organized as follows. Section 1 is devoted to a general introduction. Section 2 presents previous related works. Section 3 describes the proposed method. Section 4 introduces the implementation and evaluation scheme of the experiment and its results. Discussions are obtained in Section 5. Finally, the conclusion is drawn in Section 6.

Section snippets

Related work

Medical image segmentation identifies internal voxels and external contours in the region of interest (ROI), a crucial task of computer-aided diagnosis in clinical practice. Medical image analysis initially employed the traditional manual feature extraction method [18], [19], [20]. Analytical methods are often task-specific, complex, poorly portable, and inaccurate. The emergence of deep learning has changed this malpractice. It automatically analyzes and learns data features from large-scale

Proposed method

Based on the diuretic renography dataset, a novel Sequence-to-Sequence Automatic Renal Segmentation network (S2S-ARSNet) is proposed. Unlike many existing methods, our algorithm learns long-term spatiotemporal features directly from training data in an end-to-end manner. Experiment results show that our approach is superior to the single image segmentation method.

Experiments and results

The S2S-ARSNet is implemented in Pytorch. The model is trained on Ubuntu 16.04LTS equipped with an Intel (R) Xeon-(R) W-2104 processor and two NVIDIA GeForce GTX1080 GPU with 8 GB RAM.

Discussions

As mentioned above, S2S-ARSNet is proposed to segment kidneys from diuretic renography in this paper. Unlike most existing segmentation methods, we take full advantage of the three-dimensional characteristics of data and use ConvLSTM to model the spatiotemporal relationship and get better results. The following conclusions can be drawn from the experiment results.

First of all, the auxiliary mask is necessary for our network. The case data contains many frames, and it is time-consuming and

Conclusion

In this paper, a novel S2S-ARSNet is proposed for Automatic Renal Segmentation. Unlike some existing segmentation methods, we integrate ConvLSTM into Unet to learn the correlation between adjacent slices to obtain intra-slice spatial information and inter-slice temporal information simultaneously. Additionally, due to the lack of a large number of labels, we used a pre-trained Unet pretreatment input data to generate an auxiliary mask at the corresponding times step instead of using a unified

CRediT authorship contribution statement

Gaoyu Cao: Conceptualization, Methodology, Software, Visualization, Writing – original draft. Zhanquan Sun: Writing – review & editing, Project administration, Investigation, Supervision. Chaoli Wang: Writing – review & editing, Resources, Funding acquisition, Project administration, Supervision. Hongquan Geng: Resources, Data curation. Hongliang Fu: Resources, Data curation. Lin Sun: Data curation. Jiao Nan: Data curation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. This paper was partially supported by National Defense Basic Research Program (JCKY2019413D001), Medical Engineering Cross Project of USST (10-21-302-413) and (10-202-302-424).

References (36)

S.A. Koff et al.
Assessment of hydronephrosis in children utilizing diuretic radionuclide urography
The Journal of Urology
(Apr. 1980)
V. Ficarra et al.
Preoperative Aspects and Dimensions Used for an Anatomical (PADUA) Classification of Renal Tumours in Patients who are Candidates for Nephron-Sparing Surgery
European Urology
(2009)
A. Kutikov et al.
The RENAL nephrometry score: a comprehensive standardized system for quantitating renal tumour size, location and depth
The Journal of urology
(Sep. 2009)
C. Li et al.
ANU-Net: Attention-based Nested U-Net to exploit full resolution features for medical image segmentation
Computers & Graphics
(2020)
S. Kumar et al.
Texture Feature Extraction to Colorize Gray Images
International Journal of Computer Applications
(2013)
B.L. Shulkin et al.
Interpretation of the renogram: problems and pitfalls in hydronephrosis in children
Journal of Nuclear Medicine
(1997)
A. Eskild-Jensen et al.
Interpretation of the renogram: problems and pitfalls in hydronephrosis in children
Bju International
(2004)
I. Gordon et al.
Guidelines for standard and diuretic renogram in children
European Journal of Nuclear Medicine and Molecular Imaging
(2011)
A. Taha et al.
Kid-Net: Convolution Networks for Kidney Vessels Segmentation from CT-Volumes
MICCAI
(2018)
Y. Lecun et al.
Deep learning
Nature
(2015)

J. Guo, W. Zeng, S. Yu, et al., RAU-Net: U-Net Model Based on Residual and Attention for Kidney and Kidney Tumor...

J. Long, E. Shelhamer, and T. Darrell, Fully convolutional networks for semantic segmentation, in Proceedings of the...

O. Ronneberger, P. Fischer, and T. Brox, U-net: Convolutional networks for biomedical image segmentation, in...

M. Drozdzal et al.

The Importance of Skip Connections in Biomedical Image Segmentation

DLMIA

(2016)

Z. Zhou et al.

UNet++: A Nested U-Net Architecture for Medical Image Segmentation

DLMIA

(2018)

M. Z. Alom, M. Hasan, C. Yakopcic, T. M. Taha, and V. K. Asari, Recurrent residual convolutional neural network based...

O. Oktay, J. Schlemper, L. L. Folgoc, M. Lee, M. Heinrich, K. Misawa, K. Mori, S. McDonagh, N. Y. Hammerla, B. Kainz,...

X. Ning et al.

YouTube-VOS: Sequence-to-Sequence Video Object Segmentation

ECCV

(2018)

Cited by (1)

SeqCorr-EUNet: A sequence correction dual-flow network for segmentation and quantification of anterior segment OCT image
2024, Computers in Biology and Medicine
The accurate segmentation of AS-OCT images is a prerequisite for the morphological details analysis of anterior segment structure and the extraction of clinical biological parameters, which play an essential role in the diagnosis, evaluation, and preoperative prognosis management of many ophthalmic diseases. Manually marking the boundaries of the anterior segment tissue is time-consuming and error-prone, with inherent speckle noise, various artifacts, and some low-quality scanned images further increasing the difficulty of the segmentation task. In this work, we propose a novel model called SeqCorr-EUNet with a dual-flow architecture based on convolutional gated recursive sequence correction for semantic segmentation and quantification of AS-OCT images. An EfficientNet encoder is employed to enhance the intra-slice features extraction ability of semantic segmentation flow. The sequence correction flow based on ConvGRU is introduced to extract inter-slice features from consecutive adjacent slices. Spatio-temporal information is fused to correct the morphological details of pre-segmentation results. And the channel attention gate is inserted into the skip-connection between encoder and decoder to enrich the contextual information and suppress the noise of irrelevant regions. Based on the segmentation results of the anterior segment structures, we achieved automatic extraction of essential clinical parameters, 3D reconstruction of the anterior chamber structure, and measurement of anterior chamber volume. The proposed SeqCorr-EUNet has been evaluated on the public AS-OCT dataset. The experimental results show that our method is competitive compared with the existing methods and significantly improves the segmentation and quantification performance of low-quality imaging structures in AS-OCT images.

View full text

S2S-ARSNet: Sequence-to-Sequence automatic renal segmentation network

Highlights

Abstract

Introduction

Section snippets

Related work

Proposed method

Experiments and results

Discussions

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

The Journal of Urology

European Urology

The Journal of urology

Computers & Graphics

International Journal of Computer Applications

Interpretation of the renogram: problems and pitfalls in hydronephrosis in children

Journal of Nuclear Medicine

Interpretation of the renogram: problems and pitfalls in hydronephrosis in children

Bju International

Guidelines for standard and diuretic renogram in children

European Journal of Nuclear Medicine and Molecular Imaging

Kid-Net: Convolution Networks for Kidney Vessels Segmentation from CT-Volumes

MICCAI

Deep learning

Nature

The Importance of Skip Connections in Biomedical Image Segmentation

DLMIA

UNet++: A Nested U-Net Architecture for Medical Image Segmentation

DLMIA

YouTube-VOS: Sequence-to-Sequence Video Object Segmentation

ECCV