Quaternion Grassmann average network for learning representation of histopathological image

doi:10.1016/j.patcog.2018.12.013

Pattern Recognition

Volume 89, May 2019, Pages 67-76

https://doi.org/10.1016/j.patcog.2018.12.013 Get rights and content

Abstract

Histopathological image analysis works as ‘gold standard’ for cancer diagnosis. Its computer-aided approach has attracted considerable attention in the field of digital pathology, which highly depends on the feature representation for histopathological images. The principal component analysis network (PCANet) is a novel unsupervised deep learning framework that has shown its effectiveness for feature representation learning. However, PCA is susceptible to noise and outliers to affect the performance of PCANet. The Grassmann average (GA) is superior to PCA on robustness. In this work, a GA network (GANet) algorithm is proposed by embedding GA algorithm into the PCANet framework. Moreover, since quaternion algebra is an excellent tool to represent color images, a quaternion-based GANet (QGANet) algorithm is further developed to learn effective feature representations containing color information for histopathological images. The experimental results based on three histopathological image datasets indicate that the proposed QGANet achieves the best performance on the classification of color histopathological images among all the compared algorithms.

Introduction

Cancers seriously threaten human health, and therefore it is always highly demanded for the most accurate diagnosis of cancers. It is well known that the high-resolution histopathological image is a ‘gold standard’ for the diagnosis of almost all types of cancer in clinical practice [1], [2]. However, this kind of pathology diagnosis is highly dependent on the subjective decisions of pathologists [2], [3]. Therefore, it is necessary to develop a computer-aided diagnosis (CAD) or quantitative analysis system for histopathological image analysis, which can help the pathologists reduce the workload, and more importantly, the possibility of errors, as a second opinion tool [1], [2], [3], [4], [5], [6], [7]. Moreover, it also can provide an effective decision support tool for the training of young radiologists [1], [2], [3], [4], [5], [6], [7].

In a CAD system, feature representation is one of the most critical steps. For example, the commonly used features in histopathological images mainly represent the local cell-level information (e.g., size and shape) or the holistic architecture of tissue (e.g., topology and layout of all cells) [1], [5]. For more details on the hand-crafted features based histopathological image classification, we refer to [1], [3], [5], [8], [9], [10]. Although a remarkable progress of various feature extraction methods has been achieved, the hand-crafted features mostly have poor generalization ability among the histopathological images of different cancers [11], [12]. Therefore, feature representation has become a fairly decisive factor in histopathological image based CAD, while it is still very challenging [12].

In the past several years, deep learning (DL) has demonstrated superior performance to the hand-crafted features in various applications [13], [14], [15]. It has also gained a good reputation in the feature representation of histopathological images [6], [7], [12]. Ciresan et al. first used convolutional neural networks (CNN) to detect mitosis in breast histopathological images and won the mitosis detection competition in ICPR 2012 contest [16]. Then, various CNN-based DL algorithms have been developed for the purpose of cell detection and counting, cell segmentation and tissue classification for histopathological images [3], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32]. Except for the CNN framework, other DL algorithms, including the stacked auto-encoder (SAE), restricted Boltzmann machine (RBM), deep belief networks (DBN) and their variants, have also been successfully applied to histopathological images, especially for cell detection [8], [33], [34], [35], [36], [37]. All these DL-based methods show superior or even state-of-the-art performance for histopathological images.

More recently, Chan et al. proposed a principal component analysis network (PCANet) algorithm [38]. PCANet is essentially an unsupervised DL framework, which only has three simple basic components: the cascaded PCA as a deep network, binary hashing as a nonlinear layer, and block-wise histograms for feature pooling layer. PCANet has shown its effectiveness for feature representation learning. Moreover, compared with the commonly used DL algorithms, PCANet has the advantages of simpler network architecture and fewer parameters. Therefore, PCANet and its variants, such as DLANet [39], SRDANet [40], SPCANet [41], 2DPCANet [42], MPCANet [43], and R-VCANet [44], have been widely used for various image classification tasks.

The effectiveness of PCANet for image representation makes it feasible for histopathological image analysis. Moreover, since PCANet is an unsupervised learning algorithm, it is more suitable for medical image processing in the small dataset case. On the other hand, in histopathological images, the noises and outliers are inevitable, and they can result from artifact, non-standardized staining protocol, focusing inaccuracy, diversity of imaging devices, diversity of resolution, etc. [1], [9]. Therefore, some small regions in a histopathological image may be affected by noise and outliers. PCA is sensitive to noise and outliers, resulting in the degeneration of robustness [45], and then further affects the ability of the feature representation in PCANet. Therefore, it is critical to improve the robustness of the PCA in the PACNet framework. Fortunately, Grassmann averages (GA) can address this issue by averaging all subspaces generated by the data on the Grassmann manifold to realize dimensionality reduction [46]. It is worth noting that GA is consistent with PCA in the Gaussian data case, while it is more robust than PCA [46]. The experimental results have shown that GA outperforms PCA on several tasks in computer vision, such as dimensionality reduction, background modeling, image restoration and shadow removal [46]. Since histopathological images always carry noise and outliers during the imaging procedure, GA algorithm would have superior performance over PCA for the dimensionality reduction of histopathological images. Moreover, inspired by the PCANet framework, we will construct a GA based network (GANet) with the similar network framework in this work.

Hematoxylin and Eosin (H&E) staining is commonly used to reveal cellular components and enhance the visibility of spatial structures of histological components in clinical practice, which makes the color information particularly important and helpful for diagnosis. As a result, many color descriptors have been widely used to represent histopathological images [1], [2]. However, the existing GA algorithm can only handle grayscale images. When a color histopathological image is converted into the grayscale one, rich color information will be lost, which leads to deteriorated feature representation. Although we can conduct the GANet algorithm on each color channel image and then concatenate them to form the fused color features, the intrinsic correlation among the color channels is discarded by this color information fusion strategy [1]. Therefore, it is essential to integrate color information into GANet such that more effective representation of histopathological images can be learned.

Quaternion algebra was first proposed by Hamilton in 1866 [48]. It has been demonstrated that quaternion algebra is excellent at color image representation mathematically. It comprises one real and three imaginary parts, and these three imaginary parts are used to represent the three different color channels, respectively. Various quaternion-based feature extraction algorithms for color images have been proposed in recent years [49], [50], [51], [52], [53], especially the quaternion PCA (QPCA) [54]. More recently, Zeng et al. proposed the quaternion PCANet that outperforms PCANet for the feature representation of color images [55]. A novel algorithm, named quaternion GA (QGA), is therefore considered by integrating the quaternion algebra and the GA algorithm. We will further develop a novel QGA network (QGANet) algorithm according to the PCANet framework [46].

In this work, we develop a QGANet algorithm that can effectively learn the feature representation of the color histopathological images for cancer diagnosis based on our previous work [56]. The main contributions of this work are threefold: (1) A GANet algorithm is first proposed to improve the robustness of the learned representative features for gray histopathological images; (2) The QGA algorithm is then developed by incorporating the quaternion algebra into GA algorithm for the dimensionality reduction of color images; (3) The QGANet algorithm, which can learn more effective feature representations from color histopathological images, is finally proposed.

Section snippets

Quaternion Grassmann average network

Fig. 1 shows the proposed QGANet, which is composed of four components: the quaternion representation model, the cascaded QGA model, the binary hashing model and the block-wise histograms model.

We take a two-layer cascaded QGA network as an example in this work. It is worth noting that GANet has four components, the same as QGANet, but it only has the cascaded GA filters without the quaternion representation. We will introduce these components in the following sections.

Dataset

We then evaluate GANet and QGANet respectively on the following three histopathological image datasets, whose typical example images are shown in Fig. 4.

(1)
Hepatocellular carcinoma (HCC) image dataset [59]. This HCC dataset was acquired by the Olympus BX51 at the Medical College of Nantong University. It includes 66 HCC images in total (21 well differentiated images, 23 moderately differentiated images and 22 poorly differentiated images) [59]. The image size is 1024 × 768 pixels.
(2)
Beth Israel

Discussion

PCANet is a novel DL algorithm with much simpler network architecture and parameters. In this work, we first propose a GANet algorithm motivated by PCANet. As shown in the experiments, GANet can achieve even superior performance over PCANet on the HCC dataset and the BIDMC dataset, and it can also obtain competitive results as compared to PCANet on the ADL Kidney dataset. The experimental results indicate that GANet is more robust and effective than PCANet in the feature representation of

Conclusions

In conclusion, we first propose a GANet algorithm as an alternative unsupervised representation learning algorithm for images motivated by the PCANet framework, and then apply it to grayscale histopathological image analysis. Secondly, a QGA algorithm is developed by integrating quaternion algebra with GA. It can effectively retain and fuse color information in color histopathological images. Lastly, the QGANet algorithm, with high performance on feature representation of color

Acknowledgements

This work is supported by the National Natural Science Foundation of China (61471231, 81627804, 61671281, 11471208), and the Shanghai Science and Technology Foundation (17411953400, 18010500600), Shanghai Hospital Development Center (16CR3061B).

Jun Shi received the B.S. degree and the Ph.D. degree from the Department of Electronic Engineering and Information Science, University of Science and Technology of China in 2000 and 2005, respectively. In 2005, he joined the School of Communication and Information Engineering, Shanghai University, China, where he has been a Professor since 2015. From 2011 to 2012, he was a visiting scholar with the University of North Carolina at Chapel Hill. His current research interests include machine

References (64)

C. Zhong et al.
When machine vision meets histology: a comparative evaluation of model architecture for classification of histology sections
Med. Image Anal
(2017)
J. Arevalo et al.
An unsupervised feature learning framework for basal cell carcinoma image analysis
Artif. Intell. Med.
(2015)
L. He et al.
Histology image analysis for carcinoma detection and grading
Comput. Methods Programs Biomed.
(2012)
C. Lu et al.
Automated analysis and diagnosis of skin melanoma on whole slide histopathological images
Pattern Recognit.
(2015)
A. Janowczyk et al.
Deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases
J Pathol. Inform
(2016)
J. Xu et al.
A deep convolutional neural network for segmenting and classifying epithelial and stromal regions in histopathological images
Neurocomputing
(2016)
T. Wan et al.
Automated grading of breast cancer histopathology using cascaded ensemble with combination of multi-level image features
Neurocomputing
(2017)
Y. Xu et al.
Large scale tissue histopathology image classification, segmentation, and visualization via deep convolutional activation features
BMC Bioinf.
(2017)
Y. Zheng et al.
Feature extraction from histopathological images based on nucleus-guided convolutional neural network for breast lesion classification
Pattern Recognit.
(2017)
Z. Feng et al.
DLANet: a manifold-learning-based discriminative feature learning network for scene classification
Neurocomputing
(2015)

D. Ciresan et al.

Mitosis Detection in Breast Cancer Histology Images With Deep Neural Networks

(2013)

Y. Xie et al.

Beyond classification: Structured Regression For Robust Cell Detection Using Convolutional Neural Network

(2015)

Y. Xu et al.

Deep Convolutional Activation Features For Large Scale Brain Tumor Histopathology Image Classification and Segmentation

(2015)

F. Liu et al.

A Novel Cell Detection Method Using Deep Convolutional Neural Network and Maximum-Weight Independent Set

(2015)

K. Sirinukunwattana et al.

Locality sensitive deep learning for detection and classification of nuclei in routine colon cancer histology images

IEEE Trans. Med. Imag.

(2016)

Z. Xu et al.

Detecting 10000 Cells in One Second

(2016)

S. Wang et al.

Subtype Cell Detection With an Accelerated Deep Convolution Neural Network

(2016)

Cited by (36)

Adaptive reweighted quaternion sparse learning for data recovery and classification
2023, Pattern Recognition
Sparse representation (SR) methods in quaternion space have been attracting increasing interests recently. However, most existing quaternion SR methods adopt the quaternion $ℓ_{1}$ norm, which penalizes all the entries of the quaternion sparse vector equally and ignores the differences and significance of different entries. Ideally, the entries with large magnitude should be less penalized while those with small magnitude (such as zero entries) should be more penalized. Therefore, we propose an Adaptive Weighted Quaternion Sparse Representation (AWQSR) method in this paper, which can learn weights for distinct entries of the quaternion sparse entries in an adaptive manner. Due to the noncommutativity of quaternion multiplication, it is difficult to tackle the resulting optimization problem of AWQSR. For this reason, we devise an effective iteratively reweighted optimization algorithm based on quaternion operators. To further improve the classification performance, we also develop a Supervised AWQSR based Classification (SAWQSRC) method by leveraging the label information of training samples to learn discriminative weights. Theoretical analysis of SAWQSRC has also been established to show that SAWQSRC succeeds in classification under appropriate conditions. The experiments on simulated data and real data prove the validity of the proposed methods for quaternion signal recovery and classification.
A review of intelligent diagnosis methods of imaging gland cancer based on machine learning
2023, Virtual Reality and Intelligent Hardware
Gland cancer is a high-incidence disease that endangers human health, and its early detection and treatment require efficient, accurate, and objective intelligent diagnosis methods. In recent years, the advent of machine learning techniques has yielded satisfactory results in intelligent gland cancer diagnosis based on clinical images, significantly improving the accuracy and efficiency of medical image interpretation while reducing the workload of doctors. The focus of this study is to review, classify, and analyze intelligent diagnosis methods for imaging gland cancer based on machine learning and deep learning. This paper briefly introduces some basic imaging principles of multimodal medical images, such as the commonly used computed tomography (CT), magnetic resonance imaging (MRI), ultrasound (US), positron emission tomography (PET), and pathology. In addition, the intelligent diagnosis methods for imaging gland cancer were further classified into supervised learning and weakly supervised learning. Supervised learning consists of traditional machine learning methods, such as Knearest neighbor algorithm (KNN), support vector machine (SVM), and multilayer perceptron, and deep learning methods evolving from convolutional neural network (CNN). By contrast, weakly supervised learning can be further categorized into active learning, semisupervised learning, and transfer learning. State-of-the-art methods are illustrated with implementation details, including image segmentation, feature extraction, and optimization of classifiers. Their performances are evaluated through indicators, such as accuracy, precision, and sensitivity. In conclusion, the challenges and development trends of intelligent diagnosis methods for imaging gland cancer were addressed and discussed.
Automatic multi-tissue segmentation in pancreatic pathological images with selected multi-scale attention network
2022, Computers in Biology and Medicine
Citation Excerpt :
Convolutional neural networks (CNNs) [9–11] have been demonstrated to achieve excellent performance in various challenging tasks such as detection, classification and segmentation. This inspired a lot of researchers to apply them in various tasks in pathological image analysis, including mitosis detection [12–14], cancer type classification [15–21], semantic segmentation [22–25], nuclei segmentation [26–33] and gland segmentation [34–43]. The encoder–decoder model has been shown to be one of the most efficient network architectures for segmentation tasks.
The morphology of tissues in pathological images has been used routinely by pathologists to assess the degree of malignancy of pancreatic ductal adenocarcinoma (PDAC). Automatic and accurate segmentation of tumor cells and their surrounding tissues is often a crucial step to obtain reliable morphological statistics. Nonetheless, it is still a challenge due to the great variation of appearance and morphology. In this paper, a selected multi-scale attention network (SMANet) is proposed to segment tumor cells, blood vessels, nerves, islets and ducts in pancreatic pathological images. The selected multi-scale attention module is proposed to enhance effective information, supplement useful information and suppress redundant information at different scales from the encoder and decoder. It includes selection unit (SU) module and multi-scale attention (MA) module. The selection unit module can effectively filter features. The multi-scale attention module enhances effective information through spatial attention and channel attention, and combines different level features to supplement useful information. This helps learn the information of different receptive fields to improve the segmentation of tumor cells, blood vessels and nerves. An original-feature fusion unit is also proposed to supplement the original image information to reduce the under-segmentation of small tissues such as islets and ducts. The proposed method outperforms state-of-the-arts deep learning algorithms on our PDAC pathological images and achieves competitive results on the GlaS challenge dataset. The mDice and mIoU have reached 0.769 and 0.665 in our PDAC dataset.
Quaternion-based weighted nuclear norm minimization for color image restoration
2022, Pattern Recognition
Citation Excerpt :
Chen et al. [12] and Yu et al. [13] denoised color images with quaternion-based low-rank regularizer. Shi et al. [14] proposed a color histopathological image classification method based on quaternion Grassmann average network. Quaternion neural networks were also applied to image denoising [15] and classification [16].
Color image restoration is one of the basic tasks in pattern recognition. Unlike grayscale image, each color image has three channels in the RGB color space. Due to the inner-relationship within the three channels, color image restoration is usually much more difficult than its grayscale counterpart. Indeed, new problems such as color artifacts could emerge when the grayscale image processing methods are extended to color images directly. Note that one of the most effective gray image restoration methods is the weighted nuclear norm minimization (WNNM) approach. However, when applied to color images, the results of WNNM are usually not as promising as that of grayscale images. In order to solve this problem, in this paper, we propose to restore color images with the quaternion-based WNNM method (QWNNM) since the structure of color channels can be well preserved with quaternion representation. The proposed model can be solved efficiently by the alternating direction method of multipliers (ADMM). The theoretical analysis of the optimal solution is also presented. Numerical experiments are carefully conducted with different kinds of degradation to illustrate the superior performance of our proposed QWNNM over the state-of-the-art methods, including a celebrated deep learning approach, in both visual quality and numerical results.
A review: The detection of cancer cells in histopathology based on machine vision
2022, Computers in Biology and Medicine
Citation Excerpt :
Compared with the traditional breast tumor classification method, the method was significantly improved, and its accuracy rate reached 97%. Shi et al. [173] embedded the genetic algorithm into the PCANet framework and proposed a quaternion-based Grassmann average network algorithm (QGANet). The experimental results showed that the QGANet algorithm had the best classification performance for color histopathological images, with an average classification accuracy of 90%, a sensitivity of 89.84%, and a specificity of 94.93%.
Machine vision is being employed in defect detection, size measurement, pattern recognition, image fusion, target tracking and 3D reconstruction. Traditional cancer detection methods are dominated by manual detection, which wastes time and manpower, and heavily relies on the pathologists’ skill and work experience. Therefore, these manual detection approaches are not convenient for the inheritance of domain knowledge, and are not suitable for the rapid development of medical care in the future. The emergence of machine vision can iteratively update and learn the domain knowledge of cancer cell pathology detection to achieve automated, high-precision, and consistent detection. Consequently, this paper reviews the use of machine vision to detect cancer cells in histopathology images, as well as the benefits and drawbacks of various detection approaches. First, we review the application of image preprocessing and image segmentation in histopathology for the detection of cancer cells, and compare the benefits and drawbacks of different algorithms. Secondly, for the characteristics of histopathological cancer cell images, the research progress of shape, color and texture features and other methods is mainly reviewed. Furthermore, for the classification methods of histopathological cancer cell images, the benefits and drawbacks of traditional machine vision approaches and deep learning methods are compared and analyzed. Finally, the above research is discussed and forecasted, with the expected future development tendency serving as a guide for future research.
2K-Fold-Net and feature enhanced 4-Fold-Net for medical image segmentation
2022, Pattern Recognition
Citation Excerpt :
Various modalities of medical imaging provide important bases for diagnosis [1], automatic early screening [2], treatment response prediction [3], lesion localization [4], and surgical navigation [5].
For segmenting medical images, U-Net has become a popular and effective tool. However, it also has some shortcomings in segmenting fuzzy boundaries and eliminating interferences. Improvements of the original U-Net have been proposed by many authors, resulting in many variants such as MultiResUNet, DoubleU-Net and W-Net. Based on the common characteristics of these structures, we propose in this work a generalized structure by multiplying the folds of a fully convolutional network (FCN) for even more times, and thus name it as “2K-Fold-Net”. The more folds in this structure provide more freedoms to create cross links between the neighboring folds. The influence of the fold-pair number $K$ on its performance is also studied. The realizations with $K$ up to 6 are compared to three other variants of cascaded U-Nets using the CVC-ClinicDB dataset. Then the special case “4-Fold-Net” is further empowered with the feature enhancing functionalities recently seen in the attention-aware feature enhancement method. This new net is hence named as “Enhanced-Feature-4-Fold-Net”, abbreviated as “EF $^{3}$ -Net”. Finally, 2K-Fold-Net and EF $^{3}$ -Net have been compared with U-Net, SegNet, DoubleU-Net, MultiResUNet and its variants using four challenging medical image datasets. The results have demonstrated that the proposed nets outperform the other variants of U-Net, even with slightly lower amount of parameters. The code is available on: https://github.com/raik7/EF3-Net.

View all citing articles on Scopus

Xiao Zheng received the B.S. degree of biomedical engineering from Wenzhou Medical University in 2015. She is a M.Sc. candidate in the School of Communication and Information Engineering, Shanghai University, China now. Her research interests include the machine learning for medical images.

Jinjie Wu received the B.S. degree from the School of Electrical and Information Engineering, Jiangsu University in 2013. He is a M.Sc. candidate in the School of Communication and Information Engineering, Shanghai University, China now. His research interests include the machine learning for medical images.

Qi Zhang received his B.S. degree in Electronic Engineering in 2005 and Ph.D. degree in Biomedical Engineering in 2010, both from Fudan University, China. From 2008 to 2009, he was a visiting Ph.D. student at the Department of Biomedical Engineering, Duke University, USA. He joined the Institute of Biomedical Engineering, Shanghai University, China in 2010, and then was an Associate Professor since 2013. His research interests include medical signal processing, biomedical modeling and computer aided diagnosis.

Shihui Ying received his B.Eng. degree in Mechanical Engineering and Ph.D. degree in Applied Mathematics from Xi'an Jiaotong University, Xi'an, China in July 2001 and April 2008, respectively. He is currently a professor with the Department of Mathematics, School of Science, Shanghai University, Shanghai, China. He was a postdoctor in Biomedical Research Imaging Center (BRIC), University of North Carolina at Chapel Hill, U.S.A., from 2012 to 2013. He is also a member of IEEE from 2009 and services as an Editor of JSM Mathematics and Statistics from Jan, 2013. His research interests cover mathematical theory and methods for machine learning and medical image analysis.

View full text

Quaternion Grassmann average network for learning representation of histopathological image

Abstract

Introduction

Section snippets

Quaternion Grassmann average network

Dataset

Discussion

Conclusions

Acknowledgements

Med. Image Anal

Artif. Intell. Med.

Comput. Methods Programs Biomed.

Pattern Recognit.

J Pathol. Inform

Neurocomputing

Neurocomputing

BMC Bioinf.

Pattern Recognit.

Neurocomputing

Opt.-Int. J. Light Electron Opt

Inform. Sci.

Pattern Recognit.

Neurocomputing

Comput. Med. Imag. Graph.

Histopathological image analysis: a review

IEEE Rev. Biomed. Eng.

Methods for nuclei detection, segmentation and classification in digital histopathology: a review — current status and future potential

IEEE Rev. Biomed. Eng.

Breast cancer histopathology image analysis: a review

IEEE Trans. Biomed. Eng.

Computer-aided prostate cancer diagnosis from digitized histopathology: a review on texture-based systems

IEEE Rev. Biomed. Eng.

Robust nucleus/cell detection and segmentation for digital pathology and microscopic images: a comprehensive review

IEEE Rev. Biomed. Eng.

Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis

Sci. Rep.

Histopathological image classification with color pattern random binary hashing based PCANet and matrix-form classifier

IEEE J. Biomed. Health Inform

Representation learning: a review and new rerspectives

IEEE Trans. Pattern Anal. Mach. Intell.

Deep learning

Nature

Biologically inspired model for visual cognition achieving unsupervised episodic and semantic feature learning

IEEE Trans. Cybern.

Mitosis Detection in Breast Cancer Histology Images With Deep Neural Networks

Beyond classification: Structured Regression For Robust Cell Detection Using Convolutional Neural Network

Deep Convolutional Activation Features For Large Scale Brain Tumor Histopathology Image Classification and Segmentation

A Novel Cell Detection Method Using Deep Convolutional Neural Network and Maximum-Weight Independent Set

Locality sensitive deep learning for detection and classification of nuclei in routine colon cancer histology images

IEEE Trans. Med. Imag.

Detecting 10000 Cells in One Second

Subtype Cell Detection With an Accelerated Deep Convolution Neural Network