Elsevier

Pattern Recognition

Volume 89, May 2019, Pages 67-76
Pattern Recognition

Quaternion Grassmann average network for learning representation of histopathological image

https://doi.org/10.1016/j.patcog.2018.12.013Get rights and content

Abstract

Histopathological image analysis works as ‘gold standard’ for cancer diagnosis. Its computer-aided approach has attracted considerable attention in the field of digital pathology, which highly depends on the feature representation for histopathological images. The principal component analysis network (PCANet) is a novel unsupervised deep learning framework that has shown its effectiveness for feature representation learning. However, PCA is susceptible to noise and outliers to affect the performance of PCANet. The Grassmann average (GA) is superior to PCA on robustness. In this work, a GA network (GANet) algorithm is proposed by embedding GA algorithm into the PCANet framework. Moreover, since quaternion algebra is an excellent tool to represent color images, a quaternion-based GANet (QGANet) algorithm is further developed to learn effective feature representations containing color information for histopathological images. The experimental results based on three histopathological image datasets indicate that the proposed QGANet achieves the best performance on the classification of color histopathological images among all the compared algorithms.

Introduction

Cancers seriously threaten human health, and therefore it is always highly demanded for the most accurate diagnosis of cancers. It is well known that the high-resolution histopathological image is a ‘gold standard’ for the diagnosis of almost all types of cancer in clinical practice [1], [2]. However, this kind of pathology diagnosis is highly dependent on the subjective decisions of pathologists [2], [3]. Therefore, it is necessary to develop a computer-aided diagnosis (CAD) or quantitative analysis system for histopathological image analysis, which can help the pathologists reduce the workload, and more importantly, the possibility of errors, as a second opinion tool [1], [2], [3], [4], [5], [6], [7]. Moreover, it also can provide an effective decision support tool for the training of young radiologists [1], [2], [3], [4], [5], [6], [7].

In a CAD system, feature representation is one of the most critical steps. For example, the commonly used features in histopathological images mainly represent the local cell-level information (e.g., size and shape) or the holistic architecture of tissue (e.g., topology and layout of all cells) [1], [5]. For more details on the hand-crafted features based histopathological image classification, we refer to [1], [3], [5], [8], [9], [10]. Although a remarkable progress of various feature extraction methods has been achieved, the hand-crafted features mostly have poor generalization ability among the histopathological images of different cancers [11], [12]. Therefore, feature representation has become a fairly decisive factor in histopathological image based CAD, while it is still very challenging [12].

In the past several years, deep learning (DL) has demonstrated superior performance to the hand-crafted features in various applications [13], [14], [15]. It has also gained a good reputation in the feature representation of histopathological images [6], [7], [12]. Ciresan et al. first used convolutional neural networks (CNN) to detect mitosis in breast histopathological images and won the mitosis detection competition in ICPR 2012 contest [16]. Then, various CNN-based DL algorithms have been developed for the purpose of cell detection and counting, cell segmentation and tissue classification for histopathological images [3], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32]. Except for the CNN framework, other DL algorithms, including the stacked auto-encoder (SAE), restricted Boltzmann machine (RBM), deep belief networks (DBN) and their variants, have also been successfully applied to histopathological images, especially for cell detection [8], [33], [34], [35], [36], [37]. All these DL-based methods show superior or even state-of-the-art performance for histopathological images.

More recently, Chan et al. proposed a principal component analysis network (PCANet) algorithm [38]. PCANet is essentially an unsupervised DL framework, which only has three simple basic components: the cascaded PCA as a deep network, binary hashing as a nonlinear layer, and block-wise histograms for feature pooling layer. PCANet has shown its effectiveness for feature representation learning. Moreover, compared with the commonly used DL algorithms, PCANet has the advantages of simpler network architecture and fewer parameters. Therefore, PCANet and its variants, such as DLANet [39], SRDANet [40], SPCANet [41], 2DPCANet [42], MPCANet [43], and R-VCANet [44], have been widely used for various image classification tasks.

The effectiveness of PCANet for image representation makes it feasible for histopathological image analysis. Moreover, since PCANet is an unsupervised learning algorithm, it is more suitable for medical image processing in the small dataset case. On the other hand, in histopathological images, the noises and outliers are inevitable, and they can result from artifact, non-standardized staining protocol, focusing inaccuracy, diversity of imaging devices, diversity of resolution, etc. [1], [9]. Therefore, some small regions in a histopathological image may be affected by noise and outliers. PCA is sensitive to noise and outliers, resulting in the degeneration of robustness [45], and then further affects the ability of the feature representation in PCANet. Therefore, it is critical to improve the robustness of the PCA in the PACNet framework. Fortunately, Grassmann averages (GA) can address this issue by averaging all subspaces generated by the data on the Grassmann manifold to realize dimensionality reduction [46]. It is worth noting that GA is consistent with PCA in the Gaussian data case, while it is more robust than PCA [46]. The experimental results have shown that GA outperforms PCA on several tasks in computer vision, such as dimensionality reduction, background modeling, image restoration and shadow removal [46]. Since histopathological images always carry noise and outliers during the imaging procedure, GA algorithm would have superior performance over PCA for the dimensionality reduction of histopathological images. Moreover, inspired by the PCANet framework, we will construct a GA based network (GANet) with the similar network framework in this work.

Hematoxylin and Eosin (H&E) staining is commonly used to reveal cellular components and enhance the visibility of spatial structures of histological components in clinical practice, which makes the color information particularly important and helpful for diagnosis. As a result, many color descriptors have been widely used to represent histopathological images [1], [2]. However, the existing GA algorithm can only handle grayscale images. When a color histopathological image is converted into the grayscale one, rich color information will be lost, which leads to deteriorated feature representation. Although we can conduct the GANet algorithm on each color channel image and then concatenate them to form the fused color features, the intrinsic correlation among the color channels is discarded by this color information fusion strategy [1]. Therefore, it is essential to integrate color information into GANet such that more effective representation of histopathological images can be learned.

Quaternion algebra was first proposed by Hamilton in 1866 [48]. It has been demonstrated that quaternion algebra is excellent at color image representation mathematically. It comprises one real and three imaginary parts, and these three imaginary parts are used to represent the three different color channels, respectively. Various quaternion-based feature extraction algorithms for color images have been proposed in recent years [49], [50], [51], [52], [53], especially the quaternion PCA (QPCA) [54]. More recently, Zeng et al. proposed the quaternion PCANet that outperforms PCANet for the feature representation of color images [55]. A novel algorithm, named quaternion GA (QGA), is therefore considered by integrating the quaternion algebra and the GA algorithm. We will further develop a novel QGA network (QGANet) algorithm according to the PCANet framework [46].

In this work, we develop a QGANet algorithm that can effectively learn the feature representation of the color histopathological images for cancer diagnosis based on our previous work [56]. The main contributions of this work are threefold: (1) A GANet algorithm is first proposed to improve the robustness of the learned representative features for gray histopathological images; (2) The QGA algorithm is then developed by incorporating the quaternion algebra into GA algorithm for the dimensionality reduction of color images; (3) The QGANet algorithm, which can learn more effective feature representations from color histopathological images, is finally proposed.

Section snippets

Quaternion Grassmann average network

Fig. 1 shows the proposed QGANet, which is composed of four components: the quaternion representation model, the cascaded QGA model, the binary hashing model and the block-wise histograms model.

We take a two-layer cascaded QGA network as an example in this work. It is worth noting that GANet has four components, the same as QGANet, but it only has the cascaded GA filters without the quaternion representation. We will introduce these components in the following sections.

Dataset

We then evaluate GANet and QGANet respectively on the following three histopathological image datasets, whose typical example images are shown in Fig. 4.

  • (1)

    Hepatocellular carcinoma (HCC) image dataset [59]. This HCC dataset was acquired by the Olympus BX51 at the Medical College of Nantong University. It includes 66 HCC images in total (21 well differentiated images, 23 moderately differentiated images and 22 poorly differentiated images) [59]. The image size is 1024 × 768 pixels.

  • (2)

    Beth Israel

Discussion

PCANet is a novel DL algorithm with much simpler network architecture and parameters. In this work, we first propose a GANet algorithm motivated by PCANet. As shown in the experiments, GANet can achieve even superior performance over PCANet on the HCC dataset and the BIDMC dataset, and it can also obtain competitive results as compared to PCANet on the ADL Kidney dataset. The experimental results indicate that GANet is more robust and effective than PCANet in the feature representation of

Conclusions

In conclusion, we first propose a GANet algorithm as an alternative unsupervised representation learning algorithm for images motivated by the PCANet framework, and then apply it to grayscale histopathological image analysis. Secondly, a QGA algorithm is developed by integrating quaternion algebra with GA. It can effectively retain and fuse color information in color histopathological images. Lastly, the QGANet algorithm, with high performance on feature representation of color

Acknowledgements

This work is supported by the National Natural Science Foundation of China (61471231, 81627804, 61671281, 11471208), and the Shanghai Science and Technology Foundation (17411953400, 18010500600), Shanghai Hospital Development Center (16CR3061B).

Jun Shi received the B.S. degree and the Ph.D. degree from the Department of Electronic Engineering and Information Science, University of Science and Technology of China in 2000 and 2005, respectively. In 2005, he joined the School of Communication and Information Engineering, Shanghai University, China, where he has been a Professor since 2015. From 2011 to 2012, he was a visiting scholar with the University of North Carolina at Chapel Hill. His current research interests include machine

References (64)

  • L. Li et al.

    Overview of principal component analysis algorithm

    Opt.-Int. J. Light Electron Opt

    (2016)
  • L. Guo et al.

    Quaternion moment and its invariants for color object classification

    Inform. Sci.

    (2014)
  • H. Li et al.

    Quaternion generic Fourier descriptor for color object recognition

    Pattern Recognit.

    (2015)
  • R. Zeng et al.

    Color image classification via quaternion principal component analysis network

    Neurocomputing

    (2016)
  • J. Shi et al.

    Joint sparse coding based spatial pyramid matching for classification of color medical image

    Comput. Med. Imag. Graph.

    (2015)
  • M. Gurcan et al.

    Histopathological image analysis: a review

    IEEE Rev. Biomed. Eng.

    (2009)
  • H. Irshad et al.

    Methods for nuclei detection, segmentation and classification in digital histopathology: a review — current status and future potential

    IEEE Rev. Biomed. Eng.

    (2014)
  • M. Veta et al.

    Breast cancer histopathology image analysis: a review

    IEEE Trans. Biomed. Eng.

    (2014)
  • C. Lopez et al.

    Computer-aided prostate cancer diagnosis from digitized histopathology: a review on texture-based systems

    IEEE Rev. Biomed. Eng.

    (2015)
  • F. Xing et al.

    Robust nucleus/cell detection and segmentation for digital pathology and microscopic images: a comprehensive review

    IEEE Rev. Biomed. Eng.

    (2016)
  • G. Litjens et al.

    Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis

    Sci. Rep.

    (2016)
  • J. Shi et al.

    Histopathological image classification with color pattern random binary hashing based PCANet and matrix-form classifier

    IEEE J. Biomed. Health Inform

    (2017)
  • Y. Bengio et al.

    Representation learning: a review and new rerspectives

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2013)
  • Y. LeCun et al.

    Deep learning

    Nature

    (2015)
  • H. Qiao et al.

    Biologically inspired model for visual cognition achieving unsupervised episodic and semantic feature learning

    IEEE Trans. Cybern.

    (2016)
  • D. Ciresan et al.

    Mitosis Detection in Breast Cancer Histology Images With Deep Neural Networks

    (2013)
  • Y. Xie et al.

    Beyond classification: Structured Regression For Robust Cell Detection Using Convolutional Neural Network

    (2015)
  • Y. Xu et al.

    Deep Convolutional Activation Features For Large Scale Brain Tumor Histopathology Image Classification and Segmentation

    (2015)
  • F. Liu et al.

    A Novel Cell Detection Method Using Deep Convolutional Neural Network and Maximum-Weight Independent Set

    (2015)
  • K. Sirinukunwattana et al.

    Locality sensitive deep learning for detection and classification of nuclei in routine colon cancer histology images

    IEEE Trans. Med. Imag.

    (2016)
  • Z. Xu et al.

    Detecting 10000 Cells in One Second

    (2016)
  • S. Wang et al.

    Subtype Cell Detection With an Accelerated Deep Convolution Neural Network

    (2016)
  • Cited by (36)

    • Automatic multi-tissue segmentation in pancreatic pathological images with selected multi-scale attention network

      2022, Computers in Biology and Medicine
      Citation Excerpt :

      Convolutional neural networks (CNNs) [9–11] have been demonstrated to achieve excellent performance in various challenging tasks such as detection, classification and segmentation. This inspired a lot of researchers to apply them in various tasks in pathological image analysis, including mitosis detection [12–14], cancer type classification [15–21], semantic segmentation [22–25], nuclei segmentation [26–33] and gland segmentation [34–43]. The encoder–decoder model has been shown to be one of the most efficient network architectures for segmentation tasks.

    • Quaternion-based weighted nuclear norm minimization for color image restoration

      2022, Pattern Recognition
      Citation Excerpt :

      Chen et al. [12] and Yu et al. [13] denoised color images with quaternion-based low-rank regularizer. Shi et al. [14] proposed a color histopathological image classification method based on quaternion Grassmann average network. Quaternion neural networks were also applied to image denoising [15] and classification [16].

    • A review: The detection of cancer cells in histopathology based on machine vision

      2022, Computers in Biology and Medicine
      Citation Excerpt :

      Compared with the traditional breast tumor classification method, the method was significantly improved, and its accuracy rate reached 97%. Shi et al. [173] embedded the genetic algorithm into the PCANet framework and proposed a quaternion-based Grassmann average network algorithm (QGANet). The experimental results showed that the QGANet algorithm had the best classification performance for color histopathological images, with an average classification accuracy of 90%, a sensitivity of 89.84%, and a specificity of 94.93%.

    • 2K-Fold-Net and feature enhanced 4-Fold-Net for medical image segmentation

      2022, Pattern Recognition
      Citation Excerpt :

      Various modalities of medical imaging provide important bases for diagnosis [1], automatic early screening [2], treatment response prediction [3], lesion localization [4], and surgical navigation [5].

    View all citing articles on Scopus

    Jun Shi received the B.S. degree and the Ph.D. degree from the Department of Electronic Engineering and Information Science, University of Science and Technology of China in 2000 and 2005, respectively. In 2005, he joined the School of Communication and Information Engineering, Shanghai University, China, where he has been a Professor since 2015. From 2011 to 2012, he was a visiting scholar with the University of North Carolina at Chapel Hill. His current research interests include machine learning in medical imaging.

    Xiao Zheng received the B.S. degree of biomedical engineering from Wenzhou Medical University in 2015. She is a M.Sc. candidate in the School of Communication and Information Engineering, Shanghai University, China now. Her research interests include the machine learning for medical images.

    Jinjie Wu received the B.S. degree from the School of Electrical and Information Engineering, Jiangsu University in 2013. He is a M.Sc. candidate in the School of Communication and Information Engineering, Shanghai University, China now. His research interests include the machine learning for medical images.

    Qi Zhang received his B.S. degree in Electronic Engineering in 2005 and Ph.D. degree in Biomedical Engineering in 2010, both from Fudan University, China. From 2008 to 2009, he was a visiting Ph.D. student at the Department of Biomedical Engineering, Duke University, USA. He joined the Institute of Biomedical Engineering, Shanghai University, China in 2010, and then was an Associate Professor since 2013. His research interests include medical signal processing, biomedical modeling and computer aided diagnosis.

    Shihui Ying received his B.Eng. degree in Mechanical Engineering and Ph.D. degree in Applied Mathematics from Xi'an Jiaotong University, Xi'an, China in July 2001 and April 2008, respectively. He is currently a professor with the Department of Mathematics, School of Science, Shanghai University, Shanghai, China. He was a postdoctor in Biomedical Research Imaging Center (BRIC), University of North Carolina at Chapel Hill, U.S.A., from 2012 to 2013. He is also a member of IEEE from 2009 and services as an Editor of JSM Mathematics and Statistics from Jan, 2013. His research interests cover mathematical theory and methods for machine learning and medical image analysis.

    View full text