Group sparsity model for stain unmixing in brightfield multiplex immunohistochemistry images

doi:10.1016/j.compmedimag.2015.04.001

Computerized Medical Imaging and Graphics

Volume 46, Part 1, December 2015, Pages 30-39

https://doi.org/10.1016/j.compmedimag.2015.04.001 Get rights and content

Highlights

•
Unmixing RGB image into more than three colors is hardly studied in literature.
•
A novel IHC image unmixing algorithm is proposed based on group sparsity model.
•
Biological biomarker co-localization information is used as prior in this model.
•
It unmixes more than three dyes while preserving the biological constraints.
•
The new algorithm demonstrates better results than the existing strategies.

Abstract

Multiplex immunohistochemistry (IHC) staining is a new, emerging technique for the detection of multiple biomarkers within a single tissue section. The initial key step in multiplex IHC image analysis in digital pathology is of tremendous clinical importance due to its ability to accurately unmix the IHC image and differentiate each of the stains. The technique has become popular due to its significant efficiency and the rich diagnostic information it contains. The intriguing task of unmixing a three-channel CCD color camera acquired RGB image into more than three colors is very challenging, and to the best of our knowledge, hardly studied in academic literature.

This paper presents a novel stain unmixing algorithm for brightfield multiplex IHC images based on a group sparsity model. The proposed framework achieves robust unmixing for more than three chromogenic dyes while preserving the biological constraints of the biomarkers. Typically, a number of biomarkers co-localize in the same cell parts named priori. With this biological information in mind, the number of stains at one pixel therefore has a fixed up-bound, i.e. equivalent to the number of co-localized biomarkers. By leveraging the group sparsity model, the fractions of stain contributions from the co-localized biomarkers are explicitly modeled into one group to yield the least square solution within the group. A sparse solution is obtained among the groups since ideally only one group of biomarkers is present at each pixel. The algorithm is evaluated on both synthetic and clinical data sets, and demonstrates better unmixing results than the existing strategies.

Introduction

As one of the most life-threatening group of diseases, cancer causes millions of deaths each year. Traditional TNM staging system is often used to provide prognostic information, however, it relies solely on the tumor cells and leads to significant variation of outcomes within the same tumor stage. Therefore, it is of great clinical importance to have a reliable, reproducible, clinically relevant and biologically meaningful system for cancer identification and staging in contrast to TNM [1]. Recently, the study of immune regulation within the tumor microenvironment has gained tremendous attention in cancer research [2], [1], [3] and it has been evidenced that the immune cells are associated with the clinical outcome of certain cancer types [2]. A quantitative and objective evaluation of different types of immune cells within the tumor microenvironment hence needs to be achieved in both research and clinical studies, wherein digital pathology plays an important role.

While the popular primary staining Hematoxylin and Eosin (H&E) slides are widely investigated in digital pathology to study the tissue morphologies, classify the cancer types, or grade the cancer [4], [5], [6], [7], [8], [9], the special staining techniques such as immunohistochemistry staining also convey important information. A multiplex immunohistochemistry (IHC) slide has the potential advantage of simultaneously identifying multiple biomarkers in one tissue section as opposed to single biomarker labeling in multiple slides (see Fig. 1 for example). Therefore, multiplex immunohistochemistry staining is often used for simultaneous assessment of multiple biomarkers in cancerous tissue. For example, tumors often contain infiltrates of immune cells, which may prevent the development of tumors, or favor the outgrowth of tumors [2]. In this scenario, multiple biomarkers are used to target different types of immune cells, and then using the population distribution of each immune cell type to study the clinical outcome of the patients. The biomarkers of the immune cells are stained by using different chromogenic dyes. The correct unmixing of the IHC digital image into its individual constituent dyes for each biomarker, while also, obtaining the proportion of each dye in the color mixture remain prerequisites for accurate detection and classification of immune cells in multiplex IHC image analysis.

A typical digital pathology workflow for multiplex staining, and stain unmixing is shown in Fig. 2. A tissue slide is stained with the multiplex assay. The stained slide is then imaged using a CCD color camera mounted on a microscope, or a scanner. The acquired RGB color image is a mixture of the underlying co-localized biomarker expressions. Several techniques have been proposed in literature to decompose each pixel of the RGB image into a collection of constituent stains and the fractions of the contributions from each of them, that is, to convert the RGB image into biomarker-specific image channels. Stain unmixing is therefore a prerequisite step for the application of the following image analysis algorithms: cell detection, segmentation and classification for each biomarker. Ruifrok et al. developed an unmixing method called color deconvolution [10] to unmix the RGB image with up to three stains in the converted optical density space. Given the reference color vectors x_i ∈ R³ of the pure stains, the method assumes that each pixel of the color mixture y ∈ R³ is a linear combination of the pure stain colors and solves a linear system to obtain the combination weights b ∈ R^M. The linear system is denoted as y = Xb, where X = [x₁, …, x_M], M ≤ 3 is the matrix of reference colors. This technique is currently most widely used in the domain of digital pathology. However, the maximum number of stains which can be resolved is limited to three, as the linear system is deficient for not having enough equations in cases of more than three stains. A multilayer perceptron learning based technique has been proposed in [11] for three color brightfield image unmixing. In [12], Rabinovich et al. formulated the color unmixing problem into non-negative matrix factorization and proposed a system capable of performing the color decomposition in a fully automated manner, wherein no reference stain color selection is required. Again, these methods have the same limitation when dealing with large stain numbers due to solving y = Xb. To the best of our knowledge, the method of unmixing brightfield IHC image with more than three stains is not available in literature. In order to compare with Ruifrok's method, we divide the color space into several systems with up to three colors in each system based on the nearest color matching of each pixel to one of the systems. Ruifrok's method can therefore be used in solving each individual system. Due to the independent assignment of each pixel into different systems, the spatial continuity is lost in the unmixed images and artifacts such as holes are observed. However, this is the most straightforward modification of Ruifrok's method to be feasible on more than three color multiplex brightfield image unmixing for comparison purposes.

Alternatively, there exists another class of methods for multi-spectral image unmixing that works for a larger number of stain colors [13], [14], [15], [16], [17]. In fact, the multi-spectral image differs from the RGB image in terms of image acquisition. A multi-spectral imaging system is used to capture the image using a set of spectral narrow-band filters, rather than using the CCD color camera. The number of filters K can be as few as a dozen to as many as a hundred, and ultimately lead to a multi-channel image that provides much richer information than the bright field RGB image. The linear system constructed from it is always an over-determined system with X being a K × M (K ≫ M) matrix that leads to a unique solution. However, the scanning process in the multi-spectral imaging system is time consuming and provides only a single field of view, manually selected by a trained technician, rather than a whole slide scan. As an example of the multi-spectral imaging unmixing, the two-stage methods [14], [15] are developed in the remote sensing domain to first learn the reference colors from the image context and then use them to unmix the image. Sparse models have been widely used in radiology image analysis for image registration, segmentation, shape modeling and low dose CT analysis, etc. [18], [19], [20], [21], [22], [23], [24], [25], [26] and demonstrate improved performance with respect to the classical models. More recently, a sparse model is proposed by Greer in [17] for high dimensional multi-spectral image unmixing. It adopts the L₀ norm to regularize the combination weights b of the reference colors hence leads to a solution that only a small number of reference colors are contributed to the stain color mixture. These serve as valuable sources of inspiration for selecting regularization terms for the linear system. However, the method proposed in [17] is also designed for multi-spectral image and no prior biological information about the biomarkers are used in that framework which may lead to undesired solution for real data.

In this paper, we propose a novel color unmixing algorithm for multiplex IHC image (scanned using CCD color camera) that can handle more than three stain colors, and maintain the biological properties of the biomarkers. Intuitively, the unmixing algorithm for the multiplex IHC image should work as following. (1) Only one group of stains has non-zero contribution in the color mixture for each pixel. (2) Within that group, the fractions of the contributions from each constituent stain should be correctly estimated. These conditions motivate us to model the unmixing problem within the group sparsity [27] framework so as to ensure the sparsity among the group, but non-sparsity within the group.

Section snippets

Methodology

In this section, we present the methodology of our algorithm. We begin with illustrating the basic framework in Fig. 3 using the following example. In the analysis of cancerous tissues, different biomarkers are specified to one or more types of immune cells. For instance, CD3 is a known universal marker for all T-cells, and CD8 only stains the membranes of the cytotoxic T-cells. FoxP3 marks only the regulatory T-cells in the nuclei, and hematoxylin (HTX) stains all the nuclei. A summary of the

Experiments

In this section, we empirically validate our unmixing algorithm and compare it against the existing techniques.

Conclusion

In this paper, we introduced a novel color unmixing strategy for multiplexed bright field histopathology images based on a group sparsity model. The biological co-localization information of the biomarkers is explicitly defined in the regularization term to produce biologically meaningful unmixing results. The experiments of both synthetic and clinical data demonstrate the efficacy of the proposed algorithm in terms of accuracy and stability when compared to the existing techniques. A promising

References (30)

Y. Zheng et al.
Landmark matching based retinal image alignment by enforcing sparsity in correspondence matrix
Med Image Anal
(2014)
Y. Yu et al.
Deformable models with sparsity constraints for cardiac motion analysis
Med Image Anal
(2014)
S. Zhang et al.
Towards robust and effective shape modeling: sparse shape composition
Med Image Anal
(2012)
S. Zhang et al.
Deformable segmentation via sparse representation and dictionary learning
Med Image Anal
(2012)
R. Fang et al.
Robust low-dose ct perfusion deconvolution via tensor total-variation regularization
IEEE Trans Med Imaging
(2015)
R. Fang et al.
Improving low-dose blood–brain barrier permeability quantification using sparse high-dose induced prior for Patlak model
Med Image Anal
(2014)
T. Chen et al.
Structure preserving color deconvolution for immunohistochemistry images
SPIE Proc
(2015)
J. Galon et al.
Towards the introduction of the ‘immunoscore’ in the classification of malignant tumour
J Pathol
(2013)
J. Galon et al.
Type, density, and location of immune cells within human colorectal tumors predict clinical outcome
Science
(2006)
S. Nawaz et al.
Beyond immune density: critical role of spatial heterogeneity in estrogen receptor-negative breast cancer
Mod Pathol
(2015)

X. Zhang et al.

Towards large-scale histopathological image analysis: Hashing-based image retrieval

IEEE Trans Med Imaging

(2015)

K. Nguyen et al.

Structure and context in prostatic gland segmentation and classification

(2012)

X. Zhang et al.

Mining histopathological images via composite hashing and online learning

(2014)

F. Xing et al.

Robust selection-based sparse shape model for lung cancer image segmentation

(2013)

M. Gurcan et al.

Histopathological image analysis: a review

IEEE Rev Biomed Eng

(2009)

Cited by (18)

Immunohistochemical double nuclear staining for cell-specific automated quantification of the proliferation index – A promising diagnostic aid for melanocytic lesions
2024, Pathology Research and Practice
Pathologists often use immunohistochemical staining of the proliferation marker Ki67 in their diagnostic assessment of melanocytic lesions. However, the interpretation of Ki67 can be challenging. We propose a new workflow to improve the diagnostic utility of the Ki67-index. In this workflow, Ki67 is combined with the melanocytic tumour-cell marker SOX10 in a Ki67/SOX10 double nuclear stain. The Ki67-index is then quantified automatically using digital image analysis (DIA). The aim of this study was to optimise and test three different multiplexing methods for Ki67/SOX10 double nuclear staining.
Multiplex immunofluorescence (mIF), multiplex immunohistochemistry (mIHC), and multiplexed immunohistochemical consecutive staining on single slide (MICSSS) were optimised for Ki67/SOX10 double nuclear staining. DIA applications were designed for automated quantification of the Ki67-index. The methods were tested on a pilot case-control cohort of benign and malignant melanocytic lesions (n = 23).
Using the Ki67/SOX10 double nuclear stain, malignant melanocytic lesions could be completely distinguished from benign lesions by the Ki67-index. The Ki67-index cut-offs were 1.8% (mIF) and 1.5% (mIHC and MICSSS). The AUC of the automatically quantified Ki67-index based on double nuclear staining was 1.0 (95% CI: 1.0;1.0), whereas the AUC of conventional Ki67 single-stains was 0.87 (95% CI: 0.71;1.00).
The novel Ki67/SOX10 double nuclear stain highly improved the diagnostic precision of Ki67 interpretation. Both mIHC and mIF were useful methods for Ki67/SOX10 double nuclear staining, whereas the MICSSS method had challenges in the current setting. The Ki67/SOX10 double nuclear stain shows potential as a valuable diagnostic aid for melanocytic lesions.
DNA sequencing using the RGB image sensor of a consumer digital color camera
2022, Sensors and Actuators B: Chemical
Citation Excerpt :
Therefore, in spectroscopic analysis using an RGB image sensor, the number of components that can be identified or quantified simultaneously is limited to only three [13,14]. Notwithstanding the limitation of an RGB image sensor, in several studies, four or more components were identified or quantified by three-color detection using an RGB image sensor [15–21]. That achievement was possible because the temporal and spatial overlap of multiple components was avoided or suppressed.
A novel method for Sanger DNA sequencing, as part of a compact and inexpensive DNA sequencer, using the RGB image sensor of a digital color camera was developed. Since the RGB image sensor can only detect up to three colors (wavelength bands), while Sanger DNA sequencing is required to detect four or more colors in order to quantify fluorescence of four different fluorophores in a mixed state during capillary electrophoresis, it is difficult to perform Sanger DNA sequencing using an RGB image sensor. Moreover, due to the spectral response of an RGB image sensor, fluorescence of the four fluorophores is detected only on the red and green channels, that is, not on the blue channel. To address the above-described issues, a two-color electropherogram was obtained by capillary electrophoresis of a sequencing sample and reproduced by best local fittings of two-color single peaks of the four fluorophores in chronological order. The fitted single peaks of the four fluorophores respectively give fluorescence intensities of the four fluorophores. That is, the two-color electropherogram is converted to a four-fluorophore electropherogram. The proposed Sanger-DNA-sequencing method was applied to a model sequencing sample, and the DNA sequence of 69–144 bases of the sample was accurately determined. This result is the first successful quantification of four or more fluorophores in a mixed state by using an RGB image sensor.
Stain Color Adaptive Normalization (SCAN) algorithm: Separation and standardization of histological stains in digital pathology
2020, Computer Methods and Programs in Biomedicine
Citation Excerpt :
Hence, our algorithm is potential to be applied to other stains that satisfy Beer–Lambert law (i.e. trichrome and giemsa stain, periodic acid–schiff stain, alcian blue stain). Future studies are required to test the performance of this method for the normalization of stains that do not follow the Beer–Lambert law (e.g. some immunohistochemical stains [30]). In the future, the SCAN algorithm could be integrated into deep learning frameworks to increase the performance of CNNs designed to segment or classify the cellular structures within the histological tissue.
The diagnosis of histopathological images is based on the visual analysis of tissue slices under a light microscope. However, the histological tissue appearance may assume different color intensities depending on the staining process, operator ability and scanner specifications. This stain variability affects the diagnosis of the pathologist and decreases the accuracy of computer-aided diagnosis systems. In this context, the stain normalization process has proved to be a powerful tool to cope with this issue, allowing to standardize the stain color appearance of a source image respect to a reference image.
In this paper, novel fully automated stain separation and normalization approaches for hematoxylin and eosin stained histological slides are presented. The proposed algorithm, named SCAN (Stain Color Adaptive Normalization), is based on segmentation and clustering strategies for cellular structures detection. The SCAN algorithm is able to improve the contrast between histological tissue and background and preserve local structures without changing the color of the lumen and the background.
Both stain separation and normalization techniques were qualitatively and quantitively validated on a multi-tissue and multiscale dataset, with highly satisfactory results, outperforming the state-of-the-art approaches. SCAN was also tested on whole-slide images with high performances and low computational times.
The potential contribution of the proposed standardization approach is twofold: the improvement of visual diagnosis in digital histopathology and the development of powerful pre-processing strategies to automated classification techniques for cancer detection.
Artificial intelligence and the interplay between tumor and immunity
2020, Artificial Intelligence and Deep Learning in Pathology
Digital pathology image analysis and deep learning can be utilized to quantify and characterize nuanced interactions between cancer and the immune system. Recent advances in deep learning and artificial intelligence in Pathomics data have led to the development of methods and techniques that augment and empower qualitative traditional diagnostic histopathologic evaluation in order to substantially accelerate cancer research. Emerging digital pathology and deep learning applications can (1) stratify patient management through data-driven insights into cancer, (2) identify relevant biomarkers to predict clinical outcomes and treatment response, (3) enhance our collective understanding of cancer biology to motivate the utilization of novel therapeutic approaches. This chapter introduces and describes a selected set of novel Pathomics-based deep learning methods that have been developed to classify and reproducibly quantify the interplay between tumor cells and the immune response in the tumor microenvironment.
Unsupervised Stain Decomposition via Inversion Regulation for Multiplex Immunohistochemistry Images
2023, Proceedings of Machine Learning Research
Deep Adversarial Network Based Stain Unmixing for Brightfield Multiplex Immunohistochemistry Images
2023, Proceedings - 2023 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023

View all citing articles on Scopus

View full text

Group sparsity model for stain unmixing in brightfield multiplex immunohistochemistry images

Highlights

Abstract

Introduction

Section snippets

Methodology

Experiments

Conclusion

Med Image Anal

Med Image Anal

Med Image Anal

Med Image Anal

IEEE Trans Med Imaging

Med Image Anal

SPIE Proc

Towards the introduction of the ‘immunoscore’ in the classification of malignant tumour

J Pathol

Type, density, and location of immune cells within human colorectal tumors predict clinical outcome

Science

Beyond immune density: critical role of spatial heterogeneity in estrogen receptor-negative breast cancer

Mod Pathol

Towards large-scale histopathological image analysis: Hashing-based image retrieval

IEEE Trans Med Imaging

Structure and context in prostatic gland segmentation and classification

Mining histopathological images via composite hashing and online learning

Robust selection-based sparse shape model for lung cancer image segmentation

Histopathological image analysis: a review

IEEE Rev Biomed Eng