Keywords

1 Introduction

In digital brightfield microscopy, tissues are usually stained before digitization and evaluation, with hematoxylin and eosin (H&E) being the most widely used stains. Color deconvolution (CD) aims at separating a color image into the concentration of each stain present in it. This is not an easy task since the exact spectral profile of the stains varies from one image to another [10]. Hence, the stain color-vector matrix, which relates the color image and the stain concentrations, often needs to be estimated for each slide. Once the stain color-vectors are calculated, the color of different images can be normalized to a target image for an easier evaluation. This is usually done by replacing the stain vectors with the target stain vectors obtained from the reference slide, and converting the calculated concentrations back to an RGB image.

One of the first CD methods was proposed by Ruifrok et al. [9]. The values for the stain vector of each dye were obtained by measuring the relative absorption of each color from single-stained images. While these standard values are frequently used, they have a strong dependence on the experimental setting utilized in [9]. In [6] H&E color-vector are obtained from the two largest singular values of the SVD decomposition of the optical density matrix. McCann et al. [7] extend the method in [6] by adjusting the contrast of the eosin channel and including a weak interaction between eosin and hematoxylin in the pixels of the hematoxylin channel where eosin values were changed. The algorithm is tested on a set of three H&E images that were stained and destained to create H-only and E-only images that are used as ground-truth separated images for the H&E image. In [4] the stain color-vectors are estimated by projecting the input color image in the Maxwellian chromaticity plane to form clusters, each one corresponding to one stained tissue type. In [5] color normalization is performed by first deconvolving both source and target images, applying a non-linear mapping of the source to the target image channels and recombining the mapped channels into the normalized source image. In [8] the CD problem is formulated as a blind source separation problem tackled by Non-negative Matrix Factorization (NMF) and Independent Component Analysis (ICA). Vahadane et al. [11] and Xu et al. [13] independently extend the NMF in [8] with regularization and sparsity terms since they assume that most pixels contain one type of biological material. The use of Non-negative Least Squares (NNLS) instead of NMF is proposed in [3]. Alsubaie et al. [1] propose using ICA in the wavelets domain where the independence condition among sources is relaxed. All these methods depend on parameters that need to be manually adjusted for an optimal deconvolution.

In this paper we propose a novel fully automated CD method that simultaneously estimates the color-vector matrix, the concentration of the stains, and all required model parameters. In our Bayesian blind CD problem formulation we introduce a smoothness prior model on the stain concentrations which helps reduce the acquisition noise and takes into account the spatial correlation between adjacent pixels. Despite the variability among images, the color-vector matrices are often assumed to be close to a commonly accepted standard matrix. Our Bayesian modelling allows us to include this additional prior knowledge on the sought after solution.

The rest of the paper is organized as follows: in Sect. 2 we mathematically formulate the blind CD of histopathological images problem. This problem is approached using the Bayesian framework in Sect. 3. In this section we also carry out Bayesian inference to estimate the color-vector matrix, concentrations, and model parameters in a fully automated manner. In Sect. 4 the proposed method is evaluated in a set of H&E stained images and its performance is compared with other classical and state-of-the-art CD methods. Finally, Sect. 5 concludes the paper.

2 Problem Formulation

The RGB intensity image detected by a brightfield microscope observing a stained histological specimen’s slide is the \((M\times N)\times 3\) matrix, \({\mathbf {I}}\), with columns \({\mathbf {i}}_c=(i_{1c},\ldots ,i_{MNc})^\mathrm{T}, c\in \{R,G,B\}\) and MN the number of pixels. According to the monochromatic Beer-Lambert law [9], the Optical Density (OD) for channel c of the slide, \({\mathbf {y}}_c \in \mathbb {R}^{MN\times 1}\), is \({\mathbf {y}}_c = - \log _{10}\left( {{\mathbf {i}}_c}/{{\mathbf {i}}^{0}_c}\right) \), where \({\mathbf {i}}^{0}_c\) denotes the incident light, and the division operation and \(\log _{10}(\cdot )\) function are computed element-wise. For a slide stained using \(n_s\) stains the observed OD image \({\mathbf {Y}}={[{\mathbf {y}}_R,{\mathbf {y}}_G,{\mathbf {y}}_B]} \in \mathbb {R}^{MN\times 3}\) can be obtained from

$$\begin{aligned} {\mathbf {Y}}^{\mathrm {T}}= {\mathbf {M}}{\mathbf {C}}^{\mathrm {T}}+ {\mathbf {N}}^{\mathrm {T}}\,, \end{aligned}$$
(1)

where \({\mathbf {N}}\) is the capture noise matrix of size \(MN\times 3\) with i.i.d. \(\mathcal{N}(0,\beta ^{-1})\) components, \({\mathbf {C}}\in \mathbb {R}^{MN\times n_s}\) is the stain concentration matrix

$$\begin{aligned} {\mathbf {C}}=\left[ \begin{array}{lll} c_{11}&{}\ldots &{}c_{1n_s}\\ \vdots &{}\vdots &{}\vdots \\ c_{MN1}&{}\ldots &{}c_{MNn_s} \end{array} \right] = \left[ \begin{array}{l} {\mathbf {c}}_{1,:}^{\mathrm {T}}\\ \vdots \\ {\mathbf {c}}_{MN,:}^{\mathrm {T}}\end{array} \right] = \left[ \begin{array}{lll} {\mathbf {c}}_1&\ldots&{\mathbf {c}}_{n_s} \end{array} \right] \,, \end{aligned}$$
(2)

with i-th row \({\mathbf {c}}_{i,:}^\mathrm{T}=(c_{i1},\ldots ,c_{in_s})\), \(i=1,\ldots ,MN\) and columns \({\mathbf {c}}_s=(c_{1s},\ldots ,\) \(c_{MNs})^\mathrm{T}\), \(s\in \{1,\ldots ,n_s\}\) and \({\mathbf {M}}\in \mathbb {R}^{3\times n_s}\) is the normalized stains’ specific color-vector matrix. Each column in matrix \({\mathbf {M}}\) is a unit \(\ell 2\) norm stain color-vector containing the relative RGB color composition of the corresponding stain. Color Deconvolution (CD) is a technique that allows to obtain the stain concentration matrix, \({\mathbf {C}}\), and the color-vector matrix, \({\mathbf {M}}\), from the observed optical densities, \({\mathbf {Y}}\). In the following section we will use Bayesian modeling and inference to estimate \({\mathbf {M}}\) and \({\mathbf {C}}\) as well as the model parameters.

3 Bayesian Modelling and Inference

Following the degradation model in (1), we have

$$\begin{aligned} {\mathrm p}({\mathbf {Y}}|{\mathbf {M}},{\mathbf {C}},\beta ) = \prod _{i=1}^{MN}{\mathrm p}({\mathbf {y}}_{i,:}|{\mathbf {M}},{\mathbf {c}}_{i,:},\beta )=&\prod _{i=1}^{MN}{\mathcal {N}}({\mathbf {y}}_{i,:}|{\mathbf {M}}{\mathbf {c}}_{i,:},\beta ^{-1}{\mathbf {I}}_{3\times 3}). \end{aligned}$$
(3)

The stain concentrations at each pixel on the image are expected to have values similar to the ones of the surrounding pixels. So, we impose smoothing prior models on the concentrations \({\mathbf {c}}_s\), \(s=1,\ldots ,n_s\), that is, on the columns of \({\mathbf {C}}\), as the Gaussian distributions of the form

$$\begin{aligned} {\mathrm p}({\mathbf {C}}|{\varvec{\alpha }}) = \! \prod _{s=1}^{n_s}{\mathrm p}({\mathbf {c}}_s|\alpha _s) \propto \! \prod _{s=1}^{n_s}\alpha _s^{\frac{MN}{2}}\exp \left( \! -\frac{1}{2}\alpha _s {\mathbf {c}}_s^{\mathrm {T}}{\mathbf {F}}^{\mathrm {T}}{\mathbf {F}}{\mathbf {c}}_s \! \right) \!, \end{aligned}$$
(4)

where \({\mathbf {F}}\in \mathbb {R}^{MN\times MN}\) is a smoothing filter and \(\alpha _s\), \(s=1,\ldots ,n_s\), controls the amount of smoothness.

The color-vector matrix \({\mathbf {M}}=[{\mathbf {m}}_1,\ldots ,{{\mathbf {m}}_{n_s}}]\) is also unknown, because it depends on the staining procedures and microscopes. In [9], standard color-vectors for hematoxylin, eosin, and DAB stains were proposed. Although those standard color-vectors are not usually exact for each single image, they are very representative and have been frequently used. In this paper we incorporate the similarity to a representative color-vector matrix \(\underline{{\mathbf {M}}}=[\underline{{\mathbf {m}}}_1,\ldots ,{\underline{{\mathbf {m}}}_{n_s}}]\) into the Gaussian prior model

$$\begin{aligned} {\mathrm p}({\mathbf {M}}|{\varvec{\gamma }})\! = \!\! \prod _{s=1}^{n_s} {\mathrm p}({\mathbf {m}}_s|\gamma _s) \!\propto \!\! \prod _{s=1}^{n_s} \gamma _s^{\frac{3}{2}}\exp \left( \! -\frac{1}{2}\gamma _s\Vert {\mathbf {m}}_s \! - \!\underline{{\mathbf {m}}}_s \Vert ^2 \! \right) . \end{aligned}$$
(5)

where \(\gamma _s\), \(s=1,\ldots ,n_s\), controls our confidence on the accuracy of \(\underline{{\mathbf {m}}_s}\).

The joint probability distribution for our problem is

$$\begin{aligned} {\mathrm p}({\mathbf {Y}},{\mathbf {C}}, {\mathbf {M}}|\beta ,{\varvec{\alpha }},{\varvec{\gamma }}) = {\mathrm p}({\mathbf {Y}}|{\mathbf {C}}, {\mathbf {M}},\beta ) {\mathrm p}({\mathbf {M}}|{\varvec{\gamma }}) {\mathrm p}({\mathbf {C}}|{\varvec{\alpha }})\,. \end{aligned}$$
(6)

Following the Bayesian paradigm, inference will be based on the posterior distribution \({\mathrm p}({\mathbf {C}},{\mathbf {M}},\beta ,{\varvec{\alpha }},{\varvec{\gamma }}|{\mathbf {y}})\) which cannot be obtained in closed form, so a variational approach [2] is applied.

The hyperparameters \(\{\beta ,{\varvec{\alpha }},{\varvec{\gamma }}\}\) have not been considered as variables for the variational method, but as model parameters for which a Maximum Likelihood Estimator (MLE) will be obtained and \({\mathrm p}({\mathbf {C}},{\mathbf {M}},\beta ,{\varvec{\alpha }},{\varvec{\gamma }}|{\mathbf {y}})\) is approximated by the distribution

$$\begin{aligned} {\mathrm q}({\mathbf {C}},{\mathbf {M}}) = \prod _{s=1}^{n_s}{\mathrm q}({\mathbf {m}}_s)\prod _{s=1}^{n_s} {\mathrm q}({\mathbf {c}}_s). \end{aligned}$$
(7)

It can then be shown [2] that for each unknown \(\theta \in \varTheta =\{{\mathbf {m}}_1,\ldots ,{\mathbf {m}}_{n_s}\), \({\mathbf {c}}_1,\ldots ,{\mathbf {c}}_{n_s}\}\), \({\mathrm q}(\theta )\) will have the form

$$\begin{aligned} {\mathrm q}(\theta ) \propto \exp \left\langle \log {\mathrm p}({\mathbf {y}},{\mathbf {C}}, {\mathbf {M}}|\beta ,{\varvec{\alpha }},{\varvec{\gamma }})\right\rangle _{{\mathrm q}(\varTheta {\setminus }\theta )}\,, \end{aligned}$$
(8)

where \(\varTheta {\setminus }\theta \) represents all the variables in \(\varTheta \) except \(\theta \) and \(\left\langle \cdot \right\rangle _{{\mathrm q}(\varTheta {\setminus }\theta )}\) denotes expected value calculated using the distribution \({\mathrm q}(\varTheta {\setminus }\theta )\). Estimates for the different variables can be obtained as \(\hat{\theta }= {\left\langle \theta \right\rangle }_{{\mathrm q}(\theta )}\). Let us now obtain the analytic expressions for each unknown estimate.

Concentration Update. Let us define

$$\begin{aligned} {\mathbf {e}}_{i{,:}}^{-s}={\mathbf {y}}_{i,:}-\sum _{k\ne s}\left\langle c_{ik}\right\rangle \left\langle {\mathbf {m}}_k\right\rangle \,\,\,\,\, \mathrm { and }\,\,\,\,\, z_i^{-s}=\left\langle {\mathbf {m}}_s\right\rangle ^{\mathrm {T}}{\mathbf {e}}_{i{,:}}^{-s}, \,\,\,\, \mathrm { for }\,\,\, i=1,\ldots ,MN \end{aligned}$$
(9)

From (6) and (8) we have

$$\begin{aligned} \left\langle \log {\mathrm p}({\mathbf {y}},{\mathbf {C}}, {\mathbf {M}}|\beta ,{\varvec{\alpha }},{\varvec{\gamma }})\right\rangle _{{\mathrm q}(\varTheta {\setminus }{\mathbf {c}}_s)}=&-\frac{\beta }{2}\left( \parallel {\mathbf {c}}_s\parallel ^2\left\langle \parallel {\mathbf {m}}_s\parallel ^2\right\rangle -2{\mathbf {c}}_s^{\mathrm {T}}{\mathbf {z}}^{-s}\right) \nonumber \\&-\frac{1}{2}\alpha _s {\mathbf {c}}_s^{\mathrm {T}}{\mathbf {F}}^{\mathrm {T}}{\mathbf {F}}{\mathbf {c}}_s +\text{ const } \end{aligned}$$
(10)

which produces \({\mathrm q}({\mathbf {c}}_s) = {\mathcal {N}}({\mathbf {c}}_s|\left\langle {\mathbf {c}}_s\right\rangle ,{\varvec{\varSigma }}_{{\mathbf {c}}_s})\,,\) where

$$\begin{aligned} {\varvec{\varSigma }}_{{\mathbf {c}}_s}^{-1} = \beta \left\langle \parallel {\mathbf {m}}_s\parallel ^2\right\rangle {\mathbf {I}}_{MN\times MN} + \alpha _s {\mathbf {F}}^t{\mathbf {F}}\qquad \mathrm { and } \qquad \left\langle {\mathbf {c}}_s\right\rangle =\beta {\varvec{\varSigma }}_{{\mathbf {c}}_s}{\mathbf {z}}^{-s}. \end{aligned}$$
(11)

Color-Vector Update. In a similar way, using Eq. (9), from Eqs. (6) and (8) we now have

$$\begin{aligned} \left\langle \log {\mathrm p}({\mathbf {y}},{\mathbf {C}}, {\mathbf {M}}|\beta ,{\varvec{\alpha }},{\varvec{\gamma }})\right\rangle _{{\mathrm q}(\varTheta {\setminus }{\mathbf {m}}_s)}=&-\frac{\beta }{2}\left( \parallel {\mathbf {m}}_s\parallel ^2\sum _i\left\langle c_{is}^2\right\rangle -2{\mathbf {m}}_s^{\mathrm {T}}\sum _i\left\langle c_{is}\right\rangle {\mathbf {e}}_{i,:}^{-s}\right) \nonumber \\&-\frac{1}{2}\gamma _s\parallel {\mathbf {m}}_s - \underline{{\mathbf {m}}}_s \parallel ^2 +\text{ const } \end{aligned}$$
(12)

which produces \({\mathrm q}({\mathbf {m}}_s) = {\mathcal {N}}({\mathbf {m}}_s|\left\langle {\mathbf {m}}_s\right\rangle ,{\varvec{\varSigma }}_{{\mathbf {m}}_s})\,,\) where

$$\begin{aligned} {\varvec{\varSigma }}_{{\mathbf {m}}_s}^{-1} =(\beta \sum _i\left\langle c_{is}^2\right\rangle +\gamma _s){\mathbf {I}}_{3\times 3} \mathrm {\ \ \ and\ \ } \left\langle {\mathbf {m}}_s\right\rangle ={{\varvec{\varSigma }}_{{\mathbf {m}}_s}}(\beta \sum _i\left\langle c_{is}\right\rangle {\mathbf {e}}_{i,:}^{-s}+\gamma _s\underline{{\mathbf {m}}}_s). \end{aligned}$$
(13)

Notice that \(\left\langle {\mathbf {m}}_s\right\rangle \) may not be an unitary vector. We can always replace \(\left\langle {\mathbf {m}}_s\right\rangle \) by \(\left\langle {\mathbf {m}}_s\right\rangle /\parallel \left\langle {\mathbf {m}}_s\right\rangle \parallel \) and \({\varvec{\varSigma }}_{{\mathbf {m}}_s}\) by \({\varvec{\varSigma }}_{{\mathbf {m}}_s}/\parallel \left\langle {\mathbf {m}}_s\right\rangle \parallel ^2\). Notice also that \(\left\langle c_{is}^2\right\rangle \) can be calculated using (11) and \(\left\langle \parallel {\mathbf {m}}_s\parallel ^2\right\rangle \) can be easily calculated from (13).

Parameter Update. Finally, the MLE estimators of the noise, concentration, and color-vectors precisions are

$$\begin{aligned} \hat{\beta }^{-1}&=\frac{{\mathrm {tr}}\left\langle ({\mathbf {Y}}^{\mathrm {T}}-{\mathbf {M}}{\mathbf {C}}^{\mathrm {T}})({\mathbf {Y}}^{\mathrm {T}}-{\mathbf {M}}{\mathbf {C}}^{\mathrm {T}})^{\mathrm {T}}\right\rangle _{{\mathrm q}({\varvec{\varTheta }})}}{3MN}, \end{aligned}$$
(14)
$$\begin{aligned} \hat{\alpha }_s^{-1}&= \frac{{\mathrm {tr}}({\mathbf {F}}^{\mathrm {T}}{\mathbf {F}}\left\langle {\mathbf {c}}_s{\mathbf {c}}_s^{\mathrm {T}}\right\rangle )}{MN}, \end{aligned}$$
(15)
$$\begin{aligned} \hat{\gamma }_s^{-1}&= \frac{{\mathrm {tr}}(\left\langle ({\mathbf {m}}_s-\underline{{\mathbf {m}}}_s)({\mathbf {m}}_s-\underline{{\mathbf {m}}}_s)^{\mathrm {T}}\right\rangle )}{3}\,. \end{aligned}$$
(16)

respectively. Notice that the involved expected values can be easily calculated.

figure a

The proposed Variational Bayesian Blind Color Deconvolution method, summarized in Algorithm 1, allows to obtain the estimated concentrations \(\hat{{\mathbf {c}}}_s\) and color-vector \(\hat{{\mathbf {m}}}_s\) iterating on the concentration and color-vector updates until convergence. Finally, an RGB image of each separated stain, \(\hat{{\mathbf {i}}}^{\mathrm {sep}}_s\), can be obtained as \(\hat{\mathbf {i}}^{\mathrm {sep}}_s = \exp _{10}({-\hat{{\mathbf {m}}}_s \hat{{\mathbf {c}}}_s^{\mathrm {T}}})\).

4 Experiments

We compared the proposed fully automated approach with classical and state-of-the-art CD methods on the stain separation benchmark in [7]. This dataset is formed by three H&E images and their corresponding H-only and E-only images that can be used as ground truth images for the color deconvolution procedure. Each image in the dataset was obtained by eosin staining the tissue, imaging, destaining, staining with hematoxylin, imaging, staining also with eosin and imaging. An example of H&E stained image in the dataset is shown in Fig. 1(a) and its corresponding E-only and H-only images are shown in the left and right hand side of Fig. 1(b), respectively.

The election of a reference color-vector matrix depends on the used stains. For the H&E stains used in this paper, the value of \(\underline{{\mathbf {M}}}\) was set to the H&E values proposed in [9]. We want to note that, for stains different from H&E, simply measuring the tissue response to a single stain might provide its corresponding reference color-vector. The convergence criterion \(\Vert \left\langle {\mathbf {c}}_s\right\rangle ^{(n)}- \left\langle {\mathbf {c}}_s\right\rangle ^{(n-1)}\Vert ^2/\Vert \left\langle {\mathbf {c}}_s\right\rangle ^{(n)}\Vert ^2 < 10^{-5}\) for both stains, that is, \(s=1,2\), was used. This is met in about 15 iterations of the algorithm. We compare against the classical method in [6] and the recent methods in [7, 11]. For all the competing algorithms, parameters were selected following the recommendations on the original paper or the reference software freely available.

Figure 1(c)–(f) shows the separations with the proposed and competing methods. From the images, it is clear that the method in [7] and the proposed algorithm produce results closer to the ground truth (see Fig. 1(b)) than the methods in [6, 11]. Although all methods effectively separate epithelial and stromal structures, ours seems closer to ground truth. Note, for instance, that the long structure in the center of the image (corresponding to bone tissue [7]) is not clearly shown in the hematoxylin estimations in Figs. 1(c) and (e). All eosin estimations present a higher contrast than the ground truth although, estimations obtained by the proposed method and the method in [7] are more similar to the ground truth. The eosin estimation from the method in [7] seems to be slightly less contrasted than ours.

Numerical results, using the Peak Signal to Noise Ratio (PSNR) and Structural Similarity (SSIM) [12] measures, are presented in Table 1. This table includes the results for the non-blind method in [9] as a reference. The figures-of-merit confirm the visual inspection results. The proposed method performs better than the competitors, except for the case of the eosin stain for the algorithm in [7]. This was expected since this algorithm selectively modifies the obtained values for the stain separations to better accommodate ground truth. More precisely, in [7] the eosin separation is corrected in contrast by adding a small part of the hematoxylin stain, and the hematoxylin stain is then computed again by taking into account interaction between the stains in those places where the contrast of the eosin coefficients is adjusted. Note that, in spite of these adjustments, our fully automated proposed method consistently provides better PSNR results for the hematoxylin stain than the method in [7]. Notice that the obtained PSNR and, especially, the SSIM values are quite low. This is due to the staining-destaining process that makes the tissue to move and deform. These deformations were partially corrected in [7] by geometrically registering the H-only and E-only to their corresponding estimations. Although we used the registered images from [7] as ground truth for all the tests, there still are misalignment between ground truth and estimations that deteriorate the figures-of-merit.

Fig. 1.
figure 1

Ground truth and separations for the proposed and the compared method.

Table 1. PSNR and SSIM for the different methods and images in the dataset.

The proposed method is faster than the recent method in [11], taking 14 s on a i7-5550U @ 2.40 GHz laptop with 16 GB RAM versus 50.2 s. However it is slower than the classical method in [6], that took 0.4 s, and the method in [7] that took 2.78 s.

5 Conclusions

A novel fully automated variational Bayesian blind color deconvolution method for histological images is proposed. The method estimates the color-vector matrix, the concentration of the stains and all the model parameters. The proposed model takes into account the spatial relations between pixels as well as the similarity to a standard color-vector matrix. Comparison with classical and recent methods demonstrated that the proposed method produces better results than the competitors, except for the eosin stain by the algorithm in [7] as already mentioned.