Heterogeneity assessment of histological tissue sections in whole slide images

https://doi.org/10.1016/j.compmedimag.2014.11.006Get rights and content

Abstract

Computerized image analysis (IA) can provide quantitative and repeatable object measurements by means of methods such as segmentation, indexation, classification, etc. Embedded in reliable automated systems, IA could help pathologists in their daily work and thus contribute to more accurate determination of prognostic histological factors on whole slide images. One of the key concept pathologists want to dispose of now is a numerical estimation of heterogeneity. In this study, the objective is to propose a general framework based on the diffusion maps technique for measuring tissue heterogeneity in whole slide images and to apply this methodology on breast cancer histopathology digital images.

Introduction

Research in signal and image analysis is going on for many decades now and is directly linked with the exceptional development of computer technologies. But after all these years, it must be admitted that there are not so many real working applications in practice, especially in the medicine area where the expert's eye is still more accurate and faster than many automated systems dealing with large amounts of data. However, reliable automated systems could really help pathologists in their daily work as the number of pathological cases increases as far as the early screening campaigns do. We know that computerized image analysis (IA) can provide quantitative and repeatable object measurements by means of methods such as segmentation, indexation or classification. When IA is used to analyze images of histological sections in the medical research, the typical objects to be processed are nuclei, vessels, cell groups or tumors. IA allows these structures in histopathological images to be detected, to be analyzed automatically in terms of their size or shape, in order to assess their proportion or to compute their staining intensities. Indeed, IA can contribute to more accurate determination of prognostic histological factors by pathologists [1] but its action is limited to provide quantitative information as no multi-purpose method exists for segmenting or classifying images with a wide range of color or shape configurations, with which pathologists are familiar. The advent of digital scanners leads to generate whole slide images (WSI) from histological sections acquired at a full resolution. A main challenge this technology presents is the size of data to be processed which are generally very large and requires a huge storage capacity and processing needs. Conversely, the main advantage is that the complete tissue structure is easily accessed. In parallel, the aggressiveness of a cancer could result in morphological and architectural changes that can be observed in the tissue structure, and so be characterized by the object distribution on the slide, by the cross relations between objects and by the texture. This kind of information could contribute to evaluate a well-known concept: heterogeneity. Frequently addressed in signal processing, especially in terms of “entropy”, but more rarely in the field of imaging [2], the objective here is to propose a framework for measuring tissue heterogeneity in WSI and apply this methodology on breast cancer histopathology digital images. The key idea in this work is to not rely on segmentation (e.g. [1], [2]) of individual structures to characterize heterogeneity, but to make use of classification of squared sub-images later called ‘patches’. In some previous works dedicated to the development of a computer-aided diagnosis system (CADS) based on image retrieval and classification [3], [5], we have used a method coming from spectral graph theory, the Diffusion Maps (DM) [6], to process WSI split in small squares called ‘patches.’ The DM algorithm, in which eigenvalues and eigenvectors of a Markov matrix defining a random walk on the data are computed, allows to both cluster non-linear input data thanks to its inner classification properties preserving local neighborhood relationships, but also to reduce the input data dimensionality in a space (usually a 2D or 3D space) where it is therefore possible to compute euclidean distances between the objects to be analyzed [6], [8]. To briefly describe the complete CADS we are developing, a first step consists in building a knowledge database involving many features extracted from a set of well-known images; this is an ‘off-line’ procedure conducted once. These features are represented by vectors of non-linear data acting as a signature. In a second step, signatures are obtained from new unknown images and then compared with those already present in the database; this is an ‘on-line’ procedure that has to be conducted each time a new image is processed. In a last step, before rendering a qualitative measure of similarity/dissimilarity between the supervised images and new unknown images, a feedback procedure would have to be processed in order to eliminate most of the artifacts that usually corrupt initial knowledge databases. But this general approach, especially with the DM algorithm, can also be derived to analyze a set of image patches coming from any WSI, with just the goal to compare their feature vectors. In this paper, we focus on a way to characterize the tissue heterogeneity at a regional level by working on the projected coordinates of image patches in a reduced 3D space.

Section snippets

Materials and methods

Images used to illustrate this study come either from histological sections or tissue microarrays (TMAs) of breast cancer stained according to the Ki67 protocol and the Hematoxylin–Eosin-Saffron protocol (HES). They are acquired at a 20× magnification on a microscopic scanner device (ScanScope CS; Aperio Technologies). The resolution of WSI is 0.5 microns/pixel and the typical image size of any histological section can reach 50,000 × 40,000 pixels, corresponding to an uncompressed file of size 5.6 

Results and discussion

To simplify the displaying of our tissue heterogeneity quantification method in this abstract, tissue microarrays (TMAs) and sub-images extracted from WSIs of breast cancer are used. Fig. 2 shows four TMAs with their corresponding pseudo-color map in the DM cube. The homogeneous parts are visible and are separated by strong transitions in the tissue structure. On Fig. 3, four sub-images of size 3000 × 2000 pixels were selected and qualitatively evaluated by a pathologist. Fig. 3(1) and Fig. 3(2),

Conclusion

Patch analysis, while it may appear as a basic approach, allows segmentation of areas with similar configurations, such as stroma or tumor epithelium, and thus to encode the spatial organization of the same structure within the entire image. In future works, with the goal to decrease the computational burden generated by the “connected patches” approach, we plan to work on sampled patches obtained from stereological regular grids.

References (13)

  • R.R. Coifman et al.

    Diffusion maps

    Appl Comput Harmonic Anal

    (2006)
  • C.R. Rao

    Diversity and dissimilarity coefficients: a unified approach

    Theor Pop Biol

    (1982)
  • M. Gurcan et al.

    Histopathological image analysis: a review

    IEEE Rev Biomed Eng

    (2009)
  • F.J. Brooks et al.

    Quantification of heterogeneity observed in medical images

    BMC Med Imaging

    (2013)
  • M. Oger et al.

    Automated region of interest retrieval and classification using spectral analysis

    Diagn Pathol

    (2008)
  • S. Kullback et al.

    On information and sufficiency

    Ann Math Stat

    (1951)
There are more references available in the full text version of this article.

Cited by (0)

Special thanks to the Pathology Department, Caen, France & National Center of Pathology, Vilnius, Lithuania.

View full text