Image database analysis of Hodgkin lymphoma

https://doi.org/10.1016/j.compbiolchem.2013.04.003Get rights and content

Highlights

  • We perform automated analysis on a database of Hodgkin lymphoma (HL) images.

  • The database contains images of the two HL types, nodular sclerosis and mixed cellularity, as well as reactive lymphoid tissue.

  • Preprocessing narrows down the region of interest that contains malignant cells.

  • Preprocessing reduces the extensive requirements on computer power and memory for the automated analysis of HL images.

  • The degree of CD30+ staining is a characteristic feature of HL images.

Abstract

Hodgkin lymphoma (HL) is a special type of B cell lymphoma, arising from germinal center B-cells. Morphological and immunohistochemical features of HL as well as the spatial distribution of malignant cells differ from other lymphoma and cancer types. Sophisticated protocols for immunostaining and the acquisition of high-resolution images become routine in pathological labs. Large and daily growing databases of high-resolution digital images are currently emerging. A systematic tissue image analysis and computer-aided exploration may provide new insights into HL pathology. The automated analysis of high resolution images, however, is a hard task in terms of required computing time and memory. Special concepts and pipelines for analyzing high-resolution images can boost the exploration of image databases.

In this paper, we report an analysis of digital color images recorded in high-resolution of HL tissue slides. Applying a protocol of CD30 immunostaining to identify malignant cells, we implement a pipeline to handle and explore image data of stained HL tissue images. To the best of our knowledge, this is the first systematic application of image analysis to HL tissue slides. To illustrate the concept and methods we analyze images of two different HL types, nodular sclerosis and mixed cellularity as the most common forms and reactive lymphoid tissue for comparison. We implemented a pipeline which is adapted to the special requirements of whole slide images of HL tissue and identifies relevant regions that contain malignant cells.

Using a preprocessing approach, we separate the relevant tissue region from the background. We assign pixels in the images to one of the six predefined classes: Hematoxylin+, CD30+, Nonspecific red, Unstained, Background, and Low intensity, applying a supervised recognition method. Local areas with pixels assigned to the class CD30+ identify regions of interest. As expected, an increased amount of CD30+ pixels is a characteristic feature of nodular sclerosis, and the non-lymphoma cases show a characteristically low amount of CD30+ stain. Images of mixed cellularity samples include cases of high CD30+ coloring as well as cases of low CD30+ coloring.

Graphical abstract

Malignant Reed–Sternberg cell in Hodgkin lymphoma (HL). We perform an automated analysis of a dataset of whole slide HL images. Our pipeline identifies regions which contain malignant tissue. Preprocessing reduces the extensive requirements on computer power and memory for an automated analysis of HL images.

  1. Download : Download full-size image

Introduction

Hodgkin lymphoma (HL) is a type of lymphoma, originating from germinal center B-cells in most of the cases (Küppers et al., 1994). In HL, neoplastic cells circumvent immune surveillance and apoptotic control. Characteristically, in HL, the malignant cells make up only about 1% of the tumor mass, while being outnumbered by a microenvironment of reactive lymphoid cells (Steidl et al., 2011, Mathas et al., 2009). According to the WHO classification, two main subtypes can be described, the classical HL (cHL) and the nodular lymphocyte-predominant HL (NLP HL).

The assignment to cHL is based on the identification of characteristic tumor cells termed Hodgkin cells and Reed–Sternberg cells (HRS cells) (Reed, 1902, Sternberg, 1898). Using immunohistochemical methods for immunophenotypic differentiation of HL, the cluster-of-differentiation (CD) protocols are of particular interest. They are used to identify cell surface proteins presented on leukocytes, e.g., CD30, CD20, and CD15. For example, CD30 belongs to the tumor necrosis receptor family (TNFR), and is usually expressed by HRS cells but can also be expressed by activated blasts in reactive lesions like lymphadenitis.

Fig. 1 illustrates a typical CD30+ immunostained HRS cell within its microenvironment. Hematoxylin counterstaining is applied to visualize cell nuclei. Depending on immunophenotypic findings and the composition of the cellular microenvironment cHL is further divided into two subtypes, the most common one termed nodular sclerosis (NS cHL) and the mixed cellularity (MC cHL) (Hansmann and Willenbrock, 2002). All subtypes of cHL share an immunophenotype of CD30+ (Küppers, 2009). CD15 and CD20 can be variably expressed, with usual CD20-negativity and CD15-positivity. However, CD30 expression can also be observed in reactive lesions. Therefore a detailed investigation of the location of CD30 blasts is mandatory and image analysis may provide systematic information for this task.

Moreover, the expression of various cytokines and chemokines is thought to be responsible for the attraction of different cells, such as small lymphocytes, macrophages, mast cells, plasma cells, stromal cells, histiocytes, or fibroblasts (Aldinucci et al., 2010, Steidl et al., 2011). The extent of cellular infiltration, involving many different cell types as a response to inflammatory signals, represents a key characteristic of HL (Hartmann et al., 2013). The events that happen in the course of malignancy development as well as the complex interaction network of malignant cells in HL are not understood. Here, image analysis may provide new insights into the pathology of HL.

There is a broad range of literature on image analysis of non-lymphoid tumors. For example, Naik et al. (2008) used a Bayesian nuclear segmentation scheme in prostate and breast cancer histopathology. Bayesian models with additional contextual information from Markov random fields have been applied as a probabilistic approach for prostate cancer detection (Monaco et al., 2009). Lei et al. (2011) applied Gaussian mixture models to perform local and global clustering, extracting different tissue constituents in cervix histology images. Sertel et al. (2009) developed a system for segmentation of eosinophilic and basophilic structures in hematoxylin and eosin (H&E) stained tissue section images of neuroblastoma. Large-scale computations and machine learning algorithms have been applied to breast histology images (Petushi et al., 2006). Karacali and Tözeren (2007) proposed a high-throughput method of texture heterogeneity on breast tissue images to identify regions of interest. An automatic classification of three types of malignant lymphomas has been proposed by Orlov et al. (2010), taking into account chronic lymphocytic leukemia, follicular lymphoma, and mantle cell lymphoma. They achieved high accuracy in discriminating the different lymphoma types using various statistical and texture features. Sertel et al. developed a computer-aided method to detect follicles based on texture features in an immunostained image followed by the detection of centroblasts within the follicular regions (Sertel et al., 2008).

Today, a number of software solutions for analyzing images are available. Commercial software includes general tools like MATLAB (Matlab, 2013) as well as software geared especially toward digital pathology like Aperio ImageScope (Aperio, 2013), Definiens Tissue Studio (Definiens), and many other. There exist also freely available tools, the most prominent of which are CellProfiler (Lamprecht et al., 2007), ImageJ (Abramoff et al., 2004), GNU Octave (Eaton et al., 2008), and Scilab (Enterprises and Consortium, 2013). Despite all these approaches, no special work exists for the analysis of HL images that takes into account the specific characteristics of that lymphoma type. That is important, because NS cHL and MC cHL represent two different types of HL which often have distinct patterns of tumor cell distribution. In NS cHL, the tumor cells grow in a nodular pattern with broad collagen bands separating the nodules. In contrast, MC cHL tumor cells are more isolated and often distributed over the whole tissue section. NS cHL accounts for about 64%, whereas MC cHL is found in about 30% of the cHL cases in our database. While the visual differentiation between the two types of cHL is possible in many cases, non-lymphoma NL images sometimes exhibit features similar to cHL subtypes as for example, lymphadenitis which is an infection of the lymph nodes, resulting from certain bacterial or viral infections and often leading to inflammation and swelling of lymph nodes.

Our work aims to analyze the considered types based on CD30 stained tissue sections, using pixel based image processing techniques. To be flexible in handling and exploring HL whole slide tissue-images and to easily extend the software in future projects we implemented a new pipeline. Our software is intended to evolve into a tool that extracts statistical information from a large database of Hodgkin lymphoma images. We aim to combine our software solutions with existing imaging software solutions like CellProfiler (Lamprecht et al., 2007).

Section snippets

Material

The image sources used in this study are CD30 stained tissue slides of lymph node sections. The tissue slides are pretreated, and immunostained for CD30 as previously described (Hartmann et al., 2011). The reddish coloring is basically produced by the enzyme alkaline phosphatase that is bound via a biotin–streptavidin linker to a CD30 antibody and hydrolyzes the magenta dye new fuchsine. Hematoxylin counterstain was applied to stain cell nuclei. The digital slides are captured using an Aperio

Methods

We use Openslide 3.2 to handle whole slide images, applying the Java Advanced Imaging API (Goode and Satyanarayanan, 2008). The input images contain stained tissue in front of a bright background. Some of the images show artifacts like air bubbles, small tissue fragments, and stain residues. To eliminate such artifacts, as well as background, we developed an appropriate procedure of preprocessing steps.

Identification of regions of interest

The fraction of points eliminated by the preprocessing varies strongly with the area of tissue slides and the quality of tissue preparation. We observe a typical value for the fraction of background of 55%, but portions down to 10% or up to 90% background are not exceptional. The area of low intensity is considerably smaller and typically below 2%.

Fig. 3 illustrates the assignment of pixels to the six classes. On the top part of Fig. 3 three sections of images are shown: a region with CD30+

Conclusion

In this work, we present the method and results of a new pipeline for automated image analysis adapted to the requirements of the pathology of HL. The diagnosis of HL is based on the identification of characteristic multinucleated giant cells within an inflammatory milieu. The malignant cells can be rather sparse in the lymphoma and their detection by visual inspection can be a hard and exhausting task even for an experienced pathologist. Our aim is to extract statistical features relevant for

References (29)

  • M. Abramoff et al.

    Image processing with imageJ

    Biophotonics International

    (2004)
  • D. Aldinucci et al.

    The classical Hodgkin’s lymphoma microenvironment and its role in promoting tumour growth and immune escape

    Journal of Pathology

    (2010)
  • Aperio

    Aperio imagescope

    (2013)
  • Definiens

    Definiens Tissue Studio 3

    (2013)
  • J. Eaton et al.

    Gnu Octave Manual Version 3

    (2008)
  • S. Enterprises et al.

    Scilab: Free and Open Source Software for Numerical Computation

    (2013)
  • R. Gonzalez et al.

    Digital Image Processing

    (2008)
  • A. Goode et al.

    A vendor-neutral library and viewer for whole-slide images. Technical Report

    (2008)
  • M.-L. Hansmann et al.

    Die WHO-Klassifikation des Hodgkin-Lymphoms und ihre molekularpathologische Relevanz

    Der Pathologe

    (2002)
  • S. Hartmann et al.

    Revising the historical collection of epithelioid cell-rich lymphomas of the Kiel lymph node registry: what is Lennert’s lymphoma nowadays?

    Histopathology

    (2011)
  • S. Hartmann et al.

    Spindle-shaped cd163+ rosetting macrophages replace CD4+ T-cells in HIV-related classical Hodgkin lymphoma

    Modern Pathology

    (2013)
  • R. Küppers et al.

    Hodgkin disease: Hodgkin and Reed–Sternberg cells picked from histological sections show clonal immunoglobin gene rearrangements and appear to be derived from B cells at various stages of development

    Proceedings of the National Academy of Sciences of the United States of America

    (1994)
  • R. Küppers

    The biology of Hodgkin’s lymphoma

    Nature Reviews Cancer

    (2009)
  • B. Karacali et al.

    Automated detection of regions of interest for tissue microarray experiments: an image texture analysis

    BMC Medical Imaging

    (2007)
  • Cited by (0)

    1

    These authors contributed equally.

    View full text