Abstract
In vitro experiments with cell cultures are essential for studying growth and migration behaviour and thus, for gaining a better understanding of cancer progression and its treatment. While recent progress in lens-free microscopy (LFM) has rendered it an inexpensive tool for continuous monitoring of these experiments, there is only little work on analysing such time-lapse sequences.
We propose (1) a cell detector for LFM images based on residual learning, and (2) a probabilistic model based on moral lineage tracing that explicitly handles multiple detections and temporal successor hypotheses by clustering and tracking simultaneously. (3) We benchmark our method on several hours of LFM time-lapse sequences in terms of detection and tracking scores. Finally, (4) we demonstrate its effectiveness for quantifying cell population dynamics.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
1 Introduction
Cell growth and migration play key roles in cancer progression: abnormal cell growth can lead to formation of tumors and cancer cells can spread to other parts of the body, a process known as metastasis. In vitro experiments are essential to understand these mechanisms and for developing anti-cancer drugs. In these experiments, the cells are typically observed with conventional light microscopes. Thanks to recent advances in CMOS sensor technology, lens-free microscopy (LFM) [4, 13] has become a promising alternative. In LFM a part of the incident wavefront originating from the light source is scattered by the sample, in this case the cell. The scattered light then interferes with the unscattered part of the wavefront and the resulting interference pattern is recorded with a CMOS sensor. The components required for LFM are extremely small and cheap. Thus, LFM provides the means for a wide range of applications where a conventional light microscope would be either too big or simply too expensive, such as the continuous monitoring of growing cell cultures inside standard incubators [9].
To quantify the clinically relevant information on cell growth and migration from the large amount of images that are acquired in such continuous monitoring, reliable automatic image analysis methods are crucial. Counting the number of cells in a time series of images gives access to the dynamics of cell growth. Locating and tracing individual cells provides information about cell motility, and over the course of a sequence, reconstructing the lineage trees gives insights into cell cycle timings and allows more selective analysis of cell sub-cultures.
There are several works on these tasks in traditional light microscopy, e.g. focussing on cell segmentation [15], detection and counting [8, 10, 16] or tracking [1, 6, 7, 12, 14], but very few deal with LFM data. One of the few exceptions is [3] which employs a regression framework for estimating the total cell count per image. We aim at the more complex goal of not only counting but also localizing cells and reconstructing their spatio-temporal lineage forest (c.f. Fig. 1). Methods for the latter task range from Kalman filtering [1] to keep track of moving cells, or iteratively composing tracklets by using the Viterbi algorithm [11], and have been compared in [12]. More recently, Jug et al. [7] have proposed a mathematically rigorous framework for lineage reconstruction, the so-called moral lineage tracing problem (MLTP). The MLTP differs fundamentally from all mathematical abstractions of cell tracking whose feasible solutions are either disjoint paths or disjoint trees of detections. Unlike these approaches that select only one detection for each cell in every image, feasible solutions of the MLTP select and cluster an arbitrary set of such detections for each cell. This renders the lineage trees defined by feasible solutions of the MLTP robust to the addition of redundant detections, a property we will exploit in this work.
In this paper, we contribute a framework for analysis of LFM sequences. First, we design and benchmark robust cell detectors for LFM time-lapse sequences derived from most recent work on convolutional neural networks and residual learning. Second, we discuss the MLTP in the context of LFM data. In particular, we define a probability measure for which the MLTP is a maximum a posteriori (MAP) estimator. This allows us to define the costs in the objective function of the MLTP w.r.t. probabilities that we estimate from image data. We validate it experimentally on two annotated sequences. Finally, we demonstrate the capability of our approach to quantify biologically relevant parameters from sequences of two in vitro experiments with skin cancer cells.
2 Methods
We consider the lineage tracing task as a MAP inference over a hypothesis graph containing a multitude of potential lineage forests. We discuss the probability measure and its MAP estimator, the MLTP in Sect. 2.1. In order to construct the hypothesis graph from a sequence of LFM images, we devise a cell detector in Sect. 2.2, which estimates a cell probability map for each given image. The workflow is illustrated in Fig. 2.
2.1 Lineage Tracing
Hypothesis Graph. We construct a spatio-temporal hypothesis graph \(G= (V, E)\) as follows: For every image \(I_t\) in the sequence, we apply a cell detector and define one node \(v \in V_t\) for every local maximum in \(P\left( c_s=1\vert I_t\right) \), the estimated probability map for finding a cell at a particular location s in image \(I_t\). Additionally, we define hypothesized successors to each node that has one or more favourable parents in the previous frame but no immediate successor. This helps avoiding gaps in the final tracklets. The nodes \(v \in V\) represent cells, yet do not need to be unique, i.e. one cell may give rise to several nodes. We then construct edges in space \(E_t^{\mathrm {sp}} = \{ uv \in V_t \times V_t : d(u,v) \le d_{\mathrm {max}}\}\), i.e. between any two nodes that lie within a distance of \(d_{\mathrm {max}}\), and in the same fashion, we construct temporal edges \(E_t^{\mathrm {tmp}} = \{ uv \in V_t \times V_{t+1} : d(u,v) \le d_{\mathrm {max}}\}\) between nodes in adjacent frames.
Probabilistic Model. We introduce a family of probability measures, each defining a conditional probability of any lineage forest, given an image sequence. We describe the learning of this probability from a training set of annotated image sequences as well as the inference of a maximally probable lineage forest, given a previously unseen image sequence. The resulting MAP estimation problem will assume the form of an MLTP with probabilistically justified costs.
First, we encode subgraphs in terms of cut edges with binary indicator variables \(\mathbf {x}\in \{ 0, 1\}^{ E }\). If edge uv is cut, i.e. \(x_{uv}=1\), it means that nodes u and v do not belong together. In order to ensure that the solution describes a lineage forest, we rely on the formulation of the MLTP [7], which describes the set of inequalities that are required to do so. In short, these constraints ensure: (1) spatial and temporal consistency, i.e. if nodes u and v as well as v and w belong together, then u and w must also belong together. (2) Distinct tracklets cannot merge at a later point in time. These are the so called morality constraints. (3) Bifurcation constraints allow cells to split in no more than two distinct successors. We will denote the set of \(\mathbf {x}\) that describe valid lineage forests with \(X_G\). For a more extensive discussion of these constraints, we refer to [7, 14]. We next model the measure of probability:
It is comprised of four parts. First, we have \(P\left( X_G\vert \mathbf {x}\right) \) representing the uniform prior over \(X_G\). Second, the cut probability \(P\left( x_{uv}\vert \mathbf {\Theta }\right) \) describing the probability of u and v being part of the same cell (either in space or along time), and third and fourth, the birth and termination probabilities \(P\left( x_v^+\vert \mathbf {\Theta }\right) \) and \(P\left( x_v^-\vert \mathbf {\Theta }\right) \) for each node \(v\in V\). The variables \(x_v^+, x_v^- \in \{0, 1\}\) are indicating whether the respective event, birth or termination, occurs at node v. \(\mathbf {\Theta }\) denotes the joint set of parameters. We use these parts to incorporate the following assumptions: Two detections u and v that are close are more likely to originate from the same cell, hence we choose \(P\left( x_{uv}=1\vert \mathbf {\Theta }\right) = \min (\frac{d(u,v)}{\theta ^{\mathrm {sp}}}, 1)\). Similarly, two successive detections u at t and v at \(t+1\) are more likely to be related the closer they are, is captured by \(P\left( x_{uv}=1\vert \mathbf {\Theta }\right) = \min (\frac{d(u,v)}{\theta ^{\mathrm {tmp}}}, 1)\). Finally, we assume that birth and termination events occur at a low rate, which is incorporated by \(P\left( x_v^+=1\vert \mathbf {\Theta }\right) = \theta ^+\) and \(P\left( x_v^-=1\vert \mathbf {\Theta }\right) = \theta ^-\). We fit these parameters \(\mathbf {\Theta }\) on training data in a maximum likelihood fashion: For \(\theta ^{-}\) and \(\theta ^{+}\) this boils down to calculating the relative frequency of the respective events on the annotated lineage. For the spatial and temporal parameters \(\theta ^{\mathrm {sp}}\) and \(\theta ^{\mathrm {tmp}}\), we first complement the lineage forest with edges within \(d_{\max }\) as \(\mathcal {E}\). We then maximize the log-likelihood \(\log \mathcal {L}(\theta ) = \sum _{uv \in \mathcal {E}} \log P\left( x_{uv}\vert \theta \right) \) by an extensive search over the interval \(\theta \in \left[ \theta _{\min }, \theta _{\max }\right] \), where we found \(\left[ 1, 80\right] \) to be appropriate.
The MAP estimate \(\mathbf {x}^*= \mathop {\arg \,\mathrm{max}}\nolimits _{\mathbf {x}\in \mathbf {X}} P\left( \mathbf {x}\vert \mathbf {\Theta }, X_G\right) \) can be written as solution to the MLTP:
where the coefficients become \(c_{uv} = -\log \frac{P\left( x_{uv}=1\vert \mathbf {\Theta }\right) }{1-P\left( x_{uv}=1\vert \mathbf {\Theta }\right) }\) for edges, and vice versa for \(c_v\) of the node events. \(X_V\) is the set of \(\mathbf {x}\) that satisfy the auxiliary constraints which tie birth and termination indicator variables \(x_v^-\) and \(x_v^+\) to the respective edge variables. We optimize (3) with the KLB algorithm described in [14].
2.2 Cell Detection with Residual Networks
Cells in LFM images are usually only marked at their center of mass and not segmented since their interference pattern, i.e. their appearance in the image, does not accurately describe their true shape and would therefore be ambiguous in many cases. Thus, we are interested in a detector that outputs the set of cell centers in image \(I_t\). Strong performance of the detector is crucial for the lineage reconstruction as its errors can affect the final lineage trees over many frames. To achieve this, we build on the recent work on residual networks [5]. However, instead of directly regressing bounding boxes or center coordinates in a sliding window fashion, we train our network, denoted with \(f(I_t)\), on a surrogate task: We approximate \(f(I_t) \approx P\left( c_s = 1\vert I_t\right) \), the probablity map of finding a cell at a particular location s in \(I_t\). This detector is fully convolutional and its output \(f(I_t)\) has the same size as \(I_t\). We found this to facilitate the training as it enlargens the spatial support of the sparse cell center annotations and gracefully handles the strongly varying cell density. Similar findings were made with techniques that learn a distance transform to detect cells, e.g. in [8]. We describe next how we arrive at a suitable architecture for this task and how to construct \(P\left( c_s = 1\vert I_t\right) \) from point-wise cell annotations.
Network Architecture. We start from the architecture of ResNet-50 [5]. We first truncate the network at layer 24 to obtain a fully convolutional detector. We found that truncating in the middle of the original ResNet-50, i.e. at layer 24, results in best resolution of the output response maps and allows to distinguish close cells. We then add one convolutional layer of \(1 \times 1 \times 256\) and one up-convolutional layer (also known as deconvolutional layer) of \(8 \times 8 \times 1\) with a stride of 8. The former combines all feature channels, while the latter compensates for previous pooling operations and ensures that the predicted cell probability map has the same resolution as the input image \(I_t\). Finally, a sigmoid activation function is used in the last layer to ensure that \(f(I_t)\) is within the interval \(\left[ 0, 1 \right] \) at any point.
Loss Function and Training. We sample training images of size \(224 \times 224\) from all frames of the training corpus. For each training image \(I_k\), we construct a corresponding cell probability map \(P\left( c_s=1\vert I_k\right) \) by placing a Gaussian kernel \(G^\sigma \) with \(\sigma =8\) at each annotated center. This implicitly represents the assumption that all cells have about the same extent, which is reasonable for our microscopy data. Each image is normalized to zero mean and unit variance. During training, we minimize the cross entropy loss between the predicted map \(f(I_t)\) and \(P\left( c_s\vert I_t\right) \) in order to let our network approximate the constructed cell probability map. We fine tune the network (pre-trained weights from ResNet-50 [5]) with a learning rate of \(10^{-3}\) for 100 epochs with batch size of 8. In each epoch, we sample 4000 training images. Since the annotated dataset for training is typically small and shows strong correlation between cells in consecutive frames, we used dropout of 0.5 after the last convolutional layer to avoid overfitting.
3 Experiments and Results
Datasets. We use a sequence depicting A549 cells, annotated over 250 frames in a region of interest (ROI) of \(1295 \times 971\,\mathrm {px}\), for all training purposes. For testing, we annotated two distinct sequences monitoring 3T3 cells of 350 and 300 frames in a ROI of \(639 \times 511\,\mathrm {px}\) (3T3-I) and \(1051 \times 801\,\mathrm {px}\) (3T3-II), respectively. Images were acquired at an interval of \(3\,\mathrm {min}\) with \(1.4\,\mathrm {\upmu m} \times 1.4\,\mathrm {\upmu m}\) per pixel.
Benchmarking Detectors. We compare four different network configurations, including the described ResNet-23, ResNet-11, a variant of it which was truncated at layer 11, the UNet [15] and CNN-4. In UNet, we obtained better results when replacing the stacks in the expansive path with single up-convolution layers which are merged with the corresponding feature maps from contracting path. CNN-4 is a plain vanilla CNN with three \(5\times 5\) convolutional layers followed by max pooling and finally, one up-convolutional layer of \(8 \times 8 \times 1\) to compensate for the down-sampling operations. We use the same training procedure (Sect. 2.2) for all detectors, but adjust the learning rate for UNet and CNN-4 to \(10^{-2}\).
We match annotated cells to detections within each frame with the hungarian algorithm and consider only matches closer than \(10\,\mathrm {px}\) (\(\approx \) a cell center region) as a true positive (TP). Unmatched annotations are counted as false negative (FN), unmatched detections as false positive (FP). The results are presented in Fig. 3, where we find the ResNet-23 to be the most robust detector.
Lineage Tracing. To compare the quality of different lineages, we match again annotations and detections within each frame to calculate the number of TP, FP and FN as described before. We then determine the number of false links, i.e. how often two matched nodes do not have the same parent. From these, we calculate multiple object detection accuracy (MODA) and multiple object tracking accuracy (MOTA) [2]. Moreover, we derive the number of edit operations needed to get from the predicted lineage to the ground truth lineage, and calculate the tracking accuracy (TRA) score proposed in [12]. We use unit weight for each type of edit (add or delete node or edge). This is justified by the fact that we have point annotations for cells instead of segmentations, making both addition and deletion equally expensive to correct.
For the MLTP, we compare the effect of varying \(\theta ^{\mathrm {tmp}}\), \(\theta ^{\mathrm {sp}}\) together with hypothesis graphs generated from the different detectors in Fig. 4. The optimal parameter choice for ResNet-23 is at 10, i.e. a relatively small merge radius of favourable merges, while the other detectors considerably benefit from wider ranges. In Table 1, we compare different lineage tracing approaches. Our baseline is linear assignment problem tracking (LAPT) [6]. The disjoint trees method (DTP), uses our ResNet-23 detections but solves the disjoint trees problem instead, i.e. it considers only one detection per cell. We find that MLTP outperforms both in terms of detection and tracking metrics.
Assessing Cell Population Dynamics. We apply our method on data from two experiments with skin cancer cells. In each, one population is exposed to an inhibitor substance while the control is not. Figure 5 depicts the resulting statistics. We observe the expected difference in growth rate, yet a more constrained motility of the control cells, which is caused by the limited space.
4 Conclusions
We have presented a framework for automatic analysis of LFM time-lapse sequences. It transfers two recently proposed methods, residual learning [5] and moral lineage tracing [7, 14], to the task at hand. We have shown experimentally that it is able to determine cell lineage forests of high quality and thereby quantify several measures of interest for the analysis of in vitro experiments, including cell population dynamics and motility.
References
Arbelle, A., Drayman, N., Bray, M., Alon, U., Carpenter, A., Raviv, T.R.: Analysis of high-throughput microscopy videos: catching up with cell dynamics. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 218–225. Springer, Cham (2015). doi:10.1007/978-3-319-24574-4_26
Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the clear mot metrics. EURASIP JIVP (1), 1–10 (2008). https://link.springer.com/article/10.1155/2008/246309
Flaccavento, G., et al.: Learning to count cells: applications to lens-free imaging of large fields. In: Microscopic Image Analysis with Applications in Biology (2011)
Greenbaum, A., et al.: Imaging without lenses: achievements and remaining challenges of wide-field on-chip microscopy. Nat. Methods 9(9), 889–895 (2012)
He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Jaqaman, K., et al.: Robust single-particle tracking in live-cell time-lapse sequences. Nat. Methods 5(8), 695–702 (2008)
Jug, F., et al.: Moral lineage tracing. In: CVPR, pp. 5926–5935 (2016)
Kainz, P., Urschler, M., Schulter, S., Wohlhart, P., Lepetit, V.: You should use regression to detect cells. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 276–283. Springer, Cham (2015). doi:10.1007/978-3-319-24574-4_33
Kesavan, S.V., et al.: High-throughput monitoring of major cell functions by means of lensfree video microscopy. Sci. Rep. 4, 5942 (2014)
Khan, A., Gould, S., Salzmann, M.: Deep convolutional neural networks for human embryonic cell counting. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9913, pp. 339–348. Springer, Cham (2016). doi:10.1007/978-3-319-46604-0_25
Magnusson, K.E.G., Jaldén, J.: A batch algorithm using iterative application of the viterbi algorithm to track cells and construct cell lineages. In: ISBI (2012)
Maška, M., et al.: A benchmark for comparison of cell tracking algorithms. Bioinformatics 30(11), 1609–1617 (2014)
Mudanyali, O., et al.: Compact, light-weight and cost-effective microscope based on lensless incoherent holography for telemedicine applications. Lab Chip 10(11), 1417–1428 (2010)
Rempfler, M., et al.: Efficient algorithms for moral lineage tracing arXiv:1702.04111 (2017)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). doi:10.1007/978-3-319-24574-4_28
Xie, W., et al.: Microscopy cell counting with fully convolutional regression networks. In: MICCAI 1st WDLMIA (2015)
Acknowledgements
With the support of the Technische Universität München – Institute for Advanced Study, funded by the German Excellence Initiative (and the European Union Seventh Framework Programme under grant agreement no. 291763).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Rempfler, M., Kumar, S., Stierle, V., Paulitschke, P., Andres, B., Menze, B.H. (2017). Cell Lineage Tracing in Lens-Free Microscopy Videos. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D., Duchesne, S. (eds) Medical Image Computing and Computer-Assisted Intervention − MICCAI 2017. MICCAI 2017. Lecture Notes in Computer Science(), vol 10434. Springer, Cham. https://doi.org/10.1007/978-3-319-66185-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-66185-8_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66184-1
Online ISBN: 978-3-319-66185-8
eBook Packages: Computer ScienceComputer Science (R0)