The Mutex Watershed: Efficient, Parameter-Free Image Partitioning

Wolf, Steffen; Pape, Constantin; Bailoni, Alberto; Rahaman, Nasim; Kreshuk, Anna; Köthe, Ullrich; Hamprecht, Fred A.

doi:10.1007/978-3-030-01225-0_34

Steffen Wolf¹⁷,
Constantin Pape^17,18,
Alberto Bailoni¹⁷,
Nasim Rahaman¹⁷,
Anna Kreshuk^17,18,
Ullrich Köthe¹⁷ &
…
Fred A. Hamprecht¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11208))

Included in the following conference series:

European Conference on Computer Vision

2556 Accesses
23 Citations

Abstract

Image partitioning, or segmentation without semantics, is the task of decomposing an image into distinct segments; or equivalently, the task of detecting closed contours in an image. Most prior work either requires seeds, one per segment; or a threshold; or formulates the task as an NP-hard signed graph partitioning problem. Here, we propose an algorithm with empirically linearithmic complexity. Unlike seeded watershed, the algorithm can accommodate not only attractive but also repulsive cues, allowing it to find a previously unspecified number of segments without the need for explicit seeds or a tunable threshold. The algorithm itself, which we dub “Mutex Watershed”, is closely related to a minimal spanning tree computation. It is deterministic and easy to implement. When presented with short-range attractive and long-range repulsive cues from a deep neural network, the Mutex Watershed gives results that currently define the state-of-the-art in the competitive ISBI 2012 EM segmentation benchmark. These results are also better than those obtained from other recently proposed clustering strategies operating on the very same network outputs.

S. Wolf and C. Pape—Contributed equally.

You have full access to this open access chapter, Download conference paper PDF

The Semantic Mutex Watershed for Efficient Bottom-Up Semantic Instance Segmentation

SoftCut: A Fully Differentiable Relaxed Graph Cut Approach for Deep Learning Image Segmentation

Asymmetric Cuts: Joint Image Labeling and Partitioning

1 Introduction

Most image partitioning algorithms are defined over a graph encoding purely attractive interactions. No matter whether a segmentation or clustering is then found agglomeratively (as in single linkage clustering/watershed) or divisively (as in spectral clustering or iterated normalized cuts), the user either needs to specify the desired number of segments or a termination criterion. An even stronger form of supervision is in terms of seeds, where one pixel of each segment needs to be designated as such either by a user or automatically. Unfortunately, clustering with automated seed selection remains a fragile and error-fraught process, because every missed or hallucinated seed causes an under- or oversegmentation error. Although the learning of good edge detectors boosts the quality of classical seed selection strategies (such as finding local minima of the boundary map, or thresholding boundary maps), non-local effects of seed placement along with strong variability in region sizes and shapes make it hard for any learned predictor to place exactly one seed in every true region.

In contrast to the above class of algorithms, multicut/correlation clustering partitions vertices with both attractive and repulsive interactions encoded into the edges of a graph. Multicut has the great advantage that a “natural” partitioning of a graph can be found, without needing to specify a desired number of clusters, or a termination criterion, or one seed per region. Its great drawback is that its optimization is NP-hard.

The main insight of this paper is that when both attractive and repulsive interactions between pixels are available, then a generalization of the watershed algorithm can be devised that segments an image without the need for seeds or stopping criteria or thresholds. It examines all graph edges, attractive and repulsive, sorted by their weight and adds these to an active set iff they are not in conflict with previous, higher-priority, decisions. The attractive subset of the resulting active set is a forest, with one tree representing each segment. However, the active set can have loops involving more than one repulsive edge. See Fig. 1 for a visual abstract.

In summary, our principal contribution, the Mutex Watershed, is a “best of both worlds” algorithm that combines the multicut’s desirable lack of hyperparameters with the small computational footprint of Kruskal-type watershed algorithm.

The algorithm is presented in Sect. 3. In Sect. 4 we evaluate the algorithm against very strong baselines. We choose a challenging dataset for neuron segmentation from electron microscopy (EM) image stacks as benchmark. For this task, watershed segmentation is a key component: EM staining only highlights membrane boundaries, discouraging the use of region cues for segmentation. By incorporating long-range repulsions into the watershed procedure, we can obtain an accurate segmentation from this step already, avoiding costly post-processing for agglomeration. In addition, we present preliminary results on the BSDS500, demonstrating the applicability of the proposed method to natural images. We describe our future plans, including extensions to semantic segmentation, in Sect. 5. Our implementation is available at https://github.com/hci-unihd/mutex-watershed.git.

2 Related Work

In the original watershed algorithm [1], seeds were automatically placed at all local minima of the boundary map. Unfortunately, this leads to severe over-segmentation. Defining better seeds has been a recurring theme of watershed research ever since. The simplest solution is offered by the seeded watershed algorithm [2]: It relies on an oracle (an external algorithm or a human) to provide seeds and assigns each pixel to its nearest seed in terms of minimax path distance. In the absence of an oracle, automatic seed selection is challenging because exactly one seed must be placed in every region. Simple methods, e.g. defining seeds by connected regions of low boundary probability, do not work: The segmentation quality is usually insufficient because multiple seeds are in the same region and/or seeds leak through the boundary.

This problem is typically addressed by biasing seed selection towards over-segmentation (with seeding at all minima being the extreme case). The watershed algorithm then produces superpixels that are merged into final regions by more or less elaborate postprocessing. This works better than using watersheds alone because it exploits the larger context afforded by superpixel adjacency graphs. Many criteria have been proposed to identify the regions to be preserved during merging, e.g. region dynamics [3], the waterfall transform [4], extinction values [5], region saliency [6], and $(\alpha ,\omega )$-connected components [7]. A merging process controlled by criteria like these can be iterated to produce a hierarchy of segmentations where important regions survive to the next level. Variants of such hierarchical watersheds are reviewed and evaluated in [8].

These results highlight the close connection of watersheds to hierarchical clustering and minimum spanning trees/forests [9, 10], which inspired novel merging strategies and termination criteria. For example, [11] simply terminated hierarchical merging by fixing the number of surviving regions beforehand. [12] incorparate predefined sets of generalized merge constraints into the clustering algorithm. Graph-based segmentation according to [13] defines a measure of quality for the current regions and stops when the merge costs would exceed this measure. Ultrametric contour maps [14] combine the gPb (global probability of boundary) edge detector with an oriented watershed transform. Superpixels are agglomerated until the ultrametric distance between the resulting regions exceeds a learned threshold. An optimization perspective is taken in [15], which introduces h-increasing energy functions and builds the hierarchy incrementally such that merge decisions greedily minimize the energy. The authors prove that the optimal cut corresponds to a different unique segmentation for every value of a free regularization parameter.

An important line of research is based on the observation that superior partitionings are obtained when the graph has both attractive and repulsive edges. Solutions that optimally balance attraction and repulsion do not require external stopping criteria such as predefined number of regions or seeds. This generalization leads to the NP-hard problem of correlation clustering or (synonymous) multicut (MC) partitioning [16]. Fortunately, modern integer linear programming solvers in combination with incremental constraint generation can solve problem instances of considerable size [17], and good approximations exist for even larger problems [18, 19].

Another beneficial extension is the introduction of additional long-range edges. Thanks to their larger field of view, the strength of these edges can often be estimated with greater certainty than is achievable for the local edges used in standard watersheds. This has been used in [20] to represent object size constraints by repulsive long-range edges, which is still an MC-type problem. When long-range edges are also allowed to be attractive, the problem turns into the more complicated lifted multicut (LMC) [21]. Realistic problem sizes can only be solved approximately [22, 23], but watershed superpixels followed by LMC postprocessing achieve state-of-the-art results on important benchmarks [24]. Long-range edges are also used in [25], as side losses for the boundary detection CNN; but they are not used explicitly in any downstream inference.

In general, striking progress in watershed-based segmentation has been achieved by learning boundary maps with convolutional neural networks (CNNs). This is nicely illustrated by the evolution of neurosegmentation for connectomics, an important field we also address in the experimental section. CNNs were introduced to this application in [26] and became, in much refined form [27], the winning entry of the ISBI 2012 Neuro-Segmentaion Challenge [28]. Boundary maps and superpixels were further improved by progress in CNN architectures and data augmentation methods, using U-Nets [29], FusionNets [30] or inception modules [24]. Subsequent postprocessing with the GALA algorithm [31, 32], conditional random fields [33] or the lifted multicut [24] pushed the envelope of final segmentation quality. MaskExtend [34] applied CNNs to both boundary map prediction and superpixel merging, while flood-filling networks [35] eliminated superpixels all together by training a recurrent neural network to perform region growing one region at a time.

Most networks mentioned so far learn boundary maps on pixels, but learning works equally well for edge-based watersheds, as was demonstrated in [36, 37] using CNN-generated edge weights according to [38, 39]. Tailoring the learning objective to the needs of the watershed algorithm by penalizing critical edges along minimax paths [39] or end-to-end training of edge weights and region growing [40] improved results yet again.

Outside of connectomics, [41] obtained superior boundary maps from CNNs by learning not just boundary strength, but also its gradient direction. Holistically-nested edge detection [42, 43] couples the CNN loss at multiple resolutions using deep supervision and is successfully used as a basis for watershed segmentation of medical images in [44].

The present paper combines all these concepts (hierarchical clustering, attractive and repulsive interactions, long-range edges, and CNN-based learning) into a novel efficient segmentation framework. It can be interpreted as a generalization of [12], because we also allow for soft constraints (which can be overridden by strong attractive edges), and constraints are generated on the fly by a neural network rather than predefined. Our method is also related to greedy additive edge contraction (GAEC) according to [22], but we handle attractive and repulsive interactions separately and define edge strength between clusters by a maximum instead of an additive rule.

3 The Mutex Watershed Algorithm

3.1 Definitions and Notation

We consider the problem of clustering a graph $G(V, E^+\cup E^-, W^+\cup W^-)$ with both attractive and repulsive edge attributes. The scalar attribute $w^+_e\in \mathbb {R}^+_0$ associated with edge $e\in E^+$ is a merge affinity: the higher this number, the higher the inclination of the two incident vertices to be assigned to the same cluster. Similarly, $w^-_e\in \mathbb {R}^+_0$ for $e\in E^-$ is a split tendency: the higher this number, the greater the tendency of the incident vertices to be in different clusters.

In our application, each vertex corresponds to one pixel in the image to be segmented. Two vertices may have no edge connecting them; or an attractive edge $e\in E^+$; or a repulsive edge $e\in E^-$; or two edges at the same time, one attractive and one repulsive. Edges can be either local/short-range (when connecting two pixels that are immediately adjacent in the image) or long-range.

The Mutex Watershed algorithm, defined in Subsect. 3.3, maintains disjunct active sets $A^+ \subseteq E^+$, $A^- \subseteq E^-$, $A^+ \cap A^- = \emptyset $, that encode merges and mutual exclusion constraints, respectively. Clusters are defined via the “connected” predicate:

$$\begin{aligned} \forall i,j\in V: \quad \quad \quad \quad \quad \Pi _{i \rightarrow j}= & {} \{ \mathrm {path\,} \pi \,\mathrm {from}\,i\,\mathrm {to}\,j\,\,\mathrm {with}\,\pi \subseteq E^+\}\\ \mathrm {connected}(i, j)\Leftrightarrow & {} \exists \text { path } \pi \in \Pi _{i \rightarrow j}\,\, \text { with }\pi \subseteq A^+ \subseteq E^+\\ \mathrm {cluster}(i)= & {} \{i\} \cup \{j: \mathrm {connected}(i,j) \} \end{aligned}$$

Conversely, the active subset $A^-\subseteq E^-$ of repulsive edges defines mutual exclusion relations by using the following predicate:

$$\begin{aligned} \mathrm {mutex}(i, j)\Leftrightarrow & {} \exists \, e=(k,l)\in A^- \text { with } \\&k \in \mathrm {cluster}(i) \text { and } l \in \mathrm {cluster}(j) \text { and } \\&\mathrm {cluster}(i) \ne \mathrm {cluster}(j) \end{aligned}$$

Admissible active edge sets $A^+$ and $A^-$ must be chosen such that the resulting clustering is consistent, i.e. nodes engaged in a mutual exclusion constraint cannot be in the same cluster: $\mathrm {mutex}(i,j) \Rightarrow \mathrm {not} \,\, \mathrm {connected}(i, j)$. The “connected” and“mutex” predicates can be efficiently evaluated using a union find data structure.

3.2 Seeded Watershed from a Mutex Perspective

One interpretation of the proposed method is in terms of a generalization of the edge-based watershed algorithm [9, 45, 46] or image foresting transform [47]. This algorithm can only ingest a graph with purely attractive interations, $G(V, E^+, W^+)$. Without further constraints, the algorithm would yield only the trivial result of a single cluster comprising all vertices. To obtain more interesting output, an oracle needs to provide seeds, namely precisely one node per cluster. These seed vertices are all connected to an auxiliary node (see Fig. 2(a)) by auxiliary edges with infinite merge affinity. A maximum spanning tree (MST) on this augmented graph can be found in linearithmic time; and the maximum spanning tree (or in the case of degeneracy: at least one of the maximum spanning trees) will include the auxiliary edges. When the auxiliary edges are deleted from the MST, a forest results, with each tree representing one cluster [9, 45, 47].

We now reformulate this well-known algorithm in a way that will later emerge as a special case of the proposed Mutex Watershed: we eliminate the auxiliary node and edges, and replace them by a set of infinitely repulsive edges, one for each pair of seeds (Fig. 2(b)). Algorithm 1 is a variation of Kruskal’s MST algorithm operating on the seed mutex graph just defined, and gives results identical to seeded watershed on the original graph.

This algorithm differs from Kruskal’s only by the check for mutual exclusion in the if-statement. Obviously, the modified algorithm has the same effect as the original algorithm, because the final set $A^+$ is exactly the maximum spanning forest obtained after removing the auxiliary edges from the original solution.

In the sequel, we generalize this construction by admitting less-than-infinitely repulsive edges. Importantly, these can be dense and are hence much easier to estimate automatically than seeds with their strict requirement of only-one-per-cluster.

3.3 Mutex Watersheds

We now introduce the core contribution: an algorithm that is empirically no more expensive than a MST computation; but that can ingest both attractive and repulsive cues and partition a graph into a number of clusters that does not need to be specified beforehand. There is no requirement of one seed per cluster, and not even of a hyperparameter that would implicitly determine the number of resulting clusters.

The Mutex Watershed, Algorithm 2, proceeds as follows: given a graph with sets of attractive and repulsive edges $E^+$ and $E^-$, with edge weights $W^+$ and $W^-$ respectively, do the following: sort all edges $E^+\cup E^-$, attractive or repulsive, by their weight in descending order into a priority queue. Iteratively pop all edges from the queue and add them to the active set one by one, provided that a set of conditions are satisfied. More specifically, if the next edge popped from the priority queue is attractive and its incident vertices are not yet in the same tree, then connect the respective trees provided this is not ruled out by a mutual exclusion constraint. If on the other hand the edge popped is repulsive, and if its incident vertices are not yet in the same tree, then add a mutual exclusion constraint between the two trees.

The crucial difference to Algorithm 1 is that mutex constraints are no longer pre-defined, but created dynamically whenever a repulsive edge is found. However, new exclusion constraints can never override earlier, high-priority merge decisions. In this case, the repulsive edge in question is simply ignored. Similarly, an attractive edge must never override earlier and thus higher-priority must-not-link decisions.

3.4 Time Complexity Analysis

Before analyzing the time complexity of Algorithm 2 we first review the complexity of Kruskal’s algorithm. Using a union-find data structure the time complexity of $\mathrm {merge(i,~j)}$ and $\mathrm {connected(i,~j)}$ is $\mathcal {O}(\alpha (V))$, where $\alpha $ is the slowly growing inverse Ackerman function, and the total runtime complexity is dominated by the initial sorting of the edges $\mathcal {O}(E \log E)$ [48].

To check for mutex constraints efficiently, we maintain a set of all active mutex edges

$$ M[C_i] = \{(u,~v) \in A^- | u \in C_i \vee v \in C_i\} $$

for every $C_i = \mathrm {cluster}(i)$ using hash tables, where insertion of new mutex edges (i.e. $\mathrm {addmutex}$) and search have an average complexity of $\mathcal {O}(1)$. Note that every cluster can be efficiently identified by its union-find root node. For $\mathrm {mutex(i,~j)}$ we check if $M[C_i] \cap M[C_j] = \emptyset $ by searching for all elements of the smaller hash table in the larger hash table. Therefore $\mathrm {mutex(i,~j)}$ has an average complexity of $\mathcal {O}(\min (|M[C_i]|, |M[C_j]|)$. Similarly, during $\mathrm {merge(i,~j)}$, mutex constraints are inherited by merging two hash tables, which also has an average complexity $\mathcal {O}(\min (|M[C_i]|, |M[C_j]|)$.

In conclusion, the average runtime contribution of attractive edges $\mathcal {O}(|E^{+}| \cdot \alpha (V) + |E^{+}| \cdot M)$ (checking mutex constrains and possibly merging) and repulsive edges $\mathcal {O}(|E^{-}| \cdot \alpha (V) + |E^{-}|) $ (insertion of one mutex edge) result in a total average runtime complexity of Algorithm 2:

$$\begin{aligned} \mathcal {O}(E \log E + E \cdot \alpha (V) + EM). \end{aligned}$$

(1)

where M is the expected value of $\min (|M[C_i]|, |M[C_j]|)$. Using $\alpha (V) \in \mathcal {O}(\log V) \in \mathcal {O}(\log E) $ this simplifies to

$$\begin{aligned} \mathcal {O}(E \log E \;+\; EM). \end{aligned}$$

(2)

In the worst case $\mathcal {O}(M) = \mathcal {O}(E)$, the Mutex Watershed Algorithm has a runtime complexity of $\mathcal {O}(E^2)$. Empirically, we find that $\mathcal {O}(EM) \approx \mathcal {O}(E\log E)$ by measuring the runtime of Mutex Watershed for different sub-volumes of the ISBI challenge (see Fig. 3), leading to a

$$\begin{aligned} \text {Empirical Mutex Watershed Complexity: }\mathcal {O}(E \log E) \end{aligned}$$

(3)

4 Experiments

We evaluate the Mutex Watershed on the challenging task of neuron segmentation in electron microscopy (EM) image volumes. This application is of key interest in connectomics, a field of neuro-science that strives to reconstruct neural wiring diagrams spanning complete central nervous systems. The task requires segmentation of neurons from electron microscopy images of neural tissue – a challenging endeavor, since segmentation has to be based only on boundary information (cell membranes) and some of the boundaries are not very pronounced. Besides, cells contain membrane-bound organelles, which have to be suppressed in the segmentation. Some of the neuron protrusions are very thin, but all of those have to be preserved in the segmentation to arrive at the correct connectivity graph. While a lot of progress has been made recently, only manual tracing yields sufficient accuracy for correct circuit reconstruction [49].

We validated the Mutex Watershed algorithm on the most popular neural segmentation challenge: ISBI2012 [28]. We estimate the edge weights using a CNN as described in Sect. 4.1 and compare with other entries in the leaderboard as well as with other common post-processing methods for the same network predictions Sect. 4.2.

4.1 Estimating Edge Weights with a CNN

The common approach to EM segmentation is to predict which pixels belong to a cell membrane using a CNN. Different post-processing methods are used on top to obtain a segmentation, see Sect. 2 for an overview of these methods. The CNN can be either trained to predict boundary pixels [24, 27] or undirected affinities [25, 50] which express how likely it is for a pixel to belong to a different cell than its neighbors in the 6-neighborhood. In this case, the output of the network contains three channels, corresponding to left, down and next imaging plane neighbors in 3d. The affinities do not have to be limited to immediate neighbors - in fact, [25] have shown that introduction of long-range affinities is beneficial for the final segmentation even if they are only used as auxiliary loss during training. Building on the work of [25], we train a CNN to predict short and long-range affinities and then use those directly as weights for the Mutex Watershed algorithm.

We estimate the affinities/edge weights for the neighborhood structure shown in Fig. 4. To that end, we define local attractive and long-range repulsive edges. The choice of this structure has to be motivated by the underlying data - we use a different pattern for in-plane and between-plane edges due to the anisotropy of the validation datasets. In more detail, we picked a sparse ring of in-plane repulsive edges and additional longer-range in-plane edges which were necessary to split regions reliably (see Fig. 4a). We also added connections to the indirect neighbors in the lower adjacent slice to ensure correct 3D connectivity (see Fig. 4b).

In total, $C^+$ attractive and $C^-$ repulsive edges are defined for each pixel, resulting in $C^+ + C^-$ output channels in the network. We partition the set of attractive/repulsive edges into subsets $H^+$ and $H^-$ that contain all edges at a specific offset, attractive edges: $E^+ = {\bigcup _{c}^{C^+}} H^+_{c}$ and repulsive edges analogously. Each element of the subsets $H^+_{c}$ and $H^-_{c}$ corresponds to a specific channel predicted by the network. We further assume that weights take values in [0, 1] and adopt the same conventions for attraciveness/repulsion as in Sect. 3. For more details on network architecture and training see Supplementary 1.

In our experiments, we pick a subset of repulsive edges, by using strides of 2 in the XY-plane in order to avoid artifacts caused by occasional very thick membranes. Note that the stride is not applied to local (attractive) edges, but only to long-range (repulsive) edges.

4.2 ISBI Challenge

The ISBI 2012 EM Segmentation Challenge [28] is the neuron segmentation challenge with the largest number of competing entries. The challenge data contains two volumes of dimensions $1.5 \times 2 \times 2$ microns with a resolution of $50 \times 4 \times 4$ nm per pixel. The groundtruth is provided as binary membrane labels, which can easily be converted to a 2D, but not 3D segmentation. To train a 3D model, we follow the procedure described in [24].

The test volume has private groundtruth; results can be submitted to the leaderboard. They are evaluated based on the Adapted Rand Score (Rand-Score) and the Variation of Information Score (VI-Score) [28], separately for each 2D slice.

Our method holds the top entry in the challenge’s leader board^{Footnote 1} at the time of submission, see Table 1a. This is especially remarkable, because it is simpler than the methods holding the other top entries. Similar to us, they rely on a CNN to predict boundary locations, but postprocess its output with the complex pipeline described in [24], that involves a NP-hard partitioning step.

In addition, we compare to baseline post-processing methods starting from our network predictions: thresholding (THRESH), two watershed variants (WS, WSDT), and one multicut variant (MC-LOCAL) only take into account short-range predictions. Lifed multicut (LMC) and another multicut variant (MC-FULL) also use long-range predictions. For these baseline methods we have only produced 2D segmentations for the individual slices, either because the 3D results were inferior (THRESH, WS, WSDT) or infeasible to obtain (MC, LMC). In contrast, the Mutex Watershed benefited from 3D segmentation. See Table 1b for the evaluation results and see Supplementary 2 for further details on the baseline methods and a qualitative comparison.

The three methods that use short- and long-range connectivty perform significantly better than the other methods. Somewhat surprisingly, MWS performs better than MC-FULL and LMC, which are based on a NP-hard partition problem. This might be explained by the lack of 3D information in the two latter two approaches (solving the 3D model was infeasible).

4.3 Study on Natural Image Segmentation

We conducted preliminary experiments on the Berkeley segmentation dataset BSD500 [53] to study the Mutex Watersheds applicability to natural images. Training a state-of-the-art edge detection network on this small dataset requires a set of dataset specific optimization tricks such as training with external data, multi resolution architectures and auxiliary losses [43]. In this preliminary study we train a 2D version of the network used for the ISBI experiments to predict the 2D connectivity pattern depicted in Fig. 4a. To alleviate the small size of the training set, we present this network with predictions from [42] as additional input channel.

Table 1. Results on the ISBI 2012 EM Segmentation Challenge.

Full size table

In order to isolate the influence of the quality of the underlying affinities, we run ablation experiments where we interpolate (via weighted average) between (a) affinities as predicted by our neural network, (b) those obtained from the ground-truth and (c) uniform noise. We obtain Mutex Watershed segmentations from the interpolated affinities for the BSD testset, size-filter them (as the only post-processing step) and evaluate with the Rand Index. The “phase transition diagram” resulting from these experiments is shown in Fig. 6a; Table 6b shows Rand Index and Variation of Information obtained for several points on this diagram.

Observe that the vertices corresponding to (a) and (c) can be interpreted as structured and unstructured noise on the ground-truth affinities (respectively). Hence, the results of our experiments show that the Mutex Watershed is fairly robust against both types of noise; when mixing the GT with noise, the quality of the segmentations is unaffected up to 60 % noise. When mixing GT with NN predictions, it is unaffected to an even higher degree.

In addition, we compare to the result of [22], who use an approach similar to ours and solve a Lifted Multicut based on long range potentials extracted from a pre-computed probability map. In Supplementary 3, we show the segmentations resulting at different stages of interpolation between GT, NN predictions and noise.

5 Conclusion

We have presented a fast algorithm for the clustering of graphs with both attractive and repulsive edges. The ability to consider both obviates the need for the kind of stopping criterion or even seeds that all popular algorithms except for correlation clustering need. The proposed method has low computational complexity in imitation of its close relative, Kruskal’s algorithm.

Finally, we have found that the proposed algorithm, when presented with informative edge costs from a good neural network, outperforms all known methods on a competitive bioimage partitioning benchmark, including methods that operate on the very same network predictions.

In future work we want to generalize our algorithm to semantic instance segmentation commonly found in natural image segmentation challenges [54,55,56].

Notes

1.
http://brainiac2.mit.edu/isbi_challenge/leaders-board-new.

References

Vincent, L., Soille, P.: Watersheds in digital spaces: an efficient algorithm based on immersion simulations. IEEE Trans. Pattern Anal. Mach. Intell. 6, 583–598 (1991)
Article Google Scholar
Beucher, S., Meyer, F.: The morphological approach to segmentation: the watershed transformation. Opt. Eng. 34, 433–433 (1992)
Google Scholar
Grimaud, M.: New measure of contrast: the dynamics. In: Gader, P.D., Dougherty, E.R., Serra, J.C. (eds.), Proceedings of the Image Algebra and Morphological Processing, vol. 1769. SPIE Conference Series, pp. 292–305 (1992)
Google Scholar
Beucher, S.: Watershed, hierarchical segmentation and waterfall algorithm. In: Serra, J., Soille, P. (eds.) ISMM 1994, vol. 94, pp. 69–76. Springer, Dordrecht (1994). https://doi.org/10.1007/978-94-011-1040-2_10
Chapter Google Scholar
Vachier, C., Meyer, F.: Extinction value: a new measurement of persistence. In: Worksh. Nonlinear Signal and Image Processing, vol. 1, pp. 254–257 (1995)
Google Scholar
Najman, L., Schmitt, M.: Geodesic saliency of watershed contours and hierarchical segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 18(12), 1163–1173 (1996)
Article Google Scholar
Soille, P.: Constrained connectivity for hierarchical image decomposition and simplification. IEEE Trans. Patt. Anal. Mach. Intell. 30(7), 1132–1145 (2008)
Article Google Scholar
Perret, B., Cousty, J., Guimaraes, S.J., Maia, D.S.: Evaluation of hierarchical watersheds (2017). HAL preprint 01430865
Google Scholar
Meyer, F.: Morphological multiscale and interactive segmentation. In: WS on Nonlinear Signal and Image Processing, pp. 369–377 (1999)
Google Scholar
Najman, L.: On the equivalence between hierarchical segmentations and ultrametric watersheds. J. Math. Imaging Vis. 40(3), 231–247 (2011)
Article MathSciNet Google Scholar
Salembier, P., Garrido, L.: Binary partition tree as an efficient representation for image processing, segmentation, and information retrieval. IEEE Trans. Image Proc. 9, 561–576 (2000)
Article Google Scholar
Malmberg, F., Strand, R., Nyström, I.: Generalized hard constraints for graph segmentation. In: Heyden, A., Kahl, F. (eds.) SCIA 2011. LNCS, vol. 6688, pp. 36–47. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21227-7_4
Chapter Google Scholar
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vis. 59(2), 167–181 (2004)
Article Google Scholar
Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Patt. Anal. Mach. Intell. 33(5), 898–916 (2011)
Article Google Scholar
Kiran, B.R., Serra, J.: Global-local optimizations by hierarchical cuts and climbing energies. Pattern Recogn. 47(1), 12–24 (2014)
Article Google Scholar
Andres, B., Kappes, J.H., Beier, T., Köthe, U., Hamprecht, F.A.: Probabilistic image segmentation with closedness constraints. In: Proceedings of the ICCV 2011, pp. 2611–26181 (2011)
Google Scholar
Andres, B., et al.: Globally optimal closed-surface segmentation for connectomics. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 778–791. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33712-3_56
Chapter Google Scholar
Yarkony, J., Ihler, A., Fowlkes, C.C.: Fast planar correlation clustering for image segmentation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 568–581. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33783-3_41
Chapter Google Scholar
Pape, C., Beier, T., Li, P., Jain, V., Bock, D.D., Kreshuk, A.: Solving large multicut problems for connectomics via domain decomposition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–10 (2017)
Google Scholar
Zhang, C., Yarkony, J., Hamprecht, F.A.: Cell detection and segmentation using correlation clustering. In: Golland, P., Hata, N., Barillot, C., Hornegger, J., Howe, R. (eds.) MICCAI 2014. LNCS, vol. 8673, pp. 9–16. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10404-1_2
Chapter Google Scholar
Horňáková, A., Lange, J.H., Andres, B.: Analysis and optimization of graph decompositions by lifted multicuts. In: International Conference on Machine Learning, pp. 1539–1548 (2017)
Google Scholar
Keuper, M., Levinkov, E., Bonneel, N., Lavoué, G., Brox, T., Andres, B.: Efficient decomposition of image and mesh graphs by lifted multicuts. In: Proceedings of the ICCV 2015, pp. 1751–1759 (2015)
Google Scholar
Beier, T., Andres, B., Köthe, U., Hamprecht, F.A.: An efficient fusion move algorithm for the minimum cost lifted multicut problem. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 715–730. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_44
Chapter Google Scholar
Beier, T., Pape, C., Rahaman, N., Prange, T.E.A.: Multicut brings automated neurite segmentation closer to human performance. Nat. Methods 14(2), 101–102 (2017)
Article Google Scholar
Lee, K., Zung, J., Li, P., Jain, V., Seung, H.S.: Superhuman accuracy on the snemi3d connectomics challenge. arXiv preprint arXiv:1706.00120 (2017)
Jain, V., et al.: Supervised learning of image restoration with convolutional networks. In: Proceedings of the ICCV 2007, pp. 1–8 (2007)
Google Scholar
Ciresan, D.C., Giusti, A., Gambardella, L.M., Schmidhuber, J.: Deep neural networks segment neuronal membranes in electron microscopy images. In: Proceedings of the NIPS 2012 (2012)
Google Scholar
Arganda-Carreras, I., Turaga, S., Berger, D.: Crowdsourcing the creation of image segmentation algorithms for connectomics. Front. Neuroanat. 9, 142 (2015)
Article Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Quan, T.M., Hilderbrand, D.G., Jeong, W.K.: FusionNet: a deep fully residual convolutional neural network for image segmentation in connectomics. arXiv:1612.05360 (2016)
Nunez-Iglesias, J., Kennedy, R., Parag, T., Shi, J., Chklovskii, D.: Machine learning of hierarchical clustering to segment 2D and 3D images. PLoS One 8, e71715 (2013)
Article Google Scholar
Knowles-Barley, S., et al.: RhoanaNet pipeline: dense automatic neural annotation. arXiv:1611.06973 (2016)
Uzunbaş, M.G., Chen, C., Metaxsas, D.: Optree: a learning-based adaptive watershed algorithm for neuron segmentation. In: Golland, P., Hata, N., Barillot, C., Hornegger, J., Howe, R. (eds.) MICCAI 2014. LNCS, vol. 8673, pp. 97–105. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10404-1_13
Chapter Google Scholar
Meirovitch, Y., et al.: A multi-pass approach to large-scale connectomics. arXiv preprint:1612.02120 (2016)
Google Scholar
Januszewski, M., Maitin-Shepard, J., Li, P., Kornfeld, J., Denk, W., Jain, V.: Flood-filling networks. arXiv:1611.00421 (2016)
Zlateski, A., Seung, H.S.: Image segmentation by size-dependent single linkage clustering of a watershed basin graph. arXiv:1505.00249 (2015)
Parag, T., et al.: Anisotropic EM segmentation by 3D affinity learning and agglomeration. arXiv preprint 1707.08935 (2017)
Google Scholar
Turaga, S.C., et al.: Convolutional networks can learn to generate affinity graphs for image segmentation. Neural Comput. 22(2), 511–538 (2010)
Article Google Scholar
Turaga, S.C., Briggman, K.L., Helmstaedter, M., Denk, W., Seung, H.S.: Maximin affinity learning of image segmentation. arXiv:0911.5372 (2009)
Wolf, S., Schott, L., Köthe, U., Hamprecht, F.: Learned watershed: End-to-end learning of seeded segmentation. Proceedings of the ICCV 2017 (2017)
Google Scholar
Bai, M., Urtasun, R.: Deep watershed transform for instance segmentation. arXiv:1611.08303 (2016)
Xie, S., Tu, Z.: Holistically-nested edge detection. In: Proceedings of the ICCV 2015, pp. 1395–1403 (2015)
Google Scholar
Kokkinos, I.: Pushing the boundaries of boundary detection using deep learning. arXiv:1511.07386 (2015)
Cai, J., Lu, L., Xie, Y., Xing, F., Yang, L.: Pancreas segmentation in MRI using graph-based decision fusion on convolutional neural networks. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 674–682. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66179-7_77
Chapter Google Scholar
Meyer, F.: Topographic distance and watershed lines. Signal Process. 38(1), 113–125 (1994)
Article Google Scholar
Meyer, F.: Minimum spanning forests for morphological segmentation. In: Serra, J., Soille, P. (eds.) Mathematical Morphology and Its Applications to Image Processing, pp. 77–84. Springer, Dordrecht (1994). https://doi.org/10.1007/978-94-011-1040-2_11
Chapter Google Scholar
Falcão, A.X., Stolfi, J., de Alencar Lotufo, R.: The image foresting transform: theory, algorithms, and applications. IEEE Trans. Patt. Anal. Mach. Intell. 26(1), 19–29 (2004)
Article Google Scholar
Cormen, T.H.: Introduction to Algorithms. MIT press, Cambridge (2009)
MATH Google Scholar
Schlegel, P., Costa, M., Jefferis, G.S.: Learning from connectomics on the fly. Curr. Opin. Insect Sci. (2017)
Google Scholar
Funke, J., et al.: Large scale image segmentation with structured loss based deep learning for connectome reconstruction. IEEE Trans. Pattern Anal. Mach. Intell. (2018)
Google Scholar
Shen, W., Wang, B., Jiang, Y., Wang, Y., Yuille, A.: Multi-stage multi-recursive-input fully convolutional networks for neuronal boundary detection. arXiv preprint arXiv:1703.08493 (2017)
Weiler, M., Hamprecht, F.A., Storath, M.: Learning steerable filters for rotation equivariant CNNs. arXiv preprint arXiv:1711.07289 (2017)
Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings of the 8th International Conference on Computer Vision, vol. 2, pp. 416–423, July 2001
Google Scholar
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Mottaghi, R., et al.: The role of context for object detection and semantic segmentation in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
Google Scholar

Download references

Acknowledgements

The authors acknowledge partial support by DFG HA 4364/8-1 and DFG SFB 1129.

Author information

Authors and Affiliations

HCI/IWR, University of Heidelberg, Heidelberg, Germany
Steffen Wolf, Constantin Pape, Alberto Bailoni, Nasim Rahaman, Anna Kreshuk, Ullrich Köthe & Fred A. Hamprecht
EMBL Heidelberg, Heidelberg, Germany
Constantin Pape & Anna Kreshuk

Authors

Steffen Wolf
View author publications
You can also search for this author in PubMed Google Scholar
Constantin Pape
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Bailoni
View author publications
You can also search for this author in PubMed Google Scholar
Nasim Rahaman
View author publications
You can also search for this author in PubMed Google Scholar
Anna Kreshuk
View author publications
You can also search for this author in PubMed Google Scholar
Ullrich Köthe
View author publications
You can also search for this author in PubMed Google Scholar
Fred A. Hamprecht
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Constantin Pape .

Editor information

Editors and Affiliations

Google Research, Zurich, Switzerland
Vittorio Ferrari
Carnegie Mellon University, Pittsburgh, PA, USA
Martial Hebert
Google Research, Zurich, Switzerland
Cristian Sminchisescu
Hebrew University of Jerusalem, Jerusalem, Israel
Yair Weiss

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 4123 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wolf, S. et al. (2018). The Mutex Watershed: Efficient, Parameter-Free Image Partitioning. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds) Computer Vision – ECCV 2018. ECCV 2018. Lecture Notes in Computer Science(), vol 11208. Springer, Cham. https://doi.org/10.1007/978-3-030-01225-0_34

Download citation

DOI: https://doi.org/10.1007/978-3-030-01225-0_34
Published: 06 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01224-3
Online ISBN: 978-3-030-01225-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics