Keywords

1 Introduction

Image segmentation is a widely used concept of considerable importance in the field of computer vision. It is applicable in the field of content-based image retrieval, medical imaging, object detection, recognition tasks, etc. In the current paper the articulated components in a 3D digital object are identified by segmentation. There are four main approaches of segmentation, namely thresholding, boundary detection, region-based and hybrid methods [7]. A survey on different segmentation and partitioning techniques of boundary meshes has been presented in [10], where the segmentation problem has been formulated as an optimization problem. Similarly, [13] provides a wide overview over the common binarization and segmentation methods used in 3D image processing. Further, a comparative study of some of the mesh segmentation algorithms and their results have been provided in [2].

A work on hierarchical segmentation of articulated shapes involving eigen functions of the Laplace-Beltrami operator and persistent homology has been reported in [9]. A volume-based shape-function called the shape-diameter-function (SDF) has been used in [11] to construct partitioning of skeletons which remain consistent across a family of objects. A novel hierarchical pose-invariant mesh segmentation algorithm has been proposed in [4] which helps in the extraction of important feature points and of the core component of the mesh using a spherical mirroring operation. A semantic oriented 3D mesh hierarchical segmentation problem in [12] uses enhanced topological skeleton to identify junction areas and obtain a fine segmentation of the object. An interactive and automatic method of model segmentation based on random walks has been demonstrated in [6].

2 Definitions and Preliminaries

  • Digital Object : A digital object \(\mathcal {A}\) is defined as a finite subset of \({\mathbb Z^3}\), with all its constituent points (i.e., voxels) having integer coordinates and connected in 26-neighborhood. Each voxel is equivalent to a 3-cell [5] centered at the concerned integer point. The isothetic distance between two points \(p(x_1,y_1,z_1)\) and \(q(x_2,y_2,z_2)\) is defined as the Minkowski norm \(L_{\infty }\) given by \(d_{\top }(p,q)=\max \{|{x_{1} - x_{2}}|,|{y_{1} - y_{2}}|,|{z_{1} - z_{2}}|\}\). The (isothetic) distance of a point p from an object \(\mathcal {A}\) is \(d_{\top }(p,\mathcal {A})=\min \{d_{\top }(p,q):q\in \mathcal {A}\}\), and the distance between two connected components \(\mathcal {A}_{1}\) and \(\mathcal {A}_{2}\) is \(d_{\top }(\mathcal {A}_{1}, \mathcal {A}_{2})=\min \{d_{\top }(p,q):p\in \mathcal {A}_{1},q\in \mathcal {A}_{2}\}\).

  • Antipodal points : Two points are antipodal (i.e., each is the antipode of the other) if they are diametrically opposite [1]. In mathematics, the concept of antipodal points is generalized to spheres of any dimension: two points on the sphere are antipodal if they are opposite through the centre. In the orthogonal domain, two voxels \(v_1\) and \(v'_1\) are termed as antipodal voxels w.r.t another voxel \(v_2\) if the following conditions are satisfied.

    • \(v_1\) and \(v'_1\) are located at diametrically opposite positions w.r.t \(v_2\).

    • \(|d_{\top } (v_1,v_2) - d_{\top } (v'_1,v_2)| < t\), where \(d_{\top } (v_1,v_2)\) and \(d_{\top } (v'_1,v_2)\) denote the isothetic distances between the corresponding pair of voxels and t is a threshold.

  • Articulated Components : A 3D object is said to be articulated if it consists of two or more flexibly connected rigid 3D object-components [8]. The components are attached through joints such that they can move w.r.t one another. The 3D model of a human body is an example of articulated object where the hands, legs, and head are the articulated components.

3 Proposed Work

Let us consider a 3D digital object A provided as a triangulated data set such that exactly two triangles are incident on each edge of the triangulation. Let the object be embedded on a 3D digital grid represented as a set of unit grid cubes (UGCs) each of length g. Our objective is to identify the articulated components of the object by exploiting its curve skeleton. The curve skeleton of an object is defined as a 3D curve which is single voxel thick, connected and is centered w.r.t to the object thereby capturing the object shape [3]. Figure 1(Right) shows the segmented articulated components of the digital object Tiger (Fig. 1(Left)) extracted by using its 3D curve skeleton in Fig. 1(Middle).

Fig. 1.
figure 1

A digital object Tiger (Left), its 3D curve skeleton (Middle), and identified articulated components (Right) (Color figure online).

3.1 Preprocessing

The 3D curve skeleton has been extracted using the 3D isothetic inner cover \(\underline{P}_{\mathbb G}(A)\). The 3D isothetic inner cover is defined as the 3D polyhedron of maximum volume defined w.r.t an underlying grid \({\mathbb G}\) having surfaces parallel to the coordinate planes and inscribing the entire object [3]. If a voxel p is intersected by one or more triangles on the object surface, then it is considered as an object voxel. A UGC is considered as partially object-occupied if at least one of its constituent voxels is a background voxel. Hence, a UGC intersected by one or more triangles is a partially object-occupied UGC. All object voxels within partially occupied UGCs constitute the set of boundary voxels \(\mathcal {B}\).

The combinatorial algorithm in [3] has been used to extract the 3D curve skeleton of a digital object. The voxels at the boundary of the inner cover is represented in a topological space. The object voxels enclosed by the inner cover and satisfying certain conditions along the three coordinate planes are expressed in another topological space. The topological spaces are related by exploiting the concepts of homotopy equivalence and homology computation. Homotopy equivalence of the two topological spaces indicates that the 3D curve skeleton accurately represents the shape of the object. The \(n^{th}\) homology groups of the two topological spaces are computed to be isomorphic to demonstrate that they are topologically identical in n-dimensional space. The 3D curve skeleton, thus obtained, is single voxel thick, connected, and centered w.r.t the object.

3.2 Segmentation of the 3D Curve Skeleton

Let S(A) be the set of voxels representing the 3D curve skeleton of the digital object A. If a voxel belonging to the curve skeleton is such that exactly one of the voxels in its 26-neighborhood belongs to the curve skeleton, then the voxel is termed as leaf voxel. Figure 2(a) shows a part of a digital object where the voxels representing the skeleton and the leaf voxel \(v_k\) are shown (in red).

Fig. 2.
figure 2

(a) A sample 3D curve skeleton (red), a leaf voxel \(v_k\), and pairs of antipodal voxels \(\{u_1,u_1'\}\) (blue), \(\{u_2,u_2'\}\) (green), and \(\{u_3,u_3'\}\) (orange) w.r.t \(v_k\). (b) The boundary voxel \(v_i\) is assigned the color of its nearest skeleton voxel \(v_{min}\) (red) (Color figure online).

Let L denote the list of leaf voxels in S(A). Let the voxel \(v_k\) belong to the curve skeleton such that \(v_k \in L\). Starting from \(v_k\), the object voxels may be traversed in 26 directions so that a boundary voxel is reached at the end of traversal in each direction. Hence we have 26 boundary voxels w.r.t \(v_{k}\). The 26 boundary voxels form 13 pairs of antipodal voxels (Sect. 2). Figure 2(b) shows three such pairs of antipodal voxels, \(\{u_1,u_1'\}\) (blue), \(\{u_2,u_2'\}\) (green), and \(\{u_3,u_3'\}\) (orange).

Let \((u_{j},u'_j)\) denote a pair of antipodal voxels w.r.t \(v_{k}\). Let \(d_{j}\) and \(d'_j\) be the distances in terms of voxels from \(v_{k}\) to \(u_{j}\) and \(u'_j\) respectively. If the difference between \(d_{j}\) and \(d'_j\) lies within a threshold say, t, then \(d_{m} = max(d_{j},d'_j)\) is calculated. A pair of antipodal voxels is said to be at more or less equal distance from \(v_k\) if the above threshold criterion is satisfied. Let \(dmin_{k}\) be the minimum value of \(d_{m}\) out of the thirteen values of \(d_m\) corresponding to the available pairs of antipodal voxels w.r.t \(v_{k}\). A segment number is assigned to \(v_{k}\). Let E denote the list of segmented skeleton voxels. \(v_{k}\) along with its segment number is inserted into E. As \(v_{k}\) has been traversed, it is marked as visited.

Since the 3D curve skeleton is single voxel thick and connected, a single unvisited neighbor is always available for a skeleton voxel. Let \(v_i\) be the unvisited neighbor of \(v_k\). The procedure of finding the minimum value of \(d_m\) is repeated with \(v_i\). Let \(dmin_{i}\) be the minimum value of \(d_{m}\) obtained w.r.t \(v_i\). If the difference between \(dmin_{k}\) and \(dmin_{i}\) lies within a threshold say, \(t_{1}\), then \(v_{i}\) is assigned with the same segment number as \(v_{k}\) and is inserted into E. The procedure is repeated as long as the threshold criterion with threshold \(t_{1}\) is satisfied. Every time a voxel is traversed, it is marked as visited. Once this condition fails, we start with the next leaf voxel in L and a new segment number is assigned to it. The entire procedure is repeated for every leaf voxel in L. The list E obtained at the end contains all the segmented skeleton voxels.

3.3 Identification of Articulated Components

Initially, the triangulated data set representing the 3D digital object is provided. \(\mathcal {B}\) is the set of boundary voxels representing the object surface. Let us consider the segmented skeleton in the form of the list E. For a boundary voxel \(v_{i} \in \mathcal {B}\), let \(v_{min}\) be the nearest (in terms of number of voxels) skeleton voxel (Fig. 2 (b)). Let \(d_{min}\) be the distance between \(v_{min}\) and \(v_{i}\). Now w.r.t \(v_{min}\), we consider the 13 pairs of antipodal voxels. Let \((u_{j},u'_j)\) denote a pair of such antipodal voxels. Let \(d_{j}\) and \(d'_j\) be the distances in terms of voxels from \(v_{min}\) to \(u_{j}\) and \(u'_j\) respectively. For a pair of antipodal voxels, if the difference between \(d_{j}\) and \(d'_j\) lies within a threshold t, then \(d_{m} = max(d_{j},d'_j)\) is computed. Out of the thirteen available pairs of antipodal voxels w.r.t \(v_{min}\), let \(d_{max}\) be the maximum value of \(d_{m}\). If \(d_{min}\) is less or equal to \((t_{2} \times d_{max})\), then the segment number of \(v_{min}\) is assigned to the boundary voxel \(v_{i}\). In Fig. 2(b), the color of the nearest skeleton voxel \(v_{min}\) is assigned to \(v_i\). The same procedure is followed for each and every boundary voxel belonging to \(\mathcal {B}\). The assignment of respective segment numbers to the boundary voxels leads to the proper identification of the articulated components of the digital object.

4 Algorithm Segmentation3D

The algorithm for the proposed work has been given in Fig. 3. Given the set of voxels S(A) which belongs to the 3D curve skeleton of the digital object A and the set of boundary voxels \(\mathcal {B}\) (voxels intersected by the triangulated surface of A), all the voxels of S(A) are initially marked unvisited (Steps 1 and 2 of Segmentation3D). A leaf voxel \(v_{k}\) of the 3D curve skeleton is considered. For each pair of antipodal voxels \((u_{j}, u'_j)\), Steps 2–8 of procedure FindDist are carried out and the resultant distance is returned (Step 12). The procedure FindDist calculates the minimum distance among the 13 pairs of antipodal voxels which are situated at more or less equal distance w.r.t \(v_{k}\). This minimum distance is stored in \(dmin_k\) (Step 5 of Segmentation3D). The leaf voxel \(v_{k}\) is then assigned a segment number, is marked as visited and is inserted into a list E along with its segment number (Steps 6–9 of Segmentation3D). Let \(v_i\) be an unvisited neighbor of \(v_k\) (Steps 10, 16, and 17). The procedure FindDist is called for \(v_{i}\) and the calculated distance is stored in \(dmin_{i}\) (Step 18). If the difference between \(dmin_{k}\) and \(dmin_{i}\) is within a threshold \(t_{1}\) (where \(t_{1}\) is a threshold based on some factors), then the segment number of \(v_{k}\) is assigned to \(v_i\) (Step 20). \(v_i\) is marked as visited and is inserted into E along with its segment number (Steps 21–22). Steps 13–23 is repeated for the unvisited neighbor of each \(v_{i}\). This procedure is continued until the condition in Step 19 fails. Once this condition fails, the entire procedure is repeated for the next leaf voxel. Steps 5–23 is carried out for every leaf voxel.

Fig. 3.
figure 3

The algorithm for the segmentation of articulated components (Color figure online).

Now that the 3D curve skeleton is segmented, the Surface-Seg procedure is used to segment the surface of the digital object A which leads to the identification of the articulated components of A. For each boundary voxel \(v_{i}\), the nearest skeleton voxel \(v_{min}\) is determined. The isothetic distance from \(v_{i}\) to \(v_{min}\) is stored in \(d_{min}\) (Steps 3–6 of Surface-Seg). All the voxels from \(v_i\) through \(v_{min}\) that constitutes the isothetic distance are object voxels. The maximum distance (\(d_{max}\)) among the 13 pairs of antipodal voxels which are situated at more or less equal distance w.r.t \(v_{min}\) is determined in Step 7. If \(d_{min}\) is less than or equal to \((t_2 \times d_{max})\) (where \(t_{2}\) is a threshold based on some factors), then the segment number of \(v_{min}\) is assigned to \(v_{i}\) (Steps 8–9). The process is repeated for each boundary voxel in Steps 2–9. Thus, a segment number is assigned to all the voxels on the object surface belonging to each articulated component. Hence, the surface of the object is segmented.

Fig. 4.
figure 4

Results for identification of articulated components in some digital objects (Color figure online).

5 Time Complexity

Let n be the number of voxels on the object surface and m be the number of voxels belonging to the 3D curve skeleton where \(m = O(n^{3/2})\). The leaf voxels are detected by traversing the curve skeleton in O(m) time. Starting from each leaf voxel the curve skeleton is segmented by traversing the voxels of the curve skeleton exactly once in O(m) time. For each skeleton voxel the procedure FindDist is executed in constant time, i.e., O(1).

Fig. 5.
figure 5

The 3D curve skeleton (black) of Human with the leaf voxels (red) (Left), the segmented curve skeleton (Middle) and the identified articulated components (Right) (Color figure online).

For each voxel on the surface, the closest skeleton voxel is identified in O(m) time. Assignment of segment number to the voxels on the object surface takes O(1) time (procedure FindDist). Hence, the Surface-Seg procedure is executed in \(O(n) \times O(m) = O(mn)\) time. The following improvement may, however, be suggested to reduce the complexity. While executing the procedure FindDist w.r.t each skeleton voxel, the skeleton voxel closest to each voxel on the surface may be identified. Hence, the procedure Surface-Seg will be executed in \(O(n) \times O(1) = O(n)\) time. Therefore, the total complexity of the segmentation algorithm is given by \(O(m) + O(m) + O(n) = O(n^{3/2})\,+\,O(n^{3/2})\,+\,O(n) \simeq O(n^{3/2})\).

6 Experimental Results

The implementation of the proposed algorithm has been done in C in Ubuntu 16.04, Intel Core i5-7500 CPU @ 3.40GHz \(\times \) 4. The segmented articulated components of the digital objects Hand, Ant, Horse, and Turtle has been demonstrated by the experimental results in Fig. 4. The step-by-step approach of the proposed method has been illustrated for the digital object Human in Fig. 5. It is observed that the algorithm can separate the articulated portions of the objects upto a high degree of accuracy.

7 Conclusion

The speciality of the proposed algorithm lies in considering all the three orthogonal directions along the coordinate axes (x, y, and z-axes) together for the operations instead of considering the coordinate planes (yz-, zx-, and xy-plane) separately. Since the process takes place at the voxel level, the accuracy of segmentation is independent of the grid resolution of the objects. As the segmentation of the curve skeleton is dependent on the geometry of the object rather than the topology of the curve skeleton, there have been instances of over-segmentation in some cases (Fig. 4, Turtle). Improvements based on topological techniques may be attempted in future to ensure natural segmentation. The robustness of the algorithm may be tested in future in terms of the rotation invariance and pose invariance.