Abstract
Cephalometric analysis is an important tool used by dentists for diagnosis and treatment of patients. Tools that could automate this time consuming task would be of great assistance. In order to provide the dentist with such tools, a robust and accurate identification of the necessary landmarks is required. However, poor image quality of lateral cephalograms like low contrast or noise as well as duplicate structures resulting from the way these images are acquired make this task difficult. In this paper, a fully automatic approach for teeth segmentation is presented that aims to support the identification of dental landmarks. A 2-D coupled shape model is used to capture the statistical knowledge about the teeth’s shape variation and spatial relation to enable a robust segmentation despite poor image quality. 14 individual teeth are segmented and labeled using gradient image features and the quality of the generated results is compared to manually created gold-standard segmentations. Experimental results on a set of 14 test images show promising results with a DICE overlap of 77.2% and precision and recall values of 82.3% and 75.4%, respectively.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Radiographic images are a common tool used for diagnosis in dentistry. They support the dentist in identifying many teeth related problems. Caries, infections and bone abnormalities would be hard or impossible to detect during visual inspection only. This allows the dentist to choose the optimal treatment plan for the patient. There exist two categories of dental radiographic images: intra-oral and extra-oral [9]. Intra-oral images are obtained inside the patient’s mouth and only show specific regions of the set of teeth or individual teeth. They are mostly used to get more detailed information. Extra-oral images like cephalograms or panoramic radiographs capture the entire teeth region as well as the surrounding areas and provide fundamental information about the teeth of a patient.
Cephalometric analysis aims to extract this fundamental information from lateral cephalometric images. Therefore, several different landmark positions on soft tissue, dental or bony structures have been defined and identified in the image. The type and number of landmarks varies between different analysis methods (e.g. Steiner, Schwarz, Ricketts). Linear and angular measurements are computed based on the relative position of these landmarks. Figure 1 shows an example for a cephalometric analysis using different landmarks.
Several methods for automatic landmark detection in lateral cephalograms have been proposed in the past. In 2014Footnote 1 and 2015Footnote 2, Wang et al. [9] organized two Grand Challenges at the International Symposium on Biomedical Imaging (ISBI) on this topic and compared the performance of state-of-the-art methods. The best results were achieved by approaches utilizing Random Forests for classifying the intensity appearance of different landmarks while exploiting the spatial relations between landmarks using statistical shape models. Lindner et al. [6] presented a fully automatic landmarks annotation (FALA) where Random Forest regression-voting is used for both the detection of the skull and the localization of individual landmarks. Recently, Arik et al. [1] employed a convolutional neural network to detect landmarks and a statistical shape model to refine the landmark potions.
Despite all these efforts, the detection of these landmarks is still done manually or semi-automatically in the clinical context which is a very time consuming process [6]. Moreover, most of the presented approaches rely on the publicly available dataset from the ISBI 2015 Grand Challenge [9] which is composed of 400 images and features 19 landmarks positions. However, these 19 landmarks only include two dental landmarks, namely the incisal edge of the maxillary and mandibular central incisor (upper and lower incisal incision). Other dental landmarks like the root tip of the central incisors, the tip of the mesiobuccal cusp of the first molar or the posterior point of occlusion are not included and therefore not covered by these approaches.
To fill this gap, we propose an approach for fully-automatic teeth segmentation in these lateral cephalograms. To the best of our knowledge, there does not exist any other automatic segmentation method for teeth in cephalometric radiographs. The generated segmentations can later be used to support the identification of the dental landmarks by directly using the detected teeth contours of the corresponding teeth. Furthermore, the model could be extended to include more structures like bones or skin to further support the identification of additional landmarks and increase the robustness of the detection. Teeth segmentation in cephalograms is a challenging task. The lateral cephalogram is a projection of the patients skull onto a 2-D image plane from a lateral position which results in overlapping structures. This is especially evident in the teeth region. Asymmetries between the teeth on the left and right hemisphere of the patients like shape variations or different spatial configurations as well as variations in the head position of the patient during image acquisition result in duplicate structures. Like other radiographs, cephalograms also suffer from intensity variations, noise or low contrast.
To overcome these challenges, in this paper an approach for the automatic segmentation of teeth in lateral cephalometric radiographs using a coupled-shape model is presented. 14 individual teeth (excluding wisdom teeth) are segmented and labeled using a coupled shape model approach based on [10]. The 2-D coupled model combines the statistical knowledge about the shape of each tooth with information about their spatial relation. This combination of gradient image features (bottom-up information) with a priori statistical knowledge about the shape and position of the teeth (top-down information) leads to a more robust segmentation process [7], especially in case of poor image quality or unreliable image features. However, when local search algorithms like active shape models are used to find suitable image features, statistical models highly depend on a good initialization [4]. To solve this problem, we present a pre-processing step that will compute the required parameters like position and scale for the initialization of the model. The initialized model is then adapted to the cephalometric images using a step-wise adaptation process.
2 Methods
2-D Coupled Shape Model. The presented segmentation method is based on a coupled shape model consisting of individual deformable model items which are coupled by their spatial relation. It has already been successfully used on 3-D CT images in order to segment different structures in the head & neck area [8] and for teeth segmentation in 2-D panoramic radiographs, where it was combined with a convolutional neural network to handle the initialization [10]. The individual 2-D deformable model items are represented as statistical shape models and are generated using a point distribution model (PDM) [2] and principal component analysis (PCA). The contour of an individual item is hereby represented by 100 landmark points in form of the 2-dimensional pixel coordinates. During PCA, only the principal components describing 95% of the shape variation are kept. Additionally, each individual item also contains its relative position in relation to the center of mass of the complete model, described by an affine 2-D transformation. The coupled model is then created by combining all individual items, each one containing its shape information and its relative position. For more details about the 2-D coupled shape model, please refer to [10].
The coupled shape model used in this approach contains 14 individual teeth, namely the central and lateral incisors, the canine, first and second pre-molar and first and second molar, both maxillary and mandibular for the right hemisphere of the patient. The reason for only using the teeth of one hemisphere of the patient and not the complete set of 28 teeth is the lateral position the image is captured from. The teeth on both hemispheres will by roughly superimposed onto each other during image acquisition. However, the teeth are never perfectly superimposed but rather sightly shifted (mostly in horizontal direction), resulting in duplicated structures with a high overlap. Since the value and direction of the shift between the two hemispheres are arbitrary for each individual image, no meaningful statistical information will be gained by using the full set of 28 teeth. Wisdom teeth have not been included in the model due to the limited amount of training data available. The coupled model was trained based on a set of 14 manually annotated lateral cephalometric images.
Extraction of the two lines used for the approximation of the orientation of the occlusal plane. The left image in (a) and (b) depicts the contours found in the binary image. The colors indicate different line segments after contour splitting. The thick green and red line represent the detected jaw-line and spa-line, respectively. (Color figure online)
Model Initialization. A robust initialization of the mean model in terms of position and scale is required in order to adapt the model to the image features and segment the teeth successfully. Estimates for both of these values are computed from the input image. Additionally, the orientation of the occlusal plane is estimated and considered during model initialization. As a first step, histogram equalization and normalization are applied to the input image to ensure a similar brightness and contrast among all images. Then, several references are extracted from the image.
The estimation of the orientation of the occlusal plane is based on the mandibular jaw line and a line close to the anterior nasal spine (spa). Both are extracted from a binarized version of the input image using a contour segmentation based on the detection of zero crossings of the Laplacian of Gaussian (Log) (cf. Grau et al. [3]). The set of closed contours is split into parts based on the curvature of individual line fragments. The sought-after lines can then be extracted based on their length and orientation (see Fig. 2). The orientation of the occlusal plane is approximated by the orientation of the bisecting-line of those two lines.
The initial position of the mean model is determined by finding the tip of the central incisors. The region of interest (RoI) is restricted using the previously detected lines. After applying a binary thresholding (Otsu) to the RoI, pre-defined starting positions are used to analyze the contour of the binary mask and detect the target points. A rough approximation of these tip points is sufficient for a good initial model position. In order to estimate the scale factor for the initialization, the size of individual teeth in the input image is approximated. A reference line is defined using the previously determined position of the incisors as an anchor point in combination with the approximated orientation of the occlusal plane. Individual teeth are then separated similar to the approach of Jain and Chen [5]. Integral projection is used to compute the sum of pixel values along lines perpendicular to the reference line. The ‘gaps’ between teeth can be detected by analyzing these sum for local minima. After removing outliers, the scale factor for the initialization is computed by comparing the detected distances to the known distances of the mean model.
Model Adaptation. After the initializing the model in terms of position, rotation and scale, the model is adapted to the input image. The adaptation is done by minimizing an energy functional:
Hereby, \(E_{ext}\) is the external energy which is responsible for ensuring that the contour of model items moves in the direction of strong image features. \(E_{int}\) is the internal energy which restricts the model to stay within or close to the learned configuration space. t stands for the transformation describing the global position of the model and f for the vector describing the configuration of the coupled model. A gradient descent optimizer is used for the optimization process. The transformation parameters t are optimized first, and then the configuration and transformation parameters f, t are optimized jointly. Please refer to [10] and [8] for more information.
A multi-step approach is used to adapt the model to the image features which are gradient features computed on the input image. Thereby, the size of the set of model items which are actively adapted to the input image is progressively increased. This is done to ensure the best possible overlap between a model item and the corresponding teeth in the image before adapting the respective model item. All model items which are not actively matched to image features are only passively modified through the learned statistical information. Initially, only the incisors are adapted since they are used for the model initialization and therefore always have a good overlap. The teeth farther away from the incisors (e.g. molars) might, at that point, not match as good, depending on the patient’s configuration of the teeth (cf. Fig. 3(b)). By adapting these teeth during a later adaptation step, they have already been (passively) moved closer to their correct position and more reliable image features can be found. Starting from the central incisors, a new category of teeth (i.e. lateral incisor, canine, first pre-molar, and so on) is added after each adaptation step until the complete set of teeth is actively adapted. The final step of the adaptation process is a refinement step. Here, the contour of the individual teeth is only adpated based on the gradient features and no longer restricted by the statistical information. The final segmentation result of each individual tooth is stored as a binary image.
3 Experiments and Results
The presented fully-automatic segmentation approach has been evaluated on a separate test set of 14 manually annotated cephalometric images (referred to as gold-standard segmentations). These 14 images were not part of the training set. The test images have a resolution of either 1800\(\,\times \,\)2148 pixels or 1935\(\,\times \,\)2400 pixels. As a first step, it was visually inspected if the model was positioned, rotated and scaled correctly by the automatic initialization process since the quality of the final segmentation highly depends on a good initialization. Visual inspection was used because the multi-step adaptation approach only requires a good overlap of certain structures. The initial position was considered to be correct, if the incisor teeth of the mean model are overlapping with the incisor teeth in the input image. This was the case for all 14 test cases. The rotation of the model was regarded as correct if the orientation of the occlusal plane of the initialized mean model and the orientation of the occlusal plane of the teeth in the image are roughly the same. This was also true for all 14 test instances. The scale estimate was considered to be correct if the size of the scaled mean model roughly matches the size of the set of teeth in the input images. This estimation was sufficiently accurate for 12 out of the 14 test cases. For the two failed cases, the initial size of the model was too large. While the incisor teeth are still positioned correctly, the molar teeth are far away from their intended position. Even with the multi-step adaptation process, the model was unable to segment the teeth successfully in these cases. Figure 3 shows two correct and a failed initialization. The incorrect scale factor was caused by an incorrect separation of the teeth based on the integral projection, i.e. some teeth were not separated at all. Therefore, the reference values extracted from the image were too large, resulting in a too big scale factor for the model.
The final teeth segmentations of the 12 cases with a successful initialization have been compared to manually created gold-standard segmentations and evaluated in terms of the following metrics: precision, recall, accuracy, specificity, f-score and dice overlap. Since both specificity and accuracy consider the amount of background-pixels that have been correctly labeled as background (true-negatives), the evaluation needs to be restricted to a smaller region to retrieve meaningful results. Therefore, the evaluation is only performed on the minimum bounding box that covers both automatic- and gold-standard segmentation. The metric values for an individual test instance are computed by first calculating the values for each tooth separately. Then, these values are averaged over all teeth in that test instance. Finally, the average is computed over the remaining 12 instances. Table 1 shows the average metric values as well as minimum and maximum values for each category. Exemplary segmentation results are depicted in Fig. 4.
The metric values for individual teeth of the upper provided in Table 3, the ones for the lower jaw are provided in Table 2.
4 Discussion
The presented approach uses a coupled shape model to segment teeth in lateral cephalograms. The statistical knowledge about the shape and spatial configuration of the teeth is useful to handle the challenges of cephalometric images, like overlapping structures, noise low and contrast. Instead of only relying on image information, the a prior knowledge about the teeth helps to guide the search for suitable image features. The proposed initialization process provides robust estimates in terms of model placement and rotation. Only the scale estimation leaves room for improvement as it failed in 2 out of 14 cases, making a successful adaptation impossible. To the best of our knowledge, this is the first approach that successfully performs automatic teeth segmentation in lateral cephalograms.
Wisdom teeth have not been included in the model at the moment. There exists a high variation in their position and shape in between individual patients and not all patients have some or all wisdom teeth. With the limited amount data available, a meaningful shape model and estimate of their spatial position could not be computed. Wisdom teeth can be added in a future version of the model when sufficient training data is available.
From experience with other data modalities, we know that the approach is able to handle missing teeth, if the space originally occupied by the missing tooth is present. In that case, the mean shape model of the corresponding tooth can be placed into the gap and subsequent teeth can be positioned correctly. However, subsequent teeth will be labeled incorrectly if the gap is too small or no longer present. In the current test set, no patient was missing any teeth except for wisdom teeth. Overall, the presented approach provides promising segmentation results on a test set of 14 images.
Based on the segmentation result, a robust identification of dental landmarks for the cephalometric analysis should be possible. Moreover, many of the references extracted from the image for the model initialization can also be used to identify other landmarks. The statistical model can furthermore easily be extended to include additional structures like skin and bones to further improve the robustness of the segmentation and provide references for even more cephalometric landmarks.
5 Conclusion and Future Work
In this paper an automatic model-based approach for teeth segmentation in lateral cephalograms was presented. It provides a robust segmentation of the teeth and is an good basis for identifying dental landmarks for cephalometric analysis. Out of a set of 14 test images, 12 could be segmented successfully. For the 2 unsuccessful cases, the initialization of the model failed due to incorrect scale estimation. The achieved average DICE overlap is 77.2%. Average precision and recall values are 82.3% and 75.4%, respectively.
Future work includes increasing the robustness of the scale estimation for initialization and improving the segmentation accuracy. The amount of training data could be extended based on the data from the 2015 ISBI Grand Challenge on cephalometric landmark detection. However, manual labeling of all these images would be required. Most importantly, the approach is to be extended to identify the dental landmarks based on the segmentations and potentially other landmarks as well.
References
Arik, S., Ibragimov, B., Xing, L.: Fully automated quantitative cephalometry using convolutional neural networks. J. Med. Imaging 4, 1–11 (2017)
Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J.: Training models of shape from sets of examples. In: Hogg, D., Boyle, R. (eds.) BMVC92, pp. 9–18. Springer, London (1992). https://doi.org/10.1007/978-1-4471-3201-1_2
Grau, V., Alcaiz, M., Juan, M., Monserrat, C., Knoll, C.: Automatic localization of cephalometric landmarks. J. Biomed. Inform. 34(3), 146–156 (2001)
Heimann, T., Meinzer, H.P.: Statistical shape models for 3D medical image segmentation: a review. Med. Image Anal. 13(4), 543–563 (2009)
Jain, A.K., Chen, H.: Matching of dental X-ray images for human identification. Pattern Recognit. 37(7), 1519–1532 (2004)
Lindner, C., Wang, C.W., Huang, C.T., Li, C.H., Chang, S.W., Cootes, T.F.: Fully automatic system for accurate localisation and analysis of cephalometric landmarks in lateral cephalograms. Sci. Rep. 6, 33581 (2016)
McInerney, T., Terzopoulos, D.: Deformable models in medical image analysis: a survey. Med. Image Anal. 1(2), 91–108 (1996)
Steger, S., Jung, F., Wesarg, S.: Personalized articulated atlas with a dynamic adaptation strategy for bone segmentation in CT or CT/MR head and neck images. In: Medical Imaging 2014: Image Processing. vol. 9034, p. 90341I. International Society for Optics and Photonics (2014)
Wang, C.W., et al.: A benchmark for comparison of dental radiography analysis algorithms. Med. Image Anal. 31, 63–76 (2016)
Wirtz, A., Mirashi, S.G., Wesarg, S.: Automatic teeth segmentation in panoramic X-ray images using a coupled shape model in combination with a neural network (2018, accepted for publication at MICCAI 2018)
Acknowledgements
We thank Dr. Jan H. Willmann, University Hospital of DĂ¼sseldorf for providing the cephalometric images used in this work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Wirtz, A., Wambach, J., Wesarg, S. (2018). Automatic Teeth Segmentation in Cephalometric X-Ray Images Using a Coupled Shape Model. In: Stoyanov, D., et al. OR 2.0 Context-Aware Operating Theaters, Computer Assisted Robotic Endoscopy, Clinical Image-Based Procedures, and Skin Image Analysis. CARE CLIP OR 2.0 ISIC 2018 2018 2018 2018. Lecture Notes in Computer Science(), vol 11041. Springer, Cham. https://doi.org/10.1007/978-3-030-01201-4_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-01201-4_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01200-7
Online ISBN: 978-3-030-01201-4
eBook Packages: Computer ScienceComputer Science (R0)