Elsevier

Neurocomputing

Volume 153, 4 April 2015, Pages 286-299
Neurocomputing

Traffic sign segmentation and classification using statistical learning methods

https://doi.org/10.1016/j.neucom.2014.11.026Get rights and content

Highlights

  • We propose a complete procedure for traffic sign detection and shape classification.

  • It is robust against traffic sign rotations, translations, and scale variations.

  • It provides a good performance in a variety of circumstances (non-uniform lighting).

  • Our procedure yielded a good performance in experiments with real-world images.

Abstract

Traffic signs are an essential part of any circulation system, and failure detection by the driver may significantly increase the accident risk. Currently, automatic traffic sign detection systems still have some performance limitations, specially for achromatic signs and variable lighting conditions. In this work, we propose an automatic traffic-sign detection method capable of detecting both chromatic and achromatic signs, while taking into account rotations, scale changes, shifts, partial deformations, and shadows. The proposed system is divided into three stages: (1) segmentation of chromatic and achromatic scene elements using Lab and HSI spaces, where two machine learning techniques (k-Nearest Neighbors and Support Vector Machines) are benchmarked; (2) post-processing in order to discard non-interest regions, to connect fragmented signs, and to separate signs located at the same post; and (3) sign-shape classification by using Fourier Descriptors, which yield significant advantage in comparison to other contour-based methods, and subsequent shape recognition with machine learning techniques. Experiments with two databases of real-world images captured with different cameras yielded a sign detection rate of about 97% with a false alarm rate between 3% and 4%, depending on the database. Our method can be readily used for maintenance, inventory, or driver support system applications.

Introduction

Traffic signs constitute an essential part of any circulation system to control and guide traffic and to favor road safety [1], with a twofold role: conveniently regulating traffic and reporting to pedestrians and drivers on different aspects about road circulation. Nowadays, automatic traffic sign detection and recognition systems are of special interest in many applications, such as intelligent vehicles development and road maintenance. Regarding the first one, on-board automatic traffic sign detection systems aim to help users to detect and interpret traffic signs. Examples of industrial developments in this field are the traffic sign recognition module [2] used in Opel Eye® [3], or the method for detecting and recognizing traffic signs [4] used in Mercedes-Benz Traffic Sign Assist® [5]. Regarding road maintenance, appropriate positioning and maintenance of traffic signs clearly improve road safety, being supervised by authorities with regular inventory campaigns.

However, despite the vast amount of research recently conducted for automatic traffic sign detection and recognition, current traffic sign inventory and monitoring are still mainly carried out manually by an operator visualizing a video recording, and checking the presence, position and status of each traffic sign. This tedious task requires intense concentration during a long time, and errors can be originated by the operator׳s fatigue or by poor visibility conditions (as shown in Fig. 1). An automatic system can be designed for surpassing these difficulties, as well as for providing a significant cost reduction in the inventory process. The interest of traffic sign automatic identification is further supported by competitive challenges proposed by scientific and technical societies, see e.g. [6], [7].

Recent research works usually split the identification process in two stages, namely, detection and classification (recognition), which are designed using a representative set of images or videos (training set). Image segmentation techniques are used for the detection stage, and different approaches have been proposed depending on the type of image, i.e., true-color or gray-level. Regarding true-color images, there are two approaches, either working with the standard RGB color space used by digital cameras, or performing a deeper analysis of color information using other spaces for separating color and intensity information (such as HSI, HSV, YIQ, YUV, Luv or Lab) [8]. Many authors have addressed the segmentation by thresholding on RGB images, either at pixel level [9], or with more elaborated schemes, such as preprocessing with a Simple Vector Filter Algorithm before thresholding [10].

The main drawback of the RGB space is its high sensitivity to lighting changes, which hampers segmentation in scenes with excessive or insufficient light. For this reason, many works use color spaces that are theoretically more robust to lighting conditions than RGB. In this way, a nonlinear transformation of H and S components of the HSI color space was carried out in [11], with subsequent thresholding segmentation on the transformed components. Segmentation in [12], [13] was carried out in two stages: chromatic analysis, where components H and S were used to segment signs with predominant chromatic colors; and achromatic analysis, where RGB thresholding was performed for signs with prevalent achromatic colors. Other works resorted to other color spaces that allow an independent control of chromatic and achromatic information as well. The ab components of the Lab space were used to extract features by using a Gabor filter in [14], which were subsequently used to detect traffic signs. In [15], [16], the YUV space was used for thresholding segmentation. Also, the YCrCb space has been used for segmentation by using a dynamic thresholding scheme [17]. Whether to use the RGB space or other spaces separating chromatic and achromatic information remains controversial. On the one hand, the review in [18] evaluated and benchmarked several thresholding segmentation methods. Authors concluded that the best ones were those using normalization with respect to illumination, such as normalized RGB or Ohta Normalized, and that the use of HSI or YUV spaces did not provide a significant advantage. On the other hand, other authors [13], [14] proposed the use of HSI or YUV spaces using more elaborated segmentation schemes, improving the results provided by simpler thresholding-based methods.

Systems dealing with gray-level images are mainly focused on edges detection and their subsequent analysis. A shape-based approach for de-restriction signs detection was presented in [19], which used a black band detector to highlight regions of interest (ROI). A set of histograms of oriented gradients features were used in [20] to design a classifier with a boosting approach to detect pedestrians and traffic signs. A transformation for angle vertex and bisector detection was used in [21] to implement a gradient geometric model to detect triangular signs. In [22], a restricted Hough transform applied on the image contours was proposed as a traffic sign detection method. Nevertheless, detection techniques based on image-gradients and object-edges are very sensitive to noise and computationally expensive, requiring in most cases a complex preprocessing stage. In order to improve the detection stage, several works were proposed to use consecutive video frames to track traffic signs and to reduce false alarm and missing rates by using Kalman filtering [23], [24].

Three conclusions arise from the above review: (1) separating chromatic and achromatic segmentation by using HSI or Lab color spaces seems to improve segmentation performance; (2) achromatic segmentation is a difficult task, addressed by several works but with limited success; and (3) the potential of Lab and HSI spaces to separately segment chromatic and achromatic elements has not been fully exploited yet. These conclusions motivate the search of advanced segmentation techniques based on these spaces.

The problem of traffic sign recognition has been often tackled with matching techniques. As an example, a distortion-invariant Fringe-adjusted joint transform correlation technique was used in [14] to find correlation peaks among segmented regions and a set of patterns extracted from different traffic signs. Also, a dissimilarity measurement was used in [25] to classify the sign by matching its color with a set of patterns. Machine learning techniques have been applied to the traffic sign recognition problem too. In this way, a combination of Convolutional Neural Networks and Multilayer Perceptron was applied in [26] on images normalized by a contrast-limited adaptive histogram equalization, which achieved good performance for German traffic signs recognition. In [17], a hybrid classifier composed of Support Vector Machines (SVM) and Naive Bayes was fed with features provided by a Gabor filter bank. Other works [10], [12], [13], [18], [27], [28], [29] also used SVM as a classifier, taking into account different features such as shape signature or Pseudo-Zernike moments. Genetic Algorithms (GAs) have been less used, probably due to their high computational cost, and their application has been focused on adapting the use of certain features with other machine learning schemes. A GA was applied in [11], followed by a two-layer neural network, according to the Adaptive Resonance Theory paradigm for classification. In [30], affine transformation coefficients were used as GA parameters for sign detection. Regardless the classification algorithm, the previously mentioned features are highly dependent on traffic sign scaling, translation, or rotation. In [31], a robust contour-based descriptor was presented, so-called Shape Context, which describes each contour pixel through a coarse bidimensional histogram characterizing the edge distribution in its surrounding region. The Shape Context descriptor is a scale and rotation invariant, and has been successfully applied to describe cartoon characters, signs and other objects [32], [33], [34], [35]. A simpler descriptor also invariant to shift, scale and rotation is the Fourier Descriptor (FD), which has been successfully applied in different scenarios [36], [37]. Therefore, our aim was to design a high-quality sign shape classifier, based on the robust classification capacity of the SVM, and on the promising performance of FDs.

We present here a new automatic method to separately detect chromatic and achromatic signs in images taken in realistic scenarios. The proposed method achieves sign shape classification, and it is robust to sign rotations, scale changes, translations, shadows, and minor deformations. Our procedure is structured in three stages. First, the image is segmented using the Lab and HSI spaces, in order to separate the chromatic and achromatic traffic sign elements. For the chromatic segmentation, two machine learning techniques (k-Nearest Neighbors—k-NN and SVM) were benchmarked, and an additional HSI thresholding was used in the achromatic branch. Second, a post-processing stage improves the segmentation result by filtering out non-interest regions, which consists of merging fragmented signs and separating signs located at the same post (co-located signs). The later task is carried out by combining eigenvector decomposition and maxima dynamics. Third, a scheme of parallel SVMs is used to classify the shape of the segmented regions and to identify traffic signs. Because of FDs invariance with respect to scaling, shift, and rotation, they are used as the SVM input features. To sum up, the main contributions of the proposed system are: (1) a high performance segmentation scheme, based on Lab and HSI spaces, which consists of chromatic and achromatic sub-procedures; (2) a novel algorithm for subsequent separation of co-located signs; and (3) a shift, scale, and rotation invariant shape classification procedure, based on the use of FDs and SVMs, which is able to classify different complex shapes.

The paper is organized as follows. Next section presents an overview of the proposed methodology. Stages for segmentation, post-processing and detection are detailed in 3 Segmentation stage, 4 Post-processing stage, 5 Shape classification stage, respectively. Database description and experimental results are exposed in Section 6. Finally, main conclusions are drawn in Section 7.

Section snippets

System overview

The proposed system aims to detect traffic signs with two assumptions: first, signs must be totally inside the image frame (not cut by the image borders); and second, they must be located at a suitable distance (distinguishable by the naked eye). The traffic signs in our experiments corresponded to those used by the Traffic Department of Spain, which have very particular and distinctive colors and shapes (see examples in Fig. 2). As sketched in Fig. 3, the procedure has the following three

Segmentation stage

The goal of this stage is to extract those ROIs which are likely to be traffic signs. Segmentation algorithms usually consider criteria of similarity or connectivity. In our case, traffic signs usually stand out from their surroundings mainly due to their characteristic colors, and segmentation is tackled by a pixel classification process based on color criteria. Since some traffic signs are black and white, chromatic and achromatic image components are independently treated, and the Lab and

Post-processing stage

The aim of the post-processing stage is to refine previous segmentation by discarding certain regions and correcting undesired effects. In this setting, separate procedures for the results of chromatic and achromatic segmentations are proposed, which are next described in detail (see an overview in Fig. 3).

Shape classification stage

Regions provided by the post-processing stage are now classified into six geometric shapes, namely, circle, triangle, square, rectangle, arrow, and semicircle (though the procedure can be easily generalized to any other shape). Two different situations are taken into account, namely, signs segmented in a single ROI (see example in Fig. 5b) and signs fragmented in several ROIs (see examples in Figs. 8f and 9b). The shape classification has two objectives: first, to provide a way of filtering

Experimental results

In this section, the global performance of the proposed system is evaluated. First, the database used in the system design and evaluation is presented. Second, the procedure is revisited and results are illustrated with two complete examples. Finally, the merit figures used for the assessment and the statistical results are described.

Conclusions

This paper presented a complete procedure for chromatic and achromatic traffic sign detection and shape classification. A segmentation stage based on Lab and HSI spaces represents the first contribution of this work. This procedure has shown a high performance with color and black and white signs. The second contribution is the post-processing stage, where the algorithm for separating co-located signs provides excellent results. The description of sign shapes by means of Fourier Descriptors,

Acknowledgments

The authors would like to thank IPS-Vial [43], Spanish company specialist in inventory and traffic signs projects, for its help in providing the image database, collected as part of their inventory process.

This work has been partly supported by TSI-020100-2009-735, TEC2010-19263 and TEC2013-48439-C4-1-R Projects from Spanish Government, and by Prometeo Project from the Secretariat for Higher Education, Science, Technology and Innovation of the Republic of Ecuador (Ref: BA821926). J.M.L.C. is

J.M. Lillo-Castellano received the Telecommunication Engineering degree and the MSc in Information and Communication Technology in Biomedical Engineering from the Rey Juan Carlos University, Madrid, Spain, in 2010 and 2012, respectively. Currently, he is a PhD Candidate and working as hired researcher at the Department of Signal Theory and Communications, Telematics and Computing, Rey Juan Carlos University. His research interests include Multivariate Data Analysis, Machine Learning, Digital

References (45)

  • J. Abukhait, I. Abdel-Qader, J. Oh, O. Abudayyeh, Occlusion-invariant tilt angle computation for automated road sign...
  • Z. Chen, J. Yang, B. Kong, A robust traffic sign recognition system for intelligent vehicles, in: International...
  • S. Maldonado-Bascón et al.

    Road-sign detection and recognition based on support vector machines

    IEEE Trans. Intell. Transp. Syst.

    (2007)
  • J. Khan et al.

    Image segmentation and shape analysis for road-sign detection

    IEEE Trans. Intell. Transp. Syst.

    (2011)
  • J. Miura, T. Kanda, Y. Shirai, An active vision system for real-time traffic sign recognition, in: IEEE Conference on...
  • P. Sermanet, Y. LeCun, Traffic sign recognition with multi-scale convolutional networks, in: International Joint...
  • Y. Fatmehsan, A. Ghahari, R. Zoroofi, Gabor wavelet for road sign detection and recognition using a hybrid classifier,...
  • H. Gómez-Moreno et al.

    Goal evaluation of segmentation algorithms for traffic sign recognition

    IEEE Trans. Intell. Transp. Syst.

    (2010)
  • C. Caraffi, E. Cardarelli, P. Medici, P. Porta, G. Ghisio, G. Monchiero, An algorithm for Italian de-restriction signs...
  • G. Overett, L. Petersson, L. Andersson, N. Pettersson, Boosting a heterogeneous pool of fast hog features for...
  • R. Belaroussi, J. Tarel, Angle vertex and bisector geometric model for triangular road sign detection, in: Workshop on...
  • M. García-Garrido et al.

    Complete vision-based traffic sign recognition supported by an I2V communication system

    Sensors

    (2012)
  • Cited by (99)

    View all citing articles on Scopus

    J.M. Lillo-Castellano received the Telecommunication Engineering degree and the MSc in Information and Communication Technology in Biomedical Engineering from the Rey Juan Carlos University, Madrid, Spain, in 2010 and 2012, respectively. Currently, he is a PhD Candidate and working as hired researcher at the Department of Signal Theory and Communications, Telematics and Computing, Rey Juan Carlos University. His research interests include Multivariate Data Analysis, Machine Learning, Digital Image Processing, and their applications to Bioengineering.

    I. Mora-Jiménez received the Telecommunication Engineering degree in 1998 from the Polytechnic University of Valencia, Spain, and the PhD degree in Telecommunication in 2004 from Carlos III University of Madrid, Spain. Currently, she is an associate professor in the Department of Signal Theory and Communications, Telematics and Computing at Rey Juan Carlos University, Madrid, Spain. Her main research interests include Statistical Learning Theory, Neural Networks, and their applications to Image Processing, Bioengineering, and Communications.

    C. Figuera-Pozuelo received the Telecommunication Engineering degree in 2002 from the Polytechnic University of Madrid, Spain, and the PhD degree in Telecommunication in 2009 from Carlos III University of Madrid, Spain. He is currently working as an associate professor in the Department of Signal Theory and Communications, Telematics and Computing at Rey Juan Carlos University, Madrid, Spain. His research interests include Signal Processing for Wireless Communications and Statistical Learning Theory with applications.

    J.L. Rojo-Álvarez received the Telecommunication Engineering degree in 1996 from the University of Vigo, Spain, and the PhD degree in Telecommunication in 2000 from the Polytechnic University of Madrid, Spain. Since 2006, he has been an associate professor at the Department of Signal Theory and Communications, Telematics and Computing, Rey Juan Carlos University, Madrid, Spain. He has published more than 70 papers in JCR journals and more than 100 international conference communications. He has participated in more than 50 projects (with public and private funding), and directed more than 10 of them, including several actions in the National Plan for Research and Fundamental Science. He was awarded in 2009 with the I3 Prize of Spanish Science and Innovation Ministry to the research path.

    View full text