Elsevier

Neurocomputing

Volume 396, 5 July 2020, Pages 556-568
Neurocomputing

Deep learning-based visual ensemble method for high-speed railway catenary clevis fracture detection

https://doi.org/10.1016/j.neucom.2018.10.107Get rights and content

Abstract

This paper proposes an automatic visual inspection method for the fracture detection of clevises in the catenary systems of high-speed railways, using images of catenaries captured by an inspection vehicle. First, the clevises are extracted from the catenary image using a convolutional neural network based algorithm, known as the faster region-based convolutional neural network. Because the structure of catenary systems does not have many variations and the contextual information near a catenary fitting may have strong correlation with its category, the architecture of the original faster region-based convolutional neural network is modified to make use of the contextual information of the regions of interest in the images for object recognition. A crack detection process is then used to recognize the fractures of clevises. To detect the cracks, the edge map of the clevis sub-image is generated using a region-scalable fitting model. Areas where the cracks are most likely to occur are projected from a standard clevis image to the clevis sub-image by shape context matching and affine transformation matrix computation. The cracks are then recognized by calculating the wavelet entropy inside these areas followed by morphological filtering. Experimental results show that the modified faster region-based convolutional neural network architecture achieves better results in clevis extraction than the original architecture as well as some other state-of-art object detection models. The detection is not affected by the scaling, texture and grayscale changes of the clevises caused by the variation of shooting distance, shooting angle and illumination variations. The fractures of the clevises can be accurately and reliably detected using the fracture detection method proposed in this paper and the performance of this visual inspection method meets the strict requirements for catenary system maintenance.

Introduction

The maintenance of catenary systems is a crucial task for ensuring the safety of electrical railway operation. Traditionally, this task is carried out by railway workers who search for damaged catenary fittings that need to be replaced along the railway. With the development of the high-speed railway network, manual inspection is not able to meet the required efficiency and reliability. In recent years, computer vision-based detection methods have drawn great attention from railway companies and research institutions. The advantages of these methods include minimal interference with railway operation, low investment costs and high detection efficiency. Currently, computer vision-based detection methods have been successfully used in dynamic stagger measurement [1], [2], rail maintenance [3], [4], [5], [6] and active pantograph control [7], but the recognition of catenary fittings and the diagnosis of faults are still heavily dependent on the observation of workers.

In this paper, an automatic visual inspection method is proposed to detect the fracture of the cross link clevises of the high-speed railway catenary. The cross link clevises are used to connect the registration arms and cantilevers in catenary systems. Fig. 1 shows the physical locations of the clevises, the registration arms and the cantilevers, where cantilevers are marked in red and registration arms are marked in blue. The clevises are marked in green and displayed in red circles in the images. Pictures in Fig. 1(a) and (b) are taken from the opposite directions. For convenience, we name the clevises in Fig. 1(a) and (b) left clevises and right clevises, respectively. They are detected and analyzed separately in this paper. Fig. 2 shows the details of clevises. Clevis fractures are caused by constant vibration of the catenary system triggered by high-speed trains. It leads to the weakening of mechanical strength of the catenary system, which increases the possibility of pantograph-catenary accident. Fig. 3 shows examples of clevis fracture highlighted in red rectangles.

The process of clevis fracture detection can be divided into two steps, clevis extraction and fracture detection. In the field of object detection, a breakthrough happened in 2001 when the boosted cascade framework based on Haar-like features was proposed by Viola and Jones [8]. After that, the combination of a machine learning-based classifier and hand-crafted local features predominated the field for many years. The classifier trained with different types of local features was applied to a sliding window of the image to determine the presence of the object. Widely-used hand-crafted local feature descriptors included SIFT features [9], SURF features [10], Haar-like features [8], Histogram of Orientated Gradients (HOG) features [11] and Local Binary Pattern (LBP) features [12]. In [11], a linear Support Vector Machine (SVM) classifier trained with HOG features was used in the detection of pedestrians. In [13], Zhu et al. used integral histograms to efficiently calculate HOG features and adopted a cascade of rejecters to simplify the detection. In [14], deformable part models were proposed to detect objects that may have large variation in shape appearance. The object was divided into multiple components, and the relative positions of different components as well as the label of the object were learned by discriminative learning. In [15], five different features were combined using a weighted score-level feature fusion approach to improve the accuracy of object detection. In [16], sparse representation features were generated from HOG and LBP features using K-singular value decomposition, and the dimensions of HOG and LBP features were reduced using the principal component analysis (PCA) method.

Although the machine learning-based object detection methods have been widely used, designing feature descriptors that are both discriminative and generalized is not an easy job. It requires careful engineering and considerable domain expertise [17]. With the development of neuroscience and biology, the hierarchy of the visual cortex is discovered. When the neural excitation propagates from lower layers to higher layers of the visual cortex, the optical signal perceived by the retina will be transformed to feature representations that are more and more abstract. Inspired by the hierarchy of the visual cortex, convolutional neural networks (CNNs) are proposed by LeCun et al. in 1989 and are used in zip code recognition [18]. In 2012, Krizhevsky et al. created a “large, deep convolutional neural network” [19] named AlexNet. This network won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) [20] competition with a substantial improvement in image classification accuracy. In the next few years, the champions of ILSVRC were all convolutional neural networks. The superior ability of convolutional neural networks in object recognition and natural image classification is proved. Compared with traditional image processing methods, deep neural networks can self-adaptively extract features that effectively represent the key information of the image and learn the complex map function between the raw data and the image label.

Successes have been made in bridging the gap between natural image classification and object detection. In object detection, not only the type, but also the position of the object needs to be determined. Deep learning based object detection methods can be broadly divided into two types, regional proposal based methods and regression based methods. Regional proposal based methods first generate a series of regional proposals which may contain objects from the image. Then the regional proposals are inputted to a sub-network for object classification. Examples of this type of methods include R-CNN (region-based convolutional network) [21], fast R-CNN (fast region-based convolutional networks) [22] and faster R-CNN (towards real-time object detection with region proposal networks) [23]. Regression based methods do not rely on regional proposals. Instead, the object localization problem is treated as a regression problem from the beginning. Object detection is achieved based on the responses in different default boxes on the output spaces of a multi-task network. Examples of this type of methods include YOLO [24,25] (you only look once) and SSD (single shot multi-box detector) [26]. In this paper, the extraction of clevises is based on the faster R-CNN method.

The detection of fracture is based on the detection of cracks. This is achieved by analyzing the edge information of the clevis sub-image. Compared to traditional edge detection methods, the active contour models can be dynamically adapted to the contours of the objects with more flexibility and accuracy, and is capable of detecting weak edges that other gradient based methods may ignore. The introduction of the level set method has broadened the application range of active contour models. The basic idea of the level set method is that the curves can be implicitly represented by the zero level set of a function in the higher dimension (which is called the level set function). The level set function is deformed according to the partial differential equation (PDE) [27]. Existing implicit active contour models can be roughly categorized into two basic classes, the edge-based models [28] and the region-based models [29]. We focus on region-based models in this paper which generally have better performance in the presence of weak or discontinuous boundaries. Early popular region-based models tend to rely on intensity homogeneity [29]. However, the unevenly distributed light intensity (caused by the spotlights used for illumination and the cylinder-like shape of the clevis) on the surface of clevises may bring inhomogeneity in grayscale. The piecewise constant model [30] is able to handle intensity inhomogeneity but suffers from high computational cost. The region-scalable fitting (RSF) model proposed by Li et al. [31] draws upon intensity information in spatially varying local regions depending on a scale parameter. This method performs well in processing magnetic resonance images [32] and retinal blood vessel segmentation [33], and can be further incorporated into other models [34], [35], [36]. In this paper, the edge information of clevises is extracted based on the RSF model. Then the cracks are detected in the crucial areas (the areas in which the cracks are more likely to occur) obtained by image registration between the clevis sub-image and a standard clevis image.

The process of fracture detection is shown in Fig. 4. First, the RSF model is utilized to extract the edge information of the clevis sub-images. Then, the crucial areas, which are manually delineated in a standard clevis image, are projected to the clevis sub-image by shape context matching and affine transformation matrix computing. Finally, the clevis fracture detection is achieved by detecting cracks in the crucial areas. This is done by calculating the wavelet entropy inside the crucial areas and morphological filtering.

Section snippets

Catenary suspension image acquisition

Images of catenary systems are taken by CCD cameras mounted on the top of an inspection vehicle. As the inspection vehicle runs along the railway, the cameras are triggered automatically when a catenary pillar is detected. A sketch of the inspection vehicle is shown in Fig. 5. In order to eliminate the interference of image background, the catenary images are taken at night. LED spotlights are utilized for illumination. The captured catenary images are stored with the IDs of the corresponding

Clevis extraction

Because of the variation of illumination conditions, the grayscale distribution and texture on the surface of clevises are not invariable. Besides, the scale of the clevises may change with the shooting angle and shooting distance. Moreover, the existence of cantilevers, overhead lines, and insulators greatly increases the complexity of catenary images, making the extraction of clevises difficult. Considering the outstanding performance of CNN based methods in object detection and image

Edge information extraction

The edge information of the clevis sub-image is extracted using the RSF model. RSF model was first proposed by Li et al. [31]. Unlike other popular region-based active contour models, RSF model is capable of segmenting images with intensity inhomogeneity.

The basic idea of RSF model is to define a region-scalable fitting (RSF) energy function. For a given point x in the image, the local intensity fitting energy can be defined asFRSF(C,f1,f2)=λ1inside(C)[Kσ(xy)|I(y)f1(x)|2dy]dx+λ2outside(C)[

Experimental results and performance analyze

In this section, the performance of the proposed fracture detection method is evaluated both in the clevis extraction stage and the fracture detection stage.

Conclusions

This paper proposes a visual inspection method to detect clevis fractures in the high-speed railway catenary system based on multiple local features and RSF model. The clevis extractor trained based on the modified Faster R-CNN network can accurately extract the clevises from the image acquired under different image acquisition and illumination conditions. The proposed fracture detection method based on the RSF model and crucial area projection is reliable in most cases. Although false alarms

Conflict of interest

The authors declare that they do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Acknowledgment

This study was partially supported by National Natural Science Foundation of China (U1734202, U1434203), China Railway Science and Technology Major Research Project (2015J008-A), as well as Sichuan Province Youth Science and Technology Innovation Team (2016TD0012). The dataset collection were assisted by Guangzhou Railway Company and China Academy of Railway Sciences.

Ye Han received his B.Sc. degree in 2011 from Southwest Jiaotong University, Chengdu, China. Now he is a Ph.D. Candidate in School of Electrical Engineering, Southwest Jiaotong University. His main research field is intelligent detection of traction power supply system.

References (39)

  • İ. Aydin et al.

    A new computer vision approach for active pantograph control

  • P. Viola et al.

    Rapid object detection using a boosted cascade of simple features

  • D.G. Lowe

    Distinctive image features from scale-invariant keypoints

    Int. J. Comput. Vis.

    (2004)
  • N. Dalal et al.

    Histograms of oriented gradients for human detection

  • T. Ojala et al.

    Multiresolution gray-scale and rotation invariant texture classification with local binary patterns

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2002)
  • ZhuQ. et al.

    Fast human detection using a cascade of histograms of oriented gradients

  • P. Felzenszwalb et al.

    Object detection with discriminatively trained part based models

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2010)
  • ZhengC. et al.

    Multi-class fruit detection based on image region selection and improved object proposals

    Neurocomputing

    (2018)
  • KuangH. et al.

    Pedestrian detection based on gradient and texture feature integration

    Neurocomputing

    (2017)
  • Cited by (0)

    Ye Han received his B.Sc. degree in 2011 from Southwest Jiaotong University, Chengdu, China. Now he is a Ph.D. Candidate in School of Electrical Engineering, Southwest Jiaotong University. His main research field is intelligent detection of traction power supply system.

    Zhigang Liu received his B.Sc. degree in 1997, M.Sc. degree in 2000 and Ph.D. in 2003, all from Southwest Jiaotong University, Chengdu, China. Now he is a professor in School of Electrical Engineering, Southwest Jiaotong University. His current research interests include electrical relationships of vehicle grids in high-speed railways, power quality considering grid connections of new energies, pantograph-catenary dynamics, fault detection, status assessment, and active control.

    Yang Lv received his B.Sc. degree in 2017 from Southwest Jiaotong University, Chengdu, China, where he is currently pursuing the M.Sc, with a focus on the detection and diagnosis of the railway pantograph-catenary system.

    Kai Liu received his B.Sc. degree in 2017 from Southwest Jiaotong University, Chengdu, China, where he is currently pursuing the M.Sc, with a focus on the detection and diagnosis of the railway pantograph-catenary system.

    Changjiang Li received his B.Sc. degree in 2017 from Southwest Jiaotong University, Chengdu, China, where he is currently pursuing the M.Sc, with a focus on the detection and diagnosis of the railway pantograph-catenary system.

    Wenxuan Zhang is currently an Assistant Research Fellow with the Infrastructure Inspection Research Institute, China Academy of Railway Sciences. His research interests include data analysis for pantograph, and developing inspection equipment for the catenary system in high-speed railway.

    View full text