In intelligent transportation systems, parsing traffic signs and transmitting traffic information to humans is an urgent need. However, despite the success achieved in the detection and recognition of low-level circular or triangular traffic signs, parsing the more complex and informative rectangular traffic signs remains unexplored and challenging. Our work is devoted to the topic called “Traffic Sign Understanding (TSU)”, which is aimed to parse various traffic signs and generate semantic descriptions for them. To achieve this goal, we propose an end-to-end framework that integrates component detection, content reasoning, and semantic description generation. The component detection module first detects initial components in the sign image. Then the content reasoning module acquires the detailed content of the sign, including final components, their relations, and layout category, which provide local and global information for the subsequent module. In the end, the semantic description generation module mines relational attributes and text semantic attributes from the preceding results, embeds them with the layout categories, and transforms them into semantic descriptions through a dynamic prediction transformer. The three modules are trained jointly in an end-to-end manner for optimizing the overall performance. This method achieves state-of-the-art performance not only in the final semantic description generation stage but also on multiple subtasks of the CASIA-Tencent CTSU Dataset. Abundant ablation experiments are provided to prove the effectiveness of this method.
The CASIA-Tencent CTSU Dataset analysed during the current study is available at http://www.nlpr.ia.ac.cn/databases/CASIA-Tencent CTSU/index.html. The Chinese Traffic Sign Database (CTSDB) analysed during the current study is available at http://www.nlpr.ia.ac.cn/pal/trafficdata/recognition.html.
The dataset is available at http://www.nlpr.ia.ac.cn/databases/CASIA-TencentCTSU/index.html.
Two English traffic signs as examples, their relation annotations, and semantic descriptions. Some of the auxiliary annotations are provided on the right, including components (texts in green boxes, symbols in yellow boxes, and arrowheads in red boxes) and their relations (association relations represented by pink lines and pointing relations represented by light blue lines) (Color figure online)
