Abstract:
Deep learning-based methods have shown excellent potential on object detection and pose estimation with vast amounts of training data to achieve good performance. Obtaini...Show MoreMetadata
Abstract:
Deep learning-based methods have shown excellent potential on object detection and pose estimation with vast amounts of training data to achieve good performance. Obtaining enough comprehensive manual labeling training data is a time-consuming and error-prone task in industrial scenes, and most current time-saving synthetic data generation methods require detailed high-quality CAD models. Instead, we introduce an image-to-image translation based synthetic data generation approach, which requires only untextured CAD models and a small number of real images. The proposed approach starts with edge maps extracted by mesh model projection. And each edge map is provided with realistic appearance learned from real images through image-to-image translation to achieve patch-level realism, and then superimposed on randomly selected backgrounds with a simple Cut-and-Paste strategy. Experiments on our custom dataset and T-LESS dataset show that our synthetic data is competitive with uncomprehensive real data and gives a considerable improvement when compared to the same degree of model rendering images. In addition, the proposed approach allows for extensibility to accommodate new parts with similar texture and structure. And it shows the potential to train image translation model on only one single object and synthesize data for multiples in one take.
Published in: IEEE Robotics and Automation Letters ( Volume: 7, Issue: 3, July 2022)