Abstract:
As the global population ages and the labor force shrinks, using artificial intelligence (AI) technology to promote labor productivity growth has become a hot topic. The ...Show MoreMetadata
Abstract:
As the global population ages and the labor force shrinks, using artificial intelligence (AI) technology to promote labor productivity growth has become a hot topic. The emergence of empty-dish recycling robots has effectively alleviated the impact of the decline in labor productivity. This article proposes a multiscale stereoscopic attention (MSA) network YOLO-MSA to detect postprandial dishes for empty-dish recycling robots. First, the standard convolution is replaced with a Res2Net module, which improves the multiscale expressiveness of the network at a finer-grained level. Second, we adopt a Res2Net with different dilation rates and a novel stereoscopic attention mechanism to propose an MSA module, which is used for coarse-grained multiscale expression. Third, for multiscale feature learning in the dimensionality reduction process, dimension reduction spatial pyramid pooling (DRSPP) is proposed to fuse feature maps of different scales. Extensive experiments demonstrate the effectiveness of the proposed MSA module for multiscale feature learning. Furthermore, YOLO-MSA has achieved 98.47% mean Average Precision (mAP) on Dish-21, a dataset of the postprandial dishes, which is much higher than other state-of-the-art (SOTA) models, and has achieved an inference speed of 33.93 frames per second (FPS), which meets the needs of real-time detection of the postprandial dish for the empty-dish recycling robot. Test results on other public datasets show that the proposed YOLO-MSA has a better generalization ability. In summary, YOLO-MSA exhibits satisfactory multiscale feature expression ability, demonstrates effectiveness and robustness in postprandial dish detection, and has far-reaching significance for the development of empty-dish recycling robots.
Published in: IEEE Transactions on Instrumentation and Measurement ( Volume: 72)