Abstract:
Real-time and precise semantic segmentation is essential for robotic vision systems, particularly in dynamic and complex environments such as RoboCup SPL. This paper intr...Show MoreMetadata
Abstract:
Real-time and precise semantic segmentation is essential for robotic vision systems, particularly in dynamic and complex environments such as RoboCup SPL. This paper introduces a lightweight semantic segmentation model that combines the U-Net architecture with a MobileNetV2 subnetwork and the Convolutional Block Attention Module (CBAM). The UNet's encoder-decoder structure facilitates effective multi-scale feature fusion, while MobileNetV2 reduces computational overhead. CBAM further enhances the model's ability to focus on critical features. Evaluations on the RoboCup SPL instance segmentation dataset achieved a mean accuracy of 0.9370 . Through comparative analysis with various backbone networks and ablation studies, we investigated the effects of CBAM, joint loss functions, and data augmentation on model performance. Results indicate that the CBAM mechanism significantly improves segmentation accuracy, and both joint loss functions and data augmentation offer additional benefits. The proposed model achieves a favorable balance between accuracy and computational efficiency, making it highly suitable for real-time applications.
Published in: 2024 3rd International Conference on Artificial Intelligence, Internet of Things and Cloud Computing Technology (AIoTC)
Date of Conference: 13-15 September 2024
Date Added to IEEE Xplore: 13 November 2024
ISBN Information: