Abstract:
Saliency prediction is an important way to understand human's behavior and has a wide range of applications. Although lots of algorithms have been designed to predict sal...Show MoreMetadata
Abstract:
Saliency prediction is an important way to understand human's behavior and has a wide range of applications. Although lots of algorithms have been designed to predict saliency for planar images, there are few works for 360° images. In this paper, we propose an encoder-decoder network for panoramic image saliency prediction. Dilated convolutional layers are deployed in the network, which can extract more representative features and improve the accuracy of saliency prediction. To deal with the image distortions in 360° images, our network takes cube map format as input and processes six faces of cube map simultaneously. Respecting the saliency distribution of ground truth, we also propose a new data augmentation method to train the network, which is validated to be helpful for performance improvement. Extensive experiments show that our method gives new state-of-the-art results on 360° image saliency prediction.
Published in: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 04-08 May 2020
Date Added to IEEE Xplore: 09 April 2020
ISBN Information: