Abstract:
Shape representation for 3-D models is an important topic in computer vision, multimedia analysis, and computer graphics. Recent multiview-based methods demonstrate promi...Show MoreMetadata
Abstract:
Shape representation for 3-D models is an important topic in computer vision, multimedia analysis, and computer graphics. Recent multiview-based methods demonstrate promising performance for 3-D shape recognition and retrieval. However, most multiview-based methods ignore the correlations of multiple views or suffer from high computional cost. In this paper, we propose a novel multiview-based network architecture for 3-D shape recognition and retrieval. Our network combines convolutional neural networks (CNNs) with long short-term memory (LSTM) to exploit the correlative information from multiple views. Well-pretrained CNNs with residual connections are first used to extract a low-level feature of each view image rendered from a 3-D shape. Then, a LSTM and a sequence voting layer are employed to aggregate these features into a shape descriptor. The highway network and a three-step training strategy are also adopted to boost the optimization of the deep network. Experimental results on two public datasets demonstrate that the proposed method achieves promising performance for 3-D shape recognition and the state-of-the-art performance for the 3-D shape retrieval.
Published in: IEEE Transactions on Multimedia ( Volume: 21, Issue: 5, May 2019)