Abstract:
Bilinear convolutional neural networks (BCNN) model, the state-of-the-art in fine-grained image recognition, fails in distinguishing the categories with subtle visual dif...Show MoreMetadata
Abstract:
Bilinear convolutional neural networks (BCNN) model, the state-of-the-art in fine-grained image recognition, fails in distinguishing the categories with subtle visual differences. We design a novel BCNN model guided by user click data (C-BCNN) to improve the performance via capturing both the visual and semantical content in images. Specially, to deal with the heavy noise in large-scale click data, we propose a weakly supervised learning approach to learn the C-BCNN, namely W-C-BCNN. It can automatically weight the training images based on their reliability. Extensive experiments are conducted on the public Clickture-Dog dataset. It shows that: (1) integrating CNN with click feature largely improves the performance; (2) both the click data and visual consistency can help to model image reliability. Moreover, the method can be easily customized to medical image recognition. Our model performs much better than conventional BCNN models on both the Clickture-Dog and medical image dataset.
Date of Conference: 10-14 July 2017
Date Added to IEEE Xplore: 31 August 2017
ISBN Information:
Electronic ISSN: 1945-788X