MVANet: Multi-Task Guided Multi-View Attention Network for Chinese Food Recognition | IEEE Journals & Magazine | IEEE Xplore

MVANet: Multi-Task Guided Multi-View Attention Network for Chinese Food Recognition


Abstract:

Food recognition plays a much critical role in various health-care applications. However, it poses many challenges to current approaches due to the diverse appearances of...Show More

Abstract:

Food recognition plays a much critical role in various health-care applications. However, it poses many challenges to current approaches due to the diverse appearances of food dishes and the non-uniform composition of ingredients for the foods in the same category. Current methods primarily focus on the appearance of foods without considering their semantic information, easily finding the wrong attention areas of food images. Second, these methods lack the dynamic weighting of multiple semantic features in the modeling process. Thus this paper proposes a novel Multi-View Attention Network within the multi-task learning framework that incorporates multiple semantic features into the food recognition task from both ingredient recognition and recipe modeling. It also utilizes the multi-view attention mechanism to automatically adjust the weights of different semantic features and enables different tasks to interact with each other so as to obtain a more comprehensive feature representation. The experiments conducted on both ChineseFoodNet and VIREO Food-172 benchmark databases validate the proposed method with the obvious improvement of the performance and the lower parameter size.
Published in: IEEE Transactions on Multimedia ( Volume: 23)
Page(s): 3551 - 3561
Date of Publication: 06 October 2020

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.