Abstract:
In this work, we develop an effective person search algorithm with natural language descriptions. The contributions of this work mainly include two aspects. First, we des...Show MoreMetadata
Abstract:
In this work, we develop an effective person search algorithm with natural language descriptions. The contributions of this work mainly include two aspects. First, we design a baseline language person search framework including three basic components: a deep CNN model to extract visual features, a bi-directional LSTM to encode language descriptions and the triplet loss to conduct cross-modal feature embedding. Second, we propose a novel mutually connected classification loss to fully exploit the identity-level information, which not only introduces the identification information into both image and language descriptions but also encourages the cross-modal classification probabilities of the same identity to be more similar. The experimental results on the CUHK-PEDES dataset demonstrate that our method achieves significantly better performance than other state-of-the-art algorithms.
Published in: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 12-17 May 2019
Date Added to IEEE Xplore: 17 April 2019
ISBN Information: