Abstract
As easy-to-search semantic information, human clothing attributes have important research value in the field of computer vision. Existing attribute recognition methods encounter problems such as interference from environmental factors, and as a result show poor clothing positioning accuracy. To address these problems, a human attribute recognition method based on human pose estimation and multiple-feature fusion is proposed. First, some retrieval results are obtained for subsequent attribute recognition through appearance feature matching. Then, through a deep SSD-based human pose estimation method, the foreground area belonging to the human in the image is located, and the background interference is excluded. Finally, the analytical results of various methods are combined. The iterative smoothing process and the maximum posteriori probability assignment method are adopted to enhance the correlation between attribute labels and pixels, and the final attribute recognition results are obtained. Experiments on the benchmark dataset show that the performance of our model is improved, and solves the problems of inaccurate clothing label recognition and pixel resolution area deviation in a single recognition mode.
Similar content being viewed by others
References
Wang, J.Y, Zhu, X.T., Gong, S.G., et al.: Attribute recognition by joint recurrent learning of context and correlation. In: IEEE International Conference on Computer Vision, pp. 531–540 (2017)
Ke, X., Li, J., Guo, W.: Dense small face detection based on regional cascade multi-scale method. IET Image Process. 13(14), 2796–2804 (2019)
Ke, X., Zhou, M., Niu, Y., et al.: Data equilibrium based automatic image annotation by fusing deep model and semantic propagation. Pattern Recognit. 71, 60–77 (2017)
Niu, Y., Lin, W., Ke, X.: CF-based optimisation for saliency detection. IET Comput. Vis. 12(4), 365–376 (2018)
Kapuriya, B.R., Pradhan, D., Sharma, R.: Detection and restoration of multi-directional motion blurred objects. SIViP 13(5), 1001–1010 (2019)
Li, D.W., Chen, X.T., Huang, K.Q.: Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios. In: Asian Conference on Pattern Recognition, pp. 111–115 (2015)
Wang, J.Y., Zhu, X.T., Gong, S.G.: Discovering visual concept structure with sparse and incomplete tags. Artif. Intell. 250, 16–36 (2017)
Mliki, H., Zaafouri, R., Hammami, M.: Human action recognition based on discriminant body regions selection. SIViP 12(5), 845–852 (2018)
Khan, M.H., Farid, M.S., Grzegorzek, M.: Spatiotemporal features of human motion for gait recognition. SIViP 13(2), 369–377 (2019)
Gong, K., Liang, X.D., Zhang, D.Y., et al.: Look into person: self-supervised structure-sensitive learning and a new benchmark for human parsing. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6757–6765 (2017)
Ke, X., Zou, J., Niu, Y.: End-to-end automatic image annotation based on deep CNN and multi-label data augmentation. IEEE Trans. Multimed. 21(8), 2093–2106 (2019)
Wang, J., Yang, Y., Mao, J.H., et al.: CNN–RNN: a unified framework for multi-label image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2285–2294 (2016)
Li, Y., Lin, G.S., Zhuang, B.H., et al.: Sequential person recognition in photo albums with a recurrent network. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5660–5668 (2017)
Liu, X.H., Zhao, H.Y., Tian, M.Q., et al.: HydraPlus-Net: attentive deep features for pedestrian analysis. In: IEEE International Conference on Computer Vision, pp. 350–359 (2017)
Zhang, S., Li, X., Zong, M., et al.: Efficient kNN Classification with different numbers of nearest neighbors. IEEE Trans. Neural Netw. Learn. Syst. 29(5), 1774–1785 (2018)
Chanop, S., Richard, H.: Optimised KD-trees for fast image descriptor matching. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Tao, F.U., Yang, X.M., Wu, W., et al.: Retinex-based image enhancement framework by using region covariance filter. Soft. Comput. 22(5), 1399–1420 (2018)
Cai, B.L., Xu, X.M., Guo, K.L., et al.: A joint intrinsic–extrinsic prior model for Retinex. In: IEEE International Conference on Computer Vision, pp. 4020–4029 (2017)
Jin, M.G., Roth, S., Favaro, P.: Normalized blind deconvolution. In: European Conference on Computer Vision, pp. 694–711 (2018)
Liu, W., Anguelov, D., Erhan, D., et al.: SSD: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37 (2015)
Qiu, J.T., Wang, J., Yao, S., et al.: Going deeper with embedded FPGA platform for convolutional neural network. In: ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 26–35 (2016)
Chen, X., Yuille, A.: Parsing occluded people by flexible compositions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3945–3954 (2015)
Chu, X., Ouyang, W.L., Li, H.S., et al.: Structured feature learning for pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4715–4723 (2016)
Zhong, S., Chen, T., He, F., et al.: Fast Gaussian kernel learning for classification tasks based on specially structured global optimization. Neural Netw. 57, 51–62 (2014)
Chen, X.J., Yuille, A.L.: Articulated pose estimation by a graphical model with image dependent pairwise relations. In: Annual Conference on Neural Information Processing Systems, pp. 1736–1744 (2014)
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2002)
Yamaguchi, K., Kiapour, M.H., Berg, T.L.: Paper doll parsing: retrieving similar styles to parse clothing items. In: IEEE International Conference on Computer Vision, pp. 3519–3526 (2013)
Yamaguchi, K., Kiapour, M.H., Ortiz, L.E., et al.: Parsing clothing in fashion photographs. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3570–3577 (2012)
Liang, X., Liu, S., Shen, X., et al.: Deep human parsing with active template regression. IEEE Trans. Pattern Anal. Mach. Intell. 37(12), 2402 (2015)
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China under Grant 61972097, Grant 61672159, and Grant 61672158, in part by the Technology Guidance Project of Fujian Province under Grant 2017H0015, in part by the Natural Science Foundation of Fujian Province under Grant 2018J1798.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ke, X., Liu, T. & Li, Z. Human attribute recognition method based on pose estimation and multiple-feature fusion. SIViP 14, 1441–1449 (2020). https://doi.org/10.1007/s11760-020-01690-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-020-01690-8