Abstract
In this paper, we address the problem of automatic clothing parsing in surveillance images using the information from user-generated tags, such as “jeans” and “T-shirt.” Although clothing parsing has achieved great success in the fashion domain, it is quite challenging to parse target under practical surveillance conditions due to the presence of complex environmental interference, such as that from low resolution, viewpoint variations and lighting changes. Our method is developed to capture target information from the fashion domain and apply this information to a surveillance domain by weakly supervised transfer learning. Most target tags convey strong location information (e.g., “T-shirt” is always shown in the upper region), which can be used as weak labels for our transfer method. Both quantitative and qualitative experiments conducted on practical surveillance datasets demonstrate the effectiveness of the proposed surveillance data enhancing method.
Similar content being viewed by others
References
Li A, Liu L, Wang K, Liu S, Yan S (2015) Clothing attributes assisted person reidentification. IEEE Trans Circuits Syst Video Technol 25(5):869–878
Wang Z, Hu R, Liang C, Yu Y, Jiang J, Ye M, Chen J, Leng Q (2016) Zero-shot person re-identification via cross-view consistency. IEEE Trans Multimed 18(2):260–272
Ye M, Liang C, Yu Y, Wang Z, Leng Q, Xiao C, Chen J, Hu R (2016) Person reidentification via ranking aggregation of similarity pulling and dissimilarity pushing. IEEE Trans Multimed 18(12):2553–2566
Yang J, Franco J-S, Hétroy-Wheeler F, Wuhrer S (2016) Estimation of human body shape in motion with wide clothing. In: Proceedings of the European conference on computer vision (ECCV), Amsterdam, The Netherland, pp 439–454
Weber M, Bauml M, Stiefelhagen R (2011) Part-based clothing segmentation for person retrieval. In: Proceedings of the IEEE international conference on advanced video and signal based surveillance (AVSS), Klagenfurt, Austria, pp 361–366
Simo-Serra E, Fidler S, Moreno-Noguer F, Urtasun R (2014) A high performance CRF model for clothes parsing. In: Proceedings of the Asian conference on computer vision (ACCV), Singapore, pp 64–81
Yamaguchi K, Kiapour MH, Berg TL (2013) Paper doll parsing: retrieving similar styles to parse clothing items. In: Proceedings of the IEEE international conference on computer vision (CVPR), Portland, OR, USA, pp 3519–3526
Yang W, Luo P, Lin L (2014) Clothing co-parsing by joint image segmentation and labeling. In: Proceedings of the IEEE conference on computer vision pattern recognition (CVPR), Columbus, OH, USA, pp 3182–3189
Tangseng P, Wu Z, Yamaguchi K (2017) Looking at outfit to parse clothing. https://arxiv.org/abs/1703.01386
Bourdev L, Maji S, Malik J (2011) Describing people: a poselet-based approach to attribute classification. In: Proceedings of the IEEE conference on international conference on computer vision (ICCV), Barcelona, Spain, pp 1543–1550
Guan P, Freifeld O, Black MJ (2010) A 2D human body model dressed in Eigen clothing. In: Proceedings of the European conference on computer vision (ECCV), Heraklion, Crete, Greece, pp 285–298
Gallagher AC, Chen T (2008) Clothing cosegmentation for recognizing people. In: Proceedings of the IEEE conference on computer vision pattern recognition (CVPR), Anchorage, AK, USA, pp 1–8
Layne R, Hospedales TM, Gong S, Mary Q (2012) Person re-identification by attributes. In: Proceedings of the British machine vision conference (BMVC), Guildfor, UK, p. 8
Koestinger M, Hirzer M, Wohlhart P, Roth PM, Bischof H (2012) Large scale metric learning from equivalence constraints. In: Proceedings of the IEEE conference on computer vision pattern recognition (CVPR), Providence, Rhode Island, pp 2288–2295
Yang M, Yu K (2011) Real-time clothing recognition in surveillance videos. In: Proceedings of the IEEE international conference on image process. (ICIP), Brussels, Belgium, pp 2937–2940
Shi Z, Hospedales TM, Xiang T (2015) Transferring a semantic representation for person re-identification and search. In: Proceedings of the IEEE conference on computer vision pattern recognition (CVPR), Boston, MA, USA, pp 4184–4193
Oren M, Papageorgiou C, Sinha P, Osuna E, Poggio T (1997) Pedestrian detection using wavelet templates. In: Proceedings of the IEEE conference on computer vision pattern recognition (CVPR), San Juan, Puerto Rico, pp 193–199
Yamaguchi K, Kiapour MH, Ortiz LE, Berg TL (2012) Parsing clothing in fashion photographs. In: Proceedings of the IEEE conference on computer vision pattern recognition (CVPR), Providence, Rhode Island, pp 3570–3577
West J, Ventury D, Warnick S (2007) Spring research presentation: a theoretical foundation for inductive transfer. Brigham Young University, College of Physical and Mathematical Sciences
Chen Q, Huang J, Feris R, Brown LM, Dong J, Yan S (2015) Deep domain adaptation for describing people based on fine-grained clothing attributes. In: Proceedings of the IEEE conference on computer vision pattern recognition (CVPR), Boston, MA, USA, pp 5315–5324
Xiao T, Xia T, Yang Y, Huang C, Wang X (2015) Learning from massive noisy labeled data for image classification. In: Proceedings of the IEEE conference on computer vision pattern recognition (CVPR), Boston, MA, USA, pp 2691–2699
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Zheng Q, Chen J, Liang C, Fang W, Jing X, Hu R (2017) Transferring clothing parsing from fashion dataset to surveillance. In: Proceedings of the IEEE international conference on acoustics speech and signal processing (ICASSP), New Orleans, LA, USA, pp 1667–1671
Lin J, Yang H, Chen D et al (2019) Face parsing with RoI Tanh-Warping. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 5654–5663
Chu W, Hung WC, Tsai YH et al (2019) Weakly-supervised caricature face parsing through domain adaptation. In: Proceedings of the IEEE international conference on image processing (ICIP). IEEE, pp 3282–3286
Zhang P, Liu W, Lei Y et al (2020) RAPNet: residual atrous pyramid network for importance-aware street scene parsing. IEEE Trans Image Process 29:5010–5021
Chen H, Xu ZJ, Liu ZQ, Zhu SC (2006) Composite templates for cloth modeling and sketching. In: Proceedings of the IEEE conference on computer vision on pattern recognition (CVPR), vol 1, New York, NY, USA, pp 943–950
Wang F, Zhao Q, Yin B, Xu T (2016) Parsing fashion image into mid-level semantic parts based on chain-conditional random fields. IET Image Process 10(6):456–463
Chen D, Tang Y, Zhang H, Wang L, Li X (2019) Incremental factorization of big time series data with blind factor approximation. IEEE Trans Knowl Data Eng
Ke H, Chen D, Shi B, Zhang J, Liu X, Zhang X, Li X (2019) Improving brain E-health services via high-performance EEG classification with grouping Bayesian optimization. IEEE Trans Serv Comput. https://doi.org/10.1109/TSC.2019.2962673
Chen D, Hu Y, Wang L, Zomaya AY, Li X (2016) H-parafac: hierarchical parallel factor analysis of multidimensional big data. IEEE Trans Parallel Distrib Syst 28(4):1091–1104
Ruan W, Liu W, Bao Q, Chen J, Cheng Y, Mei T (2019) Poinet: Pose-guided ovonic insight network for multi-person pose tracking. In: Proceedings of the 27th ACM international conference on multimedia. ACM, pp 284–292
Liu F, Xue S, Wu J et al (2020) Deep learning for community detection: progress, challenges and opportunities
Yang L, Rodriguez H, Crucianu M, Ferecatu M (2017) Fully convolutional network with superpixel parsing for fashion web image segmentation. In: International conference on multimedia modeling (MMM), Reykjavík, Iceland, pp 139–151
He Y, Yang L, Chen L (2017) Real-time fashion-guided clothing semantic parsing: a lightweight multi-scale inception neural network and benchmark. In: AAAI conference on artificial intelligent (AAAI), San Francisco, CA, USA
Zhou L, Wang Z, Luo Y, Xiong Z (2020) Separability and compactness network for image recognition and superresolution. IEEE Trans Neural Netw Learn Syst 30(11):3275–3286
Yi P, Wang Z, Jiang K, Shao Z, Ma J (2020) Multi-temporal ultra dense memory network for video super-resolution. IEEE Trans Circuits Syst Video Technol 30(8):2503–2516
Jiang K, Wang Z, Yi P, Wang G, Lu T, Jiang J (2019) Edge-enhanced GAN for remote sensing image superresolution. IEEE Trans Geosci Remote Sens 57(8):5799–5812
Du B, Tang X, Wang Z et al (2019) Robust graph-based semisupervised learning for noisy labeled data via maximum correntropy criterion. IEEE Trans Cybern 49(4):1440–1453
Fernando B, Habrard A, Sebban M, Tuytelaars T (2013) Unsupervised visual domain adaptation using subspace alignment. In: Proceedings of the IEEE international conference on computer vision (ICCV), Sydney, NSW, Australia, pp 2960–2967
Saenko K, Kulis B, Fritz M, Darrell T (2010), Adapting visual category models to new domains. In: Proceedings of the European conference on computer vision (ECCV), Heraklion, Crete, Greece, pp 213–226
Bian W, Tao D, Rui Y (2012) Cross-domain human action recognition. IEEE Trans Syst Man Cybern B 42(2):298
Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) Cnn features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, pp 512–519
Hoffman J, Tzeng E, Donahue J, Jia Y, Saenko K, Darrell T (2014) One-shot adaptation of supervised deep convolutional models. In: Proceedings of the international conference on learning representations (ICLR), Banff, Canada
Li X, Zhang L, Du B et al (2017) Iterative reweighting heterogeneous transfer learning framework for supervised remote sensing image classification. IEEE J Sel Top Appl Earth Observ Remote Sens 10(5):1–14
Zhang Z, Zhao Y, Wang Y et al (2013) Transferring training instances for convenient cross-view object classification in surveillance. IEEE Trans Inf Forens Sec 8(10):1632–1641
Wu J, Zhu X, Zhang C et al (2014) Bag constrained structure pattern mining for multi-graph classification. IEEE Trans Knowl Data Eng 26(10):2382–2396
Wu J, Pan S, Zhu X et al (2015) Boosting for multi-graph classification. IEEE Trans Cybern 45(3):416–429
Dong Y, Liang T, Zhang Y, Du B (2020) Spectral-spatial weighted kernel manifold embedded distribution alignment for remote sensing image classification. IEEE Trans Cybern
Liang X, Lin L, Yang W, Luo P, Huang J, Yan S (2016) Clothes co-parsing via joint image segmentation and labeling with application to clothing retrieval. IEEE Trans Multimed 18(6):1175–1186
Liu S, Feng J, Domokos C, Xu H, Huang J, Hu Z, Yan S (2014) Fashion parsing with weak color-category labels. IEEE Trans Multimed 16(1):253–265
Elleuch M, Mezghani A, Khemakhem M et al (2019) Clothing classification using deep CNN architecture based on transfer learning. In: International conference on hybrid intelligent systems. Springer, Cham, pp 240–248
Wang Z, Du B, Guo Y (2020) Domain adaptation with neural embedding matching. IEEE Trans Neural Netw Learn Syst 31(7):2387–2397
Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), Columbus, OH, USA, pp 1717–1724
Liu S, Liang X, Liu L, Shen X, Yang J, Xu C, Lin L, Cao X, Yan S (2015) Matching-CNN meets KNN: quasi-parametric human parsing. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), Boston, MA, USA, pp 1419–1427
Xu X, Gong S, Hospedales T (2013) Cross-domain traffic scene understanding by motion model transfer. In: Proceedings of the 4th ACM/IEEE international workshop on Analysis and retrieval of tracked events and motion in imagery stream (ARTEMIS), Barcelona, Spain, pp 77–86
Rajagopal AK, Subramanian R, Ricci E, Vieriu RL, Lanz O, Ramakrishnan KR, Sebe N (2014) Exploring transfer learning approaches for head pose classification from multi-view surveillance images. Int J Comput Vis 109(1):146–167
Zheng WS (2012) Transfer re-identification: From person to set-based verification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), Providence, Rhode Island, pp 2650–2657
Li W, Zhao R, Wang X (2012) Human reidentification with transferred metric learning. In: Proceedings of the Asian conference on computer vision (ACCV), Daejeon, Korea, pp 31–44
Mckenna S (2015) Cross-scenario transfer person re-identification. IEEE Trans Circuits Syst Video Technol 26(8):1447–1460
Dong Q, Gong S, Zhu X (2017) Multi-task curriculum transfer deep learning of clothing attributes. In: IEEE winter conference on applications of computer vision (WACV), Santa Rosa, CA, USA, pp 520–529
Hoffman J, Tzeng E, Park T, Zhu JY, Isola P, Saenko K, Efros AA, Darrell T (2017) Cycada: Cycle-consistent adversarial domain adaptation. arXiv:1711.03213
Tsai YH, Hung WC, Schulter S, Sohn K, Yang MH, Chandraker M (2018) Learning to adapt structured output space for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), Salt Lake City, Utah, USA
Kumar MP, Packer B, Koller D (2010) Self-paced learning for latent variable models. In: Proceedings of the international conference on neural information processing system (NIPS), British Columbia, Canada, pp 1189–1197
Jiang L, Meng D, Yu SI, Lan Z, Shan S, Hauptmann A (2014) Self-paced learning with diversity. Montreal, Quebec, Canada, pp 2078–2086
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), Boston, MA, USA, pp 3431–3440
Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: delving deep into convolutional nets. In: Proceedings of the British machine vision conference (BMVC), Nottingham, UK
Luo P, Wang X, Tang X (2013) Pedestrian parsing via deep decompositional network. In: Proceedings of the IEEE international conference on computer vision (ICCV), Sydney, NSW, Australia, pp 2648–2655
Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on image analysis (SCIA), Ystad, Sweden, pp 91–102
Deng Y, Luo P, Loy CC, Tang X (2014) Pedestrian attribute recognition at far distance. In: Proceedings of the ACM international conference on multimedia (ACM MM), Orlando, FL, USA, pp 789–792
Funding
Funding was provided by National Natural Science Foundation of China (Grand Nos. 61872277, 61862015, U1611461, U1736206, 61876135, 61872362, 61671336, 61801335 and 61671332), National Key R&D Program of China (Grand No. 2017YFC0803700), Technology Research Program of Ministry of Public Security (Grand No. 2016JSYJA12), Hubei Province Technological Innovation Major Project (Grand Nos. 2016AAA015, 2017AAA123 and 2018AAA062), Nature Science Foundation of Hubei Province (Grand Nos. 2018CFA024 and 2019CFB472), Nature Science Foundation of Jiangsu Province (Grand No. BK20160386).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zheng, Q., He, Z., Liang, C. et al. Transferring fashion to surveillance with weak labels. Neural Comput & Applic 35, 13021–13035 (2023). https://doi.org/10.1007/s00521-020-05528-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-05528-9