skip to main content
research-article

Transform, Warp, and Dress: A New Transformation-guided Model for Virtual Try-on

Published:16 February 2022Publication History
Skip Abstract Section

Abstract

Virtual try-on has recently emerged in computer vision and multimedia communities with the development of architectures that can generate realistic images of a target person wearing a custom garment. This research interest is motivated by the large role played by e-commerce and online shopping in our society. Indeed, the virtual try-on task can offer many opportunities to improve the efficiency of preparing fashion catalogs and to enhance the online user experience. The problem is far to be solved: current architectures do not reach sufficient accuracy with respect to manually generated images and can only be trained on image pairs with a limited variety. Existing virtual try-on datasets have two main limits: they contain only female models, and all the images are available only in low resolution. This not only affects the generalization capabilities of the trained architectures but makes the deployment to real applications impractical. To overcome these issues, we present Dress Code, a new dataset for virtual try-on that contains high-resolution images of a large variety of upper-body clothes and both male and female models. Leveraging this enriched dataset, we propose a new model for virtual try-on capable of generating high-quality and photo-realistic images using a three-stage pipeline. The first two stages perform two different geometric transformations to warp the desired garment and make it fit into the target person’s body pose and shape. Then, we generate the new image of that same person wearing the try-on garment using a generative network. We test the proposed solution on the most widely used dataset for this task as well as on our newly collected dataset and demonstrate its effectiveness when compared to current state-of-the-art methods. Through extensive analyses on our Dress Code dataset, we show the adaptability of our model, which can generate try-on images even with a higher resolution.

REFERENCES

  1. [1] Ayush Kumar, Jandial Surgan, Chopra Ayush, Hemani Mayur, and Krishnamurthy Balaji. 2019. Robust cloth warping via multi-scale patch adversarial loss for virtual try-on framework. In Proceedings of the ICCV Workshops.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Barratt Shane and Sharma Rishi. 2018. A note on the inception score. In Proceedings of the ICML Workshops.Google ScholarGoogle Scholar
  3. [3] Bertiche Hugo, Madadi Meysam, and Escalera Sergio. 2020. CLOTH3D: Clothed 3D humans. In Proceedings of the ECCV.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Bińkowski Mikołaj, Sutherland Dougal J., Arbel Michael, and Gretton Arthur. 2018. Demystifying MMD GANs. In Proceedings of the ICLR.Google ScholarGoogle Scholar
  5. [5] Bookstein Fred L.. 1989. Principal warps: Thin-plate splines and the decomposition of deformations. IEEE Trans. PAMI 11, 6 (1989), 567585. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Cao Zhe, Simon Tomas, Wei Shih-En, and Sheikh Yaser. 2017. Realtime multi-person 2D pose estimation using part affinity fields. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Cucurull Guillem, Taslakian Perouz, and Vazquez David. 2019. Context-aware visual compatibility prediction. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Deng Jia, Dong Wei, Socher Richard, Li Li-Jia, Li Kai, and Fei-Fei Li. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Dong Haoye, Liang Xiaodan, Gong Ke, Lai Hanjiang, Zhu Jia, and Yin Jian. 2018. Soft-gated warping-GAN for pose-guided person image synthesis. In Proceedings of the NeurIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Dong Haoye, Liang Xiaodan, Shen Xiaohui, Wang Bochao, Lai Hanjiang, Zhu Jia, Hu Zhiting, and Yin Jian. 2019. Towards multi-pose guided virtual try-on network. In Proceedings of the ICCV.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Dong Haoye, Liang Xiaodan, Shen Xiaohui, Wu Bowen, Chen Bing-Cheng, and Yin Jian. 2019. FW-GAN: Flow-navigated warping GAN for video virtual try-on. In Proceedings of the ICCV.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Dong Xue, Wu Jianlong, Song Xuemeng, Dai Hongjun, and Nie Liqiang. 2020. Fashion compatibility modeling through a multi-modal try-on-guided scheme. In Proceedings of the ACM SIGIR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Fincato Matteo, Landi Federico, Cornia Marcella, Fabio Cesari, and Cucchiara Rita. 2020. VITON-GT: An image-based virtual try-on model with geometric transformations. In Proceedings of the ICPR.Google ScholarGoogle Scholar
  14. [14] Ge Yuying, Zhang Ruimao, Wang Xiaogang, Tang Xiaoou, and Luo Ping. 2019. DeepFashion2: A versatile benchmark for detection, pose estimation, segmentation and re-identification of clothing images. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Goodfellow Ian, Pouget-Abadie Jean, Mirza Mehdi, Xu Bing, Warde-Farley David, Ozair Sherjil, Courville Aaron, and Bengio Yoshua. 2014. Generative adversarial nets. In Proceedings of the NeurIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Guan Peng, Reiss Loretta, Hirshberg David A., Weiss Alexander, and Black Michael J.. 2012. Drape: Dressing any person. ACM Trans. Graph. 31, 4 (2012), 110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Güler Rıza Alp, Neverova Natalia, and Kokkinos Iasonas. 2018. DensePose: Dense human pose estimation in the wild. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Kiapour M. Hadi, Han Xufeng, Lazebnik Svetlana, Berg Alexander C., and Berg Tamara L.. 2015. Where to buy it: Matching street clothing photos in online shops. In Proceedings of the ICCV. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Hahn Fabian, Thomaszewski Bernhard, Coros Stelian, Sumner Robert W., Cole Forrester, Meyer Mark, DeRose Tony, and Gross Markus. 2014. Subspace clothing simulation using adaptive bases. ACM Trans. Graph. 33, 4 (2014), 19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Han Xintong, Hu Xiaojun, Huang Weilin, and Scott Matthew R.. 2019. ClothFlow: A flow-based model for clothed person generation. In Proceedings of the ICCV.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Han Xintong, Wu Zuxuan, Wu Zhe, Yu Ruichi, and Davis Larry S.. 2018. VITON: An image-based virtual try-on network. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Heusel Martin, Ramsauer Hubert, Unterthiner Thomas, Nessler Bernhard, Klambauer Günter, and Hochreiter Sepp. 2017. GANs trained by a two time-scale update rule converge to a nash equilibrium. In Proceedings of the NeurIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Hsiao Wei-Lin and Grauman Kristen. 2018. Creating capsule wardrobes from fashion images. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Hsieh Chia-Wei, Chen Chieh-Yun, Chou Chien-Lung, Shuai Hong-Han, and Cheng Wen-Huang. 2019. Fit-me: Image-based virtual try-on with arbitrary poses. In Proceedings of the ICIP.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Hsieh Chia-Wei, Chen Chieh-Yun, Chou Chien-Lung, Shuai Hong-Han, Liu Jiaying, and Cheng Wen-Huang. 2019. FashionOn: Semantic-guided image-based virtual try-on with detailed human and clothing information. In Proceedings of the ACM Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Isola Phillip, Zhu Jun-Yan, Zhou Tinghui, and Efros Alexei A.. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Issenhuth Thibaut, Mary Jérémie, and Calauzènes Clément. 2020. Do not mask what you do not need to mask: A parser-free virtual try-on. In Proceedings of the ECCV.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Lee Hyug Jae, Lee Rokkyu, Kang Minseok, Cho Myounghoon, and Park Gunhan. 2019. LA-VITON: A network for looking-attractive virtual try-on. In Proceedings of the ICCV Workshops.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Jandial Surgan, Chopra Ayush, Ayush Kumar, Hemani Mayur, Krishnamurthy Balaji, and Halwai Abhijeet. 2020. SieveNet: A unified framework for robust image-based virtual try-on. In Proceedings of the WACV.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Jetchev Nikolay and Bergmann Urs. 2017. The conditional analogy GAN: Swapping fashion articles on people images. In Proceedings of the ICCV Workshops.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Johnson Justin, Alahi Alexandre, and Fei-Fei Li. 2016. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the ECCV.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Karras Tero, Laine Samuli, and Aila Timo. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Karras Tero, Laine Samuli, Aittala Miika, Hellsten Janne, Lehtinen Jaakko, and Aila Timo. 2020. Analyzing and improving the image quality of StyleGAN. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Kingma Diederik P. and Ba Jimmy. 2015. Adam: A method for stochastic optimization. In Proceedings of the ICLR.Google ScholarGoogle Scholar
  35. [35] Kuang Zhanghui, Gao Yiming, Li Guanbin, Luo Ping, Chen Yimin, Lin Liang, and Zhang Wayne. 2019. Fashion retrieval via graph reasoning networks on a similarity pyramid. In Proceedings of the ICCV.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Kuppa Gaurav, Jong Andrew, Liu Xin, Liu Ziwei, and Moh Teng-Sheng. 2021. ShineOn: Illuminating design choices for practical video-based virtual clothing try-on. In Proceedings of the WACV Workshops.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Lewis Kathleen M., Varadharajan Srivatsan, and Kemelmacher-Shlizerman Ira. 2021. VOGUE: Try-on by stylegan interpolation optimization. Retrieved from https://arXiv:2101.02285.Google ScholarGoogle Scholar
  38. [38] Li Peike, Xu Yunqiu, Wei Yunchao, and Yang Yi. 2019. Self-correction for human parsing. Retrieved from https://arXiv:1910.09777.Google ScholarGoogle Scholar
  39. [39] Liang Xiaodan, Liu Si, Shen Xiaohui, Yang Jianchao, Liu Luoqi, Dong Jian, Lin Liang, and Yan Shuicheng. 2015. Deep human parsing with active template regression. IEEE Trans. PAMI 37, 12 (2015), 24022414. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Liu Ziwei, Luo Ping, Qiu Shi, Wang Xiaogang, and Tang Xiaoou. 2016. DeepFashion: Powering robust clothes recognition and retrieval with rich annotations. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Lorenz Dominik, Bereska Leonard, Milbich Timo, and Ommer Bjorn. 2019. Unsupervised part-based disentangling of object shape and appearance. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Ma Liqian, Sun Qianru, Georgoulis Stamatios, Gool Luc Van, Schiele Bernt, and Fritz Mario. 2018. Disentangled person image generation. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  43. [43] Ma Qianli, Yang Jinlong, Ranjan Anurag, Pujades Sergi, Pons-Moll Gerard, Tang Siyu, and Black Michael J.. 2020. Learning to dress 3D people in generative clothing. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Manfredi Marco, Grana Costantino, Calderara Simone, and Cucchiara Rita. 2014. A complete system for garment segmentation and color classification. Mach. Vision Appl. 25, 4 (2014), 955969. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Minar Matiur Rahman and Ahn Heejune. 2020. CloTH-VTON: Clothing three-dimensional reconstruction for hybrid image-based virtual try-ON. In Proceedings of the ACCV.Google ScholarGoogle Scholar
  46. [46] Minar Matiur Rahman, Tuan Thai Thanh, Ahn Heejune, Rosin Paul, and Lai Yu-Kun. 2020. CP-VTON+: Clothing shape and texture preserving image-based virtual try-on. In Proceedings of the CVPR Workshops.Google ScholarGoogle Scholar
  47. [47] Mir Aymen, Alldieck Thiemo, and Pons-Moll Gerard. 2020. Learning to transfer texture from clothing images to 3d humans. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Morelli Davide, Cornia Marcella, and Cucchiara Rita. 2021. FashionSearch++: Improving consumer-to-shop clothes retrieval with hard negatives. In Proceedings of the Italian Information Retrieval Workshop.Google ScholarGoogle Scholar
  49. [49] Neuberger Assaf, Borenstein Eran, Hilleli Bar, Oks Eduard, and Alpert Sharon. 2020. Image based virtual try-on network from unpaired data. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  50. [50] Philbin James, Chum Ondrej, Isard Michael, Sivic Josef, and Zisserman Andrew. 2007. Object retrieval with large vocabularies and fast spatial matching. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Pons-Moll Gerard, Pujades Sergi, Hu Sonny, and Black Michael J.. 2017. ClothCap: Seamless 4D clothing capture and retargeting. ACM Trans. Graph. 36, 4 (2017), 115. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. [52] Raffiee Amir Hossein and Sollami Michael. 2020. GarmentGAN: Photo-realistic adversarial fashion transfer. Retrieved from https://arXiv:2003.01894.Google ScholarGoogle Scholar
  53. [53] Raj Amit, Sangkloy Patsorn, Chang Huiwen, Lu Jingwan, Ceylan Duygu, and Hays James. 2018. SwapNet: Image based garment transfer. In Proceedings of the ECCV.Google ScholarGoogle ScholarCross RefCross Ref
  54. [54] Rocco Ignacio, Arandjelovic Relja, and Sivic Josef. 2017. Convolutional neural network architecture for geometric matching. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  55. [55] Ronneberger Olaf, Fischer Philipp, and Brox Thomas. 2015. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the MICCAI.Google ScholarGoogle ScholarCross RefCross Ref
  56. [56] Salimans Tim, Goodfellow Ian, Zaremba Wojciech, Cheung Vicki, Radford Alec, and Chen Xi. 2016. Improved techniques for training GANs. In Proceedings of the NeurIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. [57] Simonyan Karen and Zisserman Andrew. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of the ICLR.Google ScholarGoogle Scholar
  58. [58] Song Xuemeng, Nie Liqiang, and Wang Yinglong. 2019. Compatibility modeling: Data and knowledge applications for clothing matching. Synth. Lect. Info. Conc. Retriev. Serv. 11, 3 (2019), 1138.Google ScholarGoogle ScholarCross RefCross Ref
  59. [59] Szegedy Christian, Vanhoucke Vincent, Ioffe Sergey, Shlens Jon, and Wojna Zbigniew. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  60. [60] Tiwari Garvita, Bhatnagar Bharat Lal, Tung Tony, and Pons-Moll Gerard. 2020. SIZER: A dataset and model for parsing 3D clothing and learning size sensitive 3D clothing. In Proceedings of the ECCV.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. [61] Vasileva Mariya I., Plummer Bryan A., Dusad Krishna, Rajpal Shreya, Kumar Ranjitha, and Forsyth David. 2018. Learning type-aware embeddings for fashion compatibility. In Proceedings of the ECCV.Google ScholarGoogle ScholarCross RefCross Ref
  62. [62] Wang Bochao, Zheng Huabin, Liang Xiaodan, Chen Yimin, Lin Liang, and Yang Meng. 2018. Toward characteristic-preserving image-based virtual try-on network. In Proceedings of the ECCV.Google ScholarGoogle ScholarCross RefCross Ref
  63. [63] Wang Ting-Chun, Liu Ming-Yu, Zhu Jun-Yan, Tao Andrew, Kautz Jan, and Catanzaro Bryan. 2018. High-resolution image synthesis and semantic manipulation with conditional GANs. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  64. [64] Wang Wenguan, Xu Yuanlu, Shen Jianbing, and Zhu Song-Chun. 2018. Attentive fashion grammar network for fashion landmark detection and clothing category classification. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  65. [65] Wu Zhonghua, Lin Guosheng, Tao Qingyi, and Cai Jianfei. 2019. M2E-try on net: Fashion from model to everyone. In Proceedings of the ACM Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. [66] Xiao Han, Rasul Kashif, and Vollgraf Roland. 2017. Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms. Retrieved from https://arXiv:1708.07747.Google ScholarGoogle Scholar
  67. [67] Yamaguchi Kota, Kiapour M. Hadi, and Berg Tamara L.. 2013. Paper doll parsing: Retrieving similar styles to parse clothing items. In Proceedings of the ICCV. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. [68] Yang Han, Zhang Ruimao, Guo Xiaobao, Liu Wei, Zuo Wangmeng, and Luo Ping. 2020. Towards photo-realistic virtual try-on by adaptively generating-preserving image content. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  69. [69] Yildirim Gokhan, Jetchev Nikolay, Vollgraf Roland, and Bergmann Urs. 2019. Generating high-resolution fashion model images wearing custom outfits. In Proceedings of the ICCV Workshops.Google ScholarGoogle ScholarCross RefCross Ref
  70. [70] Yu Ruiyun, Wang Xiaoqi, and Xie Xiaohui. 2019. VTNFP: An image-based virtual try-on network with body and clothing feature preservation. In Proceedings of the ICCV.Google ScholarGoogle ScholarCross RefCross Ref
  71. [71] Zhang Richard, Isola Phillip, Efros Alexei A., Shechtman Eli, and Wang Oliver. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  72. [72] Zhu Heming, Cao Yu, Jin Hang, Chen Weikai, Du Dong, Wang Zhangye, Cui Shuguang, and Han Xiaoguang. 2020. Deep Fashion3D: A dataset and benchmark for 3D garment reconstruction from single images. In Proceedings of the ECCV.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Transform, Warp, and Dress: A New Transformation-guided Model for Virtual Try-on

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Multimedia Computing, Communications, and Applications
          ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 2
          May 2022
          494 pages
          ISSN:1551-6857
          EISSN:1551-6865
          DOI:10.1145/3505207
          Issue’s Table of Contents

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 16 February 2022
          • Accepted: 1 August 2021
          • Revised: 1 July 2021
          • Received: 1 March 2021
          Published in tomm Volume 18, Issue 2

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Full Text

        View this article in Full Text.

        View Full Text

        HTML Format

        View this article in HTML Format .

        View HTML Format