Elsevier

Neurocomputing

Volume 341, 14 May 2019, Pages 156-167
Neurocomputing

Toward AI fashion design: An Attribute-GAN model for clothing match

https://doi.org/10.1016/j.neucom.2019.03.011Get rights and content

Abstract

Dressing in clothes based on the matching rules of color, texture, shape, etc., can have a major impact on perception, including making people appear taller or thinner, as well as exhibiting personal style. Unlike the extant fashion mining literature, in which style is usually classified according to similarity, this paper investigates clothing match rules based on semantic attributes according to the generative adversarial network (GAN) model. Specifically, we propose an Attribute-GAN to generate clothing-match pairs automatically. The core of Attribute-GAN constitutes training a generator, supervised by an adversarial trained collocation discriminator and attribute discriminator. To implement the Attributed-GAN, we built a large-scale outfit dataset by ourselves and annotated clothing attributes manually. Extensive experimental results confirm the effectiveness of our proposed method in comparison to several state-of-the-art methods.

Introduction

Fashion is a form of expression that can highlight each person’s personality when he or she is authentic with individual style choices. Coco Chanel expressed this succinctly when she stated, “Fashion fades, only style remains the same.” Traditional fashion design relies on the designer’s personal creative sense, which can possess uncertainty and subjectivity. With the advent of the Big Data era, fashion design patterns have changed. Indeed, fashion style can now be analyzed by machine learning from images and textual descriptions of clothing, shoes, or accessories. Fashion trends are the result of numerous factors, such as color, shape, texture, patterns, etc. The items in an outfit should also be at least subtly compatible regarding these factors. Fig. 1 presents several attractive outfits that are composed of a set of clothes. Although they are all fine collocations, these outfits have very different styles.

Observing people’s dressing habits, we find that whether or not an outfit matches is mostly determined by clothing attributes, e.g., boxy matches stretchy; light blue matches white; panels match stripes, etc. It is possible to mine these rules of clothing match through artificial intelligence (AI). Specifically, this paper aims to elucidate latent match rules considering clothing attributes under the framework of the generative adversarial network (GAN). These latent match rules are then utilized to generate outfit composition.

Fashion learning has recently attracted great attention in the computer vision field due to its vast lucrative applications. In fact, a large body of literature exists that focuses on clothing segmentations [10], [14], [34], [36], [37], [39], recognition [15], [17], and fashion image retrieval [7], [8], [20], [21], [22]. Instead of assigning a semantic clothing label to each pixel of a person in an image, some other works have focused on identifying fashion ability [28], [38] or occupation [30] from the clothing in images. In addition, some researchers have explored methods for clothing retrieval, including within-scenario retrieval [20] and cross-scenario retrieval [21], [22]. However, modeling fashion collocation presents certain challenges. On the one hand, fashion style is subtle and subjective. Consequently, since the sensitivity of fashion varies from person to person, it can be difficult to unify the labeled data. On the other hand, it is quite challenging to obtain a detailed and complete set of attributes to describe whether or not a match is stylish. As a result, few existing studies focus on identifying why one outfit looks good, and then provide advice for creating a well composed outfit. Since Goodfellow et al. [9] proposed the generative adversarial network (GAN) in 2014, various derivative GAN models have been proposed. The innovative components of these models include model structure improvement [6], [24], [25], [27], [35], [41], theoretical extension [1], [2], [3], [16], [40], [42], and applications [12], [32], [35], [43], [44]. In this paper, we pilot the use of artificial intelligence (AI) in the fashion industry by developing an Attribute-GAN model. We propose Attribute-GAN to generate clothes, which takes a fashion outfit associated with clothing attributes as the input. The generator is trained to produce clothing pairs. However, two adversarial trained discriminators are respectively used who can predict whether or not the clothing pairs match or whether or not the attributes of fake clothing in these pairs are correct. The trained GAN strengthens the power of the generator to generate pairs that match on some attributes. This is more consistent with people’s dressing habits and aesthetics, instead of style similarity.

To evaluate the effectiveness of our proposed model, we built an outfit dataset containing over 160,000 clothing images, approximately 40,000 of which were annotated with attributes by human labelers. The attributes were compiled from key terms frequently retrieved in several major e-commerce sites. For evaluation, we employed two tactics, a subjective method and an objective method, for evaluating the authenticity of fake images and the collocation degree of generated clothing pairs. The subjective method constituted conducting a “real or fake” study on human labelers; whereas, the objective method included using a regression model to score the matching degree of the generated outfit and training an inception model to calculate the inception score of fake images similar to [27]. Extensive experimental results demonstrate the effectiveness of our method in comparison to state-of-the-art models.

Section snippets

Related work

There are many bodies of related work; we focus on attributes learning, compatibility learning and GAN model.

Attribute-GAN model

Our model can be regarded as an extension of cGAN, which learns a mapping from label x and random noise vector z, to y: G: z → y. Our proposed approach, Attribute-GAN, learns a mapping from a pair of outfits, conditioned on attributes of clothing. Both the generator network G and the discriminator network D perform feed-forward inference conditioned on clothing attributes. As shown in Fig. 2, the model consists of three components, i.e., a generator and two discriminators. Inspired by Isola

Dataset

In order to train and evaluate our proposed model, Attribute-GAN, we compiled a large-scale dataset from Ployvore,1 which is a free, easy-to-use web-based application for mixing and matching images from anywhere on the Internet. Users create fashion outfits with items on the website or item images uploaded by themselves. The items in a fashion outfit collaged according to users’ preferences aim to beautifully exhibit specific fashion styles. In this platform, tonality

Conclusion

This paper investigates the clothing match problem under the cGAN framework. Specifically, we proposed an Attribute-GAN model, a scalable image-to-image translation model between different domains by a generator and two discriminators, which generate collocation clothing images based on semantic attributes. Besides the advantage of generating higher image quality, Attribute-GAN achieved the best diversity of synthetic images and matching degree of generated clothing outfits, owing to the

Conflicts of interest

No conflict of interest exits in the submission of this manuscript, and manuscript is approved by all authors for publication. I would like to declare on behalf of my co-authors that the work described was original research that has not been published previously, and not under consideration for publication elsewhere, in whole or in part. All the authors listed have approved the manuscript that is enclosed.

Acknowledgments

This work was supported in part by the National Key R&D Program of China under grant no. 2018YFB1003800, 2018YFB1003805, the Natural Science Foundation of China under grant no. 61832004, and the Shenzhen Science and Technology Program under grant no. JCYJ20170413105929681.

Linlin Liu received the B.S. degree in computer science from Zhengzhou University of Aeronautics, Zhengzhou, China, in 2012, and the M.S. degree in computer science from the Harbin Institute of Technology, Shenzhen, China, in 2016, where she is currently pursuing the Ph.D. degree in computer science. Her research interests include data mining, computer vision, image processing, and deep learning.

References (44)

  • M. Arjovsky et al.

    Wasserstein gan

    (2017)
  • D. Berthelot et al.

    Began: boundary equilibrium generative adversarial networks

    (2017)
  • T. Che et al.

    Mode regularized generative adversarial networks

    (2016)
  • H. Chen et al.

    Describing clothing by semantic attributes

    European Conference on Computer Vision

    (2012)
  • Q. Chen et al.

    Deep domain adaptation for describing people based on fine-grained clothing attributes

    IEEE Conference on Computer Vision and Pattern Recognition

    (2015)
  • E.L. Denton et al.

    Deep generative image models using a Laplacian pyramid of adversarial networks

    Advances in Neural Information Processing systems

    (2015)
  • W. Di et al.

    Style finder: fine-grained clothing style detection and retrieval

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops

    (2013)
  • J. Fu et al.

    Efficient clothing retrieval with semantic-preserving visual phrases

    Asian Conference on Computer Vision

    (2012)
  • I. Goodfellow et al.

    Generative adversarial nets

    Advances in Neural Information Processing Systems

    (2014)
  • B. Hasan et al.

    Segmentation using deformable spatial priors with application to clothing

    BMVC

    (2010)
  • J. Huang et al.

    Cross-domain image retrieval with a dual attribute-aware ranking network

    IEEE International Conference on Computer Vision

    (2015)
  • P. Isola et al.

    Image-to-image translation with conditional adversarial networks

    (2016)
  • T. Iwata et al.

    Fashion coordinates recommender system using photographs from fashion magazines

    IJCAI

    (2011)
  • N. Jammalamadaka et al.

    Parsing clothes in unrestricted images

    BMVC

    (2013)
  • M.H. Kiapour et al.

    Hipster wars: discovering elements of fashion styles

    European Conference on Computer Vision

    (2014)
  • T. Kim et al.

    Learning to discover cross-domain relations with generative adversarial networks

    (2017)
  • I.S. Kwak et al.

    From bikers to surfers: visual recognition of urban tribes

    BMVC

    (2013)
  • C. Li et al.

    Precomputed real-time texture synthesis with Markovian generative adversarial networks

    European Conference on Computer Vision

    (2016)
  • Y. Li et al.

    Mining fashion outfit composition using an end-to-end deep learning approach on set data

    IEEE Transactions on Multimedia

    (2017)
  • S. Liu et al.

    Hi, magic closet, tell me what to wear!

    Proceedings of the 20th ACM International Conference on Multimedia

    (2012)
  • S. Liu et al.

    Street-to-shop: cross-scenario clothing retrieval via parts alignment and auxiliary set

    IEEE Conference on Computer Vision and Pattern Recognition

    (2012)
  • Z. Liu et al.

    Deepfashion: powering robust clothes recognition and retrieval with rich annotations

    IEEE Conference on Computer Vision and Pattern Recognition

    (2016)
  • Cited by (86)

    • ClothSeg: semantic segmentation network with feature projection for clothing parsing

      2023, Journal of Visual Communication and Image Representation
    • A multi-source credit data fusion approach based on federated distillation learning

      2024, International Journal of Machine Learning and Cybernetics
    View all citing articles on Scopus

    Linlin Liu received the B.S. degree in computer science from Zhengzhou University of Aeronautics, Zhengzhou, China, in 2012, and the M.S. degree in computer science from the Harbin Institute of Technology, Shenzhen, China, in 2016, where she is currently pursuing the Ph.D. degree in computer science. Her research interests include data mining, computer vision, image processing, and deep learning.

    Haijun Zhang received the B.Eng. and Master’s degrees from Northeastern University, Shenyang, China, and the Ph.D. degree from the Department of Electronic Engineering, City University of Hong Kong, Hong Kong, in 2004, 2007, and 2010, respectively. He was a Post-Doctoral Research Fellow with the Department of Electrical and Computer Engineering, University of Windsor, Windsor, ON, Canada, from 2010 to 2011. Since 2012, he has been with the Harbin Institute of Technology, Shenzhen, China, where he is currently an associate professor of computer science. His current research interests include multimedia data mining, machine learning, and computational advertising. He is currently an associate editor of neurocomputing, neural computing and applications, and pattern analysis and applications.

    Yuzhu Ji received the B.S. degree in computer science from PLA Information Engineering University, Zhengzhou, China, in 2012, and the M.S. degree in computer engineering from the Harbin Institute of Technology, Shenzhen, China, in 2015, where he is currently pursuing the Ph.D. degree in computer science. His research interests include data mining, computer vision, image processing, and deep learning.

    Q.M. Jonathan Wu received the Ph.D. degree in electrical engineering from the University of Wales, Swansea, U.K., in 1990. He was with the National Research Council of Canada for ten years from 1995, where he became a senior research officer and a group leader. He is currently a professor with the Department of Electrical and Computer Engineering, University of Windsor, Windsor, ON, Canada. He has published more than 250 peer-reviewed papers in computer vision, image processing, intelligent systems, robotics, and integrated microsystems. His current research interests include 3-D computer vision, active video object tracking and extraction, interactive multimedia, sensor analysis and fusion, and visual sensor networks. Dr. Wu holds the Tier 1 Canada Research Chair in Automotive Sensors and Information Systems. He was an associate editor of the IEEE Transactions onSystems, Man, and Cybernetics Part A and the International Journal of Robotics and Automation. He has served on technical program committees and international advisory committees for many prestigious conferences.

    View full text