skip to main content
10.1145/3123266.3123314acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

NeuroStylist: Neural Compatibility Modeling for Clothing Matching

Published:19 October 2017Publication History

ABSTRACT

Nowadays, as a beauty-enhancing product, clothing plays an important role in human's social life. In fact, the key to a proper outfit usually lies in the harmonious clothing matching. Nevertheless, not everyone is good at clothing matching. Fortunately, with the proliferation of fashion-oriented online communities, fashion experts can publicly share their fashion tips by showcasing their outfit compositions, where each fashion item (e.g., a top or bottom) usually has an image and context metadata (e.g., title and category). Such rich fashion data offer us a new opportunity to investigate the code in clothing matching. However, challenges co-exist with opportunities. The first challenge lies in the complicated factors, such as color, material and shape, that affect the compatibility of fashion items. Second, as each fashion item involves multiple modalities (i.e., image and text), how to cope with the heterogeneous multi-modal data also poses a great challenge. Third, our pilot study shows that the composition relation between fashion items is rather sparse, which makes traditional matrix factorization methods not applicable. Towards this end, in this work, we propose a content-based neural scheme to model the compatibility between fashion items based on the Bayesian personalized ranking (BPR) framework. The scheme is able to jointly model the coherent relation between modalities of items and their implicit matching preference. Experiments verify the effectiveness of our scheme, and we deliver deep insights that can benefit future research.

References

  1. Léon Bottou. 1991. Stochastic gradient learning in neural networks. Proceedings of Neuro-Nımes Vol. 91, 8 (1991).Google ScholarGoogle Scholar
  2. Jingyuan Chen, Xuemeng Song, Liqiang Nie, Xiang Wang, Hanwang Zhang, and Tat-Seng Chua. 2016. Micro tells macro: predicting the popularity of micro-videos via a transductive model Proceedings of the ACM International Conference on Multimedia. ACM, 898--907. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Wei Di, Catherine Wah, Anurag Bhardwaj, Robinson Piramuthu, and Neel Sundaresan. 2013. Style finder: Fine-grained clothing style detection and retrieval Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. IEEE, 8--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Pedro Felzenszwalb, David McAllester, and Deva Ramanan. 2008. A discriminatively trained, multiscale, deformable part model Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1--8.Google ScholarGoogle Scholar
  5. Fangxiang Feng, Xiaojie Wang, and Ruifan Li. 2014. Cross-modal retrieval with correspondence autoencoder Proceedings of the ACM International Conference on Multimedia. ACM, 7--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Yue Gao, Meng Wang, Zheng-Jun Zha, Jialie Shen, Xuelong Li, and Xindong Wu. 2013. Visual-textual joint relevance learning for tag-based social image search. IEEE Transactions on Image Processing Vol. 22, 1 (2013), 363--376. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Kostadin Georgiev and Preslav Nakov. 2013. A non-IID Framework for Collaborative Filtering with Restricted Boltzmann Machines. Proceedings of the International Conference on Machine Learning. JMLR.org, 1148--1156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M Hadi Kiapour, Xufeng Han, Svetlana Lazebnik, Alexander C Berg, and Tamara L Berg. 2015. Where to buy it: Matching street clothing photos in online shops Proceedings of the IEEE International Conference on Computer Vision. IEEE, 3343--3351. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ruining He and Julian McAuley. 2016. VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback Thirtieth AAAI Conference on Artificial Intelligence. AAAI Press, 144--150. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural Collaborative Filtering. In Proceedings of the ACM International Conference on World Wide Web. ACM, 173--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Geoffrey E Hinton, Simon Osindero, and Yee-Whye Teh. 2006. A fast learning algorithm for deep belief nets. Neural computation, Vol. 18, 7 (2006), 1527--1554. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Diane J Hu, Rob Hall, and Josh Attenberg. 2014. Style in the long tail: Discovering unique interests with latent variable models in large scale social e-commerce. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1640--1649. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Yang Hu, Xi Yi, and Larry S Davis. 2015. Collaborative fashion recommendation: a functional tensor factorization approach Proceedings of the ACM International Conference on Multimedia. ACM, 129--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Tomoharu Iwata, Shinji Wanatabe, and Hiroshi Sawada. 2011. Fashion coordinates recommender system using photographs from fashion magazines Proceedings of the International Joint Conference on Artificial Intelligence, Vol. Vol. 22. AAAI Press, 2262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Vignesh Jagadeesh, Robinson Piramuthu, Anurag Bhardwaj, Wei Di, and Neel Sundaresan. 2014. Large scale visual recommendations from street fashion images Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1925--1934. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Rongrong Ji, Xing Xie, Hongxun Yao, and Wei-Ying Ma. 2009. Mining city landmarks from blogs by graph modeling Proceedings of the ACM International Conference on Multimedia. ACM, 105--114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding Proceedings of the ACM International Conference on Multimedia. ACM, 675--678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Lu Jiang, Shoou-I Yu, Deyu Meng, Yi Yang, Teruko Mitamura, and Alexander G Hauptmann. 2015. Fast and accurate content-based semantic search in 100m internet videos Proceedings of the ACM International Conference on Multimedia. ACM, 49--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Yehuda Koren. 2008. Factorization meets the neighborhood: a multifaceted collaborative filtering model Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 426--434. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks Proceedings of the Advances in Neural Information Processing Systems. 1097--1105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature, Vol. 521, 7553 (2015), 436--444.Google ScholarGoogle Scholar
  22. Yuncheng Li, Liangliang Cao, Jiang Zhu, and Jiebo Luo. 2017. Mining Fashion Outfit Composition Using An End-to-End Deep Learning Approach on Set Data. IEEE Transactions on Multimedia (2017).Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Si Liu, Jiashi Feng, Zheng Song, Tianzhu Zhang, Hanqing Lu, Changsheng Xu, and Shuicheng Yan. 2012 a. Hi, magic closet, tell me what to wear!. In Proceedings of the ACM International Conference on Multimedia. ACM, 619--628. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Si Liu, Zheng Song, Guangcan Liu, Changsheng Xu, Hanqing Lu, and Shuicheng Yan. 2012 b. Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3330--3337. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel. 2015. Image-based recommendations on styles and substitutes Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 43--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jiquan Ngiam, Aditya Khosla, Mingyu Kim, Juhan Nam, Honglak Lee, and Andrew Y Ng. 2011. Multimodal deep learning. In Proceedings of the International Conference on Machine Learning. JMLR.org, 689--696. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Liqiang Nie, Meng Wang, Zhengjun Zha, Guangda Li, and Tat-Seng Chua. 2011. Multimedia Answering: Enriching Text QA with Media Information Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 695--704. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Liqiang Nie, Meng Wang, Zheng-Jun Zha, and Tat-Seng Chua. 2012 a. Oracle in Image Search: A Content-Based Approach to Performance Prediction. ACM Transactions on Information System Vol. 30 (2012), 13:1--13:23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Liqiang Nie, Shuicheng Yan, Meng Wang, Richang Hong, and Tat-Seng Chua. 2012 b. Harvesting Visual Concepts for Image Search with Complex Queries Proceedings of the ACM International Conference on Multimedia. ACM, 59--68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Xueming Qian, He Feng, Guoshuai Zhao, and Tao Mei. 2014. Personalized recommendation combining user interest and social circle. IEEE Transactions on Knowledge and Data Engineering, Vol. 26, 7 (2014), 1763--1777.Google ScholarGoogle ScholarCross RefCross Ref
  31. Janarthanan Rajendran, Mitesh M Khapra, Sarath Chandar, and Balaraman Ravindran. 2015. Bridge correlational neural networks for multilingual multimodal representation learning. arXiv preprint arXiv:1510.03519 (2015).Google ScholarGoogle Scholar
  32. Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback Proceedings of the International Conference on Uncertainty in Artificial Intelligence. AUAI Press, 452--461. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Steffen Rendle and Lars Schmidt-Thieme. 2010. Pairwise interaction tensor factorization for personalized tag recommendation Proceedings of the ACM International Conference on Web Search and Data Mining. ACM, 81--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Edgar Simo-Serra, Sanja Fidler, Francesc Moreno-Noguer, and Raquel Urtasun. 2015. Neuroaesthetics in fashion: Modeling the perception of fashionability Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 869--877.Google ScholarGoogle Scholar
  35. Xuemeng Song, Liqiang Nie, Luming Zhang, Mohammad Akbari, and Tat-Seng Chua. 2015 a. Multiple social network learning and its application in volunteerism tendency prediction Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 213--222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Xuemeng Song, Liqiang Nie, Luming Zhang, Maofu Liu, and Tat-Seng Chua. 2015 b. Interest Inference via Structure-Constrained Multi-Source Multi-Task Learning. Proceedings of the International Joint Conference on Artificial Intelligence. AAAI Press, 2371--2377. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Daixin Wang, Peng Cui, Mingdong Ou, and Wenwu Zhu. 2015. Deep Multimodal Hashing with Orthogonal Regularization. Proceedings of the International Joint Conference on Artificial Intelligence. AAAI Press, 2291--2297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Daixin Wang, Peng Cui, and Wenwu Zhu. 2016. Structural deep network embedding. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1225--1234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Xiang Wang, Xiangnan He, Liqiang Nie, and Tat-Seng Chua. 2017. Item Silk Road: Recommending Items from Information Domains to Social Users Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Xinxi Wang and Ye Wang. 2014. Improving content-based and hybrid music recommendation using deep learning Proceedings of the ACM International Conference on Multimedia. ACM, 627--636. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Kota Yamaguchi, M Hadi Kiapour, Luis E Ortiz, and Tamara L Berg. 2012. Parsing clothing in fashion photographs. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3570--3577. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Kota Yamaguchi, M Hadi Kiapour, Luis E Ortiz, and Tamara L Berg. 2015. Retrieving similar styles to parse clothing. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 37, 5 (2015), 1028--1040.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Hanwang Zhang, Zawlin Kyaw, Shih-Fu Chang, and Tat-Seng Chua. 2017. Visual Translation Embedding Network for Visual Relation Detection Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  44. Hanwang Zhang, Xindi Shang, Huanbo Luan, Meng Wang, and Tat-Seng Chua. 2016 a. Learning from collective intelligence: Feature learning using social images and tags. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) Vol. 13 (2016). Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Hanwang Zhang, Xindi Shang, Wenzhuo Yang, Huan Xu, Huanbo Luan, and Tat-Seng Chua. 2016 b. Online collaborative learning for open-vocabulary visual classifiers Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2809--2817.Google ScholarGoogle Scholar
  46. Hanwang Zhang, Zheng-Jun Zha, Yang Yang, Shuicheng Yan, Yue Gao, and Tat-Seng Chua. 2013. Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval. In Proceedings of the ACM International Conference on Multimedia. ACM, 33--42. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. NeuroStylist: Neural Compatibility Modeling for Clothing Matching

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        MM '17: Proceedings of the 25th ACM international conference on Multimedia
        October 2017
        2028 pages
        ISBN:9781450349062
        DOI:10.1145/3123266

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 19 October 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        MM '17 Paper Acceptance Rate189of684submissions,28%Overall Acceptance Rate995of4,171submissions,24%

        Upcoming Conference

        MM '24
        MM '24: The 32nd ACM International Conference on Multimedia
        October 28 - November 1, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader