research-article

NeuroStylist: Neural Compatibility Modeling for Clothing Matching

Authors:
Xuemeng Song

Shandong University, Jinan, China

Shandong University, Jinan, China
View Profile

,
Fuli Feng

National University of Singapore, Singapore, Singapore

National University of Singapore, Singapore, Singapore
View Profile

,
Jinhuan Liu

Shandong University, Jinan, China

Shandong University, Jinan, China
View Profile

,
Zekun Li

Shandong University, Jinan, China

Shandong University, Jinan, China
View Profile

,
Liqiang Nie

Shandong University, Jinan, China

Shandong University, Jinan, China
View Profile

,
Jun Ma

Shandong University, Jinan, China

Shandong University, Jinan, China
View Profile

MM '17: Proceedings of the 25th ACM international conference on MultimediaOctober 2017Pages 753–761https://doi.org/10.1145/3123266.3123314

Published:19 October 2017Publication History

MM '17: Proceedings of the 25th ACM international conference on Multimedia

Pages 753–761

ABSTRACT

Nowadays, as a beauty-enhancing product, clothing plays an important role in human's social life. In fact, the key to a proper outfit usually lies in the harmonious clothing matching. Nevertheless, not everyone is good at clothing matching. Fortunately, with the proliferation of fashion-oriented online communities, fashion experts can publicly share their fashion tips by showcasing their outfit compositions, where each fashion item (e.g., a top or bottom) usually has an image and context metadata (e.g., title and category). Such rich fashion data offer us a new opportunity to investigate the code in clothing matching. However, challenges co-exist with opportunities. The first challenge lies in the complicated factors, such as color, material and shape, that affect the compatibility of fashion items. Second, as each fashion item involves multiple modalities (i.e., image and text), how to cope with the heterogeneous multi-modal data also poses a great challenge. Third, our pilot study shows that the composition relation between fashion items is rather sparse, which makes traditional matrix factorization methods not applicable. Towards this end, in this work, we propose a content-based neural scheme to model the compatibility between fashion items based on the Bayesian personalized ranking (BPR) framework. The scheme is able to jointly model the coherent relation between modalities of items and their implicit matching preference. Experiments verify the effectiveness of our scheme, and we deliver deep insights that can benefit future research.

References

Léon Bottou. 1991. Stochastic gradient learning in neural networks. Proceedings of Neuro-Nımes Vol. 91, 8 (1991).Google Scholar
Jingyuan Chen, Xuemeng Song, Liqiang Nie, Xiang Wang, Hanwang Zhang, and Tat-Seng Chua. 2016. Micro tells macro: predicting the popularity of micro-videos via a transductive model Proceedings of the ACM International Conference on Multimedia. ACM, 898--907. Google ScholarDigital Library
Wei Di, Catherine Wah, Anurag Bhardwaj, Robinson Piramuthu, and Neel Sundaresan. 2013. Style finder: Fine-grained clothing style detection and retrieval Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. IEEE, 8--13. Google ScholarDigital Library
Pedro Felzenszwalb, David McAllester, and Deva Ramanan. 2008. A discriminatively trained, multiscale, deformable part model Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1--8.Google Scholar
Fangxiang Feng, Xiaojie Wang, and Ruifan Li. 2014. Cross-modal retrieval with correspondence autoencoder Proceedings of the ACM International Conference on Multimedia. ACM, 7--16. Google ScholarDigital Library
Yue Gao, Meng Wang, Zheng-Jun Zha, Jialie Shen, Xuelong Li, and Xindong Wu. 2013. Visual-textual joint relevance learning for tag-based social image search. IEEE Transactions on Image Processing Vol. 22, 1 (2013), 363--376. Google ScholarDigital Library
Kostadin Georgiev and Preslav Nakov. 2013. A non-IID Framework for Collaborative Filtering with Restricted Boltzmann Machines. Proceedings of the International Conference on Machine Learning. JMLR.org, 1148--1156. Google ScholarDigital Library
M Hadi Kiapour, Xufeng Han, Svetlana Lazebnik, Alexander C Berg, and Tamara L Berg. 2015. Where to buy it: Matching street clothing photos in online shops Proceedings of the IEEE International Conference on Computer Vision. IEEE, 3343--3351. Google ScholarDigital Library
Ruining He and Julian McAuley. 2016. VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback Thirtieth AAAI Conference on Artificial Intelligence. AAAI Press, 144--150. Google ScholarDigital Library
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural Collaborative Filtering. In Proceedings of the ACM International Conference on World Wide Web. ACM, 173--182. Google ScholarDigital Library
Geoffrey E Hinton, Simon Osindero, and Yee-Whye Teh. 2006. A fast learning algorithm for deep belief nets. Neural computation, Vol. 18, 7 (2006), 1527--1554. Google ScholarDigital Library
Diane J Hu, Rob Hall, and Josh Attenberg. 2014. Style in the long tail: Discovering unique interests with latent variable models in large scale social e-commerce. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1640--1649. Google ScholarDigital Library
Yang Hu, Xi Yi, and Larry S Davis. 2015. Collaborative fashion recommendation: a functional tensor factorization approach Proceedings of the ACM International Conference on Multimedia. ACM, 129--138. Google ScholarDigital Library
Tomoharu Iwata, Shinji Wanatabe, and Hiroshi Sawada. 2011. Fashion coordinates recommender system using photographs from fashion magazines Proceedings of the International Joint Conference on Artificial Intelligence, Vol. Vol. 22. AAAI Press, 2262. Google ScholarDigital Library
Vignesh Jagadeesh, Robinson Piramuthu, Anurag Bhardwaj, Wei Di, and Neel Sundaresan. 2014. Large scale visual recommendations from street fashion images Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1925--1934. Google ScholarDigital Library
Rongrong Ji, Xing Xie, Hongxun Yao, and Wei-Ying Ma. 2009. Mining city landmarks from blogs by graph modeling Proceedings of the ACM International Conference on Multimedia. ACM, 105--114. Google ScholarDigital Library
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding Proceedings of the ACM International Conference on Multimedia. ACM, 675--678. Google ScholarDigital Library
Lu Jiang, Shoou-I Yu, Deyu Meng, Yi Yang, Teruko Mitamura, and Alexander G Hauptmann. 2015. Fast and accurate content-based semantic search in 100m internet videos Proceedings of the ACM International Conference on Multimedia. ACM, 49--58. Google ScholarDigital Library
Yehuda Koren. 2008. Factorization meets the neighborhood: a multifaceted collaborative filtering model Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 426--434. Google ScholarDigital Library
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks Proceedings of the Advances in Neural Information Processing Systems. 1097--1105. Google ScholarDigital Library
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature, Vol. 521, 7553 (2015), 436--444.Google Scholar
Yuncheng Li, Liangliang Cao, Jiang Zhu, and Jiebo Luo. 2017. Mining Fashion Outfit Composition Using An End-to-End Deep Learning Approach on Set Data. IEEE Transactions on Multimedia (2017).Google ScholarDigital Library
Si Liu, Jiashi Feng, Zheng Song, Tianzhu Zhang, Hanqing Lu, Changsheng Xu, and Shuicheng Yan. 2012 a. Hi, magic closet, tell me what to wear!. In Proceedings of the ACM International Conference on Multimedia. ACM, 619--628. Google ScholarDigital Library
Si Liu, Zheng Song, Guangcan Liu, Changsheng Xu, Hanqing Lu, and Shuicheng Yan. 2012 b. Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3330--3337. Google ScholarDigital Library
Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel. 2015. Image-based recommendations on styles and substitutes Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 43--52. Google ScholarDigital Library
Jiquan Ngiam, Aditya Khosla, Mingyu Kim, Juhan Nam, Honglak Lee, and Andrew Y Ng. 2011. Multimodal deep learning. In Proceedings of the International Conference on Machine Learning. JMLR.org, 689--696. Google ScholarDigital Library
Liqiang Nie, Meng Wang, Zhengjun Zha, Guangda Li, and Tat-Seng Chua. 2011. Multimedia Answering: Enriching Text QA with Media Information Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 695--704. Google ScholarDigital Library
Liqiang Nie, Meng Wang, Zheng-Jun Zha, and Tat-Seng Chua. 2012 a. Oracle in Image Search: A Content-Based Approach to Performance Prediction. ACM Transactions on Information System Vol. 30 (2012), 13:1--13:23. Google ScholarDigital Library
Liqiang Nie, Shuicheng Yan, Meng Wang, Richang Hong, and Tat-Seng Chua. 2012 b. Harvesting Visual Concepts for Image Search with Complex Queries Proceedings of the ACM International Conference on Multimedia. ACM, 59--68. Google ScholarDigital Library
Xueming Qian, He Feng, Guoshuai Zhao, and Tao Mei. 2014. Personalized recommendation combining user interest and social circle. IEEE Transactions on Knowledge and Data Engineering, Vol. 26, 7 (2014), 1763--1777.Google ScholarCross Ref
Janarthanan Rajendran, Mitesh M Khapra, Sarath Chandar, and Balaraman Ravindran. 2015. Bridge correlational neural networks for multilingual multimodal representation learning. arXiv preprint arXiv:1510.03519 (2015).Google Scholar
Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback Proceedings of the International Conference on Uncertainty in Artificial Intelligence. AUAI Press, 452--461. Google ScholarDigital Library
Steffen Rendle and Lars Schmidt-Thieme. 2010. Pairwise interaction tensor factorization for personalized tag recommendation Proceedings of the ACM International Conference on Web Search and Data Mining. ACM, 81--90. Google ScholarDigital Library
Edgar Simo-Serra, Sanja Fidler, Francesc Moreno-Noguer, and Raquel Urtasun. 2015. Neuroaesthetics in fashion: Modeling the perception of fashionability Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 869--877.Google Scholar
Xuemeng Song, Liqiang Nie, Luming Zhang, Mohammad Akbari, and Tat-Seng Chua. 2015 a. Multiple social network learning and its application in volunteerism tendency prediction Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 213--222. Google ScholarDigital Library
Xuemeng Song, Liqiang Nie, Luming Zhang, Maofu Liu, and Tat-Seng Chua. 2015 b. Interest Inference via Structure-Constrained Multi-Source Multi-Task Learning. Proceedings of the International Joint Conference on Artificial Intelligence. AAAI Press, 2371--2377. Google ScholarDigital Library
Daixin Wang, Peng Cui, Mingdong Ou, and Wenwu Zhu. 2015. Deep Multimodal Hashing with Orthogonal Regularization. Proceedings of the International Joint Conference on Artificial Intelligence. AAAI Press, 2291--2297. Google ScholarDigital Library
Daixin Wang, Peng Cui, and Wenwu Zhu. 2016. Structural deep network embedding. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1225--1234. Google ScholarDigital Library
Xiang Wang, Xiangnan He, Liqiang Nie, and Tat-Seng Chua. 2017. Item Silk Road: Recommending Items from Information Domains to Social Users Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM. Google ScholarDigital Library
Xinxi Wang and Ye Wang. 2014. Improving content-based and hybrid music recommendation using deep learning Proceedings of the ACM International Conference on Multimedia. ACM, 627--636. Google ScholarDigital Library
Kota Yamaguchi, M Hadi Kiapour, Luis E Ortiz, and Tamara L Berg. 2012. Parsing clothing in fashion photographs. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3570--3577. Google ScholarDigital Library
Kota Yamaguchi, M Hadi Kiapour, Luis E Ortiz, and Tamara L Berg. 2015. Retrieving similar styles to parse clothing. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 37, 5 (2015), 1028--1040.Google ScholarDigital Library
Hanwang Zhang, Zawlin Kyaw, Shih-Fu Chang, and Tat-Seng Chua. 2017. Visual Translation Embedding Network for Visual Relation Detection Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Hanwang Zhang, Xindi Shang, Huanbo Luan, Meng Wang, and Tat-Seng Chua. 2016 a. Learning from collective intelligence: Feature learning using social images and tags. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) Vol. 13 (2016). Google ScholarDigital Library
Hanwang Zhang, Xindi Shang, Wenzhuo Yang, Huan Xu, Huanbo Luan, and Tat-Seng Chua. 2016 b. Online collaborative learning for open-vocabulary visual classifiers Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2809--2817.Google Scholar
Hanwang Zhang, Zheng-Jun Zha, Yang Yang, Shuicheng Yan, Yue Gao, and Tat-Seng Chua. 2013. Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval. In Proceedings of the ACM International Conference on Multimedia. ACM, 33--42. Google ScholarDigital Library

Index Terms

NeuroStylist: Neural Compatibility Modeling for Clothing Matching
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
  2. World Wide Web

Recommendations

GP-BPR: Personalized Compatibility Modeling for Clothing Matching
MM '19: Proceedings of the 27th ACM International Conference on Multimedia

Owing to the recent advances in the multimedia processing domain and the publicly available large-scale real-world data provided by online fashion communities, like the IQON and Chictopia, researchers are enabled to investigate the automatic clothing ...
Read More
Neural Compatibility Modeling with Attentive Knowledge Distillation
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

Recently, the booming fashion sector and its huge potential benefits have attracted tremendous attention from many research communities. In particular, increasing research efforts have been dedicated to the complementary clothing matching as matching ...
Read More
Neural fashion experts: I know how to make the complementary clothing matching
Abstract
Clothing has gradually become the beauty enhancing product, while the harmonious clothing matching is critical for a suitable outfit. The existing clothing matching techniques mainly rely on the visual features but overlook the textual ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '17: Proceedings of the 25th ACM international conference on Multimedia
October 2017
2028 pages
ISBN:9781450349062
DOI:10.1145/3123266
General Chairs:
Qiong Liu
FXPAL, USA
,
Rainer Lienhart
Universität Augsburg, Germany
,
Haohong Wang
TCL America, USA
,
Program Chairs:
Sheng-Wei "Kuan-Ta" Chen
Academia Sinica, Taiwan
,
Susanne Boll
University of Oldenburg, Germany
,
Phoebe Chen
La Trobe University, Australia
,
Gerald Friedland
Lawrence Livermore National Lab, USA
,
Jia Li
Google, USA
,
Shuicheng Yan
Qihoo 360, China
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 October 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
compatibility modeling
fashion analysis
multi-modal
Qualifiers
- research-article
Conference

Acceptance Rates
MM '17 Paper Acceptance Rate189of684submissions,28%Overall Acceptance Rate995of4,171submissions,24%
More
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 114
  Total Citations
  View Citations
- 975
  Total Downloads
- Downloads (Last 12 months)93
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

NeuroStylist: Neural Compatibility Modeling for Clothing Matching

MM '17: Proceedings of the 25th ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

GP-BPR: Personalized Compatibility Modeling for Clothing Matching

Neural Compatibility Modeling with Attentive Knowledge Distillation

Neural fashion experts: I know how to make the complementary clothing matching