skip to main content
10.1145/2818869.2818902acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesase-bigdataConference Proceedingsconference-collections
research-article

Visual-based Deep Learning for Clothing from Large Database

Authors Info & Claims
Published:07 October 2015Publication History

ABSTRACT

Huge benefits can be obtained by mining information from Big Data. Analyzing large volumes of consumption behavior data that are limited by conventional machine learning techniques and computational analysis becomes a critical problem as Big Data is examined. Furthermore, there is a need for powerful visual-based analytics tools when pictures have become a core content component on the Internet. Hence, in this study, we explore Deep Learning with convolutional neural networks with a goal of resolving clothing style classification and retrieval tasks. To reduce training complexity, transfer learning is incorporated by fine-tuning pre-trained models on large scale datasets. Furthermore, because the parameters are vast for any given deep net, one architecture inspired from Adaboost is designed to use multiple deep nets that are trained with a sub-dataset. Thus, the training time can be accelerated if each net is computed in one client node in a distributed computing environment. Moreover, to increase system flexibility, two architectures with multiple deep nets with two outputs are proposed for binary-class classification. Therefore, when new classes are added, no additional computation is needed for all training data. Experiments are performed to compare existing systems with hand-crafted features and conventional learning models. According to the results, the proposed system can provide significant improvements on three public clothing datasets for style classifications.

References

  1. US National Security Agency 2013. The National Security Agency: Missions, Authorities. Oversight and Partnerships, 5 (August. 2013).Google ScholarGoogle Scholar
  2. Chen, X. W. and Lin, X. 2014. Big data deep learning: challenges and perspectives. IEEE Access, 514--525. DOI: http://dx.doi.org/10.1109/ACCESS.2014.2325029.Google ScholarGoogle Scholar
  3. Gantz, J. and Reinsel, D. 2011. Extracting value from chaos, EMC.Google ScholarGoogle Scholar
  4. Dumbill, E. 2012. What is big data? An introduction to the big data landscape, Strata.Google ScholarGoogle Scholar
  5. Grobelnik, M. 2013. Big Data Tutorial, European Data Forum.Google ScholarGoogle Scholar
  6. Najafabadi, M. M., Villanustre, F., Khoshgoftaar, T. M., Seliya, N., Wald, R., and Muharemagic, E. 2015. Deep learning applications and challenges in big data analytics, Journal of Big Data, Vol. 2, No. 1, 1--21. DOI: http://dx.doi.org/10.1186/s40537-014-0007-7.Google ScholarGoogle ScholarCross RefCross Ref
  7. Lin, J. and Kolcz, A. 2012. Large-scale machine learning at twitter. International conference on management of data, 793--804. DOI:http://dx.doi.org/10.1145/2213836.2213958. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Sukumar, S. R. 2014. Machine learning in the big data era: are we there yet?. CISML.Google ScholarGoogle Scholar
  9. Bengio, Y., Courville, A., and Vincent, P. 2013. Representation learning: a review and new perspectives. TPAMI, Vol. 35, No. 8, 1798--1828. DOI: http://dx.doi.org/10.1109/TPAMI.2013.50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Arel, I., Rose, D. C., and Karnowski, T. P. 2010. Deep machine learning - a new frontier in artificial intelligence research. IEEE Computational Intelligence Magazine, Vol. 5, No. 4, 13--18. DOI: http://dx.doi.org/10.1109/MCI.2010.938364. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Efrati, A. 2013. How deep learning works at Apple. Information.Google ScholarGoogle Scholar
  12. Jones, N. 2014. Computer science: the learning machines. Nature, Vol. 505, No. 7482, 146--148. DOI: http://dx.doi.org/10.1038/505146a.Google ScholarGoogle ScholarCross RefCross Ref
  13. Wang, Y., Yu, D., Ju, Y., and Acero, A. 2011. Voice search. Language understanding: systems for extracting semantic information from speech, 119--146.Google ScholarGoogle Scholar
  14. Hinton, G., and Osindero, S. 2006. A fast learning algorithm for deep belief nets. Neural Computation, Vol. 18, No. 7, 1527--1554. DOI: http://dx.doi.org/10.1162/neco.2006.18.7.1527 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Bengio, Y., Lamblin, P., Popovici, D., and Larochelle, H. 2007. Greedy layer-wise training of deep networks. NIPS, 153--160.Google ScholarGoogle Scholar
  16. Dahl, G. E., Yu, D., Deng, L., and Acero, A. 2012. Context-dependent pretrained deep neural networks for large-vocabulary speech recognition. IEEE Trans. on Audio, Speech and Language Processing, Vol. 20, No. 1, 30--41. DOI: http://dx.doi.org/10.1109/TASL.2011.2134090. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Mohamed, A., Dahl, G., and Hinton, G. 2012. Acoustic modeling using deep belief networks. IEEE Trans. on Audio, Speech and Language Processing, Vol. 20, No. 1, 14--22. DOI: http://dx.doi.org/10.1109/TASL.2011.2109382. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Socher, R., Huang, E. H., Pennington, J., Ng, A. Y., and Manning, C. D. 2011. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. NIPS.Google ScholarGoogle Scholar
  19. Ciresan, D. C., Meier, U., Gambardella, L. M., and Schmidhuber, J. 2010. Deep big simple neural nets excel on handwritten digit recognition. Neural Computing, Vol. 22, No. 12,3207--3220. DOI: http://dx.doi.org/10.1162/NECO_a_00052. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Krizhevsky, A., Sutskever, I., and Hinton, G., 2012. Imagenet classification with deep convolutional neural networks. NIPS, 1106--1114.Google ScholarGoogle Scholar
  21. Ojala, T., Pietikainen, M., and Maenpaa, T. 2002. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. TPAMI, Vol. 24, No. 7,971--87.DOI:http://dx.doi.org/10.1109/TPAMI.2002.1017623. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, Vol.60, No.2,91110.DOI:http://dx.doi.org/10.1023/B:VISI.0000029664.99615.94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Le, Q., Ranzato, M., Monga, R., Devin, M., Chen, K., Corrado, G., Dean, J., and Ng, A. 2012. Building high-level features using large scale unsupervised learning. ICML. DOI:http://dx.doi.org/10.1109/ICASSP.2013.6639343.Google ScholarGoogle Scholar
  24. LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. 1998. Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, Vol. 86, No. 11, 2278--2324. DOI: http://dx.doi.org/10.1109/5.726791.Google ScholarGoogle ScholarCross RefCross Ref
  25. Sun, Y., Wang, X., and Tang, X. 2013. Hybrid deep learning for face verification. ICCV. DOI: http://dx.doi.org/10.1109/ICCV.2013.188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ciresan, D., Meier, U., and Schmidhuber, J. 2012. Multi-column deep neural networks for image classification. CVPR. DOI:http://dx.doi.org/10.1109/CVPR.2012.6248110.Google ScholarGoogle Scholar
  27. Girshick, R., Donahue, J., Darrell, T., and Malik, J. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR. DOI: http://dx.doi.org/10.1109/CVPR.2014.81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Sun, Y., Wang, X., and Tang, X. 2015. Deeply learned face representations are sparse, selective, and robust. CVPR.Google ScholarGoogle Scholar
  29. Zhang, N., Paluri, M., Ranzato, M., Darrell, T., and Bourdev, L. 2014. PANDA: Pose aligned networks for deep attribute modeling. CVPR. DOI:http://dx.doi.org/10.1109/CVPR.2014.212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Hinton, G., Srivastava, N., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R. 2012. Improving neural networks by preventing coadaptation of feature detectors. ArXiv:1207.0508.Google ScholarGoogle Scholar
  31. Glorot, X., Bordes, A., and Bengio, Y. 2011. Deep sparse rectifier networks. ICAIS, 315--323.Google ScholarGoogle Scholar
  32. Goodfellow, I. J., Warde-Farley, D., Mirza, M., Courville, A., and Bengio, Y. 2013. Maxout networks. arXiv preprint arXiv:1302.4389.Google ScholarGoogle Scholar
  33. Lin, M., Chen, Q., and Yan, S. 2013. Network in network. ICLR.Google ScholarGoogle Scholar
  34. Socher, R., Lin, C., and Ng, A. 2011. Parsing natural scenes and natural language with recursive neural Networks. ICML, 129--136.Google ScholarGoogle Scholar
  35. Le, Q. V., Zou, W. Y., Yeung, S. Y., and Ng, A. Y. 2011. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. CVPR, 3361--3368.DOI:http://dx.doi.org/10.1109/CVPR.2011.5995496. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Gopalan, R., Li, R., and Chellappa, R. 2014. Unsupervised adaptationacross domain shifts by generating intermediate data representations. TPAMI, Vol. 36, No. 11, 2288--2302. DOI: http://dx.doi.org/10.1109/TPAMI.2013.249Google ScholarGoogle ScholarCross RefCross Ref
  37. Razavian, A. S., Azizpour, H., Sullivan, J., and Carlsson, S. 2014. Cnn features off-the-shelf: an astounding baseline for recognition. CPVRW, 512--519. DOI: http://dx.doi.org/10.1109/CVPRW.2014.131. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Oquab, M., Bottou, L., Laptev, I., and Sivic, J. 2014. Learning and transferring mid-level image representations using convolutional neural networks. CVPR, 1717--1724. DOI: http://dx.doi.org/10.1109/CVPR.2014.222 Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Chen, Q., Huang, J., Feris, R., Brown, L. M., Dong, J., and S. Yan. 2015. Deep Domain Adaptation for Describing People Based on Fine-Grained Clothing Attributes. CVPR, 5315--5324.Google ScholarGoogle Scholar
  40. Huang, J. Feris, R. S., Chen, Q., and Yan, S. 2015. Cross-domain Image Retrieval with a Dual Attribute-aware Ranking Network. arXiv preprint arXiv:1505.07922.Google ScholarGoogle Scholar
  41. Yamaguchi, K., Berg, T. L., and Ortiz, L. E. 2014. Chic or Social: Visual Popularity Analysis in Online Fashion Networks. ACM Conference on Multimedia, 773--776. DOI: http://dx.doi.org/10.1145/2647868.2654958. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Yamaguchi, K., Kiapour, M. H., Ortiz, L. E., and Berg, T. L. 2012. Parsing clothing in fashion photographs. CVPR, 3570--3577. DOI: http://dx.doi.org/10.1109/CVPR.2012.6248101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Yamaguchi, K., Kiapour, M. H., and Berg, T. L. 2013. Paper doll parsing: Retrieving similar styles to parse clothing items. ICCV, 3519--3526. DOI: http://dx.doi.org/10.1109/ICCV.2013.437. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Kalantidis, Y., Kennedy, L., and Li, L. J. 2013. Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos. ICMR, 105--112. DOI: http://dx.doi.org/10.1145/2461466.2461485. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Jagadeesh, V., Piramuthu, R., Bhardwaj, A., Di, W., and Sundaresan, N. 2014. Large scale visual recommendations from street fashion images. ACM SIGKDD, 1925--1934. DOI: http://dx.doi.org/10.1145/2623330.2623332. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Liu, S., Feng, J., Song, Z., Zhang, T., Lu, H., Xu, C., and Yan, S. 2012. Hi, magic closet, tell me what to wear! ICM, 619--628. DOI:http://dx.doi.org/10.1145/2393347.2393433. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Kalantidis, Y., Kennedy, L., and Li, L. 2013. Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos. ICMR, 105--112. DOI: http://dx.doi.org/10.1145/2461466.2461485. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Bossard, L., Dantone, M., Leistner, C., Wengert, C., Quack, T., and Gool, L. V. 2013. Apparel classification with style. ACCV, Vol. 7727, 321--335. DOI:http://dx.doi.org/10.1007/978-3-642-37447-0_25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Gallagher, A., and Chen, T. 2008. Clothing cosegmentation for recognizing people. CVPR, 1--8. DOI: http://dx.doi.org/10.1109/CVPR.2008.4587481.Google ScholarGoogle Scholar
  50. Song, Z., Wang., Hua, M. X., and Yan, S. 2011. Predicting occupation via human clothing and contexts. ICCV, 1084--1091. DOI: http://dx.doi.org/10.1109/ICCV.2011.6126355. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Kwak, I., Murillo, A., Belhumeur, P., Kriegman, D., and Belongie, S. 2013. From bikers to surfers: visual recognition of urban tribes. BMVC. DOI: http://dx.doi.org/10.5244/C.27.14.Google ScholarGoogle Scholar
  52. Liu, S., Feng, J., Domokos, C., Xu, H., Huang, J., Hu, Z., and Yan, S. 2014. Fashion parsing with weak color-category labels. TMM, Vol. 16, No. 1, 253--265. DOI=http://dx.doi.org/10.1109/TMM.2013.2285526.Google ScholarGoogle ScholarCross RefCross Ref
  53. Dong, J., Chen, Q., Xia, W., Huang, Z., and Yan, S. 2013. A deformable mixture parsingmodel with parselets ICCV, pp. 3408--3415. DOI: http://dx.doi.org/10.1109/ICCV.2013.423. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Yang W., Luo, P., and Lin, L. 2014. Clothing co-parsing by joint image segmentation and labeling. CVPR, 3182--3189. DOI: http://dx.doi.org/10.1109/CVPR.2014.407. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Liu, C., Yuen, J., and Torralba, A. 2011. Nonparametric scene parsing via label transfer. TPAMI, Vol. 33, No. 12, 2368--2382. DOI: http://dx.doi.org/10.1109/CVPR.2009.5206536. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Tung, F., and Little, J. J. 2014. Collage parsing: nonparametric scene parsing by adaptive overlapping windows. ECCV, Vol. 8694, 511--5252.Google ScholarGoogle Scholar
  57. Farabet, C., Couprie, C., Najman, L., and LeCun, Y. 2013. Learning hierarchical features for scene labeling. TPAMI, Vol. 35, No. 8. DOI:http://dx.doi.org/10.1109/TPAMI.2012.231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Long, J., Zhang, N., and Darrell, T. 2014. Do convnets learn correspondence. NIPS, 1601--1609.Google ScholarGoogle Scholar
  59. Liu, S., Liang, X., Liu, L., Shen, X., Yang, J., Xu, C., Lin, L., Cao1, X., and Yan, S. 2015. Matching-CNN Meets KNN: Quasi-Parametric Human Parsing. arXiv:1504.01220.Google ScholarGoogle Scholar
  60. Wah, W. Di, C., A., Bhardwaj, Piramuthu, R., and Sundaresan, N. 2013. Style finder: fine-grained clothing style recognition and retrieval, CVPRW, 8--13. DOI: http://dx.doi.org/10.1109/CVPRW.2013.6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Borràs, A., Tous, F., Lladós, J., and Vanrell, M. 2003. High-level clothes description based on color-texture and structural features. PRIA, Vol. 2652, 108--116. DO: http://dx.doi.org/10.1007/978-3-540-44871-6_13.Google ScholarGoogle Scholar
  62. Chen, J. C., Xue, B. F. and Lin, Kawuu, W. 2015. Dictionary Learning for Discovering Visual Elements of Fashion Styles. CEC workshop.Google ScholarGoogle Scholar
  63. Kiapour, M. H., Yamaguchi, K., Berg A. C., and Berg, T. L. 2014. Hipster Wars: Discovering Elements of Fashion Styles. ECCV, 472--488. DOI:10.1007/978-3-319-10590-1_31.Google ScholarGoogle Scholar
  64. Khosla, N. and Venkataraman, V. Building Image-Based Shoe Search Using Convolutional Neural Networks. CS231n Course Project Reports.Google ScholarGoogle Scholar
  65. Lin, K., Yang, H. F., Liu, K. H., Hsiao, J. H., and Chen, C. S. 2015. Rapid clothing retrieval via deep learning of binary codes and hierarchical search. ICMR, 499--502. DOI:= http://dx.doi.org/10.1145/2671188.2749318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Dean, J. and Ghemawat, S. 2008. MapReduce: simplified data processing on large clusters. ACM Magazine. 107--113. DOI= http://dx.doi.org/10.1145/1327452.1327492.Google ScholarGoogle Scholar
  67. Dean, J. 2012. Large scale distributed deep networks. NIPS, 1232--1240.Google ScholarGoogle Scholar
  68. Yangqing, J., Evan, S., Jeff, D., Sergey, K., Jonathan, L., Ross, G., Sergio, G., and Trevor, D. 2014. Caffe: Convolutional architecture for fast feature embedding. ICM, 675--678. DOI: http://dx.doi.org/10.1145/2671188.2749408. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Krizhevsky A., cuda-convnet. https://code.google.com/p/cuda-convnet/Google ScholarGoogle Scholar
  70. Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., and Darrell, T. 2013. Decaf: A deep convolutional activation feature for generic visual recognition. arXiv preprint arXiv:1310.1531.DOI: http://dx.doi.org/10.1016/j.aasri.2014.05.013.Google ScholarGoogle Scholar
  71. Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., and LeCun, Y. 2014. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprintb arXiv: 1312.6229. DOI:http://dx.doi.org/10.1109/CVPRW.2014.Google ScholarGoogle Scholar

Index Terms

  1. Visual-based Deep Learning for Clothing from Large Database

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          ASE BD&SI '15: Proceedings of the ASE BigData & SocialInformatics 2015
          October 2015
          381 pages
          ISBN:9781450337359
          DOI:10.1145/2818869

          Copyright © 2015 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 7 October 2015

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited
        • Article Metrics

          • Downloads (Last 12 months)15
          • Downloads (Last 6 weeks)6

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader