ABSTRACT
Huge benefits can be obtained by mining information from Big Data. Analyzing large volumes of consumption behavior data that are limited by conventional machine learning techniques and computational analysis becomes a critical problem as Big Data is examined. Furthermore, there is a need for powerful visual-based analytics tools when pictures have become a core content component on the Internet. Hence, in this study, we explore Deep Learning with convolutional neural networks with a goal of resolving clothing style classification and retrieval tasks. To reduce training complexity, transfer learning is incorporated by fine-tuning pre-trained models on large scale datasets. Furthermore, because the parameters are vast for any given deep net, one architecture inspired from Adaboost is designed to use multiple deep nets that are trained with a sub-dataset. Thus, the training time can be accelerated if each net is computed in one client node in a distributed computing environment. Moreover, to increase system flexibility, two architectures with multiple deep nets with two outputs are proposed for binary-class classification. Therefore, when new classes are added, no additional computation is needed for all training data. Experiments are performed to compare existing systems with hand-crafted features and conventional learning models. According to the results, the proposed system can provide significant improvements on three public clothing datasets for style classifications.
- US National Security Agency 2013. The National Security Agency: Missions, Authorities. Oversight and Partnerships, 5 (August. 2013).Google Scholar
- Chen, X. W. and Lin, X. 2014. Big data deep learning: challenges and perspectives. IEEE Access, 514--525. DOI: http://dx.doi.org/10.1109/ACCESS.2014.2325029.Google Scholar
- Gantz, J. and Reinsel, D. 2011. Extracting value from chaos, EMC.Google Scholar
- Dumbill, E. 2012. What is big data? An introduction to the big data landscape, Strata.Google Scholar
- Grobelnik, M. 2013. Big Data Tutorial, European Data Forum.Google Scholar
- Najafabadi, M. M., Villanustre, F., Khoshgoftaar, T. M., Seliya, N., Wald, R., and Muharemagic, E. 2015. Deep learning applications and challenges in big data analytics, Journal of Big Data, Vol. 2, No. 1, 1--21. DOI: http://dx.doi.org/10.1186/s40537-014-0007-7.Google ScholarCross Ref
- Lin, J. and Kolcz, A. 2012. Large-scale machine learning at twitter. International conference on management of data, 793--804. DOI:http://dx.doi.org/10.1145/2213836.2213958. Google ScholarDigital Library
- Sukumar, S. R. 2014. Machine learning in the big data era: are we there yet?. CISML.Google Scholar
- Bengio, Y., Courville, A., and Vincent, P. 2013. Representation learning: a review and new perspectives. TPAMI, Vol. 35, No. 8, 1798--1828. DOI: http://dx.doi.org/10.1109/TPAMI.2013.50. Google ScholarDigital Library
- Arel, I., Rose, D. C., and Karnowski, T. P. 2010. Deep machine learning - a new frontier in artificial intelligence research. IEEE Computational Intelligence Magazine, Vol. 5, No. 4, 13--18. DOI: http://dx.doi.org/10.1109/MCI.2010.938364. Google ScholarDigital Library
- Efrati, A. 2013. How deep learning works at Apple. Information.Google Scholar
- Jones, N. 2014. Computer science: the learning machines. Nature, Vol. 505, No. 7482, 146--148. DOI: http://dx.doi.org/10.1038/505146a.Google ScholarCross Ref
- Wang, Y., Yu, D., Ju, Y., and Acero, A. 2011. Voice search. Language understanding: systems for extracting semantic information from speech, 119--146.Google Scholar
- Hinton, G., and Osindero, S. 2006. A fast learning algorithm for deep belief nets. Neural Computation, Vol. 18, No. 7, 1527--1554. DOI: http://dx.doi.org/10.1162/neco.2006.18.7.1527 Google ScholarDigital Library
- Bengio, Y., Lamblin, P., Popovici, D., and Larochelle, H. 2007. Greedy layer-wise training of deep networks. NIPS, 153--160.Google Scholar
- Dahl, G. E., Yu, D., Deng, L., and Acero, A. 2012. Context-dependent pretrained deep neural networks for large-vocabulary speech recognition. IEEE Trans. on Audio, Speech and Language Processing, Vol. 20, No. 1, 30--41. DOI: http://dx.doi.org/10.1109/TASL.2011.2134090. Google ScholarDigital Library
- Mohamed, A., Dahl, G., and Hinton, G. 2012. Acoustic modeling using deep belief networks. IEEE Trans. on Audio, Speech and Language Processing, Vol. 20, No. 1, 14--22. DOI: http://dx.doi.org/10.1109/TASL.2011.2109382. Google ScholarDigital Library
- Socher, R., Huang, E. H., Pennington, J., Ng, A. Y., and Manning, C. D. 2011. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. NIPS.Google Scholar
- Ciresan, D. C., Meier, U., Gambardella, L. M., and Schmidhuber, J. 2010. Deep big simple neural nets excel on handwritten digit recognition. Neural Computing, Vol. 22, No. 12,3207--3220. DOI: http://dx.doi.org/10.1162/NECO_a_00052. Google ScholarDigital Library
- Krizhevsky, A., Sutskever, I., and Hinton, G., 2012. Imagenet classification with deep convolutional neural networks. NIPS, 1106--1114.Google Scholar
- Ojala, T., Pietikainen, M., and Maenpaa, T. 2002. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. TPAMI, Vol. 24, No. 7,971--87.DOI:http://dx.doi.org/10.1109/TPAMI.2002.1017623. Google ScholarDigital Library
- Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, Vol.60, No.2,91110.DOI:http://dx.doi.org/10.1023/B:VISI.0000029664.99615.94. Google ScholarDigital Library
- Le, Q., Ranzato, M., Monga, R., Devin, M., Chen, K., Corrado, G., Dean, J., and Ng, A. 2012. Building high-level features using large scale unsupervised learning. ICML. DOI:http://dx.doi.org/10.1109/ICASSP.2013.6639343.Google Scholar
- LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. 1998. Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, Vol. 86, No. 11, 2278--2324. DOI: http://dx.doi.org/10.1109/5.726791.Google ScholarCross Ref
- Sun, Y., Wang, X., and Tang, X. 2013. Hybrid deep learning for face verification. ICCV. DOI: http://dx.doi.org/10.1109/ICCV.2013.188. Google ScholarDigital Library
- Ciresan, D., Meier, U., and Schmidhuber, J. 2012. Multi-column deep neural networks for image classification. CVPR. DOI:http://dx.doi.org/10.1109/CVPR.2012.6248110.Google Scholar
- Girshick, R., Donahue, J., Darrell, T., and Malik, J. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR. DOI: http://dx.doi.org/10.1109/CVPR.2014.81. Google ScholarDigital Library
- Sun, Y., Wang, X., and Tang, X. 2015. Deeply learned face representations are sparse, selective, and robust. CVPR.Google Scholar
- Zhang, N., Paluri, M., Ranzato, M., Darrell, T., and Bourdev, L. 2014. PANDA: Pose aligned networks for deep attribute modeling. CVPR. DOI:http://dx.doi.org/10.1109/CVPR.2014.212. Google ScholarDigital Library
- Hinton, G., Srivastava, N., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R. 2012. Improving neural networks by preventing coadaptation of feature detectors. ArXiv:1207.0508.Google Scholar
- Glorot, X., Bordes, A., and Bengio, Y. 2011. Deep sparse rectifier networks. ICAIS, 315--323.Google Scholar
- Goodfellow, I. J., Warde-Farley, D., Mirza, M., Courville, A., and Bengio, Y. 2013. Maxout networks. arXiv preprint arXiv:1302.4389.Google Scholar
- Lin, M., Chen, Q., and Yan, S. 2013. Network in network. ICLR.Google Scholar
- Socher, R., Lin, C., and Ng, A. 2011. Parsing natural scenes and natural language with recursive neural Networks. ICML, 129--136.Google Scholar
- Le, Q. V., Zou, W. Y., Yeung, S. Y., and Ng, A. Y. 2011. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. CVPR, 3361--3368.DOI:http://dx.doi.org/10.1109/CVPR.2011.5995496. Google ScholarDigital Library
- Gopalan, R., Li, R., and Chellappa, R. 2014. Unsupervised adaptationacross domain shifts by generating intermediate data representations. TPAMI, Vol. 36, No. 11, 2288--2302. DOI: http://dx.doi.org/10.1109/TPAMI.2013.249Google ScholarCross Ref
- Razavian, A. S., Azizpour, H., Sullivan, J., and Carlsson, S. 2014. Cnn features off-the-shelf: an astounding baseline for recognition. CPVRW, 512--519. DOI: http://dx.doi.org/10.1109/CVPRW.2014.131. Google ScholarDigital Library
- Oquab, M., Bottou, L., Laptev, I., and Sivic, J. 2014. Learning and transferring mid-level image representations using convolutional neural networks. CVPR, 1717--1724. DOI: http://dx.doi.org/10.1109/CVPR.2014.222 Google ScholarDigital Library
- Chen, Q., Huang, J., Feris, R., Brown, L. M., Dong, J., and S. Yan. 2015. Deep Domain Adaptation for Describing People Based on Fine-Grained Clothing Attributes. CVPR, 5315--5324.Google Scholar
- Huang, J. Feris, R. S., Chen, Q., and Yan, S. 2015. Cross-domain Image Retrieval with a Dual Attribute-aware Ranking Network. arXiv preprint arXiv:1505.07922.Google Scholar
- Yamaguchi, K., Berg, T. L., and Ortiz, L. E. 2014. Chic or Social: Visual Popularity Analysis in Online Fashion Networks. ACM Conference on Multimedia, 773--776. DOI: http://dx.doi.org/10.1145/2647868.2654958. Google ScholarDigital Library
- Yamaguchi, K., Kiapour, M. H., Ortiz, L. E., and Berg, T. L. 2012. Parsing clothing in fashion photographs. CVPR, 3570--3577. DOI: http://dx.doi.org/10.1109/CVPR.2012.6248101. Google ScholarDigital Library
- Yamaguchi, K., Kiapour, M. H., and Berg, T. L. 2013. Paper doll parsing: Retrieving similar styles to parse clothing items. ICCV, 3519--3526. DOI: http://dx.doi.org/10.1109/ICCV.2013.437. Google ScholarDigital Library
- Kalantidis, Y., Kennedy, L., and Li, L. J. 2013. Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos. ICMR, 105--112. DOI: http://dx.doi.org/10.1145/2461466.2461485. Google ScholarDigital Library
- Jagadeesh, V., Piramuthu, R., Bhardwaj, A., Di, W., and Sundaresan, N. 2014. Large scale visual recommendations from street fashion images. ACM SIGKDD, 1925--1934. DOI: http://dx.doi.org/10.1145/2623330.2623332. Google ScholarDigital Library
- Liu, S., Feng, J., Song, Z., Zhang, T., Lu, H., Xu, C., and Yan, S. 2012. Hi, magic closet, tell me what to wear! ICM, 619--628. DOI:http://dx.doi.org/10.1145/2393347.2393433. Google ScholarDigital Library
- Kalantidis, Y., Kennedy, L., and Li, L. 2013. Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos. ICMR, 105--112. DOI: http://dx.doi.org/10.1145/2461466.2461485. Google ScholarDigital Library
- Bossard, L., Dantone, M., Leistner, C., Wengert, C., Quack, T., and Gool, L. V. 2013. Apparel classification with style. ACCV, Vol. 7727, 321--335. DOI:http://dx.doi.org/10.1007/978-3-642-37447-0_25. Google ScholarDigital Library
- Gallagher, A., and Chen, T. 2008. Clothing cosegmentation for recognizing people. CVPR, 1--8. DOI: http://dx.doi.org/10.1109/CVPR.2008.4587481.Google Scholar
- Song, Z., Wang., Hua, M. X., and Yan, S. 2011. Predicting occupation via human clothing and contexts. ICCV, 1084--1091. DOI: http://dx.doi.org/10.1109/ICCV.2011.6126355. Google ScholarDigital Library
- Kwak, I., Murillo, A., Belhumeur, P., Kriegman, D., and Belongie, S. 2013. From bikers to surfers: visual recognition of urban tribes. BMVC. DOI: http://dx.doi.org/10.5244/C.27.14.Google Scholar
- Liu, S., Feng, J., Domokos, C., Xu, H., Huang, J., Hu, Z., and Yan, S. 2014. Fashion parsing with weak color-category labels. TMM, Vol. 16, No. 1, 253--265. DOI=http://dx.doi.org/10.1109/TMM.2013.2285526.Google ScholarCross Ref
- Dong, J., Chen, Q., Xia, W., Huang, Z., and Yan, S. 2013. A deformable mixture parsingmodel with parselets ICCV, pp. 3408--3415. DOI: http://dx.doi.org/10.1109/ICCV.2013.423. Google ScholarDigital Library
- Yang W., Luo, P., and Lin, L. 2014. Clothing co-parsing by joint image segmentation and labeling. CVPR, 3182--3189. DOI: http://dx.doi.org/10.1109/CVPR.2014.407. Google ScholarDigital Library
- Liu, C., Yuen, J., and Torralba, A. 2011. Nonparametric scene parsing via label transfer. TPAMI, Vol. 33, No. 12, 2368--2382. DOI: http://dx.doi.org/10.1109/CVPR.2009.5206536. Google ScholarDigital Library
- Tung, F., and Little, J. J. 2014. Collage parsing: nonparametric scene parsing by adaptive overlapping windows. ECCV, Vol. 8694, 511--5252.Google Scholar
- Farabet, C., Couprie, C., Najman, L., and LeCun, Y. 2013. Learning hierarchical features for scene labeling. TPAMI, Vol. 35, No. 8. DOI:http://dx.doi.org/10.1109/TPAMI.2012.231. Google ScholarDigital Library
- Long, J., Zhang, N., and Darrell, T. 2014. Do convnets learn correspondence. NIPS, 1601--1609.Google Scholar
- Liu, S., Liang, X., Liu, L., Shen, X., Yang, J., Xu, C., Lin, L., Cao1, X., and Yan, S. 2015. Matching-CNN Meets KNN: Quasi-Parametric Human Parsing. arXiv:1504.01220.Google Scholar
- Wah, W. Di, C., A., Bhardwaj, Piramuthu, R., and Sundaresan, N. 2013. Style finder: fine-grained clothing style recognition and retrieval, CVPRW, 8--13. DOI: http://dx.doi.org/10.1109/CVPRW.2013.6. Google ScholarDigital Library
- Borràs, A., Tous, F., Lladós, J., and Vanrell, M. 2003. High-level clothes description based on color-texture and structural features. PRIA, Vol. 2652, 108--116. DO: http://dx.doi.org/10.1007/978-3-540-44871-6_13.Google Scholar
- Chen, J. C., Xue, B. F. and Lin, Kawuu, W. 2015. Dictionary Learning for Discovering Visual Elements of Fashion Styles. CEC workshop.Google Scholar
- Kiapour, M. H., Yamaguchi, K., Berg A. C., and Berg, T. L. 2014. Hipster Wars: Discovering Elements of Fashion Styles. ECCV, 472--488. DOI:10.1007/978-3-319-10590-1_31.Google Scholar
- Khosla, N. and Venkataraman, V. Building Image-Based Shoe Search Using Convolutional Neural Networks. CS231n Course Project Reports.Google Scholar
- Lin, K., Yang, H. F., Liu, K. H., Hsiao, J. H., and Chen, C. S. 2015. Rapid clothing retrieval via deep learning of binary codes and hierarchical search. ICMR, 499--502. DOI:= http://dx.doi.org/10.1145/2671188.2749318. Google ScholarDigital Library
- Dean, J. and Ghemawat, S. 2008. MapReduce: simplified data processing on large clusters. ACM Magazine. 107--113. DOI= http://dx.doi.org/10.1145/1327452.1327492.Google Scholar
- Dean, J. 2012. Large scale distributed deep networks. NIPS, 1232--1240.Google Scholar
- Yangqing, J., Evan, S., Jeff, D., Sergey, K., Jonathan, L., Ross, G., Sergio, G., and Trevor, D. 2014. Caffe: Convolutional architecture for fast feature embedding. ICM, 675--678. DOI: http://dx.doi.org/10.1145/2671188.2749408. Google ScholarDigital Library
- Krizhevsky A., cuda-convnet. https://code.google.com/p/cuda-convnet/Google Scholar
- Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., and Darrell, T. 2013. Decaf: A deep convolutional activation feature for generic visual recognition. arXiv preprint arXiv:1310.1531.DOI: http://dx.doi.org/10.1016/j.aasri.2014.05.013.Google Scholar
- Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., and LeCun, Y. 2014. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprintb arXiv: 1312.6229. DOI:http://dx.doi.org/10.1109/CVPRW.2014.Google Scholar
Index Terms
- Visual-based Deep Learning for Clothing from Large Database
Recommendations
Deep net architectures for visual-based clothing image recognition on large database
In the Big Data era, there is a need for powerful visual-based analytics tools when pictures have replaced texts and become main contents on the Internet. Hence, in this study, we explore convolutional neural networks with a goal of resolving clothing ...
Large scale data based audio scene classification
Artificial Intelligence and Machine learning has been used by many research groups for processing large scale data known as big data. Machine learning techniques to handle large scale complex datasets are expensive to process computation. Apache Spark ...
Deep Learning Approaches for Image Classification
EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer EngineeringDeep learning models can achieve a higher accuracy result compared with traditional machine learning algorithm. It is widely useful in different areas, especially in images classification area. In recent years, because of the improvement of hardware and ...
Comments