Abstract
In this paper, we develop a novel image tag completion method. We propose to represent the images by using the convolutional neural network (CNN) and predict the complete tags from the convolutional representations. The prediction is performed by a linear predictive model, and the complete tags are also imposed to be consistent to the existing elements of the incomplete tag matrix. We propose to learn the CNN parameters, the complete tags, and the predictive model parameters jointly. The learning problem is modeled by a minimization problem of an objective function composed of a consistency term between the learned complete tag vectors and the existing incomplete tag matrix, a prediction error term, and the convolutional similarity regularization term, and a sparsity term of the complete tag vector. The minimization problem is solved by an augmented Lagrangian method. The experiments over some benchmark data sets show that our method outperforms the state-of-the-art image tag completion methods.
Similar content being viewed by others
References
Bhimani J, Yang J, Yang Z, Mi N, Xu Q, Awasthi M, Pandurangan R, Balakrishnan V (2016) Understanding performance of I/O intensive containerized applications for NVMe SSDs. In: 2016 IEEE 35th international performance computing and communications conference (IPCCC). IEEE, pp 1–8
Cai JF, Candès EJ, Shen Z (2010) A singular value thresholding algorithm for matrix completion. SIAM J Optim 20(4):1956–1982
Candes E, Recht B (2012) Exact matrix completion via convex optimization. Commun ACM 55(6):111–119
Candes EJ, Plan Y (2010) Matrix completion with noise. Proc IEEE 98(6):925–936
Charpenay V, Egyed-Zsigmond E, Kosch H (2016) Knowledge-driven reverse geo-tagging for annotated images. Doc Numer 19(1):83–102
Feng Z, Feng S, Jin R, Jain A (2014) Image tag completion by noisy matrix recovery. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8695(PART 7), 424–438
Fu J, Wu Y, Mei T, Wang J, Lu H, Rui Y (2016) Relaxing from vocabulary: robust weakly-supervised deep learning for vocabulary-free image tagging. In: Proceedings of the IEEE international conference on computer vision, 11–18-December-2015, pp 1985–1993. doi:10.1109/ICCV.2015.230
Gando G, Yamada T, Sato H, Oyama S, Kurihara M (2016) Fine-tuning deep convolutional neural networks for distinguishing illustrations from photographs. Expert Syst Appl 66:295–301
Gao H, Yang Z, Bhimani J, Wang T, Wang J, Sheng B, Mi N (2017) Autopath: harnessing parallel execution paths for efficient resource allocation in multi-stage big data frameworks. In: 26th international conference on computer communications
Hou Y, Lin Z (2016) Image tag completion and refinement by subspace clustering and matrix completion. In: 2015 visual communications and image processing, VCIP 2015, p 7457875
King DR, Li W, Squiers JJ, Mohan R, Sellke E, Mo W, Zhang X, Fan W, DiMaio JM, Thatcher JE (2015) Surgical wound debridement sequentially characterized in a porcine burn model with multispectral imaging. Burns 41(7):1478–1487
Li Q, Zhou X, Gu A, Li Z, Liang RZ (2016) Nuclear norm regularized convolutional max pos@top machine. Neural Comput Appl 1–10
Li W, Mo W, Zhang X, Squiers JJ, Lu Y, Sellke EW, Fan W, DiMaio JM, Thatcher JE (2015) Outlier detection and removal improves accuracy of machine learning approach to multispectral burn diagnostic imaging. J Biomed Opt 20(12):121305
Li X, Shen B, Liu BD, Zhang YJ (2016) A locality sensitive low-rank model for image tag completion. IEEE Trans Multimed 18(3):474–483
Li X, Zhang YJ, Shen B, Liu BD (2016) Low-rank image tag completion with dual reconstruction structure preserved. Neurocomputing 173:425–433
Liang RZ, Shi L, Wang H, Meng J, Wang JJY, Sun Q, Gu Y (2016) Optimizing top precision performance measure of content-based image retrieval by learning similarity function. In: 2016 23st International Conference on Pattern Recognition (ICPR). IEEE
Lin Z, Ding G, Hu M, Lin Y, Sam Ge S (2014) Image tag completion via dual-view linear sparse reconstructions. Comput Vis Image Underst 124:42–60
Lin Z, Ding G, Hu M, Wang J, Ye X (2013) Image tag completion via image-specific and tag-specific linear sparse reconstructions. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 1618–1625
Liu J, Chen C, Zhu Y, Liu W, Metaxas DN (2016) Video classification via weakly supervised sequence modeling. Comput Vis Image Underst 152:79–87
Lopes A, de Aguiar E, De Souza A, Oliveira-Santos T (2017) Facial expression recognition with convolutional neural networks: coping with few data and the training sample order. Pattern Recognit 61:610–628
Ma J, Wu F, Zhu J, Xu D, Kong D (2017) A pre-trained convolutional neural network based method for thyroid nodule diagnosis. Ultrasonics 73:221–230
Mo W, Mohan R, Li W, Zhang X, Sellke EW, Fan W, DiMaio JM, Thatcher JE (2015) The importance of illumination in a non-contact photoplethysmography imaging system for burn wound assessment. In: SPIE BiOS. International Society for Optics and Photonics, pp 93,030M–93,030M
Nie W, Liu A, Wang Z, Su Y (2016) Geo-location driven image tagging via cross-domain learning. Multimed Syst 22(4):395–404
Peng B, Liu Y, Zhou Y, Yang L, Zhang G, Liu Y (2015) Modeling nanoparticle targeting to a vascular surface in shear flow through diffusive particle dynamics. Nanoscale Res Lett 10(1):235
Qin Z, Li CG, Zhang H, Guo J (2016) Improving tag matrix completion for image annotation and retrieval. In: 2015 Visual communications and image processing, VCIP 2015, p 7457871. doi:10.1109/VCIP.2015.7457871
Roemer J, Groman M, Yang Z, Wang Y, Tan CC, Mi N (2014) Improving virtual machine migration via deduplication. In: 2014 IEEE 11th international conference on mobile ad hoc and sensor systems (MASS). IEEE, pp 702–707
Russell BC, Torralba A, Murphy KP, Freeman WT (2008) LabelMe: a database and web-based tool for image annotation. Int J Comput Vision 77(1–3):157–173
Tai J, Liu D, Yang Z, Zhu X, Lo J, Mi N (2015) Improving flash resource utilization at minimal management cost in virtualized flash-based storage systems. IEEE Trans Cloud Comput 5(3):537–549
Wang J, Wang H, Zhou Y, McDonald N (2015) Multiple kernel multivariate performance learning using cutting plane algorithm. In: 2015 IEEE international conference on systems, man, and cybernetics (SMC), pp 1870–1875
Wang J, Wang T, Yang Z, Mao Y, Mi N, Sheng B (2017) Seina: A stealthy and effective internal attack in Hadoop systems. In: 2017 international conference on computing, networking and communications (ICNC). IEEE, pp 525–530
Wang J, Wang T, Yang Z, Mi N, Sheng B (2016) eSplash: Efficient speculation in large scale heterogeneous computing systems. In: 2016 IEEE 35th international performance computing and communications conference (IPCCC). IEEE, pp 1–8
Wang J, Zhou Y, Duan K, Wang J, Bensmail H (2015) Supervised cross-modal factor analysis for multiple model data classification. In: 2015 IEEE international conference on systems, man, and cybernetics (SMC), pp 1882–1888
Wang J, Zhou Y, Wang H, Yang X, Yang F, Peterson A (2015) Image tag completion by local learning. In: International symposium on neural networks. Springer, Berlin, pp 232–239
Wang S, Zhou Y, Tan J, Xu J, Yang J, Liu Y (2014) Computational modeling of magnetic nanoparticle targeting to stent surface under high gradient field. Comput Mech 53(3):403–412
Wang X, Guo R, Kambhamettu C (2015) Deeply-learned feature for age estimation. In: 2015 IEEE Winter conference on applications of computer vision. IEEE, pp 534–541
Wang X, Kambhamettu C (2013) Gender classification of depth images based on shape and texture analysis. In: 2013 IEEE global conference on signal and information processing (GlobalSIP). IEEE, pp 1077–1080
Wang X, Kambhamettu C (2014) Leveraging appearance and geometry for kinship verification. In: 2014 IEEE international conference on image processing (ICIP). IEEE, pp 5017–5021
Wang X, Ly V, Lu G, Kambhamettu C (2013) Can we minimize the influence due to gender and race in age estimation? In: 2013 12th international conference on machine learning and applications (ICMLA), vol 2. IEEE, pp 309–314
Wu L, Jin R, Jain A (2013) Tag completion for image retrieval. IEEE Trans Pattern Anal Mach Intell 35(3):716–727
Wu Q, Boulanger P (2016) An unified image tagging system driven by image-click-ads framework. In: Proceedings—2015 IEEE international symposium on multimedia, ISM 2015, pp 369–372. doi:10.1109/ISM.2015.12
Xia C, Hu J, Zhu Y, Naaman M (2015) What is new in our city? A framework for event extraction using social media posts. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, Berlin, pp 16–32
Xia Z, Feng X, Peng J, Wu J, Fan J (2015) A regularized optimization framework for tag completion and image retrieval. Neurocomputing 147(1):500–508
Yang W, Chen Y, Liu Y, Zhong L, Qin G, Lu Z, Feng Q, Chen W (2017) Cascade of multi-scale convolutional neural networks for bone suppression of chest radiographs in gradient domain. Med Image Anal 35:421–433
Yang X, Yang F (2016) Completing tags by local learning: a novel image tag completion method based on neighborhood tag vector predictor. Neural Comput Appl 27(8):2407–2416
Yang Z, Awasthi M, Ghosh M, Mi N (2016) A fresh perspective on total cost of ownership models for flash storage in datacenters. In: 2016 IEEE international conference on cloud computing technology and science (CloudCom). IEEE, pp 245–252
Yang Z, Tai J, Bhimani J, Wang J, Mi N, Sheng B (2016) GReM: dynamic SSD resource allocation in virtualized storage systems with heterogeneous IO workloads. In: 2016 IEEE 35th international performance computing and communications conference (IPCCC). IEEE, pp 1–8
Yang Z, Wang J, Evans D, Mi N (2016) Autoreplica: Automatic data replica manager in distributed caching and data processing systems. In: 2016 IEEE 35th international performance computing and communications conference (IPCCC). IEEE, pp 1–6
Zeng L, Chen H, Xiao Y (2011) Accountable administration and implementation in operating systems. In: 2011 IEEE global telecommunications conference-GLOBECOM 2011
Zeng L, Xiao Y, Chen H (2015) Accountable logging in operating systems. In: 2015 IEEE international conference on communications (ICC). IEEE, pp 7163–7167
Zeng L, Xiao Y, Chen H (2015) Auditing overhead, auditing adaptation, and benchmark evaluation in linux. Secur Commun Netw 8(18):3523–3534
Zeng L, Xiao Y, Chen H (2015) Linux auditing: overhead and adaptation. In: 2015 IEEE international conference on communications (ICC). IEEE, pp 7168–7173
Zhu Y, Tian Y, Mexatas D, Dollár P (2015) Semantic amodal segmentation. arXiv preprint arXiv:1509.01329
Zhu Y, Zhang S, Liu W, Metaxas DN (2014) Scalable histopathological image analysis via active learning. In: International conference on medical image computing and computer-assisted intervention. Springer, Berlin, pp 369–376
Zhu Y, Zhao X, Fu Y, Liu Y (2010) Sparse coding on local spatial-temporal volumes for human action recognition. In: Asian conference on computer vision. Springer, Berlin, pp 660–671
Acknowledgements
This work was supported by the Natural Science Foundation of Hebei Province (D2015207008), Talent Training Project of Hebei Province (A201400215) and National High Technology Research and Development Program of China (863 Program No. 2014AA06A511), and the National Natural Science Foundation of China (41371358).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors of this paper are stating no conflict of interest for the work described in this paper.
Rights and permissions
About this article
Cite this article
Wu, Y., Zhai, H., Li, M. et al. Learning image convolutional representations and complete tags jointly. Neural Comput & Applic 31, 2593–2604 (2019). https://doi.org/10.1007/s00521-017-3216-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-017-3216-0