Skip to main content
Log in

Accumulative image categorization: a personal photo classification method for progressive collection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

With the explosive growth of personal photos, an effective classification tool is becoming an urgent need to organize our progressive image collections. Facing the dynamically growing collections, we present a new method to categorize images effectively by integrating image clustering, incremental updating and user feedback together in an online framework. Considering the user burden and the user-specific preference during image classification, we propose several strategies to learn a customized classification model progressively for each user. Firstly, we use a multi-view learning method to learn the preferred classification perspective of the user. Secondly, we cluster similar images into groups according to user’s preference, so that images in a group can be categorized simultaneously with high efficiency. Thirdly, we propose a multi-centroid nearest class mean classifier to online learn the user’s preferred category granularity, and use it to classify the image groups. Unlike offline systems where pre-labeling and batch training often take hours or even days to perform, our approach is fully online. It can learn the classification model and classify newly acquired images alternately in no time. The sufficient experimental results and a user study demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Akata Z, Perronnin F, Harchaoui Z, Schmid C (2014) Good practice in large-scale learning for image classification. IEEE Trans Pattern Anal Mach Intell 36 (3):507–520. https://doi.org/10.1109/TPAMI.2013.146

    Article  Google Scholar 

  2. Bergamo A, Torresani L, Fitzgibbon A (2011) PICODES: learning a compact code for novel-category recognition. In: Advances in neural information processing systems, pp 2088–2096

  3. Biswas A, Jacobs D (2014) Active image clustering with pairwise constraints from humans. Int J Comput Vis 108(1-2):133–147. https://doi.org/10.1007/s11263-013-0680-6

    Article  MathSciNet  Google Scholar 

  4. Bruneau P, Picarougne F, Gelgon M (2010) Interactive unsupervised classification and visualization for browsing an image collection. Pattern Recogn 43(2):485–493. https://doi.org/10.1016/j.patcog.2009.03.024

    Article  Google Scholar 

  5. Bul SR, Kontschieder P (2016) Online learning with bayesian classification trees. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 3985–3993. https://doi.org/10.1109/CVPR.2016.432

  6. Chechik G, Sharma V, Shalit U, Bengio S (2010) Large scale online learning of image similarity through ranking. J Mach Learn Res 11:1109–1135

    MathSciNet  MATH  Google Scholar 

  7. Cheng MM, Zhang Z, Lin WY, Torr P (2014) BING: binarized normed gradients for objectness estimation at 300fps. In: 2014 IEEE conference on computer vision and pattern recognition. IEEE, pp 3286–3293. https://doi.org/10.1109/CVPR.2014.414

  8. Ciocca G, Cusano C, Santini S, Schettini R (2014) On the use of supervised features for unsupervised image categorization: an evaluation. Comput Vis Image Underst 122:155–171. https://doi.org/10.1016/j.cviu.2014.01.010

    Article  Google Scholar 

  9. Crammer K, Dekel O, Keshet J (2006) Online Passive-Aggressive algorithms. J Mach Learn Res 7:551–585. http://www.jmlr.org/papers/v7/crammer06a.html

    MathSciNet  MATH  Google Scholar 

  10. Dang-Nguyen D, Piras L, Giacinto G, Boato G, Natale FGBD (2017) Multimodal retrieval with diversification and relevance feedback for tourist attraction images. ACM Trans Multimed Comput Commun Appl (TOMM) 13(4):49;1–49:24. https://doi.org/10.1145/3103613

    Google Scholar 

  11. Datta P, Kibler DF (1997) Symbolic nearest mean classifiers. In: Proceedings of the fourteenth national conference on artificial intelligence and ninth innovative applications of artificial intelligence conference, AAAI-97, pp 82–87

  12. Ebert S, Fritz M, Schiele B (2013) Semi-supervised learning on a budget: scaling up to large datasets. In: Computer vision — ACCV 2012, vol 7724, pp 232–245. https://doi.org/10.1007/978-3-642-37331-2_18

    Chapter  Google Scholar 

  13. Faktor A, Irani M (2014) Clustering by composition-unsupervised discovery of image categories. IEEE Trans Pattern Anal Mach Intell 36(6):1092–1106. https://doi.org/10.1109/TPAMI.2013.251

    Article  Google Scholar 

  14. Fei-Fei L, Fergus R, Perona P (2007) Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories. Comput Vis Image Underst 106(1):59–70. https://doi.org/10.1016/j.cviu.2005.09.012

    Article  Google Scholar 

  15. Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139. https://doi.org/10.1006/jcss.1997.1504

    Article  MathSciNet  Google Scholar 

  16. Fu Z, Ip HH, Lu H, Lu Z (2011) Multi-modal constraint propagation for heterogeneous image clustering. In: Proceedings of the 19th ACM international conference on multimedia - MM ’11, ACM Press, pp 143–152. https://doi.org/10.1145/2072298.2072318

  17. Galleguillos C, McFee B, Lanckriet GRG (2014) Iterative category discovery via multiple kernel metric learning. Int J Comput Vis 108(1-2):115–132. https://doi.org/10.1007/s11263-013-0679-z

    Article  MathSciNet  Google Scholar 

  18. Girshick R, Donahue J, Darrell T, Malik J (2016) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158. https://doi.org/10.1109/TPAMI.2015.2437384

    Article  Google Scholar 

  19. Grzeszick R, Fink GA (2016) An iterative partitioning-based method for semi-supervised annotation learning in image collections. Int J Pattern Recognit Artif Intell 30(2):1–19. https://doi.org/10.1142/S0218001416550053

    Article  MathSciNet  Google Scholar 

  20. Guntuku SC, Zhou JT, Roy S, Lin W, Tsang IW (2016) Understanding deep representations learned in modeling users likes. IEEE Trans Image Process 25 (8):3762–3774. https://doi.org/10.1109/TIP.2016.2576278

    Article  MathSciNet  Google Scholar 

  21. Hamid Amiri S, Jamzad M (2015) Efficient multi-modal fusion on supergraph for scalable image annotation. Pattern Recogn 48(7):2241–2253. https://doi.org/10.1016/j.patcog.2015.01.015

    Article  Google Scholar 

  22. Hoi SCH, Jin R, Zhao P, Yang T (2013) Online multiple Kernel classification. Mach Learn 90(2):289–316. https://doi.org/10.1007/s10994-012-5319-2

    Article  MathSciNet  Google Scholar 

  23. Hu J, Sun Z, Li B, Wang S (2017) PicMarker: data-driven image categorization based on iterative clustering. In: Computer vision – ACCV 2016, 13th Asian conference on computer vision. Springer International Publishing, Taipei, pp 172–187. https://doi.org/10.1007/978-3-319-54190-7_11

    Chapter  Google Scholar 

  24. Hu J, Sun Z, Li B, Yang K, Li D (2017) Online user modeling for interactive streaming image classification. In: MultiMedia modeling - 23nd international conference, MMM 2017. Springer International Publishing, Reykjavik, Iceland, pp 293–305. https://doi.org/10.1007/978-3-319-51814-5_25

    Google Scholar 

  25. Huang Y, Wu Z, Wang L, Tan T (2014) Feature coding in image classification: a comprehensive study. IEEE Trans Pattern Anal Mach Intell 36 (3):493–506. https://doi.org/10.1109/TPAMI.2013.113

    Article  Google Scholar 

  26. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM international conference on multimedia - MM ’14. ACM Press, pp 675–678. https://doi.org/10.1145/2647868.2654889

  27. Jin R, Hoi SCH, Yang T (2010) Online multiple kernel learning: algorithms and mistake bounds. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6331 LNAI:390–404. https://doi.org/10.1007/978-3-642-16108-7_31

    MathSciNet  MATH  Google Scholar 

  28. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1–9

  29. Kundu MK, Chowdhury M, Rota Bulȯ RS (2015) A graph-based relevance feedback mechanism in content-based image retrieval. Knowl-Based Syst 73:254–264. https://doi.org/10.1016/j.knosys.2014.10.009

    Article  Google Scholar 

  30. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 2169–2178. https://doi.org/10.1109/CVPR.2006.68

  31. Lee YJ, Grauman K (2011) Learning the easy things first: self-paced visual category discovery. In: 2011 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1721–1728. https://doi.org/10.1109/CVPR.2011.5995523

  32. Li LJ, Li F-F (2007) What, where and who? Classifying events by scene and object recognition. In: IEEE international conference on computer vision, IEEE, pp 1–8. https://doi.org/10.1109/ICCV.2007.4408872

  33. Li G, Huang Q, Jiang S, Xu Y, Zhang W (2015) Online learning affinity measure with CovBoost for multi-target tracking. Neurocomputing 168:327–335. https://doi.org/10.1016/j.neucom.2015.05.093

    Article  Google Scholar 

  34. Li Z, Tang J (2015) Unsupervised feature selection via nonnegative spectral analysis and redundancy control. IEEE Trans Image Process 24(12):5343–5355. https://doi.org/10.1109/TIP.2015.2479560

    Article  MathSciNet  Google Scholar 

  35. Li Z, Tang J (2015) Weakly supervised deep metric learning for community-contributed image retrieval. IEEE Trans Multimed 17(11):1989–1999. https://doi.org/10.1109/TMM.2015.2477035

    Article  Google Scholar 

  36. Li X, Uricchio T, Ballan L, Bertini M, Snoek CGM, Bimbo AD (2016) Socializing the semantic gap: a comparative survey on image tag assignment, refinement, and retrieval. ACM Comput Surv 49(1):1–39. https://doi.org/10.1145/2906152

    Article  Google Scholar 

  37. Li Z, Tang J (2017) Weakly supervised deep matrix factorization for social image understanding. IEEE Trans Image Process 26(1):276–288. https://doi.org/10.1109/TIP.2016.2624140

    Article  MathSciNet  Google Scholar 

  38. Lin L, Wang K, Meng D, Zuo W, Zhang L (2018) Active self-paced learning for cost-effective and progressive face identification. IEEE Trans Pattern Anal Mach Intell 40(1):7–19. https://doi.org/10.1109/TPAMI.2017.2652459

    Article  Google Scholar 

  39. Liu Y, Zhang L, Nie L, Yan Y, Rosenblum DS (2016) Fortune teller: predicting your career path. In: Proceedings of the thirtieth AAAI conference on artificial intelligence (AAAI). AAAI Press, pp 201–207

  40. Liu Y, Zheng Y, Liang Y, Liu S, Rosenblum DS (2016) Urban water quality prediction based on multi-task multi-view learning. In: Proceedings of the twenty-fifth international joint conference on artificial intelligence, IJCAI 2016. IJCAI/AAAI Press, pp 2576–2581

  41. Liu P, Guo J, Chamnongthai K, Prasetyo H (2017) Fusion of color histogram and lbp-based features for texture image retrieval and classification. Inf Sci 390:95–111. https://doi.org/10.1016/j.ins.2017.01.025

    Article  Google Scholar 

  42. Liu L, Fieguth P, Guo Y, Wang X, Pietikȧinen M (2017) Local binary features for texture classification: Taxonomy and experimental study. Pattern Recogn 62:135–160. https://doi.org/10.1016/j.patcog.2016.08.032

    Article  Google Scholar 

  43. Lovato P, Bicego M, Segalin C, Perina A, Sebe N, Cristani M (2014) Faved! Biometrics: tell me which image you like and i’ll tell you who you are. IEEE Trans Inf Forensic Secur 9(3):364–374. https://doi.org/10.1109/TIFS.2014.2298370

    Article  Google Scholar 

  44. Lu Z, Ip HHS (2010) Constrained spectral clustering via exhaustive and efficient constraint propagation. In: Computer vision — ECCV 2010, pp 1–14. https://doi.org/10.1007/978-3-642-15567-3_1

    Google Scholar 

  45. Lughofer E, Pratama M (2018) Online active learning in data stream regression using uncertainty sampling based on evolving generalized fuzzy models. IEEE Trans Fuzzy Syst 26(1):292–309. https://doi.org/10.1109/TFUZZ.2017.2654504

    Article  Google Scholar 

  46. Mensink T, Verbeek J, Perronnin F, Csurka G (2013) Distance-based image classification: generalizing to new classes at near-zero cost. IEEE Trans Pattern Anal Mach Intell 35(11):2624–2637. https://doi.org/10.1109/TPAMI.2013.83

    Article  Google Scholar 

  47. Misra I, Zitnick CL, Mitchell M, Girshick R (2016) Seeing through the human reporting bias: visual classifiers from noisy human-centric labels. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 2930–2939. https://doi.org/10.1109/CVPR.2016.320

  48. Oliva A, Hospital W, Ave L, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175. https://doi.org/10.1023/A:1011139631724

    Article  Google Scholar 

  49. Rasiwasia N, Vasconcelos N (2013) Latent dirichlet allocation models for image classification. IEEE Trans Pattern Anal Mach Intell 35(11):2665–2679. https://doi.org/10.1109/TPAMI.2013.69

    Article  Google Scholar 

  50. Ristin M, Guillaumin M, Gall J, Van Gool L (2016) Incremental learning of random forests for large-scale image classification. IEEE Trans Pattern Anal Mach Intell 38(3):490–503. https://doi.org/10.1109/TPAMI.2015.2459678

    Article  Google Scholar 

  51. Royer A, Lampert CH (2015) Classifier adaptation at prediction time. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1401–1409. https://doi.org/10.1109/CVPR.2015.7298746

  52. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252. https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  53. Saffari A, Godec M, Pock T, Leistner C, Bischof H (2010) Online multi-class LPBoost. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 3570–3577. https://doi.org/10.1109/CVPR.2010.5539937

  54. Shalev-Shwartz S (2011) Online learning and online convex optimization. Found Trends®; Mach Learn 4(2):107–194. https://doi.org/10.1561/2200000018

    Article  Google Scholar 

  55. Shi Z, Yang Y, Hospedales TM, Xiang T (2017) Weakly-supervised image annotation and segmentation with objects and attributes. IEEE Trans Pattern Anal Mach Intell 39(12):2525–2538. https://doi.org/10.1109/TPAMI.2016.2645157

    Article  Google Scholar 

  56. Shneiderman B, Kang H (2000) Direct annotation: a drag-and-drop strategy for labeling photos. In: 2000 IEEE conference on information visualization. IEEE Comput. Soc, pp 88–95

  57. Song M, Sun Z, Liu K, Lang X (2015) Iterative 3D shape classification by online metric learning. Comput Aided Geom Des 35-36:192–205. https://doi.org/10.1016/j.cagd.2015.03.009

    Article  MathSciNet  Google Scholar 

  58. Su Y, Jurie F (2012) Improving image classification using semantic attributes. Int J Comput Vis 100(1):59–77. https://doi.org/10.1007/s11263-012-0529-4

    Article  Google Scholar 

  59. Su Y, Jurie F (2012) Learning compact visual attributes for large-scale image classification. In: Computer vision — ECCV 2012. Workshops and Demonstrations, pp 51–60. https://doi.org/10.1007/978-3-642-33885-4_6

    Chapter  Google Scholar 

  60. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1–9. https://doi.org/10.1109/CVPR.2015.7298594

  61. Torresani L, Szummer M, Fitzgibbon A (2010) Efficient object category recognition using classemes. In: Computer vision — ECCV 2010, pp 776–789. https://doi.org/10.1007/978-3-642-15549-9_56

    Chapter  Google Scholar 

  62. von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416. https://doi.org/10.1007/s11222-007-9033-z

    Article  MathSciNet  Google Scholar 

  63. Wan J, Wu P, Hoi SCH, Zhao P, Gao X, Wang D, Zhang Y, Li J (2015) Online learning to rank for Content-Based image retrieval. In: Proceedings of the twenty-fourth international joint conference on artificial intelligence, IJCAI, vol 2015, pp 2284–2290

  64. Wu J, Zhao S, Sheng VS, Zhang J, Ye C, Zhao P, Cui Z (2017) Weak-labeled active learning with conditional label dependence for multilabel image classification. IEEE Trans Multimed 19(6):1156–1169. https://doi.org/10.1109/TMM.2017.2652065

    Article  Google Scholar 

  65. Xing EP, Ng AY, Jordan MI, Russell S (2002) Distance metric learning with application to clustering with side-information. Adv Neural Inf Process Syst 15:505–512

    Google Scholar 

  66. Yuan Y, Jiang Z, Wang Q (2015) Video-based road detection via online structural learning. Neurocomputing 168:336–347. https://doi.org/10.1016/j.neucom.2015.05.092

    Article  Google Scholar 

  67. Zhang H, Zha ZJ, Yang Y, Yan S, Gao Y, Chua TS (2014) Attribute-augmented semantic hierarchy: towards a unified framework for content-based image retrieval. ACM Trans Multimed Comput Commun Appl 11:1–21. https://doi.org/10.1145/2637291

    Article  Google Scholar 

  68. Zhang F, Sun Z, Song M, Lang X (2015) Progressive 3D shape segmentation using online learning. Comput Aided Des 58:2–12. https://doi.org/10.1016/j.cad.2014.08.008

    Article  Google Scholar 

  69. Zhang J, Han Y, Jiang J (2017) Semi-supervised tensor learning for image classification. Multimed Syst 23(1):63–73. https://doi.org/10.1007/s00530-014-0416-7

    Article  Google Scholar 

  70. Zhu S, Sun X, Jin D (2016) Multi-view semi-supervised learning for image classification. Neurocomputing 208:136–142. https://doi.org/10.1016/j.neucom.2016.02.072

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by National High Technology Research and Development Program of China (No. 2007AA01Z334); National Natural Science Foundation of China (No. 61321491, 61272219); Innovation Fund of State Key Laboratory for Novel Software Technology (No. ZZKT2013A12, ZZKT2016A11); Program for New Century Excellent Talents in University of China (No. NCET-04-04605); Nanjing University Innovation and Creative Program for PhD candidate (No. 2016013).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhengxing Sun.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, J., Sun, Z., Sun, Y. et al. Accumulative image categorization: a personal photo classification method for progressive collection. Multimed Tools Appl 77, 32179–32211 (2018). https://doi.org/10.1007/s11042-018-6152-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6152-9

Keywords

Navigation