Skip to main content
Log in

A novel multimodal clustering framework for images with diverse associated text

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

With the enormous growth in the number of images on the web, image clustering has become an essential part of any image retrieval system. Since web images are often accompanied by related text or tags, both visual and textual features can be exploited to improve the precision of web image clustering. Existing clustering methods either utilize them separately in a specific order, or use them simultaneously, but independently. In this work, we propose a new framework, Multimodal Hierarchical Clustering for Images (MHCI), which exploits the coexistence of both visual and textual patterns to establish a relationship between them. We propose textual and visual weights to quantify the relationship established between images and their features. The proposed framework can be applied to a wide variety of image datasets with different characteristics, viz., search results with noisy surrounding text, and tagged images. It can also cluster image search queries and their corresponding clicked images. The respective datasets used include image search results, Flicker (NUS-WIDE), and Clickture (Bing query-log). The proposed framework is shown to be versatile on Clickture dataset, which has not been examined by any of the previous approaches. The experimental results show that MHCI significantly improves the quality of image clusters as compared to existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. M. A. Abebe, J. Tekli, F. Getahun, G. Tekli, and R. Chbeir (2016) A General Multimedia Representation Space Model toward Event-Based Collective Knowledge Management, In: Proc. CSE/ EUC/ DCABES , pp. 512–521

  2. Agrawal R, Wu C, Grosky WI, Fotouhi F (2007) Image clustering using visual and text keywords. Symposium CIRA-IEEE, Jacksonville, pp 49–54

    Google Scholar 

  3. An J, Chen YPP, Chen H DDR: An Index Method for Large Time Series Datasets. Inf Syst 30(5):333–348

  4. I. Ayoub, K. J. Codoumi, and J Tekli (2016) Personalized Social Image Organization, Visualization, and Querying Tool Using Low- and High-Level Features, In: Proc. CSE/ EUC/ DCABES, pp. 287–294

  5. Beeferman D, Berger A (2000) Agglomerative clustering of a search engine query log. In: Proc. SIGKDD-ACM, pp. 407–416

  6. Broilo M (2010) A Stochastic Approach to Image Retrieval Using Relevance Feedback and Particle Swarm Optimization. IEEE Trans TMM 12(4):267–277

    Google Scholar 

  7. Cai D, He X, Li Z, Ma W, Wen J (2004) Hierarchical clustering of WWW image search results using visual, textual and link information. Proc. Multimedia-ACM, New York, pp 10–16

    Google Scholar 

  8. J. Chang, L. Wang, G. Meng, S. Xiang, and C. Pan (2017) Deep Adaptive Image Clustering, In: Proc. ICCV

  9. Chen Y, Dong M, Wan W (2009) Image co-clustering with multi-modality features and user feedbacks. Proc. Multimedia-ACM, New York, pp 689–692

    Google Scholar 

  10. Chen Y, Wang JZ, Krovetz R (2005) CLUE: cluster-based retrieval of images by unsupervised learning. IEEE Trans Image Processing 14(8):1187–1201

    Article  Google Scholar 

  11. Chua TS, Tang J, Hong R, Li H, Luo Z, Zheng YT (2009) Nus-wide: A real-world Web image database from national university of Singapore. In: Proc. CIVR-ACM

  12. Cutting DR, Karger DR, Pedersen JO, Tukey JW (1992) Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections. In: SIGIR, pp. 318–329

  13. Dubes RC, Jain AK (1988) Algorithms for clustering data. Prentice Hall, Upper Saddle River, NJ, USA

  14. Gao B, Liu T, Qin T, Zhenget X, Cheng Q, Ma W (2005) Web image clustering by consistent utilization of visual features and surrounding texts. Proc. Multimedia-ACM, New York, pp 112–121

    Google Scholar 

  15. Goyal P, Mehala N (2011) Concept based query recommendation. Proc. AusDM, Ballarat

    Google Scholar 

  16. Hamzaoui A, Joly A, Boujemaa N (2011) Multi-source shared nearest neighbours for multi-modal image clustering. MTAP Springer US 51(2):479–503

    Google Scholar 

  17. Hoi SC, Liu W, Chang S (2008) Semi-supervised distance metric learning for collaborative image retrieval. In: Proc. CVPR-IEEE, pp. 1–7

  18. Hu Y, Yu N, Li Z, Li M (2007) Image search result clustering and re-ranking via partial grouping. Proc. ICME-IEEE, Beijing, pp 603–606

    Google Scholar 

  19. Hua XS et al (2013) Clickture: A large-scale real-world image dataset. In: Microsoft Research Technical Report MSR-TR-2013-75

  20. Jing F, Wang C, Yao Y, Deng K, Zhang L, Ma WC (2006) IGroup: Web image search results clustering. Proc. Multimedia-ACM, New York, pp 587–596

    Google Scholar 

  21. Kobayashi M, Kameyama K (2008) User-Adaptive Image Clustering using Relevance Feedback for Efficient Content-Based Retrieval. In: Proc. IEEE SMC

  22. Krischnamachari S, Abdel-Mottaleb M (1999) Image browsing using hierarchical clustering. In: IEEE symposium on computers and communications, pp. 301–307

  23. Larsen B, Aone C (1999) Fast and Effective Text Mining Using Linear-time Document Clusterin. In: KDD, California

  24. Lee KM (2010) Cluster-Driven Refinement for Content-Based Digital Image Retrieval. IEEE Trans TMM 12(6):817–827

    MathSciNet  Google Scholar 

  25. Leuken RHV, Garcia L, Olivares X, Zwol R (2009) Visual diversification of image search results. Proc WWW-ACM, New York, pp 341–350

    Google Scholar 

  26. Li X, Cui G, Dong Y (2016) Graph Regularized Non-Negative Low-Rank Matrix Factorization for Image Clustering. IEEE Trans Cybernetics 99:1–14

    Google Scholar 

  27. Li H, He X, Tao D, Tang Y, Wang R (2018) Joint medical image fusion, denoising and enhancement via discriminative low-rank sparse dictionaries learning. Pattern Recogn 79:130–146

    Article  Google Scholar 

  28. Li P, Wang M, Cheng J, Xu C, Lu H (2013) Spectral Hashing With Semantically Consistent Graph for Image Indexing. IEEE Trans TMM 15(1):141–152

    Google Scholar 

  29. Liang J, Han Y, Hu Q (2016) Semi-Supervised image clustering with multi-modal information. ACM Multimedia System 22(2):149–160

    Article  Google Scholar 

  30. Liu Q, Sun Y, Wang C, Liu T, Tao D (2017) Elastic Net Hypergraph Learning for Image Clustering and Semi-Supervised Classification. IEEE Trans Image Processing 26(1):452–463

    Article  MathSciNet  MATH  Google Scholar 

  31. Lowe DG (1999) Object recognition from local scale-invariant features. Proc. Computer Vision-IEEE, Kerkyra, pp 1150–1157

    Google Scholar 

  32. Ma H, Zhu J, Lyu MRT, King I (2010) Bridging the Semantic Gap Between Image Contents and Tags. IEEE Trans TMM 12(5):462–473

    Google Scholar 

  33. Moëllic PA, Haugeard J, Pitel G (2008) Image clustering based on a shared nearest neighbors approach for tagged collections. Proc. CIVR-ACM, New York, pp 269–278

    Google Scholar 

  34. Nahar J, Imam T, Tickle K, Chen YPP (2013) Computational Intelligence for Heart Disease Diagnosis: A Medical Knowledge Driven Approach. Expert Syst Appl 40(1):96–104

    Article  Google Scholar 

  35. Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns. IEEE Trans TPAMI 24(7):971–987

    Article  MATH  Google Scholar 

  36. Pedronette DCG, Torres RDS (2012) Exploiting pairwise recommendation and clustering strategies for image re-ranking. Inf Sci 207:19–34

    Article  Google Scholar 

  37. Picsearch image search. http://www.picsearch.com Accessed: May 2015

  38. Priyogi B, Selviandro N, Hasibuan ZA, Ahmad M (2014) Image Clustering Using Multi-visual Features. Lecture Notes in Computer Science Information and Communication Technology 8407:179–189

    Google Scholar 

  39. Rege M, Dong M, Hua J (2008) Graph theoretical framework for simultaneously integrating visual and textual features for efficient web image clustering. Proc. WWW-ACM, New York, pp 317–326

    Google Scholar 

  40. Smith JR (2002) Color for image retrieval. In: Image Databases, John Wiley & Sons, Inc., 11, pp. 285–311

  41. Tan P-N, Steinbach M, Kumar V (2014) Introduction to Data Mining

  42. Tang X, Liu K, Cui J, Wen F, Wang X (2012) Intentsearch: Capturing user intention for one-click internet image search. IEEE Trans TPAMI 34(7):1342–1353

    Article  Google Scholar 

  43. Tao D, Cheng J, Yu Z, Yue K, Wang L (2018) Domain-Weighted Majority Voting for Crowdsourcing. IEEE trans Neural Networks and Learning Systems, pp. 1–12

  44. Tao D, Guo Y, Li Y, Gao X (2018) Tensor Rank Preserving Discriminant Analysis for Facial Recognition. IEEE Trans Image Processing 27:325–334

    Article  MathSciNet  MATH  Google Scholar 

  45. Tsai JT, Lin YY, Liao HYM (2014) Per-Cluster Ensemble Kernel Learning for Multi-Modal Image Clustering With Group-Dependent Feature Selection. IEEE Trans TMM 16(8):2229–2241

    Google Scholar 

  46. Wang XD, Chen RC, Hong CQ, Zeng ZQ, Zhou ZL (2017) Semi-supervised multi-label feature selection via label correlation analysis with l1-norm graph embedding. Image Vis Comput 63:10–23

    Article  Google Scholar 

  47. Wang XD, Chen RC, Zeng ZQ, Hong CQ, Yan F (2018) Robust Dimension Reduction for Clustering With Local Adaptive Learning. IEEE trans Neural Network Learning Systems

  48. Wang X, Zhang X, Zeng Z, Wu Q, Zhang J (2016) Unsupervised spectral feature selection with l1-norm graph. Neurocomputing 200:47–54

    Article  Google Scholar 

  49. Wu F, Pai HT, Yan YF, Chuang J (2014) Clustering results of image searches by annotations and visual features. Telematics Inform 31(3):477–491

    Article  Google Scholar 

  50. Xia DS, Xiang ZQ, Zou YX (2015) Integrating visual and textual features for web image clustering, vol 2015. Proc. BigMM-IEEE, Beijing, pp 116–123

  51. Yan Y, Liu G, Wang S, Zhang J, Zheng K (2017) Graph-based clustering and ranking for diversified image search. ACM Multimedia Systems 23(1):41–52

    Article  Google Scholar 

  52. Yang Y, Yang L, Wu G, Li S (2014) Image Relevance Prediction Using Query-Context Bag-of-Object Retrieval Model. IEEE Trans TMM 16(6):1700–1712

    Google Scholar 

  53. Yu J, Rui Y, Chen B (2014) Exploiting Click Constraints and Multi-view Features for Image Re-ranking. IEEE Trans TMM 16(1):159–168

    Google Scholar 

  54. Zhao R (2002) Narrowing the Semantic Gap—Improved Text-Based Web Document Retrieval Using Visual Features. IEEE Trans TMM 4(2):189–200

    Google Scholar 

  55. Zhao K, Cai Z, Sui Q, Wei E, Zh KQ (2014) Clustering image search results by entity disambiguation. Lecture Notes in Computer Science Machine Learning and Knowledge Discovery in Databases 8726:369–384

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chandramani Chaudhary.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chaudhary, C., Goyal, P., Tuli, S. et al. A novel multimodal clustering framework for images with diverse associated text. Multimed Tools Appl 78, 17623–17652 (2019). https://doi.org/10.1007/s11042-018-7131-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-7131-x

Keywords

Navigation