Skip to main content
Log in

An effective and efficient parallel large-scale cross-media retrieval in mobile cloud network

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

With the rapid growth of multimedia data (e.g., text, image, video, audio and 3D model, etc) in the web, there are a large number of media objects with different modalities in the multimedia documents such as webpages, which exhibit latent semantic correlation. As a new type of multimedia retrieval method, cross-media retrieval is becoming increasingly attractive, through which users can get the results with various media types with the same semantic information by submitting a retrieval of any media type. The explosive increasing of the number of media objects, however, makes it difficult for the traditional local standalone mode to process efficiently. So the powerful parallel processing capability of cloud computing is accommodated to facilitate the efficient large-scale cross-media retrieval. In this paper, based on a Multi-Layer-Cross-Reference-Graph(MLCRG) model, we propose an efficient parallel cross-media retrieval (PCMR) method in which two enabling techniques (i.e., 1) the adaptive cross-media data allocation algorithm and 2) the PCIndex scheme) are accommodated to effectively speedup the retrieval performance. To the best of our knowledge, there is little research on the parallel retrieval processing of the large-scale cross-media databases in the mobile cloud network. Extensive experiments are conducted to testify that our proposed PCIndex method outperform the three competitors (e.g., the PFAR (Mao et al, 22), the MBSR (Retrieval 4(2):153-164, 42) and the SPECH (Knowl Based Syst 251(5):1-13, 40)) in terms of the effectiveness and efficiency, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Algorithm 2
Fig. 5
Algorithm 3
Fig. 6
Fig. 7
Fig. 8
Algorithm 4
Algorithm 5
Fig. 9
Fig. 10
Fig. 11
Algorithm 6
Algorithm 7
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

Notes

  1. Note that, for a media object Xi, its corresponding embedded correlation subspace is assumed as a virtual one in which the semantics of the correlated media objects of different modalities is the same to that of Xi.

  2. In Fig. 11, the retrieval object is the image Iq. The aim of its PCMR is to return the related media objects of different modalities (e.g., audio and video) with respect to Iq.

References

  1. Beckmann N, Kriegel H-P, Schneider R, Seeger B (1990) The R*-tree: An efficient and robust access method for points and rectangles. in: Proc. ACM SIGMO, pp 322–331

  2. Berchtold S, Keim DA, Kriegel HP (1996) The X-tree: An index structure for high-dimensional data. In: Proc. VLDB, pp 28–37

  3. Berchtold S, Bohm C, Kriegel H-P (1998) The pyramid technique: Towards breaking the curse of dimensionality. In: Proc. ACM SIGMOD

  4. Berchtold S, Bohm C, Kriegel HP, Sander J, Jagadish HV (2000) Independent quantization: An index compression technique for high-dimensional data spaces. In: Proc. the ICDE, pp 577–588

  5. Böhm C, Berchtold S, Keim D (2001) Searching in High-dimensional Spaces: Index Structures for Improving the Performance of Multimedia Databases. ACM Comput Surv 33(3)

  6. Bozkaya T, Ozsoyoglu M (1997) Distance-based indexing for high-dimensional metric spaces. In: Proc. ACM SIGMOD, pp 357–368

  7. Chang SF, Chen W, Meng HJ, Sundaram H, Zhong D (1997) VideoQ: An automated content based video search system using visual cues. In: Proc. of ACM Multimedia, pp 313–324

  8. Chávez E, Navarro G, Baeza-Yates R, Marroquín J (2001) Searching in metric spaces. ACM Computing Surveys 33(3):273–321 ACM Press

    Article  Google Scholar 

  9. Chen F, Shao J, Zhang Y, Xu X, Shen H (2021) Interclass-relativity-adaptive metric learning for cross-modal matching and beyond. IEEE Trans. on Multimedia 23:3073–3084

    Article  Google Scholar 

  10. Ciaccia P, Patella M, Zezula P (1997) M-trees: An efficient access method for similarity search in metric space. In: Proc. the 23rd VLDB, pp 426–435

  11. Filho R, Traina A, Faloutsos C (2001) Similarity search without tears: The Omni family of all-purpose access methods, In: Proc. ICDE, pp 623–630

  12. Flickner M, Niblack W, Niblack W (1995) Query by image and video content: The QBIC system. IEEE Trans on Computers 28(9):23–31

    Google Scholar 

  13. Fonseca M, Jorge JA (2003) Indexing high-dimensional data for content-based retrieval in large databases. In: Proc. DASSFA, Kyoto, Japan, pp 267–274

  14. Frey B, Dueck D (2007) Clustering by passing messages between data points. Science. 315(5814):972–976

    Article  MathSciNet  Google Scholar 

  15. Guttman R (1984) R-tree: A dynamic index structure for spatial searching. In: Proc. ACM SIGMOD, pp 47–54

  16. Jagadish H, Ooi B, Tan K, Yu C, Zhang R (2005) iDistance: An adaptive B+-tree based indexing method for nearest neighbor search. ACM Trans on Data Base Systems 30(2):364–397

    Article  Google Scholar 

  17. Jagadish H, Ooi B, Shen H, Tan K-L (2006) Towards efficient multi-feature query processing. IEEE Trans Knowl Data Eng 18(3):350–362

    Article  Google Scholar 

  18. Katamaya N, Satoh S (1997) The SR-tree: An index structure for high-dimensional nearest neighbor queries. In: Proc. ACM SIGMOD, pp 32–42

  19. Li Z, Ling F, Xu C, Zhang C, Ma H (2021) Cross-media hash retrieval using multi-head attention network. In: Proc. 25th Int’l Conf. on Pattern Recognition (ICPR)

  20. Lin K, Jagadish K, Faloutsos C (1994) The TV-tree an index structure for high-dimensional data, VLDB J

  21. Lu B, Wang GR, Yuan Y (2012) A novel approach towards large scale cross-media retrieval. J Comput Sci Technol 27:1140–1149

    Article  Google Scholar 

  22. Mao X, Lin B, Cai D, He X, Pei J (2013) Parallel field alignment for cross media retrieval. In: Proc. 21st ACM Multimedia

  23. McGurk H, MacDonald J (1976) Hearing lips and seeing voices. Nature, 264, pp 746–748

  24. Microsoft Encarta, http://encarta.msn.com/, 2006.

  25. Peng Y, Chi J (2020) Unsupervised cross-media retrieval using domain adaptation with scene graph. IEEE Trans. on Circuits and Systems for Video Technology 30(11)

  26. Peng Y, Huang X, Zhao Y (2017) An overview of cross-media retrieval: concepts, methodologies, benchmarks and challenges. IEEE Trans. on Circuits and Systems for Video Technology

  27. Rui Y, Huang T-S, Chang S-F (1999) Image Retrieval: Current Techniques, Promising Directions and Open Issues. J. of Visual Communication and Image Representation, Vol. 10, pp 39–62

  28. Sakurai Y, Yoshikawa M, Uemura S, Kojima H (2000) The A-tree: An index structure for high-dimensional spaces using relative approximation. In: Proc. VLDB, pp 516–526

  29. Shen H, Zhou X, Cui B (2006) Indexing and Integrating Multiple Features for WWW images. World Wide Web J. 9(3):343–364

    Article  Google Scholar 

  30. Smith JR, Chang S-F (1996) VisualSEEK: a fully automated content-based image query system. In: Proc. of ACM Multimedia

  31. Smith J, Chang S (1997) Visually Searching the Web for Content. IEEE Multimedia Magazine 4(3):12–20

    Article  Google Scholar 

  32. Traina Jr C, Traina A, Seeger B, Faloutsos C (2000) Slim-trees: High Performance Metric Trees Minimizing Overlap Between Nodes, In: Proc. the EDBT, Konstanz, Germany

  33. Weber R, Schek H, Blott S (1998) A quantitative analysis and performance study for similarity-search methods in high- dimensional spaces. In: Proc. VLDB, pp 194–205

  34. White DA, Jain R (1996) Similarity Indexing with the SS- tree. In: Proc. ICDE, pp. 516–523

  35. Wu F, Zhang H, Zhuang Y-T (2006) Learning Semantic Correlations for Cross-Media Retrieval. In: Proc. of ICIP. pp 1465–1468

  36. Wu G, Han J, Lin Z et al (2019) Joint Image-Text Hashing for Fast Large-Scale Cross-Media Retrieval Using Self-Supervised Deep Learning. IEEE Trans on Industrial Electronics 66(12):9868–9877

    Article  Google Scholar 

  37. Xu X, Tian J, Lin K, Lu H, Shao J, Shen H (2021) Zero-Shot Cross-Modal Retrieval by Assembling AutoEncoder and Generative Adversarial Network. ACM Trans on Multimedia Computing Communications and Applications, 17(1s): Article 3, 17 pages

  38. Yang J, Li Q, Zhuang Y-T (2002) “Octopus: Aggressive Search of Multi-Modality Data Using Multifaceted Knowledge Base. In: Proc. of WWW, USA. pp 54–64.

  39. Yang J, Li Q, Liu LW, Zhuang Y (2004) Searching for Flash Movies on the Web: a Content and Context Based Framework. World Wide Web J 8(4):495–517

    Article  Google Scholar 

  40. Yang F, Ding X, Liu Y et al (2022) Scalable semantic-enhanced supervised hashing for cross-modal retrieval. Knowl-Based Syst 251(5):1–13

    Google Scholar 

  41. Zhai X, Peng Y, Xiao J (2013) Cross-media retrieval by intra-media and inter-media correlation mining. Multimedia Systems 19(5):395–406

    Article  Google Scholar 

  42. Zhao X, Zhang C, Zhang Z (2015) Distributed cross-media multiple binary subspace learning. Int’l J of Multimedia Information Retrieval 4(2):153–164

    Article  MathSciNet  Google Scholar 

  43. Zhuang Y, Yang Y, Wu F (2008) Mining Semantic Correlation of Heterogeneous Multimedia Data for Cross-media Retrieval. IEEE Trans on Multimed 10(2):221–229

    Article  Google Scholar 

  44. Zhuang Y, Li Q, Chen L (2009) A Unified Indexing Structure for Efficient Cross-Media Retrieval, In: Proc. DASFAA’09. Brisbane, Australia

Download references

Acknowledgements

The authors would like to thank the editors and anonymous reviewers for their helpful comments. This work is partially supported by Zhejiang Province Philosophy and Social Science Planning Project under Grant No. 23NDJC165YB; Zhejiang Provincial Natural Science Foundation of China under Grant No. LGF19F020004, LY22F020010, LGF22H180039 and LTGY23F020002; the Zhejiang Traditional Chinese Medicine Science and Technology Project under grant No. 2023ZL119.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi Zhuang.

Ethics declarations

Conflict of Interests

The authors declared that they have no conflicts of interest to this work.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, N., Zhuang, Y. & Chiu, D.K. An effective and efficient parallel large-scale cross-media retrieval in mobile cloud network. Multimed Tools Appl 83, 13821–13850 (2024). https://doi.org/10.1007/s11042-023-16060-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16060-y

Keywords

Navigation