Abstract
With the rapid growth of multimedia data (e.g., text, image, video, audio and 3D model, etc) in the web, there are a large number of media objects with different modalities in the multimedia documents such as webpages, which exhibit latent semantic correlation. As a new type of multimedia retrieval method, cross-media retrieval is becoming increasingly attractive, through which users can get the results with various media types with the same semantic information by submitting a retrieval of any media type. The explosive increasing of the number of media objects, however, makes it difficult for the traditional local standalone mode to process efficiently. So the powerful parallel processing capability of cloud computing is accommodated to facilitate the efficient large-scale cross-media retrieval. In this paper, based on a Multi-Layer-Cross-Reference-Graph(MLCRG) model, we propose an efficient parallel cross-media retrieval (PCMR) method in which two enabling techniques (i.e., 1) the adaptive cross-media data allocation algorithm and 2) the PCIndex scheme) are accommodated to effectively speedup the retrieval performance. To the best of our knowledge, there is little research on the parallel retrieval processing of the large-scale cross-media databases in the mobile cloud network. Extensive experiments are conducted to testify that our proposed PCIndex method outperform the three competitors (e.g., the PFAR (Mao et al, 22), the MBSR (Retrieval 4(2):153-164, 42) and the SPECH (Knowl Based Syst 251(5):1-13, 40)) in terms of the effectiveness and efficiency, respectively.
Similar content being viewed by others
Notes
Note that, for a media object Xi, its corresponding embedded correlation subspace is assumed as a virtual one in which the semantics of the correlated media objects of different modalities is the same to that of Xi.
In Fig. 11, the retrieval object is the image Iq. The aim of its PCMR is to return the related media objects of different modalities (e.g., audio and video) with respect to Iq.
References
Beckmann N, Kriegel H-P, Schneider R, Seeger B (1990) The R*-tree: An efficient and robust access method for points and rectangles. in: Proc. ACM SIGMO, pp 322–331
Berchtold S, Keim DA, Kriegel HP (1996) The X-tree: An index structure for high-dimensional data. In: Proc. VLDB, pp 28–37
Berchtold S, Bohm C, Kriegel H-P (1998) The pyramid technique: Towards breaking the curse of dimensionality. In: Proc. ACM SIGMOD
Berchtold S, Bohm C, Kriegel HP, Sander J, Jagadish HV (2000) Independent quantization: An index compression technique for high-dimensional data spaces. In: Proc. the ICDE, pp 577–588
Böhm C, Berchtold S, Keim D (2001) Searching in High-dimensional Spaces: Index Structures for Improving the Performance of Multimedia Databases. ACM Comput Surv 33(3)
Bozkaya T, Ozsoyoglu M (1997) Distance-based indexing for high-dimensional metric spaces. In: Proc. ACM SIGMOD, pp 357–368
Chang SF, Chen W, Meng HJ, Sundaram H, Zhong D (1997) VideoQ: An automated content based video search system using visual cues. In: Proc. of ACM Multimedia, pp 313–324
Chávez E, Navarro G, Baeza-Yates R, Marroquín J (2001) Searching in metric spaces. ACM Computing Surveys 33(3):273–321 ACM Press
Chen F, Shao J, Zhang Y, Xu X, Shen H (2021) Interclass-relativity-adaptive metric learning for cross-modal matching and beyond. IEEE Trans. on Multimedia 23:3073–3084
Ciaccia P, Patella M, Zezula P (1997) M-trees: An efficient access method for similarity search in metric space. In: Proc. the 23rd VLDB, pp 426–435
Filho R, Traina A, Faloutsos C (2001) Similarity search without tears: The Omni family of all-purpose access methods, In: Proc. ICDE, pp 623–630
Flickner M, Niblack W, Niblack W (1995) Query by image and video content: The QBIC system. IEEE Trans on Computers 28(9):23–31
Fonseca M, Jorge JA (2003) Indexing high-dimensional data for content-based retrieval in large databases. In: Proc. DASSFA, Kyoto, Japan, pp 267–274
Frey B, Dueck D (2007) Clustering by passing messages between data points. Science. 315(5814):972–976
Guttman R (1984) R-tree: A dynamic index structure for spatial searching. In: Proc. ACM SIGMOD, pp 47–54
Jagadish H, Ooi B, Tan K, Yu C, Zhang R (2005) iDistance: An adaptive B+-tree based indexing method for nearest neighbor search. ACM Trans on Data Base Systems 30(2):364–397
Jagadish H, Ooi B, Shen H, Tan K-L (2006) Towards efficient multi-feature query processing. IEEE Trans Knowl Data Eng 18(3):350–362
Katamaya N, Satoh S (1997) The SR-tree: An index structure for high-dimensional nearest neighbor queries. In: Proc. ACM SIGMOD, pp 32–42
Li Z, Ling F, Xu C, Zhang C, Ma H (2021) Cross-media hash retrieval using multi-head attention network. In: Proc. 25th Int’l Conf. on Pattern Recognition (ICPR)
Lin K, Jagadish K, Faloutsos C (1994) The TV-tree an index structure for high-dimensional data, VLDB J
Lu B, Wang GR, Yuan Y (2012) A novel approach towards large scale cross-media retrieval. J Comput Sci Technol 27:1140–1149
Mao X, Lin B, Cai D, He X, Pei J (2013) Parallel field alignment for cross media retrieval. In: Proc. 21st ACM Multimedia
McGurk H, MacDonald J (1976) Hearing lips and seeing voices. Nature, 264, pp 746–748
Microsoft Encarta, http://encarta.msn.com/, 2006.
Peng Y, Chi J (2020) Unsupervised cross-media retrieval using domain adaptation with scene graph. IEEE Trans. on Circuits and Systems for Video Technology 30(11)
Peng Y, Huang X, Zhao Y (2017) An overview of cross-media retrieval: concepts, methodologies, benchmarks and challenges. IEEE Trans. on Circuits and Systems for Video Technology
Rui Y, Huang T-S, Chang S-F (1999) Image Retrieval: Current Techniques, Promising Directions and Open Issues. J. of Visual Communication and Image Representation, Vol. 10, pp 39–62
Sakurai Y, Yoshikawa M, Uemura S, Kojima H (2000) The A-tree: An index structure for high-dimensional spaces using relative approximation. In: Proc. VLDB, pp 516–526
Shen H, Zhou X, Cui B (2006) Indexing and Integrating Multiple Features for WWW images. World Wide Web J. 9(3):343–364
Smith JR, Chang S-F (1996) VisualSEEK: a fully automated content-based image query system. In: Proc. of ACM Multimedia
Smith J, Chang S (1997) Visually Searching the Web for Content. IEEE Multimedia Magazine 4(3):12–20
Traina Jr C, Traina A, Seeger B, Faloutsos C (2000) Slim-trees: High Performance Metric Trees Minimizing Overlap Between Nodes, In: Proc. the EDBT, Konstanz, Germany
Weber R, Schek H, Blott S (1998) A quantitative analysis and performance study for similarity-search methods in high- dimensional spaces. In: Proc. VLDB, pp 194–205
White DA, Jain R (1996) Similarity Indexing with the SS- tree. In: Proc. ICDE, pp. 516–523
Wu F, Zhang H, Zhuang Y-T (2006) Learning Semantic Correlations for Cross-Media Retrieval. In: Proc. of ICIP. pp 1465–1468
Wu G, Han J, Lin Z et al (2019) Joint Image-Text Hashing for Fast Large-Scale Cross-Media Retrieval Using Self-Supervised Deep Learning. IEEE Trans on Industrial Electronics 66(12):9868–9877
Xu X, Tian J, Lin K, Lu H, Shao J, Shen H (2021) Zero-Shot Cross-Modal Retrieval by Assembling AutoEncoder and Generative Adversarial Network. ACM Trans on Multimedia Computing Communications and Applications, 17(1s): Article 3, 17 pages
Yang J, Li Q, Zhuang Y-T (2002) “Octopus: Aggressive Search of Multi-Modality Data Using Multifaceted Knowledge Base. In: Proc. of WWW, USA. pp 54–64.
Yang J, Li Q, Liu LW, Zhuang Y (2004) Searching for Flash Movies on the Web: a Content and Context Based Framework. World Wide Web J 8(4):495–517
Yang F, Ding X, Liu Y et al (2022) Scalable semantic-enhanced supervised hashing for cross-modal retrieval. Knowl-Based Syst 251(5):1–13
Zhai X, Peng Y, Xiao J (2013) Cross-media retrieval by intra-media and inter-media correlation mining. Multimedia Systems 19(5):395–406
Zhao X, Zhang C, Zhang Z (2015) Distributed cross-media multiple binary subspace learning. Int’l J of Multimedia Information Retrieval 4(2):153–164
Zhuang Y, Yang Y, Wu F (2008) Mining Semantic Correlation of Heterogeneous Multimedia Data for Cross-media Retrieval. IEEE Trans on Multimed 10(2):221–229
Zhuang Y, Li Q, Chen L (2009) A Unified Indexing Structure for Efficient Cross-Media Retrieval, In: Proc. DASFAA’09. Brisbane, Australia
Acknowledgements
The authors would like to thank the editors and anonymous reviewers for their helpful comments. This work is partially supported by Zhejiang Province Philosophy and Social Science Planning Project under Grant No. 23NDJC165YB; Zhejiang Provincial Natural Science Foundation of China under Grant No. LGF19F020004, LY22F020010, LGF22H180039 and LTGY23F020002; the Zhejiang Traditional Chinese Medicine Science and Technology Project under grant No. 2023ZL119.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declared that they have no conflicts of interest to this work.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jiang, N., Zhuang, Y. & Chiu, D.K. An effective and efficient parallel large-scale cross-media retrieval in mobile cloud network. Multimed Tools Appl 83, 13821–13850 (2024). https://doi.org/10.1007/s11042-023-16060-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16060-y