Abstract
With the rapid development of location-based social networks (LBSNs), more and more media data are unceasingly uploaded by users. The asynchrony between the visual and textual information has made it extremely difficult to manage the multimodal information for manual annotation-free retrieval and personalized recommendation. Consequently the automated image semantic discovery of multimedia location-related user-generated contents (UGCs) for user experience has become mandatory. Most of the literatures leverage single-modality data or correlated multimedia data for image semantic detection. However, the intrinsically heterogeneous UGCs in LBSNs are usually independent and uncorrelated. It is hard to build correlation between textual information and visual information. In this paper, we propose a cross-domain semantic modeling method for automatic image annotation for visual information from social network platforms. First, we extract a set of hot topics from the collected textual information for image dataset preparation. Then the proposed noisy sample filtering is implemented to remove low-relevance photos. Finally, we leverage cross-domain datasets to discover the common knowledge of each semantic concept from UGCs and boost the performance of semantic annotation by semantic transfer. The comparison experiments on cross-domain datasets were conducted to demonstrate the superiority of the proposed method.








Similar content being viewed by others
References
Gao, Y., Tang, J., Hong, R., Dai, Q., Chua, T.-S., Jain, R.: W2go: a travel guidance system by automatic landmark ranking. In: ACM Multimedia, pp. 123–132 (2010)
Ahern, S., Naaman, M., Nair, R., Yang, J.H.: World explorer: visualizing aggregate data from unstructured text in geo-referenced collections. In: JCDL (2007)
Ji, R., Duan, L.-Y., Chen, J., Yao, H., Yuan, J., Rui, Y., Gao, W.: Location discriminative vocabulary coding for mobile landmark search. Int. J. Computer Vis. 96(3), 290–314 (2012)
Sang, J., Xu, C.: Right buddy makes the difference: an early exploration of social relation analysis in multimedia applications. In: ACM Multimedia, pp. 19–28 (2012)
Alham, N., Li, M., Liu, Y., Hammoud, S., Ponraj, M.: A distributed SVM for scalable image annotation. In: FSKD (2011)
Lei, Y., Wong, W., Liu, W., Bennamoun, M.: An HMM-SVM-based automatic image annotation approach. In: ACCV (4) (2010)
Zhao, Y., Zhao, Y., Zhu, Z., Pan, J.: A novel image annotation scheme based on neural network. In: ISDA (3) (2008)
Fakhari, A., Eftekhari-Moghadam, A.: Combination of classification and regression in decision tree for multi-labeling image annotation and retrieval. Appl. Soft Comput. 13(2), 1292–1302 (2013)
Du, L., Ren, L., Dunson, D., Carin, L.: A bayesian model for simultaneous image clustering, annotation and object segmentation. In: NIPS (2009)
Shi, R., Lee, C., Chua, T.: Enhancing image annotation by integrating concept ontology and text-based bayesian learning model. In: ACM MM (2007)
Gao, Y., Wang, F., Luan, H., Chua, T.: Brand data gathering from live social media streams. In: ACM Conference on Multimedia Retrieval (2014)
Rui, X., Li, M., Li, Z., Ma, W., Yu, N.: Bipartite graph reinforcement model for web image annotation. In: ACM Multimedia (2007)
Gao, Y., Wang, M., Luan, H., Shen, J., Yan, S., Tao, D.: Tag-based social image search with visual-text joint hypergraph learning. In: ACM Multimedia, pp. 1517–1520 (2011)
Liu, X., Liu, R., Li, F., Cao, Q.: Graph-based dimensionality reduction for knn-based image annotation. In: ICPR (2012)
Wang, H., Huang, H., Ding, C.: Image annotation using bi-relational graph of images and semantic labels. In: CVPR (2011)
Sang, J., Xu, C., Liu, J.: User-aware image tag refinement via ternary semantic analysis. IEEE Trans. Multimed. 14(3–2), 883–895 (2012)
Gao, Y., Wang, M., Zha, Z., Shen, J., Li, X., Wu, X.: Visual-textual joint relevance learning for tag-based social image search. IEEE Trans. Image Process. 22(1), 3633–3676 (2013)
Gao, Y., Wang, M., Ji, R., Zha, Z., Shen, J.: k-Partite graph reinforcement and its application in multimedia information retrieval. Inf. Sci. 194, 224–239 (2012)
Balamurali, A., Mukherjee, S., Malu, A., Bhattacharyya, P.: Leveraging sentiment to compute word similarity. In: CoRR (2012)
Ji, R., Gao, Y., Zhong, B., Yao, H., Tian, Q.: Mining Flickr landmarks by modeling reconstruction sparsity. TOMCCAP 7, 31 (2011)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. PAMI 32(9), 1627–1645 (2010)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A.: Cascade object detection with deformable part models. In: CVPR (2010)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. CVPR 1, 886–893 (2005)
Viola, P.A., Jones, M.J.: Robust real-time face detection. Int. J. Computer Vis. 57(2), 137–154 (2004)
Yang, J., Yan, R., Hauptmann, A.: Cross-domain video concept detection using adaptive SVMs. In: ACM Multimedia (2007)
Torii, Y., Abe, S.: Fast training of linear programming support vector machines using decomposition techniques. In: ANNPR (2006)
Nie, W., Liu, A., Su, Y.: Multiple person tracking by spatiotemporal tracklet association. In: AVSS (2012)
Vandenbroucke, N., Macaire, L., Postaire, J.: Color image segmentation by supervised pixel classification in a color texture feature space: application to soccer image segmentation. In: ICPR (2000)
Mäenpää, T., Ojala, T., Pietikäinen, M., Soriano, M.: Robust texture classification by subsets of local binary patterns. In: ICPR (2000)
Ji, R., Yao, H., Liu, W., Sun, X., Tian, Q.: Task-dependent visual-codebook compression. IEEE Trans. Image Process. 21(4), 2282–2293 (2012)
Belani, A.: Vandalism detection in wikipedia: a bag-of-words classifier approach. In: CoRR, vol. abs/1001.0700 (2010)
Lukasová, A.: Hierarchical agglomerative clustering procedure. Pattern Recogn. 11, 5–6 (1979)
Acknowledgments
I would like to express my deep gratitude to Prof. Chua and the NeXT group in National University of Singapore. This work was supported in part by the National Natural Science Foundation of China (61100124, 21106095, 61170239, and 61202168), the grant of Elite Scholar Program of Tianjin University, the grant of Introducing Talents to Tianjin Normal University (5RL123), the grant of Introduction of One Thousand High-level Talents in Three Years in Tianjin.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nie, W., Liu, A. & Su, Y. Cross-domain semantic transfer from large-scale social media. Multimedia Systems 22, 75–85 (2016). https://doi.org/10.1007/s00530-014-0394-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-014-0394-9