Abstract
Collaborative image tagging systems, such as Flickr, are very attractive for supporting keyword-based image retrieval, but some user-provided tags of collaboratively-tagged social images might be imprecise. Some people may use general or high-level words (i.e., abstract tags) to tag their images for saving time and effort, but such general or high-level tags are too abstract to describe the visual content of social images precisely. As a result, users may not be able to find what they need when they use the specific keywords for query specification. To tackle the problem of abstract tags, an ontology with three-level semantics is constructed for detecting the candidates of abstract tags from large-scale social images. Then the image context (nearest neighbors) and tag context (most relevant tags) of social images with abstract tags are used to ultimately confirm whether these candidates are abstract or not and identify the specific tags which can further depict the images with abstract tags. In addition, all the relevant tags, which correspond with intermediate nodes between the abstract tags and specific tags on our concept ontology, are added to enrich the tags of social images so that users can have more choices to select various keywords for query specification. We have tested our proposed algorithms on two types of data sets (revised standard datasets and self-constructed dataset) and compared our approach with other approaches.
Similar content being viewed by others
References
Kennedy, L., Naaman, M., Ahern, S., Nair, R., Rattenbury, T. (2007). How flickr helps us make sense of the world: context and content in community-contributed media collections. In Proceedings of the 15th international conference on multimedia (pp. 631–640). ACM.
Liu, D., Hua, X.S., Zhang, H.J. (2011). Content-based tag processing for internet social images. Multimedia Tools and Applications, 51(2), 723–738.
Yahoo! Flickr . http://www.flickr.com.
Datta, R., Joshi, D., Li, J., Wang, J.Z. (2008). Image retrieval: Ideas, influences, and trends of the new age. ACM Computing Surveys (CSUR), 40(2), 5.
Ames, M., & Naaman, M. (2007). Why we tag: motivations for annotation in mobile and online media. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 971–980). ACM.
Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y. (2009). Nus-wide: a real-world web image database from national university of singapore. In Proceedings of the ACM international conference on image and video retrieval (p. 48). ACM.
Wu, L., Yang, L., Yu, N., Hua, X.S. (2009). Learning to tag. In Proceedings of the 18th international conference on world wide web (pp. 361–370). ACM.
Majid, A., Khusro, S., Rauf, A. (2011). Semantics in social tagging systems: A review. In Computer networks and information technology (ICCNIT), 2011 international conference on (pp. 191–203). IEEE.
Li, X., Snoek, C.G.M., Worring, M. (2008). Learning tag relevance by neighbor voting for social image retrieval. In Proceedings of the 1st ACM international conference on multimedia information retrieval (pp. 180–187). ACM.
Tang, J., Yan, S., Hong, R., Qi, G.J., Chua, T.S. (2009). Inferring semantic concepts from community-contributed images and noisy tags. In Proceedings of the seventeen ACM international conference on multimedia (pp. 223–232). ACM.
Fan, J., Shen, Y., Zhou, N., Gao, Y. (2010). Harvesting large-scale weakly-tagged image databases from the web. In IEEE CVPR (pp. 802–809).
Liu, D., Hua, X.S., Wang, M., Zhang, H.J. (2010). Image retagging. In Proceedings of the international conference on multimedia (pp. 491–500).
Zhu, G., Yan, S., Ma, Y. (2010). Image tag refinement towards low-rank, content-tag prior and error sparsity. In Proceedings of the international conference on multimedia (pp. 461–470). ACM.
Sigurbjörnsson, B., & Van, Zwol R. (2008). Flickr tag recommendation based on collective knowledge. In Proceedings of the 17th international conference on world wide web (pp. 327–336). ACM.
Bucak, S.S., Jin, R., Jain, A.K. (2011). Multi-label learning with incomplete class assignments. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2801–2808).
Yang, K., Hua, X., Wang, M., Zhang, H. (2011). Tag tagging: Towards more descriptive keywords of image content. IEEE Transactions on Multimedia, 99, 1–1.
Ballan, L., Bertini, M., Del Bimbo, A., Serra, G. (2011). Enriching and localizing semantic tags in internet videos. In Proceedings of the 19th ACM international conference on multimedia (pp. 1541–1544). ACM.
Naphade, M., Smith, J.R., Smith, J., Chang, S.F., Hsu, W., Kennedy, L., Hauptmann, A., Curtis, J. (2006). Large-scale concept ontology for multimedia. IEEE Multimedia , 13(3), 86–91.
Lu, Y., Zhang, L., Tian, Q., Ma, W.Y. (2008). What are the high-level concepts with small semantic gaps? In IEEE conference on computer vision and pattern recognition (pp. 1–8).
Fellbaum, C. (1998). WordNet: An electronic lexical database. Bradford Books.
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.F. (2009). Imagenet: A large-scale hierarchical image database. In IEEE conference on computer vision and pattern recognition (pp. 248–255).
Rosch, E. (1999). Principles of categorization. Concepts: core readings (pp. 189–206).
Peters, I. (2009). Folksonomies: Indexing and retrieval in Web 2.0 (Vol. 1). KG Saur Verlag Gmbh & Company, Munich, Germany.
Rorissa, A. (2008). User-generated descriptions of individual images versus labels of groups of images: A comparison using basic level theory. Information Processing & Management, 44(5), 1741–1753.
Deng, Y., Manjunath, B.S., Kenney, C., Moore, M.S., Shin, H. (2001). An efficient color representation for image retrieval. IEEE Transactions on Image Processing, 10(1), 140–147.
Ma, W.Y., & Manjunath, B.S. (1996). Texture features and learning similarity. In IEEE computer society conference on computer vision and pattern recognition (pp. 425–430).
Lowe, D.G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Rubner, Y., Tomasi, C., Guibas, L.J. (1998). A metric for distributions with applications to image databases. In IEEE sixth international conference on computer vision (pp 59–66).
Li, M., Chen, X., Li, X., Ma, B., Vitányi, P.M.B. (2004). The similarity metric. IEEE Transactions on Information Theory, 50(12), 3250–3264.
Frey, B.J., & Dueck, D. (2007). Clustering by passing messages between data points. Science, 315(5814), 972.
Acknowledgments
This work is partly supported by the doctorate foundation of Northwestern Polytechnical University (No: CX201113), Doctoral Program of Higher Education of China (Grant No.20106102110028 and 20116102110027) and National Science Foundation of China (under Grant No.61075014 and 61272285).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xia, Z., Peng, J., Feng, X. et al. Automatic Abstract Tag Detection for Social Image Tag Refinement and Enrichment. J Sign Process Syst 74, 5–18 (2014). https://doi.org/10.1007/s11265-013-0756-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-013-0756-0