Interactive object-based image retrieval and annotation on iPad

Han, Junwei; Xu, Ming; Li, Xin; Guo, Lei; Liu, Tianming

doi:10.1007/s11042-013-1509-6

Interactive object-based image retrieval and annotation on iPad

Published: 11 June 2013

Volume 72, pages 2275–2297, (2014)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Junwei Han¹,
Ming Xu¹,
Xin Li²,
Lei Guo¹ &
…
Tianming Liu²

413 Accesses
8 Citations
Explore all metrics

Abstract

Apple iPad is a portable tablet computer that offers users a generic platform for consumer media including games, books, and movies. Though iPad is gaining popularity very quickly, its application in content-based image retrieval and annotation is still in its infancy. This paper aims to develop an interactive system to efficiently retrieve and annotate image objects on iPad, which mainly consists of two components of the front-end GUI (graphical user interface) and the back-end retrieval model. In the first component, an iPad-based GUI is implemented, which can provide users with an efficient way to select query objects and facilitate annotations. In the second component, we propose an object-based image retrieval algorithm that combines a novel feature descriptor based on context-preserving bags-of-words (BoW) and a two-stage re-ranking technique to measure the similarity between the query image and each image in the database. The retrieval results are returned and visualized on the iPad-based GUI, and annotations offered by users can be propagated among them. The communication between the front-end GUI and the back-end module is through the use of wireless networks. Comprehensive experiments on several benchmark datasets demonstrated the effectiveness of the proposed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MirBot: A Multimodal Interactive Image Retrieval System

Android Oriented Image Visualization Exploratory Search

Sketch-based manga retrieval using manga109 dataset

Article Open access 09 November 2016

Notes

References

Abramson Y, Freund Y (2005) Semi-automatic visual learning (seville): a tutorial on active learning for visual object recognition. Tutorial of IEEE Conference on Computer Vision and Pattern Recognition
ASIHTTPRequest documentation [Online]. Available: http://allseeing-i.com/ASIHTTPRequest/. Accessed 5 April 2012
Avidan S, Shamir A (2007) Seam carving for content-aware image resizing. ACM Trans. on Graphics, 26(3):10
Google Scholar
Cao Y, Wang C, Li Z, Zhang L, Zhang L (2010) Spatial-bag-of-features. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition pp. 3352–3359
Carneiro G, Jepson A (2007) Flexible spatial configuration of local image features. IEEE Trans Patterns Anal Mach Intell 26:2089–2104
Article Google Scholar
Chandrasekhar V, Chen DM, Lin A, Takacs G, Tsai S, Cheung N-M, Reznik Y, Grzeszczuk R, Girod B (2010) Comparison of local feature descriptors for mobile visual search. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition pp. 3885–3888
Chen D, Baatz G, Koser K, Tsai S, Vedantham R, Pylvanainen T, Roimela K, Chen X, Bach J, Pollefeys M, Girod B, Grzeszczuk R (2011) City-scale landmark identification on mobile devices. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition pp. 737–744
Erol B, Antúnez E, Hull J (2008) HOTPAPER: Multimedia interaction with paper using mobile phones. In Proceedings of the 16th ACM International Conference on Multimedia pp. 399–408
Ferrari V, Jurie F, Schmid C (2009) From images to shape models for object detection. Int J Comput Vis 87:284–303
Article Google Scholar
Flickr Photo Sharing Service. [Online]. Available: http://www.flickr.com. Accessed 5 April 2012
Google Goggles [Online]. Available: http://www.google.com/mobile/goggles/. Accessed 5 April 2012
Han J, Farin D, de With P (2011) A mixed-reality system for broadcasting sports video to mobile devices. IEEE Multimedia 18(2):72–84
Google Scholar
Han D, Li W, Li Z (2008) Semantic image classification using statistical local spatial relations model. Multimedia Tools Appl 39(2):169–188
Article Google Scholar
Han D, Wu X, Sonka M (2009) Optimal multiple surfaces searching for video/image resizing-a graph-theoretic approach. In Proceedings of IEEE International Conference on Computer Vision pp. 1026–1033
Jamieson M, Fazly A, Stevenson S, Dickinson S, Wachsmuth S (2010) Using language to learn structured appearance models for image annotation. IEEE Trans Patterns Anal Mach Intell 32:148–164
Article Google Scholar
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition pp. 2169–2178
Leibe B, Schiele B (2004) Scale-invariant object categorization using a scale-adaptive mean-shift search. Pattern Recognition 3175:145–153
Google Scholar
Li X, Liu T (2011) iPad for bioimage informatics. Dissertation, University of Georgia
Liu X, Hull J, Graham J, Moraleda J, Bailloeul T (2010) Mobile Visual Search, Linking Printed Documents to Digital Media. Demonstration of IEEE Conference on Computer Vision and Pattern Recognition
Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110
Article Google Scholar
Nokia (2009) Nokia Point and Find [Online]. Available: http://betalabs.nokia.com/trials/nokia-point-and-find. Accessed 5 April 2012
Opelt A, Pinz A, Zisserman A (2006) A boundary-fragment-model for object detection. In Proceedings of European Conference on Computer Vision pp. 575–588
Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition pp. 1–8
Russell B, Torralba A (2008) LabelMe: a database and web-based tool for image annotation. Int J Comput Vis 77:157–173
Article Google Scholar
Sadun E (2009) The iPhone developer’s cookbook, 2nd edn. Addison-Wesley Professional Press, Boston
Savarese S, Winn J, Criminisi A (2006) Discriminative object class models of appearance and shape by correlations. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition pp. 2033–2040
Sivic J, Zisserman A (2003) Video Google: A text retrieval approach to object matching in videos. In Proceedings of IEEE International Conference on Computer Vision pp. 1470–1477
Sivic J, Zisserman A (2006) Video Google: Efficient visual search of videos. In Toward Category-Level Object Recognition pp. 127–144
Sivic J, Zisserman A (2009) Efficient visual search of videos cast as text retrieval. IEEE Trans Pattern Anal Mach Intell 31(4):591–606
Google Scholar
Takacs G, Chandrasekhar V, Gelfand N, Xiong Y, Chen W, Bismpigiannis T, Grzeszczuk R, Pulli K, Girod B (2008) Outdoors augmented reality on mobile phone using loxel-based visual feature organization. In Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval pp. 427–434
Apache Tomcat [Online]. Available: http://tomcat.apache.org/download-60.cgi. Accessed 5 April 2012
Tsai S, Chen D, Chen H, Hsu C, Kim K, Singh J, Girod B (2011) Combining image and text features: a hybrid approach to mobile book spine recognition. In Proceedings of the 19th ACM International Conference on Multimedia pp. 1029–1032
von Ahn L, Dabbish L (2004) Labeling images with a computer game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems pp. 319–326
Wagner D, Reitmayr G, Mulloni A, Drummond T, Schmalstieg D (2008) Pose tracking from natural features on mobile phones. In Proceedings of the 7th IEEE/ACM International Symposium on Mixed and Augmented Reality pp. 125–134
Wu Z, Ke Q, Isard M, Sun J (2009) Bundling features for large scale partial-duplicate web image search. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition pp. 25–32
Yeh T, Tollmar K, Darrell T (2004) Searching the web with mobile images for location recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition pp. 76–81
Zhang S, Tian Q, Hua G, Huang Q, Li S (2009) Descriptive visual words and visual phrases for image applications. In Proceedings of the 17th ACM International Conference on Multimedia pp. 75–84

Download references

Acknowledgments

This work was supported by the National Science Foundation of China under Grant 61005018 and 91120005, NPU-FFR-JC20120237, and Program for New Century Excellent Talents in University under grant NCET-10-0079.

Author information

Authors and Affiliations

School of Automation, Northwestern Polytechnic University, Xi’an, 710072, China
Junwei Han, Ming Xu & Lei Guo
Department of Computer Science, The University of Georgia, Athens, GA, USA
Xin Li & Tianming Liu

Authors

Junwei Han
View author publications
You can also search for this author in PubMed Google Scholar
Ming Xu
View author publications
You can also search for this author in PubMed Google Scholar
Xin Li
View author publications
You can also search for this author in PubMed Google Scholar
Lei Guo
View author publications
You can also search for this author in PubMed Google Scholar
Tianming Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Junwei Han or Tianming Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Han, J., Xu, M., Li, X. et al. Interactive object-based image retrieval and annotation on iPad. Multimed Tools Appl 72, 2275–2297 (2014). https://doi.org/10.1007/s11042-013-1509-6

Download citation

Published: 11 June 2013
Issue Date: October 2014
DOI: https://doi.org/10.1007/s11042-013-1509-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Interactive object-based image retrieval and annotation on iPad

Abstract

Access this article

Similar content being viewed by others

MirBot: A Multimodal Interactive Image Retrieval System

Android Oriented Image Visualization Exploratory Search

Sketch-based manga retrieval using manga109 dataset

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Interactive object-based image retrieval and annotation on iPad

Abstract

Access this article

Similar content being viewed by others

MirBot: A Multimodal Interactive Image Retrieval System

Android Oriented Image Visualization Exploratory Search

Sketch-based manga retrieval using manga109 dataset

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation