Abstract
This paper presents the development of an Augmented Reality mobile application which aims at sensibilizing young children to abstract concepts of music. Such concepts are, for instance, the musical notation or the idea of rhythm. Recent studies in Augmented Reality for education suggest that such technologies have multiple benefits for students, including younger ones. As mobile document image acquisition and processing gains maturity on mobile platforms, we explore how it is possible to build a markerless and real-time application to augment the physical documents with didactic animations and interactive virtual content. Given a standard image processing pipeline, we compare the performance of different local descriptors at two key stages of the process. Results suggest alternatives to the SIFT local descriptors, regarding result quality and computational efficiency, both for document model identification and perspective transform estimation. All experiments are performed on an original and public dataset we introduce here.













Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Just avoiding the OpenCV’s Java wrapper and program it in C++ will already entail an important speedup.
References
Andoni A, Indyk P (2008) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun ACM – 50th anniversary issue: 1958–2008 51(1):117–122
Argelia N (2015) How to include augmented reality in descriptive geometry teaching Proceedings of the international conference on virtual and augmented reality in education, pp 250–256
Bacca J, Baldiris S, Fabregat R, Graf S (2014) Kinshuk: Augmented reality trends in education: a systematic review of research and applications. J Educ Technol Soc 17(4):133
Bay H, Ess A, Tuytelaars T, Gool LV (2008) SURF: Speeded Up robust features. Comput Vis Image Underst 110(3):346–359
Bin A, Rohaya D (2013) An interactive mobile augmented reality magical playbook: Learning number with the thirsty crow Proceedings of the international conference on virtual and augmented reality in education, pp 123–130
Burie J, Chazalon J, Coustaty M, Eskenazi S, Luqman M, Mehri M, Nayef N, Ogier J, Prum S, Rusiñol M (2015) ICDAR2015 Competition on smartphone document capture and OCR (smartdoc) Proceedings of the 13th international conference on document analysis and recognition, pp 1161–1165
Calonder M, Lepetit V, Ozuysal M, Trzcinski T, Strecha C, Fua P (2012) BRIEF: Computing A local binary descriptor very fast. IEEE Trans Pattern Anal Mach Intell 34(7):1281–1298
Cascales A, Pérez-López D, Contero M (2013) Study on parent’s acceptance of the augmented reality use for preschool education Proceedings of the international conference on virtual and augmented reality in education, vol 25, pp 420–427
Chazalon J, Rusiñol M, Ogier J (2015) Improving document matching performance by local descriptor filtering Proceedings of the 6th international workshop on camera based document image analysis, pp 1216–1220
Chazalon J, Rusiñol M, Ogier J, Lladós J (2015) A semi-automatic groundtruthing tool for mobile-captured document segmentation Proceedings of the 13th international conference on document analysis and recognition, pp 621–625
Diaz C, Hincapié M, Moreno G (2015) How the type of content in educative augmented reality application affects the learning experience Proceedings of the international conference on virtual and augmented reality in education, pp 205–212
Everingham M, Gool LV, Williams C, Winn J, Zisserman A (2010) The PASCAL visual object classes (VOC) challenge. Int J Comput Vision 88(2):303–338
Fischler M, Bolles R (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395
Goodwin A, Green R (2013) Key detection for a virtual piano teacher Proceedings of the international conference of image and vision computing
Hartley R, Zisserman A (2004) Multiple view geometry in computer vision. Cambridge University Press, Cambridge
Huand F, Zhou Y, Yu Y, Wang Z, Du S (2011) Piano ar: a markerless augmented reality based piano teaching system Proceedings of the international conference on intelligent human-machine systems and cybernetics, pp 47–52
Jain R, Oard D, Doermann D (2013) Scalable ranked retrieval using document images Proceedings of document recognition and retrieval XXI
Karatzas D, d’Andecy V, Rusiñol M, Chica A, Vazquez P Human-document interaction systems – a new frontier for document image analysis Proceedings of the 12th IAPR Workshop on Document Analysis Systems, pp 369–374
Leutenegger S, Chli M, Siegwart R (2011) BRISK: Binary Robust invariant scalable keypoints Proceedings of the international conference on computer vision, pp 2548–2555
Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Lucas B, Kanade T (1981) An iterative image registration technique with an application to stereo vision Proceedings of imaging understanding workshop, pp 121–130
Martínez M, Diaz F, Barroso L, González D, Antón M (2013) Mobile serious game using augmented reality for supporting children’s learning about animals Proceedings of the international conference on virtual and augmented reality in education, pp 375–381
Moraleda J (2012) Large scalability in document image matching using text retrieval. Pattern Recogn Lett 33(7):863–871
Moraleda J, Hull J (2010) Toward massive scalability in image matching Proceedings of the international conference on pattern recongition, pp 3424–3427
Muja M, Lowe D (2009) Fast approximate nearest neighbors with automatic algorithm configuration Proceedings of the international conference on computer vision theory and applications, pp 331– 340
Nakai T, Kise K, Iwamura M (2006) Use of affine invariants in locally likely arrangement hashing for camera-based document image retrieval Proceedings of the international workshop on document analysis systems, pp 541–552
Nayef N, Luqman M, Prum S, Eskenazi S, Chazalon J, Ogier J (2015) Smartdoc-QA: A dataset for quality assessment of smartphone captured document images - single and multiple distortions Proceedings of the international conference on document analysis and recognition
Pra YD, Fontana F, Tao L (2014) Infrared vs. ultrasonic finger detection on a virtual piano keyboard Proceedings of the ICMC
Quintero E, Salinas P, González E, Ramírez H (2015) Augmented reality app for calculus: a proposal for the development of spatial visualization Proceedings of the international conference on virtual and augmented reality in education, pp 301–305
Rohaya D, Matcha W, Sulaiman S (2013) Fun learning with ar alphabet book for preschool children Proceedings of the international conference on virtual and augmented reality in education, pp 211– 219
Rublee E, Rabaud V, Konolige K, Bradski G (2011) ORB: An efficient alternative to SIFT or SURF Proceedings of the international conference on computer vision, pp 2564–2571
Rusiñol M, Chazalon J, Ogier J, Lladós J (2015) A comparative study of local detectors and descriptors for mobile document classification Proceedings of the 13th international conference on document analysis and recognition, pp 596–600
Takeda K, Kise K, Iwamura M (2011) Real-time document image retrieval for a 10 million pages database with a memory efficient and stability improved LLAH Proceedings of the 11th international conference on document analysis and recognition, pp 1054–1058
Takeda K, Kise K, Iwamura M (2012) Real-time document image retrieval on a smartphone Proceedings of the international workshop on document analysis systems, pp 225–229
Telea A (2004) An image inpainting technique based on the fast marching method. Journal of Graphics Tools 9(1):23–34
Wu C, Hsu C, Lee T, Smith S (2016) A virtual reality keyboard with realistic haptic feedback in a fully immersive virtual environment. Virtual Reality 1–11
Yilmaz RM (2016) Educational magic toys developed with augmented reality technology for early childhood education. Comput Hum Behav 54:240–248
Acknowledgements
This work was supported by the Spanish project TIN2014-52072-P, by the People Programme (Marie Curie Actions) of the Seventh Framework Programme of the European Union (FP7/2007-2013) under REA grant agreement no. 600388, by the Agency of Competitiveness for Companies of the Government of Catalonia, ACCIO, by the CERCA Programme / Generalitat de Catalunya, and by the MOBIDEM project, part of the “Systematic Paris-Region” and “Images & Network” Clusters, funded by the French Government and its economic development agencies. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X Pascal GPU used for this research.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rusiñol, M., Chazalon, J. & Diaz-Chito, K. Augmented songbook: an augmented reality educational application for raising music awareness. Multimed Tools Appl 77, 13773–13798 (2018). https://doi.org/10.1007/s11042-017-4991-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-4991-4