Abstract
One of the latest developments made by publishing companies is introducing mixed and augmented reality to their printed media (e.g. to produce augmented books). An important computer vision problem that they are facing is classification of book pages from video frames. The problem is non-trivial, especially considering that typical training data is limited to only one digital original per book page, while the trained classifier should be suitable for real-time utilization on mobile devices, where camera can be exposed to highly diverse conditions and computing resources are limited. In this paper we address this problem by proposing an automated classifier development process that allows training classification models that run real-time, with high usability, on low-end mobile devices and achieve average accuracy of 88.95% on our in-house developed test set consisting of over 20 000 frames from real videos of 5 books for children. At the same time, deployment tests reveal that the classifier development process time is reduced approximately 16-fold.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Costanza, E., Kunz, A., Fjeld, M.: Mixed reality: a survey. In: Lalanne, D., Kohlas, J. (eds.) Human Machine Interaction. LNCS, vol. 5440, pp. 47–68. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00437-7_3
Hull, J.J., et al.: Paper-based augmented reality. In: 17th International Conference on Artificial Reality and Telexistence (ICAT 2007), Denmark, November 2007, pp. 205–209. IEEE (2007)
Fujinami, K., Inagawa, N.: Page-flipping detection and information presentation for implicit interaction with a book. Int. J. Multimed. Ubiquitous Eng. 4(3), 20 (2009)
Back, M., Cohen, J., Gold, R., Harrison, S., Minneman, S.: Listen reader: an electronically augmented paper-based book. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems - CHI 2001, Seattle, Washington, USA, pp. 23–29. ACM Press (2001)
Garris, M.D., Creating and validating a large image database for METTREC. Technical report NIST IR 6090, National Institute of Standards and Technology, Gaithersburg, MD (1997)
Chakraborty, D., Roy, P.P., Alvarez, J.M., Pal, U.: Duplicate open page removal from video stream of book flipping. In: 2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), Jodhpur, India, December 2013, pp. 1–4. IEEE (2013)
Chang, Y.-H., Liao, H.-L., Jeng, L.-D., Chiu, Y.-C.: An interactive multimedia storybook demonstration system. Multimed. Tools Appl. 74(17), 6709–6728 (2014). https://doi.org/10.1007/s11042-014-1926-1
Jang, S.-W., Ko, J., Lee, H.J., Kim, Y.S.: A study on tracking and augmentation in mobile AR for e-Leisure. Mobile Inf. Syst. 2018, 1–11 (2018)
PTC. Developer’s guide (2019)
Wikitude GmbH. Developer’s Guide (2020)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2: inverted residuals and linear bottlenecks, March 2019. arXiv:1801.04381. arXiv: 1801.04381
Acknowledgements
This work has been partially supported by Gdańskie Wydawnictwo Oświatowe and Statutory Funds of Electronics, Telecommunications and Informatics Faculty, Gdansk University of Technology.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Brzeski, A. et al. (2020). Automated Classifier Development Process for Recognizing Book Pages from Video Frames. In: Bellatreche, L., et al. ADBIS, TPDL and EDA 2020 Common Workshops and Doctoral Consortium. TPDL ADBIS 2020 2020. Communications in Computer and Information Science, vol 1260. Springer, Cham. https://doi.org/10.1007/978-3-030-55814-7_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-55814-7_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-55813-0
Online ISBN: 978-3-030-55814-7
eBook Packages: Computer ScienceComputer Science (R0)