Skip to main content

Automated Classifier Development Process for Recognizing Book Pages from Video Frames

  • Conference paper
  • First Online:
ADBIS, TPDL and EDA 2020 Common Workshops and Doctoral Consortium (TPDL 2020, ADBIS 2020)

Abstract

One of the latest developments made by publishing companies is introducing mixed and augmented reality to their printed media (e.g. to produce augmented books). An important computer vision problem that they are facing is classification of book pages from video frames. The problem is non-trivial, especially considering that typical training data is limited to only one digital original per book page, while the trained classifier should be suitable for real-time utilization on mobile devices, where camera can be exposed to highly diverse conditions and computing resources are limited. In this paper we address this problem by proposing an automated classifier development process that allows training classification models that run real-time, with high usability, on low-end mobile devices and achieve average accuracy of 88.95% on our in-house developed test set consisting of over 20 000 frames from real videos of 5 books for children. At the same time, deployment tests reveal that the classifier development process time is reduced approximately 16-fold.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Costanza, E., Kunz, A., Fjeld, M.: Mixed reality: a survey. In: Lalanne, D., Kohlas, J. (eds.) Human Machine Interaction. LNCS, vol. 5440, pp. 47–68. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00437-7_3

    Chapter  Google Scholar 

  2. Hull, J.J., et al.: Paper-based augmented reality. In: 17th International Conference on Artificial Reality and Telexistence (ICAT 2007), Denmark, November 2007, pp. 205–209. IEEE (2007)

    Google Scholar 

  3. Fujinami, K., Inagawa, N.: Page-flipping detection and information presentation for implicit interaction with a book. Int. J. Multimed. Ubiquitous Eng. 4(3), 20 (2009)

    Google Scholar 

  4. Back, M., Cohen, J., Gold, R., Harrison, S., Minneman, S.: Listen reader: an electronically augmented paper-based book. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems - CHI 2001, Seattle, Washington, USA, pp. 23–29. ACM Press (2001)

    Google Scholar 

  5. Garris, M.D., Creating and validating a large image database for METTREC. Technical report NIST IR 6090, National Institute of Standards and Technology, Gaithersburg, MD (1997)

    Google Scholar 

  6. Chakraborty, D., Roy, P.P., Alvarez, J.M., Pal, U.: Duplicate open page removal from video stream of book flipping. In: 2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), Jodhpur, India, December 2013, pp. 1–4. IEEE (2013)

    Google Scholar 

  7. Chang, Y.-H., Liao, H.-L., Jeng, L.-D., Chiu, Y.-C.: An interactive multimedia storybook demonstration system. Multimed. Tools Appl. 74(17), 6709–6728 (2014). https://doi.org/10.1007/s11042-014-1926-1

    Article  Google Scholar 

  8. Jang, S.-W., Ko, J., Lee, H.J., Kim, Y.S.: A study on tracking and augmentation in mobile AR for e-Leisure. Mobile Inf. Syst. 2018, 1–11 (2018)

    Article  Google Scholar 

  9. PTC. Developer’s guide (2019)

    Google Scholar 

  10. Wikitude GmbH. Developer’s Guide (2020)

    Google Scholar 

  11. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2: inverted residuals and linear bottlenecks, March 2019. arXiv:1801.04381. arXiv: 1801.04381

Download references

Acknowledgements

This work has been partially supported by Gdańskie Wydawnictwo Oświatowe and Statutory Funds of Electronics, Telecommunications and Informatics Faculty, Gdansk University of Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomasz Dziubich .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Brzeski, A. et al. (2020). Automated Classifier Development Process for Recognizing Book Pages from Video Frames. In: Bellatreche, L., et al. ADBIS, TPDL and EDA 2020 Common Workshops and Doctoral Consortium. TPDL ADBIS 2020 2020. Communications in Computer and Information Science, vol 1260. Springer, Cham. https://doi.org/10.1007/978-3-030-55814-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-55814-7_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-55813-0

  • Online ISBN: 978-3-030-55814-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics