Automated Classifier Development Process for Recognizing Book Pages from Video Frames

Brzeski, Adam; Cychnerski, Jan; Draszawka, Karol; Dziubich, Krystyna; Dziubich, Tomasz; Korłub, Waldemar; Rościszewski, Paweł

doi:10.1007/978-3-030-55814-7_14

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1260))

Included in the following conference series:

746 Accesses
1 Citations

Abstract

One of the latest developments made by publishing companies is introducing mixed and augmented reality to their printed media (e.g. to produce augmented books). An important computer vision problem that they are facing is classification of book pages from video frames. The problem is non-trivial, especially considering that typical training data is limited to only one digital original per book page, while the trained classifier should be suitable for real-time utilization on mobile devices, where camera can be exposed to highly diverse conditions and computing resources are limited. In this paper we address this problem by proposing an automated classifier development process that allows training classification models that run real-time, with high usability, on low-end mobile devices and achieve average accuracy of 88.95% on our in-house developed test set consisting of over 20 000 frames from real videos of 5 books for children. At the same time, deployment tests reveal that the classifier development process time is reduced approximately 16-fold.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Can computer vision problems benefit from structured hierarchical classification?

Article Open access 06 May 2016

Bildsuche: Erfahrungen zur Erkennung von Emblemen und zur automatischen Annotation von Segmenten

Evolving Interpretable Visual Classifiers with Large Language Models

References

Costanza, E., Kunz, A., Fjeld, M.: Mixed reality: a survey. In: Lalanne, D., Kohlas, J. (eds.) Human Machine Interaction. LNCS, vol. 5440, pp. 47–68. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00437-7_3
Chapter Google Scholar
Hull, J.J., et al.: Paper-based augmented reality. In: 17th International Conference on Artificial Reality and Telexistence (ICAT 2007), Denmark, November 2007, pp. 205–209. IEEE (2007)
Google Scholar
Fujinami, K., Inagawa, N.: Page-flipping detection and information presentation for implicit interaction with a book. Int. J. Multimed. Ubiquitous Eng. 4(3), 20 (2009)
Google Scholar
Back, M., Cohen, J., Gold, R., Harrison, S., Minneman, S.: Listen reader: an electronically augmented paper-based book. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems - CHI 2001, Seattle, Washington, USA, pp. 23–29. ACM Press (2001)
Google Scholar
Garris, M.D., Creating and validating a large image database for METTREC. Technical report NIST IR 6090, National Institute of Standards and Technology, Gaithersburg, MD (1997)
Google Scholar
Chakraborty, D., Roy, P.P., Alvarez, J.M., Pal, U.: Duplicate open page removal from video stream of book flipping. In: 2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), Jodhpur, India, December 2013, pp. 1–4. IEEE (2013)
Google Scholar
Chang, Y.-H., Liao, H.-L., Jeng, L.-D., Chiu, Y.-C.: An interactive multimedia storybook demonstration system. Multimed. Tools Appl. 74(17), 6709–6728 (2014). https://doi.org/10.1007/s11042-014-1926-1
Article Google Scholar
Jang, S.-W., Ko, J., Lee, H.J., Kim, Y.S.: A study on tracking and augmentation in mobile AR for e-Leisure. Mobile Inf. Syst. 2018, 1–11 (2018)
Article Google Scholar
PTC. Developer’s guide (2019)
Google Scholar
Wikitude GmbH. Developer’s Guide (2020)
Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2: inverted residuals and linear bottlenecks, March 2019. arXiv:1801.04381. arXiv: 1801.04381

Download references

Acknowledgements

This work has been partially supported by Gdańskie Wydawnictwo Oświatowe and Statutory Funds of Electronics, Telecommunications and Informatics Faculty, Gdansk University of Technology.

Author information

Authors and Affiliations

Computer Vision & Artificial Intelligence Laboratory, Department of Computer Architecture, Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology, Gdańsk, Poland
Adam Brzeski, Jan Cychnerski, Karol Draszawka, Krystyna Dziubich, Tomasz Dziubich, Waldemar Korłub & Paweł Rościszewski

Authors

Adam Brzeski
View author publications
You can also search for this author in PubMed Google Scholar
Jan Cychnerski
View author publications
You can also search for this author in PubMed Google Scholar
Karol Draszawka
View author publications
You can also search for this author in PubMed Google Scholar
Krystyna Dziubich
View author publications
You can also search for this author in PubMed Google Scholar
Tomasz Dziubich
View author publications
You can also search for this author in PubMed Google Scholar
Waldemar Korłub
View author publications
You can also search for this author in PubMed Google Scholar
Paweł Rościszewski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tomasz Dziubich .

Editor information

Editors and Affiliations

ISAE-ENSMA, Poitiers, France
Ladjel Bellatreche
Slovak University of Technology, Bratislava, Slovakia
Mária Bieliková
Université Lumière Lyon 2, Lyon, France
Omar Boussaïd
University of Genova, Genova, Italy
Barbara Catania
Université Lumière Lyon 2, Lyon, France
Jérôme Darmont
Leibniz University of Hannover, Hannover, Niedersachsen, Germany
Elena Demidova
Université Claude Bernard Lyon 1, Lyon, France
Fabien Duchateau
The Open University, Milton Keynes, UK
Mark Hall
University of Ljubljana, Ljubljana, Slovenia
Tanja Merčun
National Research University Higher School of Economics, St. Petersburg, Russia
Boris Novikov
Ionian University, Corfu, Greece
Christos Papatheodorou
Goethe University Frankfurt, Frankfurt am Main, Hessen, Germany
Thomas Risse
Universitat Politècnica de Catalunya, Barcelona, Spain
Oscar Romero
AgroParisTech, Montpellier, France
Lucile Sautot
University of Lyon, Lyon, France
Guilaine Talens
Poznań University of Technology, Poznań, Poland
Robert Wrembel
University of Ljubljana, Ljubljana, Slovenia
Maja Žumer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brzeski, A. et al. (2020). Automated Classifier Development Process for Recognizing Book Pages from Video Frames. In: Bellatreche, L., et al. ADBIS, TPDL and EDA 2020 Common Workshops and Doctoral Consortium. TPDL ADBIS 2020 2020. Communications in Computer and Information Science, vol 1260. Springer, Cham. https://doi.org/10.1007/978-3-030-55814-7_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-55814-7_14
Published: 18 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-55813-0
Online ISBN: 978-3-030-55814-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics