skip to main content
10.1145/2072298.2071930acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

Combining image and text features: a hybrid approach to mobile book spine recognition

Published: 28 November 2011 Publication History

Abstract

Despite the successful use of local image features for large-scale object recognition, they are not effective in recognizing book spines on bookshelves. This is because some book spines contain only text components that do not yield distinguishing image features. To overcome this issue, we develop a new approach that combines a text-based spine recognition pipeline with an image feature-based spine recognition pipeline. The text within the book spine image is recognized and used as keywords to search a book spine text database. The image features of the book spine image are searched through a book spine image database. The search results of the two approaches are then carefully combined to form the final result. We implement the proposed hybrid book recognition pipeline used in a book inventory management system, and conduct extensive experiments to evaluate its performance. The experimental results show that while text-based or image feature-based systems only achieve a recall of 72%, the proposed hybrid system achieves a recall of ~91%.

References

[1]
H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool. Speeded-up robust features (SURF). Computer Vision and Image Understanding, 110(3), 2008.
[2]
D. Chen, S. Tsai, K.-H. Kim, C.-H. Hsu, J. P. Singh, and B. Girod. Low-cost asset tracking using location-aware camera phones. Number 1, San Diego, California, USA, 2010.
[3]
D. M. Chen, S. S. Tsai, B. Girod, C.-H. Hsu, K.-H. Kim, and J. P. Singh. Building book inventories using smartphones. In Proc. ACM Multimedia (MM'10'), MM '10, Firenze, Italy, 2010. ACM.
[4]
H. Chen, S. S. Tsai, G. Schroth, D. M. Chen., V. Chandrasekhar, G. Takacs, R. Vedantham, R. Grzeszczuk, and B. Girod. Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In International Conference on Image Processing, 2011.
[5]
D. Crasto, A. Kale, and C. Jaynes. The smart bookshelf: A study of camera projector scene augmentation of an everyday environment. In Proc. IEEE Workshop on Applications of Computer Vision (WACV'05), Breckenridge, CO, January 2005.
[6]
M. Fischler and R. Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 1981.
[7]
D. Lee, Y. Chang, J. Archibald, and C. Pitzak. Matching book-spine images for library shelf-reading process automation. In Proc. IEEE International Conference on Automation Science and Engineering (CASE'08), Arlington, VA, September 2008.
[8]
M. Loechtefeld, S. Gehring, J. Schoening, and A. Krueger. Shelftorchlight: Augmenting a shelf using a camera projector unit. UBIProjection 2010 - Workshop on Personal Projection, 2010.
[9]
K. Matsushita, D. Iwai, and K. Sato. Interactive bookshelf surface for in situ book searching and storing support. In Proceedings of the 2nd Augmented Human International Conference, New York, NY, USA, 2011.
[10]
D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06), New York, NY, June 2006.
[11]
J. Philbin, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08),Anchorage, AL, June 2008.
[12]
N. Quoc and W. Choi. A framework for recognition books on bookshelves. In Proc. International Conference on Intelligent Computing (ICIC'09), Ulsan, Korea, September 2009.
[13]
G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), 1988.
[14]
I. H. Witten, A. Moffat, and T. C. Bell. Managing gigabytes: Compressing and indexing documents and images. 1999.
[15]
T. Yeh and B. Katz. Searching documentation using text, ocr, and image. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, New York, NY, USA, 2009.

Cited By

View all
  • (2023)Libraries of Things: Understanding the Challenges of Sharing Tangible Collections and the Opportunities for HCIProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581094(1-18)Online publication date: 19-Apr-2023
  • (2022)An RFID and Computer Vision Fusion System for Book Inventory using Mobile RobotIEEE INFOCOM 2022 - IEEE Conference on Computer Communications10.1109/INFOCOM48880.2022.9796711(1239-1248)Online publication date: 2-May-2022
  • (2022)Library on-shelf book segmentation and recognition based on deep visual featuresInformation Processing and Management: an International Journal10.1016/j.ipm.2022.10310159:6Online publication date: 1-Nov-2022
  • Show More Cited By

Index Terms

  1. Combining image and text features: a hybrid approach to mobile book spine recognition

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '11: Proceedings of the 19th ACM international conference on Multimedia
    November 2011
    944 pages
    ISBN:9781450306164
    DOI:10.1145/2072298
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 November 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. image retrieval
    2. text recognition
    3. visual search

    Qualifiers

    • Short-paper

    Conference

    MM '11
    Sponsor:
    MM '11: ACM Multimedia Conference
    November 28 - December 1, 2011
    Arizona, Scottsdale, USA

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 19 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Libraries of Things: Understanding the Challenges of Sharing Tangible Collections and the Opportunities for HCIProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581094(1-18)Online publication date: 19-Apr-2023
    • (2022)An RFID and Computer Vision Fusion System for Book Inventory using Mobile RobotIEEE INFOCOM 2022 - IEEE Conference on Computer Communications10.1109/INFOCOM48880.2022.9796711(1239-1248)Online publication date: 2-May-2022
    • (2022)Library on-shelf book segmentation and recognition based on deep visual featuresInformation Processing and Management: an International Journal10.1016/j.ipm.2022.10310159:6Online publication date: 1-Nov-2022
    • (2020)Book spine recognition with the use of deep neural networksComputer Optics10.18287/2412-6179-CO-73144:6(968-977)Online publication date: Dec-2020
    • (2020)An Event-Based Framework for Virtual Libraries2020 International Conference on Computational Science and Computational Intelligence (CSCI)10.1109/CSCI51800.2020.00247(1322-1327)Online publication date: Dec-2020
    • (2019)Text Detection on Books Using CNN Trained with Another Domain Data2019 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech)10.1109/DASC/PiCom/CBDCom/CyberSciTech.2019.00041(170-176)Online publication date: Aug-2019
    • (2018)A Ubiquitous Approach for Automated Library Book Location ManagementProceedings of the 2018 International Conference on Computing and Big Data10.1145/3277104.3277115(78-82)Online publication date: 8-Sep-2018
    • (2018)Recognizing Call Numbers for Library Books: The Problem and Issues2018 International Conference on Computational Science and Computational Intelligence (CSCI)10.1109/CSCI46756.2018.00079(386-391)Online publication date: Dec-2018
    • (2017)Smart libraryProceedings of the 17th ACM/IEEE Joint Conference on Digital Libraries10.5555/3200334.3200363(245-248)Online publication date: 19-Jun-2017
    • (2017)Con-Text: Text Detection for Fine-Grained Object ClassificationIEEE Transactions on Image Processing10.1109/TIP.2017.270780526:8(3965-3980)Online publication date: Aug-2017
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media