Skip to main content

Rule-Based Page Segmentation for Palm Leaf Manuscript on Color Image

  • Conference paper
  • First Online:
  • 2284 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10075))

Abstract

Palm leaf manuscripts are important source of history and ancient wisdom. Large number of manuscripts have been already digitized in the form of folio images. To extract useful information, an optical character recognition (OCR) is often considered to be the first step towards text mining. Unfortunately, folio images contain multiple unsegmented palm leaf images, making it difficult to manage in OCR process. This motivates us to propose a new page segmentation method for palm leaf manuscripts. This method consists of two main steps, first of which is the detection of objects in folio images using Connected Component Labeling method in a transformed L*a*b* color space. The second step is rule-based selection of objects as either palm leaf or not palm leaf. The experiments performed on 20 publicly available palm leaf manuscripts composed of 384 folio images demonstrated that the proposed method effectively segmented folio images into separate palm leaf images, with 99.86 % precision and 96.67 % recall scores.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Inkeaw, P., Chueaphun, C., Chaijaruwanich, J., Klomsae, A., Marukatat, S., (eds.): Lanna dharma handwritten character recognition on palm leaves manuscript based on wavelet transform. In: IEEE International Conference on Signal and Image Processing Applications (ICSIPA); 19–21 Oct.; Kuala Lumpur, Malaysia (2015)

    Google Scholar 

  2. Thammano, A., Pravesjit, S.: Recognition of archaic Lanna handwritten manuscripts using a hybrid bio-inspired algorithm. Memetic Comput. 7(1), 3–17 (2015)

    Article  Google Scholar 

  3. Doermann, D., Tombre, K.: Handbook of Document Image Processing, Recognition. Springer, Heidelberg (2014)

    Book  MATH  Google Scholar 

  4. Jain, A.K., Bhattacharje, S.: Text segmentation using gabor filters for automatic document processing. Mach. Vis. Appl. 5(3), 169–184 (1992)

    Article  Google Scholar 

  5. Acharyya, M., Kundu, M.K.: Document image segmentation using wavelet scale-space features. IEEE Transactions on Circuits and Systems for Video Technology. 12(12), 1117–1127 (2002)

    Article  Google Scholar 

  6. Baldevbhai, P.J., Anand, R.S.: Color image segmentation for medical images using L*a*b* color space. IOSR J. Electron. Commun. Eng. 1(2), 24–45 (2012)

    Article  Google Scholar 

  7. Recky, M., Leberl, F., (eds.): Windows detection using K-means in CIE-lab color space. In: 2010 20th International Conference on Pattern Recognition (ICPR), 2010 23–26, August 2010

    Google Scholar 

  8. Zhang, Q., Chi, Y., He, N.: Color image segmentation based on a modified k-means algorithm. In: Proceedings of the 7th International Conference on Internet Multimedia Computing, Service; Zhangjiajie, Hunan, China. 2808538: Observation of strains. Infect Dis Ther. 3(1), 35–43. ACM (2015). pp. 1–4 (2011)

    Google Scholar 

  9. Fairchild, M.D.: Color appearance models. Wiley, United Kingdom (2013)

    Book  Google Scholar 

  10. Rosin, L.P.: Measuring rectangularity. Mach. Vis. Appl. 11(4), 191–196 (1999)

    Article  MathSciNet  Google Scholar 

  11. The collection of Lanna Manuscripts [Internet]. Chiang Mai University (2015). http://library.cmu.ac.th/lanna_ebook/

  12. Digital Library of Lao Manuscripts [Internet]. National Library of Laos (2009). http://www.laomanuscripts.net/

  13. The Collections of Palm Leaf Manuscripts [Internet]. Los Angeles County Museum of Art. http://www.lacma.org/

  14. The collections of palm-leaf manuscripts [Internet]. Harvard Art Museums. http://www.harvardartmuseums.org/

  15. Old Mon Palm Leaf Manuscripts Collections [Internet]. Mon Language. http://www.monlanguage.net/

  16. Matkovi, K., Neumann, L., Neumann, A., Psik, T., Purgathofer, W.: Global contrast factor - a new approach to image contrast. In: Proceedings of the First Eurographics Conference on Computational Aesthetics in Graphics, Visualization, Imaging; Girona, Spain. 2381242: Observation of strains. Infect Dis Ther. 3(1), 35–43: Eurographics Association; 2005. pp. 159–67 (2011)

    Google Scholar 

  17. Samet, H., Tamminen, M.: Efficient component labeling of images of arbitrary dimension represented by linear bintrees. IEEE Trans. Pattern Anal. Mach. Intell. 10(4), 79–86 (1988). doi:10.1109/34.3918

    Article  Google Scholar 

  18. Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co. Inc., United States (1999)

    Google Scholar 

Download references

Acknowledgments

This study was funded under the Royal Golden Jubilee Ph.D. Program by the Thailand Research Fund. We would like to thank Faculty of Science and Lan na Studies, Chiang Mai University, Thailand, for financial support and collection of digital Lanna archives, respectively.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeerayut Chaijaruwanich .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Inkeaw, P., Bootkrajang, J., Charoenkwan, P., Marukatat, S., Ho, SY., Chaijaruwanich, J. (2016). Rule-Based Page Segmentation for Palm Leaf Manuscript on Color Image. In: Morishima, A., Rauber, A., Liew, C. (eds) Digital Libraries: Knowledge, Information, and Data in an Open Access Society. ICADL 2016. Lecture Notes in Computer Science(), vol 10075. Springer, Cham. https://doi.org/10.1007/978-3-319-49304-6_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49304-6_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49303-9

  • Online ISBN: 978-3-319-49304-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics