A Tesseract-based OCR framework for historical documents lacking ground-truth text | IEEE Conference Publication | IEEE Xplore