Abstract
The necessity to convert printed documents to facilitate the storage and retrieval of information is growing, particularly in the medical and healthcare industries. In our last work, we presented a method to extract prescriptions from images using CRAFT and TESSERACT so that patients could quickly save and check up on their pharmaceutical use information. However, the slow processing speed and the limited number of medication names lead to it being impractical. Based on this model structure, a new system is introduced, using bounding box clustering heuristics to detect the featured text areas, before employing VietOCR tool to identify the texts in prescription images. Simultaneously, a fast and accurate technique for extracting prescriptions is developed, utilizing word embedding and the vector search algorithm. The experiment results reveal that the proposed model significantly reduces the error of the retrieved data on the two standard measures, WER and CER, prominently with CER lowered to 26.95. Furthermore, the execution time decreases from 17.81 s to an average of 3.64 s, demonstrating the great effectiveness of our effort to improve the prior system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hillestad, R., et al.: Health information technology: can HIT lower costs and improve quality? Santa Monica, CA: RAND Corporation; 2005. RB-9136-HLTH. RAND Corporation research briefs (2005)
Mulac, A., Taxis, K., Hagesaether, E., Granas, A.G.: Severe and fatal medication errors in hospitals: findings from the Norwegian Incident Reporting System. Eur. J. Hosp. Pharm. 28(e1), e56–e61 (2021)
Nguyen, T.T., Nguyen, D.V.V., Le, T.: Developing a prescription recognition system based on CRAFT and tesseract. In: International Conference on Computational Collective Intelligence, pp. 443–455. Springer, Cham, September 2021. https://doi.org/10.1007/978-3-030-88081-1_33
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2014)
Tian, Z., Huang, W., He, T., He, P., Qiao, Y.: Detecting text in natural image with connectionist text proposal network. In: European Conference on Computer Vision, pp. 56–72. Springer, Cham, October 2016. https://doi.org/10.1007/978-3-319-46484-8_4
Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9365–9374 (2019)
Zhou, X., et al.: East: an efficient and accurate scene text detector. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5551–5560 (2017)
Nguyen, T.T.H., Jatowt, A., Coustaty, M., Doucet, A.: Survey of post-OCR processing approaches. ACM Comput. Surv. (CSUR) 54(6), 1–37 (2021)
Shi, B., Bai, X., Belongie, S.: Detecting oriented text in natural images by linking segments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2550–2558 (2017)
Thompson, P., McNaught, J., Ananiadou, S.: Customised OCR correction for historical medical text. In: 2015 Digital Heritage, vol. 1, pp. 35–42. IEEE, September 2015
Singh, S.: Natural language processing for information extraction. arXiv preprint arXiv:1807.02383 (2018)
Huang, Z., et al.: ICDAR 2019 competition on scanned receipt OCR and information extraction. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1516–1520. IEEE, September 2019
Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNS-CRF. arXiv preprint arXiv:1603.01354 (2016)
Malkov, Y.A., Yashunin, D.A.: Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE Trans. Pattern Anal. Mach. Intell. 42(4), 824–836 (2018)
Tenney, I., Das, D., Pavlick, E.: BERT rediscovers the classical NLP pipeline. arXiv preprint arXiv:1905.05950 (2019)
Acknowledgements
This research is funded by the University of Science, VNU-HCM, Vietnam under grant number CNTT 2021-12 and Advanced Program in Computer Science.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nguyen, NT., Vo, H., Tran, K., Ha, D., Nguyen, D., Le, T. (2022). Medical Prescription Recognition Using Heuristic Clustering and Similarity Search. In: Nguyen, N.T., Manolopoulos, Y., Chbeir, R., Kozierkiewicz, A., Trawiński, B. (eds) Computational Collective Intelligence. ICCCI 2022. Lecture Notes in Computer Science(), vol 13501. Springer, Cham. https://doi.org/10.1007/978-3-031-16014-1_60
Download citation
DOI: https://doi.org/10.1007/978-3-031-16014-1_60
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16013-4
Online ISBN: 978-3-031-16014-1
eBook Packages: Computer ScienceComputer Science (R0)