Abstract
Flowcharts serve as integral visual aids, encapsulating both logical flows and specific component-level information in a manner easily interpretable by humans. However, automated parsing of these diagrams poses a significant challenge due to their intricate logical structure and text-rich nature. In this paper, we introduce GenFlowchart, a novel framework that employs generative AI to enhance the parsing and understanding of flowcharts. First, a cutting-edge segmentation model is deployed to delineate the various components and geometrical shapes within the flowchart using the Segment Anything Model (SAM). Second, Optical Character Recognition (OCR) is utilized to extract the text residing in each component for deeper functional comprehension. Finally, we formulate prompts using prompt engineering for the generative AI to integrate the segmented results and extracted text, thereby reconstructing the flowchart’s workflows. To validate the effectiveness of GenFlowchart, we evaluate its performance across multiple flowcharts and benchmark it against several baseline approaches. GenFlowchart is available at https://github.com/ResponsibleAILab/GenFlowchart.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
pytesseract — pypi.org. https://pypi.org/project/pytesseract/. Accessed 16 Jan 2024
Alhindi, T., Chakrabarty, T., Musi, E., Muresan, S.: Multitask instruction-based prompting for fallacy recognition. arXiv preprint arXiv:2301.09992 (2023)
Bradski, G., Kaehler, A., et al.: OpenCV: Open source computer vision library (2020). https://opencv.org/. Accessed 10 Jan 2024
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Harel, D.: Statecharts: a visual formalism for complex systems. Sci. Comput. Program. 8(3), 231–274 (1987)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Kirillov, A., Mintun, E., et al. Segment anything. In: IEEE/CVF CVPR, pp. 4015–4026 (2023)
Knuth, D.E.: Runcible-algebraic translation on a limited computer. Commun. ACM 2(11), 18–21 (1959)
Ling, C., Jiang, J., et al.: Deep graph representation learning and optimization for influence maximization. In: ICML (2023)
Mikolov, T., Sutskever, I., et al.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26 (2013)
Raghu, D., Agarwal, S., Joshi, S., et al.: End-to-end learning of flowchart grounded task-oriented dialogs. arXiv preprint arXiv:2109.07263 (2021)
Reimers, N., Gurevych, I.: Sentence-bert: sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019)
Schäfer, B., Keuper, M., Stuckenschmidt, H.: Arrow R-CNN for handwritten diagram recognition. Int. J. Doc. Anal. Recognit. (IJDAR) 24(1), 3–17 (2021)
Schäfer, B., Stuckenschmidt, H.: Arrow R-CNN for flowchart recognition. In: International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 1, pp. 7–13 (2019)
Suitter, J.A.: Accuracy of optical character recognition software google tesseract (2015)
Supaartagorn, C.: Web application for automatic code generator using a structured flowchart. In: 8th IEEE International Conference on Software Engineering and Service Science (ICSESS), pp. 114–117 (2017)
Winkelmann, A., Weiß, B.: Automatic identification of structural process weaknesses in flow chart diagrams. Bus. Process. Manag. J. 17(5), 787–807 (2011)
Xiang-Hu, W.U., Ming-Cheng, Q.U., et al.: A code automatic generation algorithm based on structured flowchart. Appl. Math. 6(1S), 1S-8S (2012)
Xinogalos, S.: Using flowchart-based programming environments for simplifying programming and software engineering processes. In: 2013 IEEE Global Engineering Education Conference (EDUCON), pp. 1313–1322. IEEE (2013)
Yin, W., Hay, J., Roth, D.: Benchmarking zero-shot text classification: datasets, evaluation and entailment approach. arXiv preprint arXiv:1909.00161 (2019)
Zeng, Y., Pan, M., et al.: Narcissus: a practical clean-label backdoor attack with limited information. In: ACM CCS (2023)
Zhang, T., Kishore, V., Wu, F., et al.: Bertscore: evaluating text generation with bert. arXiv preprint arXiv:1904.09675 (2019)
Zhang, Y., et al.: Communication-efficient stochastic gradient descent ascent with momentum algorithms. In: IJCAI 2023 (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Arbaz, A., Fan, H., Ding, J., Qiu, M., Feng, Y. (2024). GenFlowchart: Parsing and Understanding Flowchart Using Generative AI. In: Cao, C., Chen, H., Zhao, L., Arshad, J., Asyhari, T., Wang, Y. (eds) Knowledge Science, Engineering and Management. KSEM 2024. Lecture Notes in Computer Science(), vol 14884. Springer, Singapore. https://doi.org/10.1007/978-981-97-5492-2_8
Download citation
DOI: https://doi.org/10.1007/978-981-97-5492-2_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5491-5
Online ISBN: 978-981-97-5492-2
eBook Packages: Computer ScienceComputer Science (R0)