Skip to main content

GenFlowchart: Parsing and Understanding Flowchart Using Generative AI

  • Conference paper
  • First Online:
Knowledge Science, Engineering and Management (KSEM 2024)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14884))

  • 552 Accesses

Abstract

Flowcharts serve as integral visual aids, encapsulating both logical flows and specific component-level information in a manner easily interpretable by humans. However, automated parsing of these diagrams poses a significant challenge due to their intricate logical structure and text-rich nature. In this paper, we introduce GenFlowchart, a novel framework that employs generative AI to enhance the parsing and understanding of flowcharts. First, a cutting-edge segmentation model is deployed to delineate the various components and geometrical shapes within the flowchart using the Segment Anything Model (SAM). Second, Optical Character Recognition (OCR) is utilized to extract the text residing in each component for deeper functional comprehension. Finally, we formulate prompts using prompt engineering for the generative AI to integrate the segmented results and extracted text, thereby reconstructing the flowchart’s workflows. To validate the effectiveness of GenFlowchart, we evaluate its performance across multiple flowcharts and benchmark it against several baseline approaches. GenFlowchart is available at https://github.com/ResponsibleAILab/GenFlowchart.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://docs.opencv.org/4.x/d9/d61/tutorial_py_morphological_ops.html.

  2. 2.

    https://pymupdf.readthedocs.io/.

References

  1. pytesseract — pypi.org. https://pypi.org/project/pytesseract/. Accessed 16 Jan 2024

  2. Alhindi, T., Chakrabarty, T., Musi, E., Muresan, S.: Multitask instruction-based prompting for fallacy recognition. arXiv preprint arXiv:2301.09992 (2023)

  3. Bradski, G., Kaehler, A., et al.: OpenCV: Open source computer vision library (2020). https://opencv.org/. Accessed 10 Jan 2024

  4. Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)

    Google Scholar 

  5. Harel, D.: Statecharts: a visual formalism for complex systems. Sci. Comput. Program. 8(3), 231–274 (1987)

    Article  MathSciNet  Google Scholar 

  6. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  7. Kirillov, A., Mintun, E., et al. Segment anything. In: IEEE/CVF CVPR, pp. 4015–4026 (2023)

    Google Scholar 

  8. Knuth, D.E.: Runcible-algebraic translation on a limited computer. Commun. ACM 2(11), 18–21 (1959)

    Article  Google Scholar 

  9. Ling, C., Jiang, J., et al.: Deep graph representation learning and optimization for influence maximization. In: ICML (2023)

    Google Scholar 

  10. Mikolov, T., Sutskever, I., et al.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26 (2013)

    Google Scholar 

  11. Raghu, D., Agarwal, S., Joshi, S., et al.: End-to-end learning of flowchart grounded task-oriented dialogs. arXiv preprint arXiv:2109.07263 (2021)

  12. Reimers, N., Gurevych, I.: Sentence-bert: sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019)

  13. Schäfer, B., Keuper, M., Stuckenschmidt, H.: Arrow R-CNN for handwritten diagram recognition. Int. J. Doc. Anal. Recognit. (IJDAR) 24(1), 3–17 (2021)

    Article  Google Scholar 

  14. Schäfer, B., Stuckenschmidt, H.: Arrow R-CNN for flowchart recognition. In: International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 1, pp. 7–13 (2019)

    Google Scholar 

  15. Suitter, J.A.: Accuracy of optical character recognition software google tesseract (2015)

    Google Scholar 

  16. Supaartagorn, C.: Web application for automatic code generator using a structured flowchart. In: 8th IEEE International Conference on Software Engineering and Service Science (ICSESS), pp. 114–117 (2017)

    Google Scholar 

  17. Winkelmann, A., Weiß, B.: Automatic identification of structural process weaknesses in flow chart diagrams. Bus. Process. Manag. J. 17(5), 787–807 (2011)

    Article  Google Scholar 

  18. Xiang-Hu, W.U., Ming-Cheng, Q.U., et al.: A code automatic generation algorithm based on structured flowchart. Appl. Math. 6(1S), 1S-8S (2012)

    Google Scholar 

  19. Xinogalos, S.: Using flowchart-based programming environments for simplifying programming and software engineering processes. In: 2013 IEEE Global Engineering Education Conference (EDUCON), pp. 1313–1322. IEEE (2013)

    Google Scholar 

  20. Yin, W., Hay, J., Roth, D.: Benchmarking zero-shot text classification: datasets, evaluation and entailment approach. arXiv preprint arXiv:1909.00161 (2019)

  21. Zeng, Y., Pan, M., et al.: Narcissus: a practical clean-label backdoor attack with limited information. In: ACM CCS (2023)

    Google Scholar 

  22. Zhang, T., Kishore, V., Wu, F., et al.: Bertscore: evaluating text generation with bert. arXiv preprint arXiv:1904.09675 (2019)

  23. Zhang, Y., et al.: Communication-efficient stochastic gradient descent ascent with momentum algorithms. In: IJCAI 2023 (2023)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yunhe Feng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Arbaz, A., Fan, H., Ding, J., Qiu, M., Feng, Y. (2024). GenFlowchart: Parsing and Understanding Flowchart Using Generative AI. In: Cao, C., Chen, H., Zhao, L., Arshad, J., Asyhari, T., Wang, Y. (eds) Knowledge Science, Engineering and Management. KSEM 2024. Lecture Notes in Computer Science(), vol 14884. Springer, Singapore. https://doi.org/10.1007/978-981-97-5492-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-5492-2_8

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-5491-5

  • Online ISBN: 978-981-97-5492-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics