Abstract
Convolutional Neural Networks (CNN) and derivative architectures have been increasingly popular for image and signal processing applications such as detection and classification. Recently, a CNN architecture with a Wavelet Packet feature selector (Ultra-CNN) was introduced for Ultrasonic Non-Destructive Evaluation applications. This CNN based classifier manages to detect the presence of flaws with accuracy up to 92% using experimental data. In this study, an FPGA based Ultra-CNN design using high-level synthesis (HLS) is presented. Implementing the algorithm on a portable FPGA platform facilitates detection of ultrasonic flaws with high accuracy even when there is no access to high performance computation resources in the field. Unlike most other CNN designs used for pattern recognition in images, Ultra-CNN’s fully connected layers require more operations than its convolutional layers. In order to maximize the throughput, proposed design is optimized for both convolutional and fully connected layers. Therefore, we introduce a new design with two pipelined processors optimized for convolutional and fully connected layers, respectively. The results demonstrate highest utilization efficiency achieved compared to other CNN implementations and validate the low-cost, real-time operation of the design.













Similar content being viewed by others
References
LeCun, Y., et al. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 1097–1105.
Imagenet. [Online]. Available: http://image-net.org/
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431–3440).
Kim, Y. (2014). Convolutional neural networks for sentence classification. In Proceeding 2014 Conference on Empirical Methods in Natural Language Processing (pp. 1746–1751).
Shi, W., et al. (2016). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1874–1883).
Virupakshappa, K., & Oruklu, E. (2019). Multi-class classification of defect types in ultrasonic ndt signals with convolutional neural networks. In 2019 IEEE International Ultrasonics Symposium (IUS) (pp. 1647–1650). IEEE.
Chapon, A., Pereira, D., Toews, M., & Belanger, P. (2021). Deconvolution of ultrasonic signals using a convolutional neural network. Ultrasonics, 111, 106312.
Pilikos, G., Horchens, L., Batenburg, K. J., van Leeuwen, T., & Lucka, F. (2020). Fast ultrasonic imaging using end-to-end deep learning. In 2020 IEEE International Ultrasonics Symposium (IUS) (pp. 1–4). IEEE.
Posilović, L., Medak, D., Subašić, M., Petković, T., Budimir, M., & Lončarić, S. (2019). Flaw detection from ultrasonic images using YOLO and SSD. In 2019 11th International Symposium on Image and Signal Processing and Analysis (ISPA) (pp. 163–168).
Virupakshappa, K., Marino, M., & Oruklu, E. (2018). A multi-resolution convolutional neural network architecture for ultrasonic flaw detection. In 2018 IEEE International Ultrasonics Symposium (IUS) (pp. 1–4).
Guo, K., et al. (2018). Angel-eye: A complete designflow for mapping CNN onto embedded FPGA. IEEE Transactions Computer-Aided Design Integrated Circuits Systems, 37(1), 35–47.
Zhang, C., et al. (2015). Optimizing FPGA-based accelerator design for deep convolutional neural networks. In Proceedings of the 2015 ACM/SIGDA international symposium on field-programmable gate arrays (pp. 161–170).
Shen, Y., Ferdman, M., & Milder, P. (2017). Maximizing CNN accelerator efficiency through resource partitioning. In 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) (pp. 535–547).
Ma, Y., Cao, Y., Vrudhula, S., & Seo, J. S. (2018). Optimizing the convolution operation to accelerate deep neural networks on FPGA. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 26(7), 1354–1367.
Qiu, J., et al. (2016). Going deeper with embedded FPGA platform for convolutional neural network. In Proc. ACM FPGA (pp. 26–35).
Ma, Y., et al. (2018). ALAMO: FPGA acceleration of deep learning algorithms with a modularized RTL compiler. Integration, the VLSI Journal, 62, 14–23.
Wang, C., et al. (2017). DLAU: A scalable deep learning accelerator unit on FPGA. IEEE Transactions Computer-Aided Design Integrated Circuits Systems, 36(3), 513–517.
Schmerr, L. W. Jr. (2016) Fundamentals of ultrasonic non destructive evaluation: A modeling approach. In Measurement Science and Technology (2nd ed.). Springer Series. ISBN 978–3–319–30463–2.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248–255).
AXI Reference Guide, v14.3, Xilinx, San Jose, CA, 2012
Xie, L., et al. (2018). High throughput CNN accelerator design based on FPGA. In IEEE International Conference on Field-Programmable Technology (FPT) (pp. 274–277).
Williams, S., Waterman, A., & Patterson, D. (2009). Roofline: An insightful visual performance model for multicore architectures. Communications of the ACM, 52(4), 65–76.
ZedBoard, [Online]. Available: http://www.zedboard.org/
Vivado Design Suite HLx Editions - Accelerating High Level Design, [Online]. Available: https://www.xilinx.com/products/design-tools/vivado.html
Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 [cs.CV].
Mei, C., et al. (2017). A 200MHZ 202.4GFLOPS@10.8W VGG16 accelerator in Xilinx VX690T. In Proc. IEEE Global Conference Signal Information Process. (GlobalSIP) (pp. 784–788).
Lian, X., et al. (2019). High-performance FPGA-based CNN accelerator with block-floating-point arithmetic. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 27(8), 1874–1885.
Kala, S., et al. (2019). High-performance CNN accelerator on FPGA using unified winograd-GEMM architecture. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 27(12), 2816–2828.
Wang, J., et al. (2018). Efficient hardware architectures for deep convolutional neural network. IEEE Transactions on Circuits and Systems I: Regular Papers, 65(6), 1941–1953.
Chen, Z., et al. (2020). Deep neural network acceleration based on low-rank approximated channel pruning. IEEE Transactions on Circuits and Systems I: Regular Papers, 67(4), 1232–1244.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yuan, Y., Virupakshappa, K. & Oruklu, E. FPGA Implementation of an Ultrasonic Flaw Detection Algorithm Based on Convolutional Neural Networks. J Sign Process Syst 94, 1447–1457 (2022). https://doi.org/10.1007/s11265-022-01756-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-022-01756-5