Abstract
In the context of big data, transmission line inspection of the Grid has evolved from the era of human inspection to intelligent inspection. Large quantities of image data will be collected and analyzed by machines. Since the foreground may have greater value compared with the background, Region of Interest (ROI) image coding is applied. However, the traditional image coding aims to maintain good human-perceivable visual quality and is not designed for semantic analysis. The image coding paradigms that comprehensively balance the human visual quality and automatic analysis performance are needed. In this paper, a Rate-Accuracy Optimized quantization algorithm based on Region of Interest image coding is proposed to obtain the optimal analysis performance with the given coded bit rate. First, a machine vision-oriented attention-map is determined. Since the features are leveraged to reflect abstract semantic meaning which is vital for image analysis tasks, it is reasonable to regard the region containing the feature vector information as the key area and others as the non-key area. The key area is compressed with fine-grained quantization while the non-key area is with coarse-grained quantization. Then the relationship between rate, accuracy and quantization parameters are analyzed and modeled. Finally, the optimal quantization parameters are determined based on the Rate-Accuracy criteria. The proposed algorithm is verified by the insulator dataset (Image of insulator defect collected by drone inspection). Experimental results show that the accuracy of defect identification is improved by 12% at the same bit rate.
Similar content being viewed by others
Data availability statement
The raw/processed data required to reproduce these findings cannot beshared at this time as the data also forms part of an ongoing study.
References
Adhuran J, Kulupana G, Galkandage C, Fernando A (2020) Multiple quantization parameter optimization in versatile video coding for 360∘ videos. IEEE Trans Consum Electron 66(3):213–222. https://doi.org/10.1109/TCE.2020.3001231
Alam MM, Nguyen TD, Hagan MT, Chandler DM (2015) A perceptual quantization strategy for HEVC based on a convolutional neural network trained on natural images. Proc. SPIE, vol 9599, Art. no. 959918
Alvar SR, Bajić IV (2021) Pareto-optimal bit allocation for collaborative intelligence. IEEE Trans Image Process 30:3348–3361. https://doi.org/10.1109/TIP.2021.3060875
Bansal M, Kumar M, Kumar M, et al (2020) An efficient technique for object recognition using Shi-Tomasi corner detection algorithm[J]
Cai C, Chen L, Zhang X, Gao Z (2019) Efficient variable rate image compression with multi-scale decomposition network. IEEE Trans Circ Syst Video Technol 29(12):3687–3700. https://doi.org/10.1109/TCSVT.2018.2880492
Cai Q, Chen Z, Wu D, Liu S, Li X A novel video coding strategy in HEVC for object detection. In: IEEE Transactions on circuits and systems for video technology, https://doi.org/10.1109/TCSVT.2021.3056134
Chang J, et al (2022) Conceptual compression via deep structure and texture synthesis. IEEE Trans Image Process 31:2809–2823. https://doi.org/10.1109/TIP.2022.3159477
Chen T, Liu HJ, Ma Z, Shen Q, Cao X, Wang Y (2019) Neural image compression via non-local attention optimization and improved context modeling, arXiv:1910.06244
Chen Z, Fan K, Wang S, Duan L, Lin W, Kot AC (2020) Toward intelligent sensing: intermediate deep feature compression. IEEE Trans Image Process 29:2230–2243. https://doi.org/10.1109/TIP.2019.2941660
Chen H, He X, Yang H, Qing L, Teng Q (2022) A feature-enriched deep convolutional neural network for JPEG image compression artifacts reduction and its applications. IEEE Trans Neur Netw Learn Syst 33(1):430–444. https://doi.org/10.1109/TNNLS.2021.3124370
Cheng Z, Sun H, Takeuchi M, Katto J (2020) Energy compaction-based image compression using convolutional autoencoder. IEEE Trans Multimed 22 (4):860–873
Chhabra P, Garg NK, Kumar M (2020) Content-based image retrieval system using ORB and SIFT features. Neural Comput Applic 32:2725–2733. https://doi.org/10.1007/s00521-018-3677-9
Ding L, Tian Y, Fan H, Wang Y, Huang T (2017) Rate-performance-loss optimization for inter-frame deep feature coding from videos. IEEE Trans Image Process 26(12):5743–5757. https://doi.org/10.1109/TIP.2017.2745203
Duan L, Liu J, Yang W, Huang T, Gao W (2020) Video coding for machines: a paradigm of collaborative compression and intelligent analytics. IEEE Trans Image Process 29:8680–8695. https://doi.org/10.1109/TIP.2020.3016485
Fischer K, Brand F, Herglotz C, Kaup A (2020) Video coding for machines with feature-based rate-distortion optimization. In: 2020 IEEE 22nd International workshop on multimedia signal processing (MMSP). Tampere, pp 1–6
Fu H, Liang F, Lei B, Bian N, Zhang Q, Akbari M, Liang J, Tu C (2020) Improved hybrid layered image compression using deep learning and traditional codecs. Signal Processing: Image Communication, 82
Hu Y, Yang S, Yang W, Duan L-Y, Liu J (2020) Towards coding for human and machine vision: a scalable image coding approach. In: 2020 IEEE International Conference on Multimedia and Expo (ICME), London, pp 1–6
Huang B, Chen Z, Su K, Chen J, Ling N (2021) Low-complexity rate-distortion optimization for HEVC encoders. IEEE Trans Broadcast 67(3):721–735. https://doi.org/10.1109/TBC.2021.3077771
Kim S, Kim D, Jeong S, Ham J-W, Lee J-K, Oh K-Y (2020) Fault diagnosis of power transmission lines using a UAV-mounted smart inspection system. IEEE Access 8:149999–150009. https://doi.org/10.1109/ACCESS.2020.3016213
Lee J, Cho S, Beack SK (2018) Context-adaptive entropy model for end-to-end optimized image compression, arXiv, pp 1–19
Li Y, Mou X (2021) Joint optimization for SSIM-based CTU-level bit allocation and rate distortion optimization. IEEE Trans Broadcast 67(2):500–511. https://doi.org/10.1109/TBC.2021.3068871
Li Y, et al (2018) A hybrid neural network for chroma intra prediction. In: Proc. 25th IEEE Int. conf. image process. (ICIP), pp 1797–1801
Liu F, Chen Z (2021) Multi-objective optimization of quality in VVC rate control for low-delay video coding. IEEE Trans Image Process 30:4706–4718. https://doi.org/10.1109/TIP.2021.3072225
Ma S, Zhang X, Wang S, Zhang X, Jia C, Wang S (2019) Joint feature and texture coding: toward smart video representation via front-end intelligence. IEEE Trans Circ Syst Video Technol 29(10):3095–3105. https://doi.org/10.1109/TCSVT.2018.2873102
Schäfer M, Pientka S, Pfaff J, Schwarz H, Marpe D, Wiegand T (2021) Rate-distortion optimized encoding for deep image compression. IEEE Open J Circ Syst 2:633–647. https://doi.org/10.1109/OJCAS.2021.3124995
Sullivan GJ, Ohm J, Han W, Wiegand T (2012) Overview of the High Efficiency Video Coding (HEVC) standard. IEEE Trans Circ Syst Vid Technol 22(12):1649–1668. https://doi.org/10.1109/TCSVT.2012.2221191
Sullivan GJ, Wiegand T (1998) Rate-distortion optimization for video compression. IEEE Signal Process Mag 15(6):74–90
Wallace GK (1992) The JPEG still picture compression standard. IEEE Trans Consum Electron 38(1):xviii–xxxiv. https://doi.org/10.1109/30.125072
Wang Z, Simoncelli EP, Bovik AC (2003) Multiscale structural similarity for image quality assessment. In: The Thrity-seventh asilomar conference on signals, systems & computers, pp 1398–1402. https://doi.org/10.1109/ACSSC.2003.1292216
Wang S, Wang S, Zhang X, Wang S, Ma S, Gao W (2019) Scalable facial image compression with deep feature reconstruction. In: 2019 IEEE International conference on image processing. IEEE, pp 2691–2695
Wang X, Yang E-h, He D-k, Song L, Yu X (2020) Rate distortion optimization: a joint framework and algorithms for random access hierarchical video coding. IEEE Trans Image Process 29:9458–9469. https://doi.org/10.1109/TIP.2020.3028280
Wang X, Yang E-h, He D-k, Song L, Yu X (2020) Rate distortion optimization: a joint framework and algorithms for random access hierarchical video coding. IEEE Trans Image Process 29:9458–9469. https://doi.org/10.1109/TIP.2020.3028280
Wang Y, Liu D, Ma S, Wu F, Gao W (2021) Ensemble learning-based rate-distortion optimization for end-to-end image compression. IEEE Trans Circ Syst Video Technol 31(3):1193–1207. https://doi.org/10.1109/TCSVT.2020.3000331
Wang S, et al (2022) Towards analysis-friendly face representation with scalable feature and texture compression. IEEE Trans Multimed 24:3169–3181. https://doi.org/10.1109/TMM.2021.3094300
Wiegand T, Sullivan GJ, Bjontegaard G, Luthra A (2003) Overview of the H.264/AVC video coding standard. IEEE Trans Circ Syst Video Technol 13(7):560–576. https://doi.org/10.1109/TCSVT.2003.815165
Xia S, Liang K, Yang W, Duan L-Y, Liu J (2020) An Emerging Coding Paradigm VCM: a scalable coding approach beyond feature and signal. In: 2020 IEEE International Conference on Multimedia and Expo (ICME), London, pp 1–6, https://doi.org/10.1109/ICME46284.2020.9102843
Yang L, Fan J, Liu Y, Li E, Peng J, Liang Z (2020) A review on state-of-the-art power line inspection techniques. IEEE Trans Instrum Measur 69(12):9350–9365. https://doi.org/10.1109/TIM.2020.3031194
Yang F, Herranz L, Weijer Jvd, Guitián JAI, López AM, Mozerov MG (2020) Variable rate deep image compression with modulated autoencoder. IEEE Signal Process Lett 27:331–335
Yang S, Hu Y, Yang W, Duan L-Y, Liu J (2021) Towards coding for human and machine vision: scalable face image coding. IEEE Trans Multimed 23:2957–2971
Yılmaz MA, Tekalp AM (2022) End-to-end rate-distortion optimized learned hierarchical bi-directional video compression. IEEE Trans Image Process 31:974–983. https://doi.org/10.1109/TIP.2021.3138300
Zhao L, Bai H, Wang A, Zhao Y (2019) Learning a virtual codec based on deep convolutional neural network to compress image. Journal of Visual Communication and Image Representation, 63
Acknowledgments
This paper was partially financially supported by the National Natural Science Foundation of China (61202369, 61401269), Shanghai Technology Innovation Project (17020500900), and “Shuguang Program” sponsored by Shanghai Education Development Foundation and Shanghai Municipal Education Commission (17SG51).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, Z., Jiang, W., Zhang, Y. et al. Rate-accuracy optimized quantization algorithm based on ROI image coding in power line inspection. Multimed Tools Appl 83, 16139–16160 (2024). https://doi.org/10.1007/s11042-023-15271-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15271-7