Skip to main content

Advertisement

Log in

Rate-accuracy optimized quantization algorithm based on ROI image coding in power line inspection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In the context of big data, transmission line inspection of the Grid has evolved from the era of human inspection to intelligent inspection. Large quantities of image data will be collected and analyzed by machines. Since the foreground may have greater value compared with the background, Region of Interest (ROI) image coding is applied. However, the traditional image coding aims to maintain good human-perceivable visual quality and is not designed for semantic analysis. The image coding paradigms that comprehensively balance the human visual quality and automatic analysis performance are needed. In this paper, a Rate-Accuracy Optimized quantization algorithm based on Region of Interest image coding is proposed to obtain the optimal analysis performance with the given coded bit rate. First, a machine vision-oriented attention-map is determined. Since the features are leveraged to reflect abstract semantic meaning which is vital for image analysis tasks, it is reasonable to regard the region containing the feature vector information as the key area and others as the non-key area. The key area is compressed with fine-grained quantization while the non-key area is with coarse-grained quantization. Then the relationship between rate, accuracy and quantization parameters are analyzed and modeled. Finally, the optimal quantization parameters are determined based on the Rate-Accuracy criteria. The proposed algorithm is verified by the insulator dataset (Image of insulator defect collected by drone inspection). Experimental results show that the accuracy of defect identification is improved by 12% at the same bit rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Data availability statement

The raw/processed data required to reproduce these findings cannot beshared at this time as the data also forms part of an ongoing study.

References

  1. Adhuran J, Kulupana G, Galkandage C, Fernando A (2020) Multiple quantization parameter optimization in versatile video coding for 360 videos. IEEE Trans Consum Electron 66(3):213–222. https://doi.org/10.1109/TCE.2020.3001231

    Article  Google Scholar 

  2. Alam MM, Nguyen TD, Hagan MT, Chandler DM (2015) A perceptual quantization strategy for HEVC based on a convolutional neural network trained on natural images. Proc. SPIE, vol 9599, Art. no. 959918

  3. Alvar SR, Bajić IV (2021) Pareto-optimal bit allocation for collaborative intelligence. IEEE Trans Image Process 30:3348–3361. https://doi.org/10.1109/TIP.2021.3060875

    Article  MathSciNet  Google Scholar 

  4. Bansal M, Kumar M, Kumar M, et al (2020) An efficient technique for object recognition using Shi-Tomasi corner detection algorithm[J]

  5. Cai C, Chen L, Zhang X, Gao Z (2019) Efficient variable rate image compression with multi-scale decomposition network. IEEE Trans Circ Syst Video Technol 29(12):3687–3700. https://doi.org/10.1109/TCSVT.2018.2880492

    Article  Google Scholar 

  6. Cai Q, Chen Z, Wu D, Liu S, Li X A novel video coding strategy in HEVC for object detection. In: IEEE Transactions on circuits and systems for video technology, https://doi.org/10.1109/TCSVT.2021.3056134

  7. Chang J, et al (2022) Conceptual compression via deep structure and texture synthesis. IEEE Trans Image Process 31:2809–2823. https://doi.org/10.1109/TIP.2022.3159477

    Article  Google Scholar 

  8. Chen T, Liu HJ, Ma Z, Shen Q, Cao X, Wang Y (2019) Neural image compression via non-local attention optimization and improved context modeling, arXiv:1910.06244

  9. Chen Z, Fan K, Wang S, Duan L, Lin W, Kot AC (2020) Toward intelligent sensing: intermediate deep feature compression. IEEE Trans Image Process 29:2230–2243. https://doi.org/10.1109/TIP.2019.2941660

    Article  Google Scholar 

  10. Chen H, He X, Yang H, Qing L, Teng Q (2022) A feature-enriched deep convolutional neural network for JPEG image compression artifacts reduction and its applications. IEEE Trans Neur Netw Learn Syst 33(1):430–444. https://doi.org/10.1109/TNNLS.2021.3124370

    Article  Google Scholar 

  11. Cheng Z, Sun H, Takeuchi M, Katto J (2020) Energy compaction-based image compression using convolutional autoencoder. IEEE Trans Multimed 22 (4):860–873

    Article  Google Scholar 

  12. Chhabra P, Garg NK, Kumar M (2020) Content-based image retrieval system using ORB and SIFT features. Neural Comput Applic 32:2725–2733. https://doi.org/10.1007/s00521-018-3677-9

    Article  Google Scholar 

  13. Ding L, Tian Y, Fan H, Wang Y, Huang T (2017) Rate-performance-loss optimization for inter-frame deep feature coding from videos. IEEE Trans Image Process 26(12):5743–5757. https://doi.org/10.1109/TIP.2017.2745203

    Article  MathSciNet  Google Scholar 

  14. Duan L, Liu J, Yang W, Huang T, Gao W (2020) Video coding for machines: a paradigm of collaborative compression and intelligent analytics. IEEE Trans Image Process 29:8680–8695. https://doi.org/10.1109/TIP.2020.3016485

    Article  Google Scholar 

  15. Fischer K, Brand F, Herglotz C, Kaup A (2020) Video coding for machines with feature-based rate-distortion optimization. In: 2020 IEEE 22nd International workshop on multimedia signal processing (MMSP). Tampere, pp 1–6

  16. Fu H, Liang F, Lei B, Bian N, Zhang Q, Akbari M, Liang J, Tu C (2020) Improved hybrid layered image compression using deep learning and traditional codecs. Signal Processing: Image Communication, 82

  17. Hu Y, Yang S, Yang W, Duan L-Y, Liu J (2020) Towards coding for human and machine vision: a scalable image coding approach. In: 2020 IEEE International Conference on Multimedia and Expo (ICME), London, pp 1–6

  18. Huang B, Chen Z, Su K, Chen J, Ling N (2021) Low-complexity rate-distortion optimization for HEVC encoders. IEEE Trans Broadcast 67(3):721–735. https://doi.org/10.1109/TBC.2021.3077771

    Article  Google Scholar 

  19. Kim S, Kim D, Jeong S, Ham J-W, Lee J-K, Oh K-Y (2020) Fault diagnosis of power transmission lines using a UAV-mounted smart inspection system. IEEE Access 8:149999–150009. https://doi.org/10.1109/ACCESS.2020.3016213

    Article  Google Scholar 

  20. Lee J, Cho S, Beack SK (2018) Context-adaptive entropy model for end-to-end optimized image compression, arXiv, pp 1–19

  21. Li Y, Mou X (2021) Joint optimization for SSIM-based CTU-level bit allocation and rate distortion optimization. IEEE Trans Broadcast 67(2):500–511. https://doi.org/10.1109/TBC.2021.3068871

    Article  Google Scholar 

  22. Li Y, et al (2018) A hybrid neural network for chroma intra prediction. In: Proc. 25th IEEE Int. conf. image process. (ICIP), pp 1797–1801

  23. Liu F, Chen Z (2021) Multi-objective optimization of quality in VVC rate control for low-delay video coding. IEEE Trans Image Process 30:4706–4718. https://doi.org/10.1109/TIP.2021.3072225

    Article  MathSciNet  Google Scholar 

  24. Ma S, Zhang X, Wang S, Zhang X, Jia C, Wang S (2019) Joint feature and texture coding: toward smart video representation via front-end intelligence. IEEE Trans Circ Syst Video Technol 29(10):3095–3105. https://doi.org/10.1109/TCSVT.2018.2873102

    Article  Google Scholar 

  25. Schäfer M, Pientka S, Pfaff J, Schwarz H, Marpe D, Wiegand T (2021) Rate-distortion optimized encoding for deep image compression. IEEE Open J Circ Syst 2:633–647. https://doi.org/10.1109/OJCAS.2021.3124995

    Article  Google Scholar 

  26. Sullivan GJ, Ohm J, Han W, Wiegand T (2012) Overview of the High Efficiency Video Coding (HEVC) standard. IEEE Trans Circ Syst Vid Technol 22(12):1649–1668. https://doi.org/10.1109/TCSVT.2012.2221191

    Article  Google Scholar 

  27. Sullivan GJ, Wiegand T (1998) Rate-distortion optimization for video compression. IEEE Signal Process Mag 15(6):74–90

    Article  Google Scholar 

  28. Wallace GK (1992) The JPEG still picture compression standard. IEEE Trans Consum Electron 38(1):xviii–xxxiv. https://doi.org/10.1109/30.125072

    Article  Google Scholar 

  29. Wang Z, Simoncelli EP, Bovik AC (2003) Multiscale structural similarity for image quality assessment. In: The Thrity-seventh asilomar conference on signals, systems & computers, pp 1398–1402. https://doi.org/10.1109/ACSSC.2003.1292216

  30. Wang S, Wang S, Zhang X, Wang S, Ma S, Gao W (2019) Scalable facial image compression with deep feature reconstruction. In: 2019 IEEE International conference on image processing. IEEE, pp 2691–2695

  31. Wang X, Yang E-h, He D-k, Song L, Yu X (2020) Rate distortion optimization: a joint framework and algorithms for random access hierarchical video coding. IEEE Trans Image Process 29:9458–9469. https://doi.org/10.1109/TIP.2020.3028280

    Article  MathSciNet  Google Scholar 

  32. Wang X, Yang E-h, He D-k, Song L, Yu X (2020) Rate distortion optimization: a joint framework and algorithms for random access hierarchical video coding. IEEE Trans Image Process 29:9458–9469. https://doi.org/10.1109/TIP.2020.3028280

    Article  MathSciNet  Google Scholar 

  33. Wang Y, Liu D, Ma S, Wu F, Gao W (2021) Ensemble learning-based rate-distortion optimization for end-to-end image compression. IEEE Trans Circ Syst Video Technol 31(3):1193–1207. https://doi.org/10.1109/TCSVT.2020.3000331

    Article  Google Scholar 

  34. Wang S, et al (2022) Towards analysis-friendly face representation with scalable feature and texture compression. IEEE Trans Multimed 24:3169–3181. https://doi.org/10.1109/TMM.2021.3094300

    Article  MathSciNet  Google Scholar 

  35. Wiegand T, Sullivan GJ, Bjontegaard G, Luthra A (2003) Overview of the H.264/AVC video coding standard. IEEE Trans Circ Syst Video Technol 13(7):560–576. https://doi.org/10.1109/TCSVT.2003.815165

    Article  Google Scholar 

  36. Xia S, Liang K, Yang W, Duan L-Y, Liu J (2020) An Emerging Coding Paradigm VCM: a scalable coding approach beyond feature and signal. In: 2020 IEEE International Conference on Multimedia and Expo (ICME), London, pp 1–6, https://doi.org/10.1109/ICME46284.2020.9102843

  37. Yang L, Fan J, Liu Y, Li E, Peng J, Liang Z (2020) A review on state-of-the-art power line inspection techniques. IEEE Trans Instrum Measur 69(12):9350–9365. https://doi.org/10.1109/TIM.2020.3031194

    Article  Google Scholar 

  38. Yang F, Herranz L, Weijer Jvd, Guitián JAI, López AM, Mozerov MG (2020) Variable rate deep image compression with modulated autoencoder. IEEE Signal Process Lett 27:331–335

    Article  Google Scholar 

  39. Yang S, Hu Y, Yang W, Duan L-Y, Liu J (2021) Towards coding for human and machine vision: scalable face image coding. IEEE Trans Multimed 23:2957–2971

    Article  Google Scholar 

  40. Yılmaz MA, Tekalp AM (2022) End-to-end rate-distortion optimized learned hierarchical bi-directional video compression. IEEE Trans Image Process 31:974–983. https://doi.org/10.1109/TIP.2021.3138300

    Article  Google Scholar 

  41. Zhao L, Bai H, Wang A, Zhao Y (2019) Learning a virtual codec based on deep convolutional neural network to compress image. Journal of Visual Communication and Image Representation, 63

Download references

Acknowledgments

This paper was partially financially supported by the National Natural Science Foundation of China (61202369, 61401269), Shanghai Technology Innovation Project (17020500900), and “Shuguang Program” sponsored by Shanghai Education Development Foundation and Shanghai Municipal Education Commission (17SG51).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Jiang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Z., Jiang, W., Zhang, Y. et al. Rate-accuracy optimized quantization algorithm based on ROI image coding in power line inspection. Multimed Tools Appl 83, 16139–16160 (2024). https://doi.org/10.1007/s11042-023-15271-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15271-7

Keywords

Navigation