Abstract
Semantic image compression can greatly reduce the amount of transmitted data by representing and reconstructing images using semantic information. Considering the fact that objects in an image are not equally important at the semantic level, we propose a semantic importance-based deep image compression scheme, where a generative approach is used to produce a visually pleasing image from segmentation information. A base-layer image can be reconstructed using a conditional generative adversarial network (GAN) considering the importance of objects. To ensure that objects with the same semantic importance have similar perceptual fidelity, a generative compensation module has been designed, considering the varying generative capability of GAN. The base-layer image can be further refined using residuals, prioritizing regions with high semantic importance. Experimental results show that the reconstructed images of the proposed scheme are more visually pleasing compared with relevant schemes, and objects with a high semantic importance achieve both good pixel and semantic-perceptual fidelity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agustsson, E., Tschannen, M., Mentzer, F., Timofte, R., Gool, L.V.: Generative adversarial networks for extreme learned image compression, pp. 221–231 (2019)
Akbari, M., Liang, J., Han, J.: DSSLIC: deep semantic segmentation-based layered image compression. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2042–2046 (2019)
Balle, J., Laparra, V., Simoncelli, E.P.: End-to-end optimization of nonlinear transform codes for perceptual quality. In: Picture Coding Symposium (PCS), pp. 1–5. IEEE, Nuremberg, Germany (2016). https://doi.org/10.1109/PCS.2016.7906310
Bellard., F.: BPG Image format
Binkowski, M., Sutherland, D.J., Arbel, M., Gretton, A.: Demystifying MMD GANs. ArXiv:1801.01401 (2018)
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. IEEE (2016)
Goodfellow, I., et al.: Generative adversarial nets. In: Neural Information Processing Systems (2014)
Google: WebP Image format (2010). https://developers.google.com/speed/webp/
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems 30 (2017)
Hoang, T.M., Zhou, J., Fan, Y.: Image compression with encoder-decoder matched semantic segmentation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 160–161 (2020)
Huang, D., Gao, F., Tao, X., Du, Q., Lu, J.: Towards semantic communications: deep learning-based image semantic coding. IEEE J. Selected Areas Commun. 41(1), 55–71 (2022)
Huang, D., Tao, X., Gao, F., Lu, J.: Deep learning-based image semantic coding for semantic communications. In: IEEE Global Communications Conference (GLOBECOM), pp. 1–6 (2021)
Liu, M., Zhu, C., Wu, X.: Index assignment design for three-description lattice vector quantization. In: 2006 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 4-pp. IEEE (2006)
Liu, Y.Y., Zhu, C., Mao, M.: Light field image compression based on quality aware pseudo-temporal sequence. Electron. Lett. 54(8), 500–501 (2018)
Liu, Z., Meng, L., Tan, Y., Zhang, J., Zhang, H.: Image compression based on octave convolution and semantic segmentation. Knowl.-Based Syst. 228, 107254 (2021)
Meng, L., Li, H., Zhang, J., Tan, Y., Ren, Y., Zhang, H.: Convolutional auto-encoder based multiple description coding network. KSII Trans. Internet and Inform. Syst. (TIIS) 14(4), 1689–1703 (2020)
Mentzer, F., Toderici, G.D., Tschannen, M., Agustsson, E.: High-fidelity generative image compression. Adv. Neural. Inf. Process. Syst. 33, 11913–11924 (2020)
Padilla, R., Netto, S.L., Silva, E.: A survey on performance metrics for object-detection algorithms. In: International Conference on Systems, Signals and Image Processing (IWSSIP) (2020)
Paszke, A., et al.: Automatic differentiation in pytorch (2017)
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv e-prints (2018)
Shi, J., Chen, Z.: Reinforced bit allocation under task-driven semantic distortion metrics. In: IEEE International Symposium on Circuits And Systems (ISCAS), pp. 1–5 (2020)
Skodras, A., Christopoulos, C., Ebrahimi, T.: The JPEG 2000 still image compression standard. IEEE Signal Process. Mag. 18(5), 36–58 (2001)
Wallace, Gregory, K.: The JPEG still picture compression standard. Communications ACM 34(4), 30–44 (1991)
Zhang, D., et al.: Exploring resolution fields for scalable image compression with uncertainty guidance. IEEE Trans. Circ. Syst. Video Technolpp. (2023). https://doi.org/10.1109/TCSVT.2023.3307438
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
Zhao, L., Bai, H., Wang, A., Zhao, Y.: Multiple description convolutional neural networks for image compression. IEEE Trans. Circuits Syst. Video Technol. 29(8), 2494–2508 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Gu, X., Xu, Y., Zhu, K. (2024). Semantic Importance-Based Deep Image Compression Using a Generative Approach. In: Rudinac, S., et al. MultiMedia Modeling. MMM 2024. Lecture Notes in Computer Science, vol 14555. Springer, Cham. https://doi.org/10.1007/978-3-031-53308-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-53308-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53307-5
Online ISBN: 978-3-031-53308-2
eBook Packages: Computer ScienceComputer Science (R0)