Abstract
Due to the advancement of hardware technology, e.g. head-mounted display devices, augmented reality (AR) has been widely used. In AR, virtual objects added to the real environment may partially overlap with objects in the real world, leading to a degraded display. Thus, except for adding virtual objects to the real world, diminished reality (DR) is an urgent task that virtually removes, hides, and sees through real objects from panoramas. In this paper, we propose a pipeline for diminished reality in indoor panoramic images with rich prior information. Especially, to restore the structure information, a structure restoration module is developed to aggregate the layout boundary features of the masked panoramic image. Subsequently, we design a structured region texture extraction module to assist the real texture restoration after removing the target object. Ultimately, to explore the relations among structure and texture, we design a fast Fourier convolution fusion module to generate inpainting results respecting real-world structures and textures. Moreover, we also create a structured panoramic image diminished reality dataset (SD) for the diminished reality task. Extensive experiments illustrate that the proposed pipeline is capable of producing more realistic results, which is also consistent with the human eye’s perception of structural changes in indoor panoramic images.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Takeuchi, Y., Perlin, K.: ClayVision: the (elastic) image of the city. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2411–2420 (2012)
Gkitsas, V., Sterzentsenko, V., Zioulis, N., Albanis, G., Zarpalas, D.: PanoDR: spherical panorama diminished reality for indoor scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3716–3726 (2021)
Kawai, N., Sato, T., Yokoya, N.: Diminished reality based on image inpainting considering background geometry. IEEE Trans. Visual Comput. Graph. 22(3), 1236–1247 (2015)
Bertel, T., Campbell, N.D.F., Richardt, C.: MegaParallax: casual 360 panoramas with motion parallax. IEEE Trans. Visual Comput. Graph. 25(5), 1828–1835 (2019)
Nazeri, K., Ng, E., Joseph, T., Qureshi, F., Ebrahimi, M.: EdgeConnect: structure guided image inpainting using edge prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)
Dong, Q., Cao, C., Fu, Y.: Incremental transformer structure enhanced image inpainting with masking positional encoding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11358–11368 (2022)
Pintore, G., Agus, M., Almansa, E., Gobbetti, E.: Instant automatic emptying of panoramic indoor scenes. IEEE Trans. Visual Comput. Graph. 28(11), 3629–3639 (2022)
Chen, Q., Koltun, V.: Photographic image synthesis with cascaded refinement networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1511–1520 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Dong, H., Yu, S., Wu, C., Guo, Y.: Semantic image synthesis via adversarial learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5706–5714 (2017)
Suvorov, R., et al.: Resolution-robust large mask inpainting with Fourier convolutions. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2149–2159 (2022)
Guo, X., Yang, H., Huang, D.: Image inpainting via conditional texture and structure dual generation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14134–14143 (2021)
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10684–10695 (2022)
Zou, C., Colburn, A., Shan, Q., Hoiem, D.: LayoutNet: reconstructing the 3D room layout from a single RGB image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2051–2059 (2018)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Lim, J.H., Ye, J.C.: Geometric GAN. arXiv preprint arXiv:1705.02894 (2017)
Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: PatchMatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28(3), 24 (2009)
Zhu, P., Abdal, R., Qin, Y., Wonka, P.: Sean: image synthesis with semantic region-adaptive normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5104–5113 (2020)
Qin, Y., Chi, X., Sheng, B., Lau, R.W.: GuideRender: large-scale scene navigation based on multi-modal view frustum movement prediction. Visual Comput. 1–11 (2023)
Rudolph, C., Brunnett, G., Bretschneider, M., Meyer, B., Asbrock, F.: TechnoSapiens: merging humans with technology in augmented reality. Visual Comput. 1–16 (2023)
Kim, T., Kim, G.J.: Real-time and on-line removal of moving human figures in hand-held mobile augmented reality. Vis. Comput. 39(7), 2571–2582 (2023)
Chung, S.J., Lee, T.H., Jeong, B.R., et al.: VRCAT: VR collision alarming technique for user safety. Vis. Comput. 39(7), 3145–3159 (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, J., Zhang, Q., Shen, X., Wu, W., Wang, X. (2024). Hybrid Prior-Based Diminished Reality for Indoor Panoramic Images. In: Sheng, B., Bi, L., Kim, J., Magnenat-Thalmann, N., Thalmann, D. (eds) Advances in Computer Graphics. CGI 2023. Lecture Notes in Computer Science, vol 14497. Springer, Cham. https://doi.org/10.1007/978-3-031-50075-6_30
Download citation
DOI: https://doi.org/10.1007/978-3-031-50075-6_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50074-9
Online ISBN: 978-3-031-50075-6
eBook Packages: Computer ScienceComputer Science (R0)