AAGS: Appearance-Aware 3D Gaussian Splatting with Unconstrained Photo Collections

Zhang, Wencong; Guo, Zhiyang; Zhou, Wengang; Li, Houqiang

doi:10.1007/s00530-025-01742-4

AAGS: Appearance-Aware 3D Gaussian Splatting with Unconstrained Photo Collections

Regular Paper
Published: 28 March 2025

Volume 31, article number 173, (2025)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Wencong Zhang¹,
Zhiyang Guo¹,
Wengang Zhou¹ &
…
Houqiang Li¹

122 Accesses
Explore all metrics

Abstract

Reconstructing 3D scenes from unconstrained collections of in-the-wild photographs has consistently been a challenging problem. The main difficulty lies in different appearance conditions and transient occluders of uncontrolled image samples. With the advancement of Neural Radiance Fields (NeRF), previous works have developed some effective strategies to tackle this issue. However, limited by deep networks and volumetric rendering techniques, these methods generally require substantial time costs. Recently, the advent of 3D Gaussian Splatting (3DGS) has significantly accelerated the training and rendering speed of 3D reconstruction tasks. Nevertheless, vanilla 3DGS struggles to distinguish varying appearances of in-the-wild photo collections. To address the aforementioned problems, we propose Appearance-Aware 3D Gaussian Splatting (AAGS), a novel extension of 3DGS to unconstrained photo collections. Specifically, we employ an appearance extractor to capture global features for image samples, enabling the distinction of visual conditions, e.g., illumination and weather, across different observations. Furthermore, to mitigate the impact of transient occluders, we design a transient-removal module that adaptively learns a 2D visibility map to decompose the static target from complex real-world scenes. Extensive experiments are conducted to validate the effectiveness and superiority of our AAGS. Compared with previous works, our method not only achieves better reconstruction and rendering quality, but also significantly reduces both training and rendering overhead. Code will be released at https://github.com/Zhang-WenCong/AAGS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 7

Gaussian in the Wild: 3D Gaussian Splatting for Unconstrained Image Collections

SWAG: Splatting in the Wild Images with Appearance-Conditioned Gaussians

High-resolution SVBRDF estimation based on deep inverse rendering from two-shot images

Article 01 August 2022

Data Availability

All datasets used in this study are publicly available. The data were obtained from open-access repositories, and detailed information, including links to the original datasets, is provided in the references section. There are no restrictions on data access, allowing for replication and further analysis.

References

Kaviani, H.R., Shirani, S.: An adaptive patch-based reconstruction scheme for view synthesis by disparity estimation using optical flow. IEEE TCSVT 28(7), 1540–1552 (2017)
MATH Google Scholar
Liu, B., Peng, B., Zhang, Z., Huang, Q., Ling, N., Lei, J.: Unsupervised single-view synthesis network via style guidance and prior distillation. IEEE TCSVT (2023)
Mildenhall, B., Srinivasan, P., Tancik, M., Barron, J., Ramamoorthi, R., Ng, R.: NeRF: Representing scenes as neural radiance fields for view synthesis. In: ECCV (2020)
Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3D gaussian splatting for real-time radiance field rendering. ACM TOG 42(4), 1–14 (2023)
Google Scholar
Barron, J.T., Mildenhall, B., Tancik, M., Hedman, P., Martin-Brualla, R., Srinivasan, P.P.: Mip-NeRF: A multiscale representation for anti-aliasing neural radiance fields. In: ICCV (2021)
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-NeRF 360: Unbounded anti-aliased neural radiance fields. In: CVPR, pp. 5470–5479 (2022)
Hu, W., Wang, Y., Ma, L., Yang, B., Gao, L., Liu, X., Ma, Y.: Tri-MipRF: Tri-mip representation for efficient anti-aliasing neural radiance fields. In: ICCV, pp. 19774–19783 (2023)
Müller, T., Evans, A., Schied, C., Keller, A.: Instant neural graphics primitives with a multiresolution hash encoding. ACM TOG 41(4), 1–15 (2022)
MATH Google Scholar
Martin-Brualla, R., Radwan, N., Sajjadi, M.S., Barron, J.T., Dosovitskiy, A., Duckworth, D.: NeRF in the wild: Neural radiance fields for unconstrained photo collections. In: CVPR, pp. 7210–7219 (2021)
Chen, X., Zhang, Q., Li, X., Chen, Y., Feng, Y., Wang, X., Wang, J.: Hallucinated neural radiance fields in the wild. In: CVPR, pp. 12943–12952 (2022)
Yang, Y., Zhang, S., Huang, Z., Zhang, Y., Tan, M.: Cross-ray neural radiance fields for novel-view synthesis from unconstrained image collections. In: ICCV (2023)
Dahmani, H., Bennehar, M., Piasco, N., Roldao, L., Tsishkou, D.: SWAG: Splatting in the wild images with appearance-conditioned gaussians. arXiv preprint arXiv:2403.10427 (2024)
Zhang, D., Wang, C., Wang, W., Li, P., Qin, M., Wang, H.: Gaussian in the wild: 3d gaussian splatting for unconstrained image collections. In: ECCV, pp. 341–359 (2025). Springer
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Fridovich-Keil, S., Yu, A., Tancik, M., Chen, Q., Recht, B., Kanazawa, A.: Plenoxels: Radiance fields without neural networks. In: CVPR, pp. 5501–5510 (2022)
Guo, Y.-C., Kang, D., Bao, L., He, Y., Zhang, S.-H.: NeRFReN: Neural radiance fields with reflections. In: CVPR, pp. 18409–18418 (2022)
Pumarola, A., Corona, E., Pons-Moll, G., Moreno-Noguer, F.: D-NeRF: Neural radiance fields for dynamic scenes. In: CVPR, pp. 10318–10327 (2021)
Park, K., Sinha, U., Barron, J.T., Bouaziz, S., Goldman, D.B., Seitz, S.M., Martin-Brualla, R.: Nerfies: Deformable neural radiance fields. In: ICCV, pp. 5865–5874 (2021)
Li, Z., Niklaus, S., Snavely, N., Wang, O.: Neural scene flow fields for space-time view synthesis of dynamic scenes. In: CVPR, pp. 6498–6508 (2021)
Wang, Z., Wu, S., Xie, W., Chen, M., Prisacariu, V.A.: NeRF$--$: Neural radiance fields without known camera parameters. arXiv preprint arXiv:2102.07064 (2021)
Bian, W., Wang, Z., Li, K., Bian, J.-W., Prisacariu, V.A.: Nope-NeRF: Optimising neural radiance field with no pose prior. In: CVPR, pp. 4160–4169 (2023)
Chibane, J., Bansal, A., Lazova, V., Pons-Moll, G.: Stereo radiance fields (srf): Learning view synthesis from sparse views of novel scenes. In: CVPR (2021). IEEE
Irshad, M.Z., Zakharov, S., Liu, K., Guizilini, V., Kollar, T., Gaidon, A., Kira, Z., Ambrus, R.: Neo 360: Neural fields for sparse view synthesis of outdoor scenes. In: ICCV, pp. 9187–9198 (2023)
Guo, S., Wang, Q., Gao, Y., Xie, R., Li, L., Zhu, F., Song, L. IEEE TCSVT, 1–1 (2024) 10.1109/TCSVT.2024.3385360
Kim, I., Choi, M., Kim, H.J.: UP-NeRF: Unconstrained pose-prior-free neural radiance fields. In: NeurIPS (2023)
Wu, G., Yi, T., Fang, J., Xie, L., Zhang, X., Wei, W., Liu, W., Tian, Q., Wang, X.: 4D gaussian splatting for real-time dynamic scene rendering. arXiv preprint arXiv:2310.08528 (2023)
Luiten, J., Kopanas, G., Leibe, B., Ramanan, D.: Dynamic 3D gaussians: Tracking by persistent dynamic view synthesis. arXiv preprint arXiv:2308.09713 (2023)
Yang, Z., Gao, X., Zhou, W., Jiao, S., Zhang, Y., Jin, X.: Deformable 3D gaussians for high-fidelity monocular dynamic scene reconstruction. arXiv preprint arXiv:2309.13101 (2023)
Yu, Z., Chen, A., Huang, B., Sattler, T., Geiger, A.: Mip-Splatting: Alias-free 3D gaussian splatting. arXiv preprint arXiv:2311.16493 (2023)
Fan, Z., Wang, K., Wen, K., Zhu, Z., Xu, D., Wang, Z.: LightGaussian: Unbounded 3D gaussian compression with 15x reduction and 200+ fps. arXiv preprint arXiv:2311.17245 (2023)
Navaneet, K., Meibodi, K.P., Koohpayegani, S.A., Pirsiavash, H.: Compact3D: Compressing gaussian splat radiance field models with vector quantization. arXiv preprint arXiv:2311.18159 (2023)
Niedermayr, S., Stumpfegger, J., Westermann, R.: Compressed 3D Gaussian Splatting for Accelerated Novel View Synthesis (2023)
Liu, Y., Guan, H., Luo, C., Fan, L., Peng, J., Zhang, Z.: CityGaussian: Real-time High-quality Large-Scale Scene Rendering with Gaussians (2024)
Kerbl, B., Meuleman, A., Kopanas, G., Wimmer, M., Lanvin, A., Drettakis, G.: A hierarchical 3d gaussian representation for real-time rendering of very large datasets. ACM TOG 43(4) (2024)
Yang, Z., Gao, X., Sun, Y., Huang, Y., Lyu, X., Zhou, W., Jiao, S., Qi, X., Jin, X.: Spec-gaussian: Anisotropic view-dependent appearance for 3d gaussian splatting. arXiv preprint arXiv:2402.15870 (2024)
Meng, J., Li, H., Wu, Y., Gao, Q., Yang, S., Zhang, J., Ma, S.: Mirror-3dgs: Incorporating mirror reflections into 3d gaussian splatting. arXiv preprint arXiv:2404.01168 (2024)
Fu, Y., Liu, S., Kulkarni, A., Kautz, J., Efros, A.A., Wang, X.: Colmap-free 3d gaussian splatting. arXiv preprint arXiv:2312.07504 (2023)
Zwicker, M., Pfister, H., Van Baar, J., Gross, M.: Ewa volume splatting. In: Proceedings Visualization, pp. 29–538 (2001). IEEE
Chen, G., Wang, W.: A survey on 3D gaussian splatting. arXiv preprint arXiv:2401.03890 (2024)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional networks for biomedical image segmentation. In: MICCAI, pp. 234–241 (2015). Springer
Jin, Y., Mishkin, D., Mishchuk, A., Matas, J., Fua, P., Yi, K.M., Trulls, E.: Image matching across wide baselines: From paper to practice. IJCV 129(2), 517–547 (2021)
MATH Google Scholar
Schonberger, J.L., Frahm, J.-M.: Structure-from-motion revisited. In: CVPR, pp. 4104–4113 (2016)
Rudnev, V., Elgharib, M., Smith, W., Liu, L., Golyanik, V., Theobalt, C.: Nerf for outdoor scene relighting. In: European Conference on Computer Vision, pp. 615–631 (2022). Springer
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE TIP 13(4), 600–612 (2004)
MATH Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR, pp. 586–595 (2018)
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: PyTorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019)
Chandrasekar, A., Chakrabarty, G., Bardhan, J., Hebbalaguppe, R., AP, P.: Remove: A reference-free metric for object erasure. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7901–7910 (2024)

Download references

Author information

Authors and Affiliations

MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, the University of Science and Technology of China, Fuxing Street, Hefei, Anhui, 230052, China
Wencong Zhang, Zhiyang Guo, Wengang Zhou & Houqiang Li

Authors

Wencong Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Zhiyang Guo
View author publications
You can also search for this author inPubMed Google Scholar
Wengang Zhou
View author publications
You can also search for this author inPubMed Google Scholar
Houqiang Li
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Wencong Zhang contributed to the methodology, coding, experiments, and manuscript writing. Zhiyang Guo provided guidance on writing and reviewed the manuscript. Wengang Zhou and Houqiang Li supervised the research and provided overall guidance for the paper.

Corresponding author

Correspondence to Houqiang Li.

Ethics declarations

Conflict of interest

The authors declare no Conflict of interest.

Additional information

Communicated by Bing-kun Bao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, W., Guo, Z., Zhou, W. et al. AAGS: Appearance-Aware 3D Gaussian Splatting with Unconstrained Photo Collections. Multimedia Systems 31, 173 (2025). https://doi.org/10.1007/s00530-025-01742-4

Download citation

Received: 17 October 2024
Accepted: 25 February 2025
Published: 28 March 2025
DOI: https://doi.org/10.1007/s00530-025-01742-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

AAGS: Appearance-Aware 3D Gaussian Splatting with Unconstrained Photo Collections

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Gaussian in the Wild: 3D Gaussian Splatting for Unconstrained Image Collections

SWAG: Splatting in the Wild Images with Appearance-Conditioned Gaussians

High-resolution SVBRDF estimation based on deep inverse rendering from two-shot images

Data Availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now