Abstract
Real-to-Sim transfer is a popular research topic in robotics. Utilizing a simulated environment, the development processes can achieve lower costs and make the testing process easier. In addition, after the Real-to-Sim transfer, the simulated environment can lower the texture effect and light effect, which can be further applied to other computer vision tasks, such as robot grasping. Differing from artistic style transfer, Real-to-Sim transfer has higher accuracy requirements for content preservation. In this paper, we utilize the transformer to solve the Real-to-Sim transfer. We creatively design the restoration stage to preserve the content information. We also propose the restoration loss function. After these improvements, our architecture can achieve better performance on light removal, content preservation, and feature embedding.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Change history
03 February 2023
In the originally published version of chapter 33, erroneously, a second, incorrect affiliation had been added for all of the authors. This has been corrected.
References
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414â2423 (2016)
Li, Y., et al.: Universal style transfer via feature transforms. In: Advances in Neural Information Processing Systems, 30 (2017)
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501â1510 (2017)
Csurka, G.: Domain adaptation for visual applications: a comprehensive survey. arXiv preprint arXiv:1702.05374 (2017)
Bousmalis, K., et al.: Using simulation and domain adaptation to improve efficiency of deep robotic grasping. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 4243â4250. IEEE, May 2018
Rao, K., Harris, C., Irpan, A., Levine, S., Ibarz, J., Khansari, M.: Rl-CycleGAN: reinforcement learning aware simulation-to-real. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11157â11166 (2020)
Prakash, A., Debnath, S., Lafleche, J.F., Cameracci, E., Birchfield, S., Law, M.T.: Self-supervised real-to-sim scene generation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 16044â16054 (2021)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, 30 (2017)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213â229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Dai, Z., Cai, B., Lin, Y., Chen, J.: UP-DETR: unsupervised pre-training for object detection with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1601â1610 (2021)
Zheng, S., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6881â6890 (2021)
Wang, Y., et al.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741â8750 (2021)
Deng, Y., Tang, F., Pan, X., Dong, W., Ma, C., Xu, C.: StyTr\(^ 2\): unbiased image style transfer with transformers. arXiv preprint arXiv:2105.14576 (2021)
Chen, H., et al.: Artistic style transfer with internal-external learning and contrastive learning. In: Advances in Neural Information Processing Systems, 34 (2021)
Qian, K., Zhou, J., Xiong, F., Zhou, H., Du, J.: Object tracking in hyperspectral videos with convolutional features and kernelized correlation filter. In: Basu, A., Berretti, S. (eds.) ICSM 2018. LNCS, vol. 11010, pp. 308â319. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04375-9_26
Mukherjee, S., Valenzise, G., Cheng, I.: Potential of deep features for opinion-unaware, distortion-unaware, no-reference image quality assessment. In: McDaniel, T., Berretti, S., Curcio, I.D.D., Basu, A. (eds.) ICSM 2019. LNCS, vol. 12015, pp. 87â95. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-54407-2_8
Liu, C., Cheng, I., Basu, A.: Synthetic vision assisted real-time runway detection for infrared aerial images. In: Basu, A., Berretti, S. (eds.) ICSM 2018. LNCS, vol. 11010, pp. 274â281. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04375-9_23
Avila, M., Ponce, P., Molina, A., Romo, K.: Simulation framework for load management and behavioral energy efficiency analysis in smart homes. In: McDaniel, T., Berretti, S., Curcio, I.D.D., Basu, A. (eds.) ICSM 2019. LNCS, vol. 12015, pp. 497â508. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-54407-2_42
Lugo, G., Hajari, N., Reddy, A., Cheng, I.: Textureless object recognition using an RGB-D sensor. In: McDaniel, T., Berretti, S., Curcio, I.D.D., Basu, A. (eds.) ICSM 2019. LNCS, vol. 12015, pp. 13â27. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-54407-2_2
Park, D.Y., Lee, K.H.: Arbitrary style transfer with style-attentional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5880â5888 (2019)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Chen, H., et al.: Artistic style transfer with internal-external learning and contrastive learning. In: Advances in Neural Information Processing Systems, 34 (2021)
Kang, M., Park, J.: ContraGAN: contrastive learning for conditional image generation. In: Advances in Neural Information Processing Systems, 33, 21357â21369 (2020)
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: International conference on machine learning, pp. 214â223. PMLR, July 2017
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740â755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Li, X., Liu, S., Kautz, J., Yang, M.H.: Learning linear transformations for fast image and video style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3809â3817 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Âİ 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ma, Y., Yang, F., Li, X., Jiang, C., Basu, A. (2022). SimFormer: Real-to-Sim Transfer with Recurrent Restoration. In: Berretti, S., Su, GM. (eds) Smart Multimedia. ICSM 2022. Lecture Notes in Computer Science, vol 13497. Springer, Cham. https://doi.org/10.1007/978-3-031-22061-6_33
Download citation
DOI: https://doi.org/10.1007/978-3-031-22061-6_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22060-9
Online ISBN: 978-3-031-22061-6
eBook Packages: Computer ScienceComputer Science (R0)