Real-time and on-line removal of moving human figures in hand-held mobile augmented reality

Kim, Taehyung; Kim, Gerard J.

doi:10.1007/s00371-022-02479-1

Real-time and on-line removal of moving human figures in hand-held mobile augmented reality

Original article
Published: 30 April 2022

Volume 39, pages 2571–2582, (2023)
Cite this article

The Visual Computer Aims and scope Submit manuscript

427 Accesses
4 Citations
Explore all metrics

Abstract

In this paper, we present a real time on-line augmented/diminished reality system that runs entirely on the hand-held moving mobile device. Specifically, we introduce an improved inpainting algorithm that is designed for the on-line usage (i.e., live streaming video) with the moving view point. Unlike other previous approaches, the proposed algorithm produces a reasonably high-quality inpainting imagery by taking advantage of the 3D camera tracking and scene information from the SLAM in selecting the proper source frame from which to extract the parts to replace the masked region, and robustly applying homography to fill it in from the source. The algorithm is evaluated and compared with the state-of-the-art video inpainting methods in terms of the execution time, and objective and subjective inpainted image quality. Although the algorithm is applicable to any dynamic objects, the current implementation is limited to removing and filling in for human figures only. The evaluation results have shown the quality of the inpainted image quality was on par with those by the off-line state-of-the-art systems and yet ran at an interactive rate on a mobile device. A user study was also conducted to assess the user perception and experience in an outdoor interactive AR application using the proposed algorithm to remove the interfering pedestrians. The algorithm significantly reduced the level of distraction and improved the AR user experience by lowering the visual inconsistency and artifacts, when compared to other nominal test conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Augmented Reality: A Comprehensive Review

Article 20 October 2022

Image Inpainting: A Review

Article 06 December 2019

YOLO glass: video-based smart object detection using squeeze and attention YOLO network

Article 31 December 2023

Notes

The dataset is available on-line at this site.

References

Kim, H., Kim, T., Lee, M., Kim, G.J., Hwang, J.I.: Don’t bother me: how to handle content-irrelevant objects in handheld augmented reality. In:26th ACM Symposium on Virtual Reality Software and Technology, pp. 1–5. https://doi.org/10.1145/3385956.3418948 (2020)
Kim, H., Kim, T., Lee, M., Kim, G.J., Hwang, J.I.: CIRO: the effects of visually diminished real objects on human perception in handheld augmented reality. Electronics 10(8), 900 (2021). https://doi.org/10.3390/electronics10080900
Article Google Scholar
Niantic, Inc. PokemonGo. Accessed 17 Mar, 2021. https://www.pokemongo.com/en-us/(2021)
Wang, C., Huang, H., Han, X., Wang, J.: Video inpainting by jointly learning temporal structure and spatial details. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, No. 01, pp. 5232–5239. https://doi.org/10.1609/aaai.v33i01.33015232 (2019)
Granados, M., Kim, K.I., Tompkin, J., Kautz, J.,Theobalt, C.: Background inpainting for videos with dynamic objects and a free-moving camera. In: European Conference on Computer Vision, pp. 682–695. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33718-5_49 (2012)
Newson, A., Almansa, A., Fradet, M., Gousseau, Y., Pérez, P.: Video inpainting of complex scenes. Siam J. Imaging Sci. 7(4), 1993–2019 (2014). https://doi.org/10.1137/140954933
Article MathSciNet MATH Google Scholar
Huang, J.B., Kang, S.B., Ahuja, N., Kopf, J.: Temporally coherent completion of dynamic video. ACM Trans. Graph. TOG 35(6), 1–11 (2016). https://doi.org/10.1145/2980179.2982398
Article Google Scholar
Xu, R., Li, X., Zhou, B., Loy, C.C.: Deep flow-guided video inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3723–3732. https://doi.org/10.1109/cvpr.2019.00384 (2019)
Gao, C., Saraf, A., Huang, J.B., Kopf, J.: Flow-edge guided video completion. In: European Conference on Computer Vision, pp. 713–729. Springer, Cham. https://doi.org/10.1007/978-3-030-58610-2_42 (2020)
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5505–5514. https://doi.org/10.1109/cvpr.2018.00577 (2018)
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4471–4480. https://doi.org/10.1109/iccv.2019.00457 (2019)
Liu, G., Reda, F.A., Shih, K.J., Wang, T.C., Tao, A., Catanzaro, B.: Image inpainting for irregular holes using partial convolutions. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 85–100. https://doi.org/10.1007/978-3-030-01252-6_6 (2018)
Telea, A.: An image inpainting technique based on the fast marching method. J. Graph. Tools 9(1), 23–34 (2004). https://doi.org/10.1080/10867651.2004.10487596
Article Google Scholar
Ebdelli, M., Le Meur, O., Guillemot, C.: Video inpainting with short-term windows: application to object removal and error concealment. IEEE Trans. Image Process. 24(10), 3034–3047 (2015). https://doi.org/10.1109/TIP.2015.2437193
Article MathSciNet MATH Google Scholar
Le, T.T., Almansa, A., Gousseau, Y., Masnou, S.: Motion-consistent video inpainting. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 2094–2098. IEEE. https://doi.org/10.1109/ICIP.2017.8296651 (2017)
Siltanen, S.: Diminished reality for augmented reality interior design. Visual Comput. 33(2), 193–208 (2017). https://doi.org/10.1007/s00371-015-1174-z
Article Google Scholar
Herling, J., Broll, W.: High-quality real-time video inpaintingwith PixMix. IEEE Trans. Visual. Comput. Graph. 20(6), 866–879 (2014). https://doi.org/10.1109/TVCG.2014.2298016
Article Google Scholar
Mori, S., Erat, O., Broll, W., Saito, H., Schmalstieg, D., Kalkofen, D.: InpaintFusion: incremental RGB-D inpainting for 3D scenes. IEEE Trans. Visual. Comput. Graph. 26(10), 2994–3007 (2020). https://doi.org/10.1109/TVCG.2020.3003768
Article Google Scholar
Queguiner, G., Fradet, M., Rouhani, M.: Towards mobile diminished reality. In: 2018 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), pp. 226–231. IEEE. https://doi.org/10.1109/ISMAR-Adjunct.2018.00073 (2018)
Yagi, K., Hasegawa, K., Saito, H.: Diminished reality for privacy protection by hiding pedestrians in motion image sequences using structure from motion. In: 2017 IEEE International Symposium on Mixed and Augmented Reality (ISMAR-Adjunct), pp. 334–337. IEEE. https://doi.org/10.1109/ISMAR-Adjunct.2017.101 (2017)
Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: PatchMatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28(3), 24 (2009)
Article Google Scholar
Kim, D., Woo, S., Lee, J.Y., Kweon, I.S.: Deep video inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5792–5801. https://doi.org/10.1109/CVPR.2019.00594 (2019)
Woo, S., Kim, D., Park, K., Lee, J.Y., Kweon, I.S. Woo, S., Kim, D., Park, K., Lee, J.Y., Kweon, I.S.: (2020). Align-and-attend network for globally and locally coherent video inpainting. In: The 31st British Machine Vision Virtual Conference. British Machine Vision Virtual Conference (2019)
Lee, S., Oh, S.W., Won, D., Kim, S.J.: Copy-and-paste networks for deep video inpainting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4413–4421. https://doi.org/10.1109/iccv.2019.00451(2019)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: 2011 International Conference on Computer Vision, pp. 2564–2571. IEEE. https://doi.org/10.1109/ICCV.2011.6126544 (2011)
Barath, D., Noskova, J., Ivashechkin, M., Matas, J.: MAGSAC++, a fast, reliable and accurate robust estimator. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1304–1312. https://doi.org/10.1109/cvpr42600.2020.00138 (2020)
Apple. ARKit. Accessed 17 Mar, 2021. https://developer.apple.com/augmented-reality/ (2021)
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision (1981)
Tareen, S.A.K., Saleem, Z.: A comparative analysis of sift, surf, kaze, akaze, orb, and brisk. In: 2018 International Conference on Computing, Mathematics and Engineering Technologies (iCoMET), pp. 1–10. IEEE. https://doi.org/10.1109/ICOMET.2018.8346440 (2018)
Chum, O., Pajdla, T., Sturm, P.: The geometric error for homographies. Comput. Vis. Image Underst. 97(1), 86–102 (2005). https://doi.org/10.1016/j.cviu.2004.03.004
Article Google Scholar
Wu, X., Xu, K., Hall, P.: A survey of image synthesis and editing with generative adversarial networks. Tsinghua Sci. Technol. 22(6), 660–674 (2017). (https://doi.org/10.23919/TST.2017.8195348)
Article MATH Google Scholar
Zeng, Y., Fu, J., Chao, H.: Learning joint spatial-temporal transformations for video inpainting. In: European Conference on Computer Vision, pp. 528–543. Springer, Cham. https://doi.org/10.1007/978-3-030-58517-4_31 (2020)
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004). https://doi.org/10.1109/TIP.2003.819861
Article Google Scholar
Zhang, R., Isola, P., Efros, A. A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595. https://doi.org/10.1109/cvpr.2018.00068 (2018)
Unity Technologies. ARFoundation. Accessed 17 Mar, 2021. https://unity.com/unity/features/arfoundation (2021)
Bradski, G.: The OpenCV Library. Dr. Dobb’s Journal of Software Tools (2000)

Download references

Acknowledgements

This research was supported in part by the IITP/MSIT of Korea, under the ITRC support program (IITP-2021-2016-0-00312) and, also KEA/KIAT/MOTIE Competency Development Program for Industry Specialist (N000999), and the NRF Korea through the Basic Science Research (2019R1A2C1086649).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Korea University, Seoul, 02841, Republic of Korea
Taehyung Kim & Gerard J. Kim

Authors

Taehyung Kim
View author publications
You can also search for this author in PubMed Google Scholar
Gerard J. Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gerard J. Kim.

Ethics declarations

Conflict of interest

Both Taehyung Kim and Gerard J. Kim declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A Appendix: Questionnaire

Distraction
DS1	I was not able to concentrate on the AR scene because of the passerby roaming around in the background
DS2	The passerby’s existence bothered me when observing and interacting with the virtual avatar
DS3	I became conscious of the people passing across the AR screen
DS4	I did not pay attention to the passerby

Visual inconsistency
VI1	The visual mismatch between outside and inside the screen of the passerby was obvious to me
VI2	The different visual representations of the passerby in the AR scene felt awkward
VI3	I did not notice the visual inconsistency between the AR scene and the real scene
VI4	The passerby’s body parts in the AR scene did not feel awkward at all
Object presence
OP1	I felt like the avatar was a part of the environment
OP2	I didn’t feel like the avatar was actually there in the environment
OP3	It seemed as though an avatar was present in the environment
OP4	I felt as though an avatar was physically present in the environment

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, T., Kim, G.J. Real-time and on-line removal of moving human figures in hand-held mobile augmented reality. Vis Comput 39, 2571–2582 (2023). https://doi.org/10.1007/s00371-022-02479-1

Download citation

Accepted: 21 March 2022
Published: 30 April 2022
Issue Date: July 2023
DOI: https://doi.org/10.1007/s00371-022-02479-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-time and on-line removal of moving human figures in hand-held mobile augmented reality

Abstract

Access this article

Similar content being viewed by others

Augmented Reality: A Comprehensive Review

Image Inpainting: A Review

YOLO glass: video-based smart object detection using squeeze and attention YOLO network

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A Appendix: Questionnaire

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Real-time and on-line removal of moving human figures in hand-held mobile augmented reality

Abstract

Access this article

Similar content being viewed by others

Augmented Reality: A Comprehensive Review

Image Inpainting: A Review

YOLO glass: video-based smart object detection using squeeze and attention YOLO network

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A Appendix: Questionnaire

Appendix A Appendix: Questionnaire

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation