skip to main content
10.1145/3539637.3557930acmconferencesArticle/Chapter ViewAbstractPublication PageswebmediaConference Proceedingsconference-collections
research-article

360RAT: A Tool for Annotating Regions of Interest in 360-degree Videos

Published: 07 November 2022 Publication History

Abstract

This paper introduces the software 360RAT as a tool for annotating regions of interest (RoIs) in 360-degree videos. These regions represent the portions of the video content that are important for telling a story throughout the video. We believe that this software is an invaluable tool for studying different aspects of 360-degree videos, including what viewers consider relevant and interesting to the experience. As part of this work, we conducted a subjective experiment in which 9 human observers used the proposed software to annotate 11 360-degree videos. As a result, we created a dataset containing a set of annotated 360-degree videos, i.e., videos with marked RoIs and their semantic classification. We present a simple analysis of the annotations gathered with the experiment for a subset of the videos. We noticed that there is a higher agreement of annotations among participants for videos with fewer objects. We also compared the RoI maps with saliency maps computed with the Cube Padding saliency model. We found a strong correlation between RoI maps and computed saliency models, indicating a link between the annotated RoI and the saliency properties of the content.

References

[1]
2018. Computer vision annotation tool. https://www.v7labs.com/blog/cvat-guide. [Online; accessed 05-Mai-2022].
[2]
2018. labelImage. https://github.com/tzutalin/labelImg. [Online; accessed 05-Mai-2022].
[3]
2018. Labelpix. https://github.com/unsignedrant/labelpix. [Online; accessed 05-Mai-2022].
[4]
2018. Python Video Annotator. https://github.com/unsignedrant/labelpix. [Online; accessed 05-Mai-2022].
[5]
2022. CANNES XR: Showcases. https://www.marchedufilm.com/programs/cannes-xr/showcases/. [Online; accessed 19-February-2022].
[6]
2022. Epic Games. https://store.epicgames.com/en-US/. [Online; accessed 05-June-2022].
[7]
2022. Planar. https://www.planar.com/markets/virtual-production/. [Online; accessed 24-February-2022].
[8]
2022. VR Gorilla. https://www.vr-gorilla.com/. [Online; accessed 19-February-2022].
[9]
Lemonia Argyriou, Daphne Economou, and Vassiliki Bouki. 2020. Design methodology for 360 immersive video applications: the case study of a cultural heritage virtual tour. Personal and Ubiquitous Computing(2020), 1–17.
[10]
Tewodros Atanaw Biresaw, Tahir Habib Nawaz, James M. Ferryman, and Anthony I. Dell. 2016. ViTBAT: Video tracking and behavior annotation tool. 2016 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) (2016), 295–301.
[11]
J. Bremmers. 2017. Narrative cues within cinematic virtual reality: An exploratory study of narrative cues within the content and motives of virtual reality developers. Master’s thesis. http://hdl.handle.net/2105/42782
[12]
J Brillhart. 2016. VR & Cinema.
[13]
Cullen Brown, Ghanshyam Bhutra, Mohamed Suhail, Qinghong Xu, and Eric D Ragan. 2017. Coordinating attention and cooperation in multi-user virtual reality narratives. In 2017 IEEE Virtual Reality (VR). IEEE, 377–378.
[14]
Zoya Bylinskii, Tilke Judd, Aude Oliva, Antonio Torralba, and Frédo Durand. 2018. What do different evaluation metrics tell us about saliency models?IEEE transactions on pattern analysis and machine intelligence 41, 3(2018), 740–757.
[15]
Hsien-Tzu Cheng, Chun-Hung Chao, Jin-Dong Dong, Hao-Kai Wen, Tyng-Luh Liu, and Min Sun. 2018. Cube Padding for Weakly-Supervised Saliency Prediction in 360° Videos. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (Jun 2018). https://doi.org/10.1109/cvpr.2018.00154
[16]
Cisco. 2018. Cisco Annual Report 2018-2023. Retrieved February 12, 2021 from https://www.cisco.com/c/en/us/solutions/collateral/executive-perspectives/annual-internet-report/white-paper-c11-741490.html
[17]
Savino Dambra, Giuseppe Samela, Lucile Sassatelli, Romaric Pighetti, Ramon Aparicio-Pardo, and Anne-Marie Pinna-Déry. 2018. Film editing: New levers to improve VR streaming. In Proceedings of the 9th ACM Multimedia Systems Conference. 27–39.
[18]
Raphael S. de Abreu, Joel André Ferreira dos Santos, and Débora Christina Muchaluat-Saade. 2021. Sensory Effect Extraction for 360° Media Content. Proceedings of the Brazilian Symposium on Multimedia and the Web (2021).
[19]
Abhishek Dutta and Andrew Zisserman. 2019. The VGG Image Annotator (VIA). ArXiv abs/1904.10699(2019).
[20]
Mylène Farias, Myllena Prado, and Lucas Althoff. 2021. 360RAT - 360 - ROI Annotator Tool. https://gitlab.com/gpds-unb/360rat
[21]
Hannes Fassold and Barnabás Takács. 2019. Towards Automatic Cinematography and Annotation for 360° Video. Proceedings of the 2019 ACM International Conference on Interactive Experiences for TV and Online Video(2019).
[22]
Florian Groh, Dominik Schörkhuber, and Margrit Gelautz. 2020. A tool for semi-automatic ground truth annotation of traffic videos. electronic imaging 2020(2020), 200–1–200–7.
[23]
Hou-Ning Hu, Yen-Chen Lin, Ming-Yu Liu, Hsien-Tzu Cheng, Yung-Ju Chang, and Min Sun. 2017. Deep 360 Pilot: Learning a Deep Agent for Piloting through 360° Sports Video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[24]
Huawei. 2017. VR/AR Huawei white paper. Retrieved February 12, 2021 from http://www-file.huawei.com/-/media/CORPORATE/PDF/ilab/vr-ar-en.pdf
[25]
Sebastian Knorr, Cagri Ozcinar, Colm O Fearghail, and Aljosa Smolic. 2018. Director’s cut: a combined dataset for visual attention analysis in cinematic VR content. In Proceedings of the 15th ACM SIGGRAPH European Conference on Visual Media Production. 1–10.
[26]
Matthias Kummerer, Thomas S. A. Wallis, and Matthias Bethge. 2018. Saliency Benchmarking Made Easy: Separating Models, Maps and Metrics. In Proceedings of the European Conference on Computer Vision (ECCV).
[27]
Liang-Han Lin, Hao-Kai Wen, Man-Hsin Kao, Evelyn Chen, Tse-Han Lin, and Ming Ouhyoung. 2020. Label360: An Annotation Interface for Labeling Instance-Aware Semantic Labels on Panoramic Full Images. SIGGRAPH Asia 2020 Posters(2020).
[28]
Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, and Piotr Dollár. 2015. Microsoft COCO: Common Objects in Context. arxiv:1405.0312 [cs.CV]
[29]
C Milk. 2016. Chris Milk: The birth of virtual reality as an art form [Video File].
[30]
OpenCV. 2015. Open Source Computer Vision Library. https://opencv.org/.
[31]
Jayesh S Pillai, Azif Ismail, and Herold P Charles. 2017. Grammar of VR storytelling: visual cues. In Proceedings of the virtual reality international conference-laval virtual 2017. 1–4.
[32]
python. 2021. PyQt5 5.15.4. https://pypi.org/project/PyQt5/.
[33]
Michel Reilhac. 2016. Presence Design and Spatial Writing in Virtual Reality.
[34]
Bryan C. Russell, Antonio Torralba, Kevin P. Murphy, and William T. Freeman. 2007. LabelMe: A Database and Web-Based Tool for Image Annotation. International Journal of Computer Vision 77 (2007), 157–173.
[35]
Lucile Sassatelli, Anne-Marie Pinna-Déry, Marco Winckler, Savino Dambra, Giuseppe Samela, Romaric Pighetti, and Ramon Aparicio-Pardo. 2018. Snap-changes: a dynamic editing strategy for directing viewer’s attention in streaming virtual reality videos. In Proceedings of the 2018 International Conference on Advanced Visual Interfaces. 1–5.
[36]
Ana Serrano, Vincent Sitzmann, Jaime Ruiz-Borau, Gordon Wetzstein, Diego Gutierrez, and Belen Masia. 2017. Movie editing and cognitive event segmentation in virtual reality video. ACM Transactions on Graphics (TOG) 36, 4 (2017), 1–12.
[37]
Anting Shen. 2016. BeaverDam : Video Annotation Tool for Computer Vision Training Labels.
[38]
Afshin Taghavi, Aliehsan Samiei, Anahita Mahzari, Ryan McMahan, Ravi Prakash, Mylene Farias, and Marcelo Carvalho. 2019. A taxonomy and dataset for 360° videos. 273–278. https://doi.org/10.1145/3304109.3325812
[39]
Carl Vondrick, Donald J. Patterson, and Deva Ramanan. 2012. Efficiently Scaling up Crowdsourced Video Annotation. International Journal of Computer Vision 101 (2012), 184–204.
[40]
Shang-Ta Yang, Chi-Han Peng, Peter Wonka, and Hung kuo Chu. 2018. PanoAnnotator: a semi-automatic tool for indoor panorama layout annotation. SIGGRAPH Asia 2018 Posters(2018).

Cited By

View all
  • (2024)360Align: An Open Dataset and Software for Investigating QoE and Head Motion in 360° Videos with Alignment EditsProceedings of the 2024 ACM International Conference on Interactive Media Experiences10.1145/3639701.3656311(41-55)Online publication date: 7-Jun-2024
  • (2024)Object Detection with YOLOv5 in Indoor Equirectangular PanoramasProcedia Computer Science10.1016/j.procs.2023.10.233225:C(2420-2428)Online publication date: 4-Mar-2024
  • (2023)Impact of Alignment Edits on the Quality of Experience of 360° VideosIEEE Access10.1109/ACCESS.2023.331934611(108475-108492)Online publication date: 2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WebMedia '22: Proceedings of the Brazilian Symposium on Multimedia and the Web
November 2022
389 pages
ISBN:9781450394093
DOI:10.1145/3539637
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 November 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. 360-degree video
  2. attention guiding
  3. cinematic VR
  4. user experience

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • CAPES
  • FAPESP

Conference

WebMedia '22
WebMedia '22: Brazilian Symposium on Multimedia and Web
November 7 - 11, 2022
Curitiba, Brazil

Acceptance Rates

Overall Acceptance Rate 270 of 873 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)23
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)360Align: An Open Dataset and Software for Investigating QoE and Head Motion in 360° Videos with Alignment EditsProceedings of the 2024 ACM International Conference on Interactive Media Experiences10.1145/3639701.3656311(41-55)Online publication date: 7-Jun-2024
  • (2024)Object Detection with YOLOv5 in Indoor Equirectangular PanoramasProcedia Computer Science10.1016/j.procs.2023.10.233225:C(2420-2428)Online publication date: 4-Mar-2024
  • (2023)Impact of Alignment Edits on the Quality of Experience of 360° VideosIEEE Access10.1109/ACCESS.2023.331934611(108475-108492)Online publication date: 2023
  • (2023)Tracing the Visual Path: Gaze Direction in the 360 Video ExperienceAdvances in Computing10.1007/978-3-031-47372-2_32(406-415)Online publication date: 14-Nov-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media