skip to main content
10.1145/3652212.3652213acmconferencesArticle/Chapter ViewAbstractPublication PagesmmsysConference Proceedingsconference-collections
research-article
Free Access

An Extensible Architecture for Recognizing Sensory Effects in 360° Images

Published:15 April 2024Publication History

ABSTRACT

The use of 360° content with sensory effects can enhance user immersion. However, creating such effects is complex and time-consuming as authors must annotate the spatial position (i.e., "origin of the effect") in 360°. To tackle this mutimedia authoring issue, this paper presents an extensible architecture to automatically recognize sensory effects in 360° images. The architecture is based on a data treatment strategy that divides multimedia content into several manageable parts, operates on each part independently, and then joins the responses. The proposed architecture is capable of taking advantage of the diversity of recognition solutions and adapting to a possible author configuration. We also propose an implementation that provides three effect recognition modules, including a neural network for locating effects in equirectangular projections and a computer vision algorithm for sun localization. The results offer valuable insights into the effectiveness of the system and highlight areas for improvement.

References

  1. Raphael Abreu, Joel dos Santos, and Eduardo Bezerra. 2018. A Bimodal Learning Approach to Assist Multi-sensory Effects Synchronization. In IJCNN '18 (Rio de Janeiro, Brazil). IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  2. Raphael Abreu, Douglas Mattos, Joel Santos, George Guinea, and Débora C Muchaluat-Saade. 2023. Semi-automatic mulsemedia authoring analysis from the user's perspective. In Proceedings of the 14th Conference on ACM Multimedia Systems. 249--256.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. James Clark et al. 1999. Xsl transformations (xslt). World Wide Web Consortium (W3C). URL http://www.w3.org/TR/xslt 103 (1999).Google ScholarGoogle Scholar
  4. Benjamin Coors, Alexandru Paul Condurache, and Andreas Geiger. 2018. Spherenet: Learning spherical representations for detection and classification in omnidirectional images. In Proceedings of the European Conference on Computer Vision (ECCV). 518--533.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Alexandra Covaci, Ramona Trestian, Estêvão Bissoli Saleme, Ioan-Sorin Comsa, Gebremariam Assres, Celso AS Santos, and Gheorghita Ghinea. 2019. 360° Mulsemedia: a way to improve subjective QoE in 360° videos. In Proceedings of the 27th ACM International Conference on Multimedia. 2378--2386.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Raphael Silva de Abreu, Douglas Mattos, Joel dos Santos, Gheorghita Ghinea, and Débora Christina Muchaluat-Saade. 2020. Toward content-driven intelligent authoring of mulsemedia applications. IEEE MultiMedia 28, 1 (2020), 7--16.Google ScholarGoogle ScholarCross RefCross Ref
  7. Marcello Novaes de Amorim, Estêvão Bissoli Saleme, Fábio Ribeiro de Assis Neto, Celso AS Santos, and Gheorghita Ghinea. 2019. Crowdsourcing authoring of sensory effects on videos. Multimedia Tools and Applications 78 (2019), 19201--19227.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. René Octivio Queiroz Dias and Díbio Leandro Borges. 2016. Recognizing Plant Species in the Wild: Deep Learning Results and a New Database. In 2016 IEEE International Symposium on Multimedia (ISM). 197--202. https://doi.org/10.1109/ISM.2016.0047Google ScholarGoogle ScholarCross RefCross Ref
  9. Wilfried Elmenreich. 2002. An introduction to sensor fusion. Vienna University of Technology, Austria 502 (2002), 1--28.Google ScholarGoogle Scholar
  10. Gabriel Giraldo, Myriam Servières, and Guillaume Moreau. 2020. Perception of multisensory wind representation in virtual reality. In 2020 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE, 45--53.Google ScholarGoogle ScholarCross RefCross Ref
  11. Marina Josué, Raphael Abreu, Fábio Barreto, Douglas Mattos, Glauco Amorim, Joel dos Santos, and Débora Muchaluat-Saade. 2018. Modeling sensory effects as first-class entities in multimedia applications. In Proceedings of the 9th ACM Multimedia Systems Conference. 225--236.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper Uijlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Alexander Kolesnikov, et al. 2020. The open images dataset v4. International Journal of Computer Vision (2020), 1--26.Google ScholarGoogle Scholar
  13. Matthew Lombard, Theresa B Ditton, and Lisa Weinstein. 2009. Measuring presence: the temple presence inventory. In Proceedings of the 12th annual international workshop on presence. 1--15.Google ScholarGoogle Scholar
  14. Anh Nguyen, Zhisheng Yan, and Klara Nahrstedt. 2018. Your attention is unique: Detecting 360-degree video saliency in head-mounted display for head movement prediction. In Proceedings of the 26th ACM international conference on Multimedia. 1190--1198.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Karol J Piczak. 2015. Environmental sound classification with convolutional neural networks. In 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  16. Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779--788.Google ScholarGoogle ScholarCross RefCross Ref
  17. Thomhert S Siadari, Mikyong Han, and Hyunjin Yoon. 2017. 4D Effect Video Classification with Shot-Aware Frame Selection and Deep Neural Networks. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 1148--1155.Google ScholarGoogle ScholarCross RefCross Ref
  18. Kristin Van Damme, Anissa All, Lieven De Marez, and Sarah Van Leuven. 2019. 360 video journalism: Experimental study on the effect of immersion on news experience and distant suffering. Journalism Studies 20, 14 (2019), 2053--2076.Google ScholarGoogle ScholarCross RefCross Ref
  19. Yuhao Zhou, Makarand Tapaswi, and Sanja Fidler. 2018. Now You Shake Me: Towards Automatic 4D Cinema. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7425--7434.Google ScholarGoogle Scholar
  20. Ziqi Zhu, Li Zhuo, Panling Qu, Kailong Zhou, and Jing Zhang. 2016. Extreme Weather Recognition Using Convolutional Neural Networks. In 2016 IEEE International Symposium on Multimedia (ISM). 621--625. https://doi.org/10.1109/ISM.2016.0133Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. An Extensible Architecture for Recognizing Sensory Effects in 360° Images

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          MMVE '24: Proceedings of the 16th International Workshop on Immersive Mixed and Virtual Environment Systems
          April 2024
          101 pages
          ISBN:9798400706189
          DOI:10.1145/3652212

          Copyright © 2024 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 15 April 2024

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate26of44submissions,59%
        • Article Metrics

          • Downloads (Last 12 months)10
          • Downloads (Last 6 weeks)10

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader