research-article

Free Access

An Extensible Architecture for Recognizing Sensory Effects in 360° Images

Authors:
Raphael Abreu

MidiaCom Lab, Fluminense Federal University (UFF), Niterói, Brazil

MidiaCom Lab, Fluminense Federal University (UFF), Niterói, Brazil

0000-0001-5917-6113
View Profile

,
Joel Dos Santos

Multimedia Research Group, CEFET/RJ, Rio de Janeiro, Brazil

Multimedia Research Group, CEFET/RJ, Rio de Janeiro, Brazil

0000-0001-7234-613X
View Profile

,
Débora C. Muchaluat-Saade

MidiaCom Lab, Fluminense Federal University (UFF), Niterói, Brazil

MidiaCom Lab, Fluminense Federal University (UFF), Niterói, Brazil

0000-0002-1233-9736
View Profile

MMVE '24: Proceedings of the 16th International Workshop on Immersive Mixed and Virtual Environment SystemsApril 2024Pages 35–40https://doi.org/10.1145/3652212.3652213

Published:15 April 2024Publication History

MMVE '24: Proceedings of the 16th International Workshop on Immersive Mixed and Virtual Environment Systems

Pages 35–40

ABSTRACT

The use of 360° content with sensory effects can enhance user immersion. However, creating such effects is complex and time-consuming as authors must annotate the spatial position (i.e., "origin of the effect") in 360°. To tackle this mutimedia authoring issue, this paper presents an extensible architecture to automatically recognize sensory effects in 360° images. The architecture is based on a data treatment strategy that divides multimedia content into several manageable parts, operates on each part independently, and then joins the responses. The proposed architecture is capable of taking advantage of the diversity of recognition solutions and adapting to a possible author configuration. We also propose an implementation that provides three effect recognition modules, including a neural network for locating effects in equirectangular projections and a computer vision algorithm for sun localization. The results offer valuable insights into the effectiveness of the system and highlight areas for improvement.

References

Raphael Abreu, Joel dos Santos, and Eduardo Bezerra. 2018. A Bimodal Learning Approach to Assist Multi-sensory Effects Synchronization. In IJCNN '18 (Rio de Janeiro, Brazil). IEEE.Google ScholarCross Ref
Raphael Abreu, Douglas Mattos, Joel Santos, George Guinea, and Débora C Muchaluat-Saade. 2023. Semi-automatic mulsemedia authoring analysis from the user's perspective. In Proceedings of the 14th Conference on ACM Multimedia Systems. 249--256.Google ScholarDigital Library
James Clark et al. 1999. Xsl transformations (xslt). World Wide Web Consortium (W3C). URL http://www.w3.org/TR/xslt 103 (1999).Google Scholar
Benjamin Coors, Alexandru Paul Condurache, and Andreas Geiger. 2018. Spherenet: Learning spherical representations for detection and classification in omnidirectional images. In Proceedings of the European Conference on Computer Vision (ECCV). 518--533.Google ScholarDigital Library
Alexandra Covaci, Ramona Trestian, Estêvão Bissoli Saleme, Ioan-Sorin Comsa, Gebremariam Assres, Celso AS Santos, and Gheorghita Ghinea. 2019. 360° Mulsemedia: a way to improve subjective QoE in 360° videos. In Proceedings of the 27th ACM International Conference on Multimedia. 2378--2386.Google ScholarDigital Library
Raphael Silva de Abreu, Douglas Mattos, Joel dos Santos, Gheorghita Ghinea, and Débora Christina Muchaluat-Saade. 2020. Toward content-driven intelligent authoring of mulsemedia applications. IEEE MultiMedia 28, 1 (2020), 7--16.Google ScholarCross Ref
Marcello Novaes de Amorim, Estêvão Bissoli Saleme, Fábio Ribeiro de Assis Neto, Celso AS Santos, and Gheorghita Ghinea. 2019. Crowdsourcing authoring of sensory effects on videos. Multimedia Tools and Applications 78 (2019), 19201--19227.Google ScholarDigital Library
René Octivio Queiroz Dias and Díbio Leandro Borges. 2016. Recognizing Plant Species in the Wild: Deep Learning Results and a New Database. In 2016 IEEE International Symposium on Multimedia (ISM). 197--202. https://doi.org/10.1109/ISM.2016.0047Google ScholarCross Ref
Wilfried Elmenreich. 2002. An introduction to sensor fusion. Vienna University of Technology, Austria 502 (2002), 1--28.Google Scholar
Gabriel Giraldo, Myriam Servières, and Guillaume Moreau. 2020. Perception of multisensory wind representation in virtual reality. In 2020 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE, 45--53.Google ScholarCross Ref
Marina Josué, Raphael Abreu, Fábio Barreto, Douglas Mattos, Glauco Amorim, Joel dos Santos, and Débora Muchaluat-Saade. 2018. Modeling sensory effects as first-class entities in multimedia applications. In Proceedings of the 9th ACM Multimedia Systems Conference. 225--236.Google ScholarDigital Library
Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper Uijlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Alexander Kolesnikov, et al. 2020. The open images dataset v4. International Journal of Computer Vision (2020), 1--26.Google Scholar
Matthew Lombard, Theresa B Ditton, and Lisa Weinstein. 2009. Measuring presence: the temple presence inventory. In Proceedings of the 12th annual international workshop on presence. 1--15.Google Scholar
Anh Nguyen, Zhisheng Yan, and Klara Nahrstedt. 2018. Your attention is unique: Detecting 360-degree video saliency in head-mounted display for head movement prediction. In Proceedings of the 26th ACM international conference on Multimedia. 1190--1198.Google ScholarDigital Library
Karol J Piczak. 2015. Environmental sound classification with convolutional neural networks. In 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 1--6.Google ScholarCross Ref
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779--788.Google ScholarCross Ref
Thomhert S Siadari, Mikyong Han, and Hyunjin Yoon. 2017. 4D Effect Video Classification with Shot-Aware Frame Selection and Deep Neural Networks. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 1148--1155.Google ScholarCross Ref
Kristin Van Damme, Anissa All, Lieven De Marez, and Sarah Van Leuven. 2019. 360 video journalism: Experimental study on the effect of immersion on news experience and distant suffering. Journalism Studies 20, 14 (2019), 2053--2076.Google ScholarCross Ref
Yuhao Zhou, Makarand Tapaswi, and Sanja Fidler. 2018. Now You Shake Me: Towards Automatic 4D Cinema. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7425--7434.Google Scholar
Ziqi Zhu, Li Zhuo, Panling Qu, Kailong Zhou, and Jing Zhang. 2016. Extreme Weather Recognition Using Convolutional Neural Networks. In 2016 IEEE International Symposium on Multimedia (ISM). 621--625. https://doi.org/10.1109/ISM.2016.0133Google ScholarCross Ref

Index Terms

An Extensible Architecture for Recognizing Sensory Effects in 360° Images

Recommendations

Sensory Effect Extraction for 360° Media Content
WebMedia '21: Proceedings of the Brazilian Symposium on Multimedia and the Web

The presentation of sensory effects in sync with 360° content has the potential to increase user immersion. However, the authoring process for such effects is laborious and slow. It requires the author to specify in time and space the presentation ...
Read More
Beyond Multimedia Authoring: On the Need for Mulsemedia Authoring Tools

The mulsemedia (Multiple Sensorial Media (MulSeMedia)) concept has been explored to provide users with new sensations using other senses beyond sight and hearing. The demand for producing such applications has motivated various studies in the mulsemedia ...
Read More
Crowdsourcing authoring of sensory effects on videos

Human perception is inherently multi-sensorial involving five traditional senses: sight, hearing, touch, taste, and smell. In contrast to traditional multimedia, based on audio and visual stimuli, mulsemedia seek to stimulate all the human senses. One ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

MMVE '24: Proceedings of the 16th International Workshop on Immersive Mixed and Virtual Environment Systems
April 2024
101 pages
ISBN:9798400706189
DOI:10.1145/3652212

Copyright © 2024 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 April 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Sensory effects
automatic recognition
mulsemedia authoring
multisensory experiences
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate26of44submissions,59%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 10
  Total Downloads
- Downloads (Last 12 months)10
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

An Extensible Architecture for Recognizing Sensory Effects in 360° Images

MMVE '24: Proceedings of the 16th International Workshop on Immersive Mixed and Virtual Environment Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Sensory Effect Extraction for 360° Media Content

Beyond Multimedia Authoring: On the Need for Mulsemedia Authoring Tools

Crowdsourcing authoring of sensory effects on videos

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

An Extensible Architecture for Recognizing Sensory Effects in 360° Images

MMVE '24: Proceedings of the 16th International Workshop on Immersive Mixed and Virtual Environment Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Sensory Effect Extraction for 360° Media Content

Beyond Multimedia Authoring: On the Need for Mulsemedia Authoring Tools

Crowdsourcing authoring of sensory effects on videos

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media