See ColOr: Seeing Colours with an Orchestra

Deville, Benoît; Bologna, Guido; Vinckenbosch, Michel; Pun, Thierry

doi:10.1007/978-3-642-00437-7_10

Benoît Deville¹⁷,
Guido Bologna¹⁸,
Michel Vinckenbosch¹⁸ &
…
Thierry Pun¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 5440))

2807 Accesses
3 Altmetric

Abstract

The See Color interface transforms a small portion of a coloured video image into sound sources represented by spatialised musical instruments. Basically, the conversion of colours into sounds is achieved by quantisation of the HSL (Hue, Saturation and Luminosity) colour system. Our purpose is to provide visually impaired individuals with a capability of perception of the environment in real time. In this work we present the system’s principles of design and several experiments that have been carried out by several blindfolded persons. The goal of the first experiment was to identify the colours of main features in static pictures in order to interpret the image scenes. Participants found that colours were helpful to limit the possible image interpretations.

Afterwards, two experiments based on a head mounted camera have been performed. The first experiment pertains to object manipulation. It is based on the pairing of coloured socks, while the second experiment is related to outdoor navigation with the goal of following a coloured sinuous path painted on the ground. The socks experiment demonstrated that blindfolded individuals were able to accurately match pairs of coloured socks. The same participants successfully followed a red serpentine path painted on the ground for more than 80 meters.

Finally, we propose an original approach for a real time alerting system, based on the detection of visual salient parts in videos. The particularity of our approach lies in the use of a new feature map constructed from the depth gradient. From the computed feature maps we infer conspicuity maps that indicate areas that are appreciably different from their surrounding. Then a specific distance function is described, which takes into account both stereoscopic camera limitations and user’s choices. We also report how we automatically estimate the relative contribution of each conspicuity map, which enables the unsupervised determination of the final saliency map, indicating the visual salience of all points in the image. We demonstrate here that this additional depth-based feature map allows the system to detect salient regions with good accuracy in most situations, even in the presence of noisy disparity maps.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Loud and Clear: The VR Game Without Visuals

SoundSight: a mobile sensory substitution device that sonifies colour, distance, and temperature

Article Open access 02 July 2021

A Web Platform to Investigate the Relationship Between Sounds, Colors and Emotions

References

World Health Organisation: Magnitude and causes of visual impairment. Fact Sheet No. 282 (November 2004), http://www.who.int/mediacentre/factsheets/fs282/en/
Just, M., Carpenter, P.: Eye fixations and cognitive processes. Cognitive Psychology 8, 441–480 (1976)
Article Google Scholar
Mannos, J., Sakrison, D.: The effects of a visual fidelity criterion on the encoding of images. IEEE Transactions on Information Theory 20, 525–536 (1974)
Article MATH Google Scholar
Kelly, D.: Retinal inhomogenity. i. spatiotemporal contrast sensitivity. Journal of the Optical Society of America 1, 107–113 (1984)
Google Scholar
Way, T., Barner, K.: Automatic visual to tactile translation, part i: human factors, access methods and image manipulation. IEEE Transactions on Rehabilitation Engineering 5, 81–94 (1997)
Article Google Scholar
Loomis, J.M., Lederman, S.J.: Tactual perception. In: Handbook of Perception and Human Performance. Cognitive Processes and Performance, vol. 2, John Wiley and Sons, New York (1986)
Google Scholar
Burdea, G.: Force and touch feedback for virtual reality. John Wiley and Sons, New York (1996)
Google Scholar
Bach-y-Rita, P., Collins, C., Saunders, F., White, B., Scadden, L.: Vision substitution by tactile image projection. Nature 221, 963–964 (1969)
Article Google Scholar
Bach-y-Rita, P.: Visual information through the skin: a tactile vision substitution system (tvss). In: Transactions - American Academy of Ophthalmology and Otolaryngology, vol. 78 (September 1974); Symposium on Prosthetic Aids for the Blind
Google Scholar
Kaczmarek, K., Bach-y Rita, P., Tompkins, W.: A tactile vision-substitution system for the blind: computer-controlled partial image sequencing. IEEE Transactions on Biomedical Engineering 32, 602–608 (1985)
Article Google Scholar
Parkes, D.: An audio-tactile device for the acquisition, use and management of spatially distributed information by visually impaired people. In: Symposium on Maps and Graphics for Visually Handicapped People, pp. 30–35 (1988)
Google Scholar
Kawai, Y., Tomita, F.: Evaluation of interactive tactile display system. In: Proceedings of the International Conference on Computers Helping People with Special Needs (ICCHP 1998), pp. 29–36 (1998)
Google Scholar
Maucher, T., Meier, K., Schemmel, J.: An interactive tactile graphics display. In: Proceedings of the International Symposium on Signal Processing and its Applications (ISSPA), Kuala Lumpur, Malaysia, pp. 190–193 (2001)
Google Scholar
Pun, T., Roth, P., Bologna, G., Moustakas, K., Tzovaras, D.: Image and video processing for visually handicapped people. Eurasip International Journal of Image and Video Processing 2007 (2007), http://www.hindawi.com/GetArticle.aspx?doi=10.1155/2007/25214
Ruff, R., Perret, E.: Auditory spatial pattern perception aided by visual choices. Psychological Research 38, 369–377 (1976)
Article Google Scholar
Lakatos, S.: Recognition of complex auditory-spatial patterns. Perception 22, 363–374 (1993)
Article Google Scholar
Kay, L.: A sonar aid to enhance spatial perception of the blind: Engineering design and evaluation. The Radio and Electronic Engineer 44, 605–627 (1974)
Article Google Scholar
Meijer, P.: An experimental system for auditory image representations. IEEE Transactions on Biomedical Engineering 39, 112–121 (1992)
Article Google Scholar
Capelle, C., Trullemans, C., Arno, P., Veraart, C.: A real time experimental prototype for enhancement of vision rehabilitation using auditory substitution. IEEE Transactions on Biomedical Engineering 45, 1279–1293 (1998)
Article Google Scholar
Cronly-Dillon, J., Persaud, K., Gregory, R.: The perception of visual images encoded in musical form: a study in cross-modality information. Proceedings of Biological Sciences 266, 2427–2433 (1999)
Article Google Scholar
Gonzalez-Mora, J., Rodriguez-Hernandez, A., Rodriguez-Ramos, L., Dfaz-Saco, L., Sosa, N.: Development of a new space perception system for blind people, based on the creation of a virtual acoustic space. In: Mira, J. (ed.) IWANN 1999. LNCS, vol. 1607, pp. 321–330. Springer, Heidelberg (1999)
Chapter Google Scholar
Bologna, G., Vinckenbosch, M.: Eye tracking in coloured image scenes represented by ambisonic fields of musical instrument sounds. In: Mira, J., Álvarez, J.R. (eds.) IWINAC 2005. LNCS, vol. 3561, pp. 327–337. Springer, Heidelberg (2005)
Chapter Google Scholar
Bologna, G., Deville, B., Pun, T., Vinckenbosch, M.: Identifying major components of pictures by audio encoding of colors. In: Mira, J., Álvarez, J.R. (eds.) IWINAC 2007. LNCS, vol. 4528, pp. 81–89. Springer, Heidelberg (2007)
Chapter Google Scholar
Bologna, G., Deville, B., Pun, T., Vinckenbosch, M.: Transforming 3d coloured pixels into musical instrument notes for vision substitution applications. EURASIP Journal on Image and Video Processing 2007 (2007), http://www.hindawi.com/getarticle.aspx?doi=10.1155/2007/76204
Deville, B., Bologna, G., Vinckenbosch, M., Pun, T.: Depth-based detection of salient moving objects in sonified videos for blind users. In: VISAPP 2008, International Conference on Computer Vision Theory and Applications (January 2008)
Google Scholar
Begault, R.: 3-D Sound for Virtual Reality and Multimedia. Boston A.P. Professional (1994)
Google Scholar
Brown, C., Duda, R.: A structural model for binaural sound synthesis. IEEE Transactions on Speech and Audio Processing 6 (1998)
Google Scholar
Algazi, V., Duda, R., Thompson, D., Avendano, C.: The cipic hrtf database. In: Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2001), New Paltz, NY (2001)
Google Scholar
Landragin, F.: Saillance physique et saillance cognitive. Cognition, Representation, Langage 2 (2004), http://edel.univ-poitiers.fr/corela/document.php?id=142
Navalpakkam, V., Itti, L.: An integrated model of top-down and bottom-up attention for optimizing detection speed. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2049–2056 (2006)
Google Scholar
Peters, R.J., Itti, L.: Beyond bottom-up: Incorporating task-dependent influences into a computational model of spatial attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2007)
Google Scholar
Hoffman, D., Singh, M.: Salience of visual parts. Cognition 63, 29–78 (1997)
Article Google Scholar
Kadir, T., Brady, M.: Scale, saliency and image description. International Journal of Computer Vision 45, 83–105 (2001)
Article MATH Google Scholar
Lowe, D.: Object recognition from local scale-invariant features. In: Seventh International Conference on Computer Vision (ICCV 1999), vol. 2 (1999)
Google Scholar
Bay, H., Tuytelaars, T., Gool, L.V.: Surf: Speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 7–13. Springer, Heidelberg (2006)
Google Scholar
Milanese, R., Gil, S., Pun, T.: Attentive mechanism for dynamic and static scene analysis. Optical Engineering 34, 2428–2434 (1995)
Article Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machcine Intelligence 20, 1254–1259 (1998)
Article Google Scholar
Maki, A., Nordlund, P., Eklundh, J.: A computational model of depth-based attention. In: Proceedings of the International Conference on Pattern Recognition (ICPR 1996) (1996)
Google Scholar
Ouerhani, N., Hügli, H.: Computing visual attention from scene depth. In: Proceedings of the 15th International Conference on Pattern Recognition, vol. 1, pp. 375–378 (2000)
Google Scholar
Jost, T., Ouerhani, N., von Wartburg, R., Müri, R., Hügli, H.: Contribution of depth to visual attention: comparison of a computer model and human. In: Early cognitive vision workshop, Isle of Skye, Scotland (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Vision and Multimedia Lab, University of Geneva, Route de Drize 7, CH-1227, Carouge, Switzerland
Benoît Deville & Thierry Pun
Laboratoire d’Informatique Industrielle, University of Applied Science, CH-1202, Geneva, Switzerland
Guido Bologna & Michel Vinckenbosch

Authors

Benoît Deville
View author publications
You can also search for this author in PubMed Google Scholar
Guido Bologna
View author publications
You can also search for this author in PubMed Google Scholar
Michel Vinckenbosch
View author publications
You can also search for this author in PubMed Google Scholar
Thierry Pun
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics, University of Fribourg, Bd. de Pérolles 90, CH-1700, Fribourg, Switzerland
Denis Lalanne & Jürg Kohlas &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Deville, B., Bologna, G., Vinckenbosch, M., Pun, T. (2009). See ColOr: Seeing Colours with an Orchestra. In: Lalanne, D., Kohlas, J. (eds) Human Machine Interaction. Lecture Notes in Computer Science, vol 5440. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00437-7_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-00437-7_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00436-0
Online ISBN: 978-3-642-00437-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics