Abstract
Database technology can now host multimedia applications through the representation of sounds and images, but such new applications also require extensions to HCI technology. This paper examines the problems of querying and manipulating audio information. We argue that no single “style” of user interface can provide a complete solution, and propose two novel types of interface to complement conventional database languages. The first is gestural, and allows users literally to reach into spaces of sounds and to “grab” the required objects. The second involves retrieval by mimicry. The main part of this paper describes our research into the viability of the gestural interface. We have experimented using the ISEE (Intuitive Sound Editing Environment) interface, a four-dimensional perceptually-based space of sounds. Our experiments have involved a user population and a range of multidimensional input devices, and have provided strong evidence that the approach is viable, but that the choice of input devices has a significant impact on the usability of the system. The second proposed interface, which we are currently researching, involves the use of neural networks within the data model to derive perceptually-based attributes. The neural networks can be trained on expertly created sound spaces, together with vocal imitations of the sounds, and subsequently used to retrieve on the basis of vocal imitations of the required sounds.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
W.A. Burkhard, Some Approaches to Best-Match File Searching, Comms ACM, 1973, 16 (4), pp. 230–236.
W. Buxton, There’s More to Interaction than Meets the Eye: Some Issues in Manual Input, in D.A. Norman and S.W. Draper (ed), User Centered System Design: New Perspectives on HCI, 1986, Lawrence Erlbaum Associates: Hillsdale, N.J. pp. 319–337.
J. Chowning, The synthesis of complex audio spectra by means of frequency modulation, Journal of the Audio Engineering Society, 1973, 21 (7), pp. 526–534.
M. Chen, S.J. Mountford, and A. Sellen, A Study in Interactive 3-D Rotation Using 2-D Control Devices, Computer Graphics, 1988, 22 (4), pp. 121–129.
R. Cogan, New images of Musical Sound, Havard U P, 1984.
H. Coolican, Research Methods and Statistics in Psychology, London: Hodder and Stoughton, 1990.
K De Koning and S. Oates, Sound Base: Phonetic Searching in Sound Archives, Proceedings of the International Conference on Computer Music, Montreal, 1991, pp. 433–436.
B.Eaglestone and S. Oates, Analytical tools for Group Additive Synthesis, Proceedings of the International Conference on Computer Music, Glasgow, 1990.
B. Eaglestone and A. Verschoor, An Intelligent Music Repository, Proceedings of the International Computer Music Conference, Montreal, 1991, pp. 437–440.
B. Eaglestone, G.L.Davies and T. Ungvary, An Extended Version Model for Artistic Design, 5th International Conference on Computing and Information, IEEE, Sudbury, Canada, 1993, pp. 502–506.
B. Eaglestone, G.L. Davies, M. Ridley and Hulley N, Implementation of an Artists Version Model using Extended Relational Database Technology. Advances in Databases, BNCOD-11, Keele, UK, July 1993, Lecture Notes in Computer Science, 696, Springer Verlag, 1993, pp. 258–276.
B. Feiten and T. Ungvary, Organisation of Sounds with Neural Nets, Proceedings of the International Computer Music Conference, Montreal, International Computer Music Association, 1991.
P. Fitts and M. Posner, Human Performance, London, Prentice-Hall Inc., 1967.
J. Grey, An Exploration of Musical Timbre, Ph.D. Dissertation, Dept. of Psychology, Stanford University. CCRMA Report STAN-M-2, 1975.
M. Jaslowitz, T. D’Silva and E. Zwaneveld, Sound Genie–An Automated Digital Sound Effects Library System, SMTE Journal, May 1990, pp. 386–391.
R.J.K. Jacob and L.E. Sibert, The Perceptual Structure of Multidimensional Input Device Selection, Proceedings of ACM CHI’92 Conference on Human Factors in Computing Systems, 1992, pp. 211–218.
S. Keele, Movement Control in Skilled Motor Performance, Psychological Bulletin, 70, 1968, pp. 387–402.
S. Keele, Attention and Human Performance, Pacific Pallisades, Goodyear Publishing Company, 1973.
M. Lee, A. Freed, D. Wessel, Real-Time Neural Network Processing of Gestural and Acoustical Signals, Proceedings of the International Computer Music Conference, Montreal, 1991, pp. 277–280.
M. Lee and D. Wessel, Connectionist Models for Real-Time Control of Synthesis and Compositional Algorithms, Proceedings of the International Computer Music Conference, San Jose, International Computer Music Association, 1992.
J.D. Mackinlay, S.K. Card, and G.G. Robertson, A Semantic Analysis of the Design Space of Input Devices, Human-Computer Interaction, 1990, 5, pp. 145–190.
A. Monk, Statistical Evaluation of Behavioural Data, in A. Monk (ed), Fundamentals of Human-Computer Interaction. 1985, Academic Press, London, pp. 81–87.
R. Pausch, Virtual Reality on Five Dollars a Day, Proceedings of ACM CHI’91 Conference on Human Factors in Computing Systems
pp. 265–270.
R. Plomp, Aspects of Tone Sensation, London, Academic Press, 1976.
R. Shepard, Representations of Structure in Similar Data: Problems and Prospects, Psychometrica, 1974, 39, pp. 373–421.
B. Shneiderman, Designing the User-Interface: Strategies for Effective Human-Computer Interaction, Reading, MA, Addison Wesley, 1987.
D.Shasha and T.-L. Wang, New Techniques for Best-Match Retrieval, ACM TOIS, 1990, 8 (2), pp. 140–158.
B. Truax, Organizational Techniques for c:m Ratios in Frequency Modulation, Computer Music Journal, 1977, 1 (4), pp. 39–45.
R. Vertegaal, ISEE: ontwerp en implementatie, Music Technology Dissertation, Utrecht School of the Arts, The Netherlands, 1992.
R. Vertegaal and E. Bonis, ISEE: An Intuitive Sound Editing Environment, Computer Music Journal, 1994, 8 (2), pp. 21–29.
D. Wessel, Report to C.M.E. University of California, San Diego, 1974.
D. Wessel. Timbre Space as a Musical Control Structure. In C. Roads and J. Strawn (ed), Foundations of Computer Music, Cambridge, MA, MIT Press, 1985.
L.A. Zadeh, Soft Confusion and Fuzzy Logic, 5th International Conference on Computing and Information, IEEE, Sudbury, Canada, 1993.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1995 Springer-Verlag London
About this paper
Cite this paper
Eaglestone, B., Vertegaal, R. (1995). Intuitive Human Interfaces for an Audio-database. In: Sawyer, P. (eds) Interfaces to Database Systems (IDS94). Workshops in Computing. Springer, London. https://doi.org/10.1007/978-1-4471-3818-1_18
Download citation
DOI: https://doi.org/10.1007/978-1-4471-3818-1_18
Publisher Name: Springer, London
Print ISBN: 978-3-540-19910-6
Online ISBN: 978-1-4471-3818-1
eBook Packages: Springer Book Archive